Abstract
This paper presents an adaptive XML parser that is based on table-driven XML (TDX) parsing technology. This technique can be used for developing extensible high-performance Web services for large complex systems that typically require extensible schemas. The parser integrates scanning, parsing, and validation into a single-pass without backtracking by utilizing compact tabular representations of schemas and a push-down automaton (PDA) at runtime. The tabular forms are constructed from a set of schemas or WSDL descriptions through the use of permutation grammar. The engine is implemented as a PDA-based, table-driven driver, as a result, it is independent of XML schemas. When XML schemas are updated or extended, the tabular forms can be regenerated and populated to the generic engine without requirement of redeployment of the parser. This adaptive approach balances the need for performance against the requirements of reconstruction and redeployment of the Web services. Our experiments show the adaptive parser usually demonstrates performance of 5 times faster than traditional validating parsers and performance drop within 20% of the fastest fully compiled traditional validating parsers.