parsing

Parsing XML at the Speed of Light--Arseny Kapoulkine

Some high-performance techniques that you an use for more than just parsing, including this week's darling of memory management:

Parsing XML at the Speed of Light

a chapter from "The Performance of Open Source Applications"
by Arseny Kapoulkine

From the chapter:

This chapter describes various performance tricks that allowed the author to write a very high-performing parser in C++: pugixml. While the techniques were used for an XML parser, most of them can be applied to parsers of other formats or even unrelated software (e.g., memory management algorithms are widely applicable beyond parsers). ...

Optimizing software is hard. In order to be successful, optimization efforts almost always involve a combination of low-level micro-optimizations, high-level performance-oriented design decisions, careful algorithm selection and tuning, balancing among memory, performance, implementation complexity, and more. Pugixml is an example of a library that needs all of these approaches to deliver a very fast production-ready XML parser–even though compromises had to be made to achieve this. A lot of the implementation details can be adapted to different projects and tasks, be it another parsing library or something else entirely.

Continue reading...