chunk – Read a large (XML) file a chunk at a time

I’ve recently had to parse some pretty large XML documents, and needed a method to read one element at a time.

Here’s a fairly simple solution in PHP and Ruby form (hopefully a Python one coming soon…).

If you have the following file (complex-test.xml):

And wanted to return the <Object/>s



It (probably) doesn’t work with nested XML elements, but the use I had for it didn’t require that it did.

You can get the PHP version here, and the Ruby one here.

Update: The class was accepted onto PHP Classes as a notable package! 04/08/2009.

6 thoughts on “chunk – Read a large (XML) file a chunk at a time

  1. Pingback: chunk – Read a large (XML) file a chunk at a time |

  2. This code helped me parse some very large files quickly, however when I try to create more than one chunk instance in a script, it will error out. I am not quite sure why this is happening. I receive the following exception when I try to read the second XML file in:

    I am calling again, so I am not sure why a second execution would cause this behavior. If I execute each import separately, everything works. Any help is greatly appreciated.

    • Hi there Ben,

      I’ll certainly have a play with it and see if it’s something I can resolve quickly, not sure why it would have any impact though…

      Thanks for pointing this out!


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">