…well, making it easier to read at least.
I often have to deal with a machine produced lump of XML which is entirely impossible to read and extract information from, so I’ve made a small page that beautifies XML without sending it server side for processing (ideal for business sensitive data that can’t be transmitted to who-knows-what server) and thought I’d share in case it’s useful to anyone else.
There might be some bugs in there somewhere, but it’s served to meet most requirements of my day-to-day needs.
It’s available here.
htmlentities() doesn’t even come close.
This small file contains 4 functions (2 of which are taken from the PHP manual, credit given!) which will allow you to encode and decode entities from ASCII/unicode strings in either decimal or hexadecimal format for use in valid XML documents.
xml_entity_decode() function accepts an optional second parameter to allow non-standard XML entities (that may have been specified in your schema) in the format:
// 'entity' => 'char'
'amp' => '&',
'lt' => '<',
'gt' => '>',
'apos' => '\'',
'quot' => '"'
$s = 'This should be safe, but don\'t assume!
// outputs: <strong>This</strong> should be safe, but don't assume!<br/>
You can get the script here, or there’s a demo here too.
I’ve recently had to parse some pretty large XML documents, and needed a method to read one element at a time.
Here’s a fairly simple solution in PHP and Ruby form (hopefully a Python one coming soon…).
Continue reading “chunk – Read a large (XML) file a chunk at a time”