During the course of a typical workday, in order to track down a problem, I often need to look at logfiles containing xml. The xml is usually poorly formatted and sometimes it is all on one line, making it very difficult to read. What would be ideal would be if I could extract the xml I am interested in into a buffer and pretty-print it with very little effort. The steps would be something like:

create a new buffer called *xml* in another window (C-x 4 b) delete anything that exists in that buffer already pretty print the xml into the new buffer

Did I gloss over step 3? That sounds pretty complicated right? Well, I can call a simple perl script from emacs. I’m pretty pragmatic and I don’t feel the need to code absolutely everything in emacs-lisp.

# !/usr/bin/perl use XML ::Twig; use XML ::Parser; my $ xml = XML::Twig->new(pretty_print => 'indented' ); if ($ ARGV [0]) { $ xml ->parse($ ARGV [0]); } else { $ xml ->parse(\* STDIN ); } $ xml ->print();

( defun xml-pretty-print-region (start end) (interactive "r" ) ( let ((b (get-buffer-create "*xml*" ))) (switch-to-buffer-other-window b) (xml-mode) (erase-buffer) (other-window -1) (goto-char end) ( let ((e (point-marker))) (join-broken-lines start end) (call-process-region start e "xml_pretty_print.pl" nil b))))

Actually, sgml-mode (xml-mode is just an alias) has a method called sgml-pretty-print but firstly I prefer the output from XML::Twig and secondly it is nice to see how easy emacs makes it to call out to an external process and return the results. Anyone without perl installed might prefer to replace the external call with a call to (sgml-pretty-print ...) .

( defun xml-pretty-print-region (start end) (interactive "r" ) ( let ((cb (current-buffer)) (buf (get-buffer-create "*xml*" ))) (set-buffer buf) (erase-buffer) (set-buffer cb) (copy-to-buffer buf start end) (switch-to-buffer-other-window buf) (xml-mode) (join-broken-lines (point-min) (point-max)) (sgml-pretty-print (point-min) (point-max)) (other-window -1)))

What is that call to (join-broken-lines ...) ? When I cut and paste from putty it breaks lines at the width of my window.

< ?xml version = "1.0" encoding = "UTF-8" ?>< xsl : stylesheet version = "1.0" >< xsl : output method = "html" />< xsl : template match = "/" >< H2 >Customer Listing (in Alternating ro w colors)</ H2 >< table border = "1" >< xsl : for-each select = "/customers/customer" >< tr > < xsl : choose >< xsl : when test = "position() mod 2 = 1" >< xsl : attribute name = "class" >c lsOdd</ xsl : attribute ></ xsl : when >< xsl : otherwise >< xsl : attribute name = "class" >clsE ven</ xsl : attribute ></ xsl : otherwise ></ xsl : choose >< xsl : for-each select = "@*" >< td >< xsl:value-of select = "." /></ td ></ xsl : for-each ></ tr ></ xsl : for-each ></ table >< H3 >To tal Customers< xsl : value-of select = "count(customers/customer)" /></ H3 ></ xsl : templ ate></ xsl : stylesheet >

This invalidates the xml so I need to fix this before passing the result to the xml parser.

( defconst cr (string ?

)) ( defconst *broken-line-regex* cr) ( defun join-broken-lines (start end) (interactive "r" ) (goto-char start) ( while (re-search-forward *broken-line-regex* end t) (replace-match "" nil nil)))

This is the output from the version which calls (sgml-pretty-print ...)

< ?xml version = "1.0" encoding = "UTF-8" ?> < xsl : stylesheet version = "1.0" > < xsl : output method = "html" /> < xsl : template match = "/" > < H2 >Customer Listing (in Alternating row colors) </ H2 > < table border = "1" > < xsl : for-each select = "/customers/customer" > < tr > < xsl : choose > < xsl : when test = "position() mod 2 = 1" > < xsl : attribute name = "class" >clsOdd </ xsl : attribute > </ xsl : when > < xsl : otherwise > < xsl : attribute name = "class" >clsEven </ xsl : attribute > </ xsl : otherwise > </ xsl : choose > < xsl : for-each select = "@*" > < td > < xsl : value-of select = "." /> </ td > </ xsl : for-each > </ tr > </ xsl : for-each > </ table > < H3 >Total Customers < xsl : value-of select = "count(customers/customer)" /> </ H3 > </ xsl : template > </ xsl : stylesheet >

We probably want a function to handle the common case is where all the xml is on one line with some junk before the xml and some junk afterwards. E.g.

asdflkjnalkjnasdf < ?xml version = "1.0" encoding = "UTF-8" ?>< a >< b >c</ b ></ a > xskldfgjnskldf

( defun xml-pretty-print-current () (interactive) ( save-excursion (end-of-line nil) (re-search-backward ">" 1) ( let ((e (+ 1 (point)))) (beginning-of-line nil) (re-search-forward "<?xml[ ^ >]*>" e) (xml-pretty-print-region (point) e))))

And, er, I guess that’s all I’ve got to say about that.