Tidy up your XML files

I have been working with some really large XML files (20 MB) that I have to parse and save the data to a database. The XML data is available to me as a single line of XML (probably to save up space) which is quite unreadable by a human. There are many tools out there that will help you tidy up and reformat your XML to a human-readable format. But writing my own tool using C# and LINQ to XML is so trivial that I decided to do my own. Here's the code:

using System;
using System.IO;
using System.Xml;
using System.Linq;
using System.Xml.Linq;

namespace XmlTidy {
    class Program {
        static void Main(string[] args) {
            try {
                XmlTidy(args[0]);
            } catch (FileNotFoundException ex) {
                Console.WriteLine("{0}: {1}", ex.Message, ex.FileName);
            }
        }

        static void XmlTidy(string filename) {
            if (File.Exists(filename)) {
                var doc = XDocument.Load(filename);
                doc.Save(filename, SaveOptions.None);
            } else {
                throw new FileNotFoundException("File not found", filename);
            }
        }
    }
}

What's going on in the code above? When you call the program from the command line, it expects a file name as a parameter, like so: xmltidy samplefile.xml. Next is the call to the XmlTidy function, which checks if the file exists and then loads the file contents into a LINQ to XML document, and then we just save the file again to the same location. That's it! The XDocument class takes care of formating the XML data for you.

You can download the full source, which has some enhancements, from my Github project. This code works on Windows 7 with .NET 4.0 and also on Ubuntu 10.04 with Mono 2.6.7 .

Comments

Comments powered by Disqus