XML vs. JSON

Developers often ask something to the effect of “Should I use XML or JSON here?”  or “Which is better?” or “Which is faster?” or my favorite, “Which is easier?” and before reading any more, you should know that JSON normally wins the day.

Here are my most common recommendations by task type:

  1. I’m integrating with an application that only offers XML.

    Don’t change it.  It’s twice the work.  Use what exists today.  I’ve seen some people query a remote data source, return XML, run it through a JSON conversion process, then handle the JSON.  If you can avoid reprocessing an already processed result, you should.  Keep the XML in place and use the tools available to manipulate the XML the way you need it.

  2. I’m updating an application that only offers XML and I have the opportunity to add JSON support

    Awesome!  Do it.  The benefits aren’t just on the consumption side – JSON is lighter (mainly without those XML closing tags) and lightweight is generally better.  Additionally, more choices on the side of the consumer makes for an easier pitch on why someone should use your product.

  3. I’m building an application from scratch and I only want to offer one option – should it be XML or JSON?

    JSON unless you need something XML provides that JSON doesn’t.  At this point, I can’t find much that XML offers that JSON isn’t capable of providing.Using XML, you can access data with XPath query.
    Using JSON, you can access data with JSONpath.

    I used to hear quite a bit about XMLs validation, but JSON offers validation solutions, too.

    Attributes were once a common argument of a feature that XML offered where JSON did not – but I still can’t find a use for XML attributes that I can’t also accomplish in well structured JSON.

    XSL, or, the ability to modify XML without a programming language doing the work — well, I still can’t find a use for that in the way I use most webdev data where this question comes up.  Normally, I’m serving something out of a database, maybe with some light data manipulation, then packaging that data and pushing it out to a requester.  In this scenario, I can’t find a reason where I wouldn’t have a programming language readily available.

Certainly, XML has its place.  I don’t recommend changing an existing application to remove XML support in favor of JSON.  I do recommend offering both as an option when building new APIs (with JSON as the default, of course).  I almost exclusively recommend AJAX use JSON as the back end data format for new builds.

For anyone who wants the details on the “lightweight” claims, I generated 1,000 records on mockaroo.com and got them into both XML and JSON formats containing only one level and four items:

  • ID (number)
  • First Name (text)
  • Last Name (text)
  • Email (text)
Benchmark XML JSON Difference
Actual
Difference
Percent
Winner
Character Count 142,702 chars 137,679 chars 5,023 chars 4% JSON

At this point, most people think this isn’t much of a difference.  5 thousand characters difference over a thousand records isn’t really much.  That’s just 5 extra characters per record.

Consider what happens on a much larger scale – not just 1,000 records, but 2 billion records.

JSON wins at 10 billion (10,046,000,000) characters lighter.

Whether you get the 2 billion records because the data is passed, parsed, downloaded, or evaluated every few minutes for the lifetime of the application or just as simple as the growth of the source data – what seem like small differences today can easily grow to something quite significant at scale.