You are here

Using JSON in IETF Protocols

Andy Newton

In IETF protocol activities in which a data serialization format is not natural to the problem domain, chances are discussion will be raised about using JavaScript Object Notation (JSON). And chances are that discussion will compare JSON with the Extensible Markup Language (XML). Today, in the IETF, where XML was once the preferred text-based data format, JSON is being used more and more.

This increased use by the IETF mirrors a trend in the broader industry. ProgrammableWeb.com keeps a directory of Internet application programming interfaces (APIs). In 2009 they listed only 191 registered JSON APIs.1 Today, 3,285 JSON APIs are on their registry (compared to 4,543 registered as XML). That’s a pretty good showing, especially considering that XML is more than twice the age of JSON.

The juxtaposition of JSON with regard to XML was probably best put by James Clark, technical lead of the W3C Work Group that created XML 1.0, in a 2010 blog post:2

“I think the Web community has spoken, and it’s clear that what it wants is HTML5, JavaScript and JSON. XML isn’t going away but I see it being less and less a Web technology; it won’t be something that you send over the wire on the public Web, but just one of many technologies that are used on the server to manage and generate what you do send over the wire.”

In other words, JSON will not replace XML, but a lot of the structured data communications in which XML was once the only consideration will eventually migrate to JSON.

While a lot can be said for the heft of XML complexities contributing to this trend, a better explanation of JSON’s popularity is that it is simply simpler than XML. After all, the abundance of XML technologies did not inhibit XML adoption before JSON came on scene—by any measure, XML is a success. But JSON is a simpler solution to a subset of the problems solved by XML—and that subset is a rather large one.

JSON’s simplicity comes in two forms: the rules for serializing JSON data, and the interfaces for parsing JSON data, particularly in today’s more-popular, dynamic languages.

Unlike XML, JSON serialization and parsing rules are fairly simple. JSON data is composed of objects, which, in turn,  have members, and arrays, and the more primitive types, string, boolean, number, and null. They possess neither mixed content nor metadata structures. And they contain no processing instructions or character entity references—no facility at all for commenting.

Thanks to these basic differences, the API in many programming languages can also be quite simple. Because there is no need to account for mixed content, processing instructions, and the like, many dynamic languages easily map JSON data into their native object and array constructs—resulting in seamless access to both JSON-rendered data and the other program’s data. For many programmers, outside of a parse command, there appears to be no JSON API.

The Devil Is in the Details

As with any technology, the lure of simplicity and the need for features is a function of the problem being solved. Any specification activity looking at JSON should evaluate the following issues.

JSON Is Not JavaScript. Though widely known, sometimes it bears repeating. JSON is a language-independent data format. That said, understanding JSON’s heritage is important for two reasons.

  • The constructs of JSON objects and arrays follow JavaScript semantics. Members of JSON objects are considered unordered; no member should be repeated within an object (this may not produce an error, but the behavior is dependent on the parser’s implementation). Arrays, however, are considered ordered collections of values.
  • Although JSON is not JavaScript, it is valid JavaScript syntax. Therefore, for security reasons, JSON parsers should take precautions against JavaScript embedded in JSON data (e.g. JavaScript engines should not simply parse JSON using the eval function), and serializers should prevent the injection of JavaScript into JSON data.

Character Set Support. JSON is strictly Unicode; protocols relying on non-Unicode encodings will be incompatible. This is not generally a problem, but does contrast with XML, in which US-ASCII, ISO-8859-1, Shift_JIS, Big-5, GB, and other character sets can be explicitly specified.

Mixed Content and Markup. Another contrast with XML is JSON’s lack of mixed-content support, whereby data-format syntax can be intermingled with message text. This is what makes XML ideally suited for markup and annotating text with metadata. For those purposes, JSON is a poor choice.

Schemas. There is no one standardized schema language for JSON, although several are presently in the works (including one by this author). The need for a JSON schema language is controversial—JSON is regarded by most as simple enough on its own. Indeed, there is no shortage of JSON-based interchange specification making due without schema formalism.

The inherent complexity of a generalized schema language may also be unnecessary. The Application-Layer Traffic Optimization (ALTO) working group has created a simple schema language for JSON based upon the C programming language’s struct syntax—an approach that sufficiently covers all ALTO needs.

But the significance of JSON Schema should not be understated. While JSON Schema is not yet an IETF proposed standard, it has been used in the specification of many private, JSON-based protocols.

Namespaces. JSON does not (yet) have a standardized, namespace specification. A few have been proposed, but none have gained significant traction.

Many JSON specifications have no need for a namespacing system—nor for the complexities that come with a federated namespacing system. One example is the Web Extensible Inter-net Registry Data Service working group, which has developed a simple prefix rule for JSON names to avoid collisions between names in standardized specifications and custom extensions.

Performance. The needs of a protocol are often dictated by performance, of which there are multiple aspects. Common wisdom holds that JSON is less verbose than XML, resulting in smaller payload sizes and needed memory.

But JSON is also considered to be faster than XML. A study conducted at Montana State University3 that modeled the transfer of data over a network found that in some scenarios JSON could be 41 times faster than XML—and consume noticeably less CPU time and memory.

The types of APIs available can also impact performance. Particularly for applications with large datasets in which an in-memory, tree-node API would consume too many resources. For years, XML benefited from the quasi-standard SAX specification, an event-based API for consuming XML. JSON parsers with similar concepts are now available: Jackson, LitJSON, YAJL (its various language bindings), and GSON, to name a few.

Normative Reference. No discussion about using JSON in IETF protocols is complete without noting that the JSON-defining document, RFC 4627, is published as an Informational RFC. Because JSON plays such a pivotal role in IETF standards, there are discussions of moving JSON into the standards track. Until then, RFC 4627 is listed in the DOWNREF registry4 and can be used as a normative reference.

References to ECMA’s ECMAScript Language Specification also are unnecessary as a normative reference, as the ECMAScript 5.1 Edition5 normatively refers to RFC 4627 with slight differences called out.

An Eye to the Future

With the aforementioned considerations in hand, one must recognise those IETF activities in the area of JSON infrastructure: the Applications Area working group (APPSAWG) has discussions from time to time regarding JSON Schema (and other JSON schema languages) and is currently working on JSON Patch and JSON Pointer, specifications for referencing values in a JSON document and for applying changes to a JSON document with HTTP PATCH. In addition, the Java-script Object Signing and Encryption (JOSE) working group is tasked with developing cryptographic integrity and encryption specifications for JSON.

As it is the broader trend in the computing industry, one should expect to see more JSON RFCs from the IETF—both as client standards and as building blocks for JSON-based stacks. 

References

1.         “The Stealthy Ascendancy of JSON,” Lori MacVittie, F5 DevCentral, https://devcentral.f5.com/weblogs/macvittie/archive/2011/04/27/the-stealthy-ascendancy-of-json.aspx

2.         “XML vs The Web,” James Clark, James Clark’s Random Thoughts, http://blog.jclark.com/2010/11/xml-vs-web_24.html

3.         “Comparison of JSON and XML Data Interchange Formats: A Case Study,” Nurzhan Nurseitov, Michael Paulson, Randall Reynolds and Clemente Izurieta, http://www.cs.montana.edu/izurieta/pubs/caine2009.pdf

4.         DOWNREF Registry, http://trac.tools.ietf.org/group/iesg/trac/wiki/DownrefRegistry

5.         “ECMAScript Language Specification, 5.1 Edition,” ECMA-262, June 2011, http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf