DITA > HTML > JSON

At Information Development World 2015 some attendees expressed interest in the JSON documentation format that feeds my documentation portal.

Starting from DITA source, there is a series of two transformation:

  1. HTML2 from DITA4Publishers, which flattens the directory structure.
  2. A custom XSLT that reads the resulting index and creates nested structures representing the document.
Each topic in the map becomes a “document” element in the JSON that is made up of the following pieces:
Field Source
Title Topic title
ID Topic filename
Unique key Top-level document filename + topic filename
Ancestors List of ancestor topics at all levels
Summary* Topic shortdesc
Body Topic body
HREF Topic path + topic filename
Documents* List of sub-documents

The JSON created in stage 2 is loaded into MongoDB for rendering on the documentation portal. As the loader and the rest of the portal infrastructure was developed by the support tools team I can’t give any insight there except to say that cross-references and image links presented a bit of a challenge.

The XSLT (ditahtml2json.xsl) and a sample JSON (hierarchy.json) generated from the DITA-OT hierarchy.ditamap are available from GitHub. More background is available in the slides from the IDW presentation.

Leave a Reply

Your email address will not be published. Required fields are marked *


*