Starting from DITA source, there is a series of two transformation:
- HTML2 from DITA4Publishers, which flattens the directory structure.
- A custom XSLT that reads the resulting index and creates nested structures representing the document.
Each topic in the map becomes a “document” element in the JSON that is made up of the following pieces:
Field | Source |
---|---|
Title | Topic title |
ID | Topic filename |
Unique key | Top-level document filename + topic filename |
Ancestors | List of ancestor topics at all levels |
Summary* | Topic shortdesc |
Body | Topic body |
HREF | Topic path + topic filename |
Documents* | List of sub-documents |
The JSON created in stage 2 is loaded into MongoDB for rendering on the documentation portal. As the loader and the rest of the portal infrastructure was developed by the support tools team I can’t give any insight there except to say that cross-references and image links presented a bit of a challenge.
The XSLT (ditahtml2json.xsl) and a sample JSON (hierarchy.json) generated from the DITA-OT hierarchy.ditamap are available from GitHub. More background is available in the slides from the IDW presentation.