Flattening HTML output

My DITA repository has a number of subdirectories to keep maps and topics organized. This strategy is convenient, but it can be a drawback when I need to further process HTML output, as I had to do for a recent publishing project. The HTML output type in the DITA Open Toolkit retains the organization of the source files, so that every processing task turned into file tree navigation with tools that aren’t suited for it.

The DITA For Publishers HTML2 plugin provides a mechanism for flattening the output: the html2.file.organization.strategy Ant parameter. To make flattening a viable approach, there needs to be a provision for avoiding collisions. For example, say you have two directories, indir1 and indir2, each of which contain a topic file topic1.dita. The single-directory output, then, can’t be outdir/topic1.html because there are two topic1 files.

The plugin deals with this requirement by appending a string, created by generate-id(), to the file name. So indir1/topic1.dita would become outdir/topic1_d97.html and indir2/topic1.dita would become outdir/topic1_d84.html. The exact expression (in the get-result-topic-base-name-single-dir template) is

concat(relpath:getNamePart($topicUri), '_', generate-id(.))

While that’s a reasonable approach, it’s not the one that I want to use because the filenames ultimately get exposed to my customers. Since I don’t know the algorithm for generating the unique ID, it’s not deterministic enough, and a bookmarked link might become invalidated without my knowing. Instead, I’d like to prepend the parent directory name, so I modified the expression to this:

concat(relpath:getNamePart(relpath:getParent($topicUri)), '_', relpath:getNamePart($topicUri))

The result for the two example files then would be outdir/indir1_topic1.html and outdir/indir2_topic1.html. This approach has the added advantage that the output doesn’t lose information about its location in the source.

Leave a Reply

Your email address will not be published. Required fields are marked *