Word Conversion Tool

I initially drafted this post about a Word conversion tool two years ago, and for some forgotten reason, never published it. Eventually the project was scrapped, but we did use the tool for quite a while. I’ve gone back and made the article past-tense, and added some reflection. I think there’s still some good stuff in here…

Word. It ain’t going away anytime soon. As a DITA–and more generally, as a structured authoring–evangelist, I’ve long loathed the wild West approach to documentation Word engenders. Okay sure, it is easy to use. It doesn’t really require any training. I suppose, yes, one could argue that writing in Word lets authors focus on the content rather than on “tagging.” (I think that argument is faulty, as I’m sure you do, but we won’t go into that here.) The point is, if Word is going to be here awhile, what might we do about it?

Continue reading

QA Plugin Use Case: Learning Engagement

I thought it would be useful to share a use case for the QA Plugin from the Education team at Citrix. In addition to the metrics in the open-source code, we’ve added a number of our own used to measure the quality of instructional design in our courses. For example, we calculate what we call an “engagement ratio”, which is the ratio of words to interactions. We find a good target is 250 words to each interaction. The ratio gives us a single metric that tells us, at least directionally, whether the course will offer a sound experience for the student.

Of course, if the content uses a lot of “click to see more text” interactions, then a low ratio may be misleading. That’s why we also total up the number of each interaction type. Showing these two metrics together gives us a solid understanding of the variety and frequency of interaction in a course.

In addition, we are able to calculate reading time vs other activities, like videos, labs, and simulations, as well as an estimated total course length. Therefore, we have language metrics telling us about terminology and style use, interaction metrics telling us about variety and frequency, and timing metrics about various activity types. Those metrics combined give us a accurate picture of how engaging a course will be, without having to read a single page.

But, you know, you should still read the course. 🙂 But with the QA plugin, you know where to focus, what issues you are likely to encounter, and how much work you are likely to need in order to get the course ready for release.

If you have a use case for the QA plugin, please let us know! We’d be more than happy to feature it here on ditanauts.

STC Webinar: Simplify DITA Authoring with Constraints

On Tuesday, June 19th at 9pm EDT, I’ll be presenting an STC webinar about constraints, specifically on how to download, install, and customize the ditanauts constraints example plugin. Hope to see you there!

To register, go to: http://www.stc.org/education/online-education/live-seminars/item/simplify-dita-authoring-with-constraints?category_id=53

After the webinar, please feel free to post comments and questions here.


Yesterday we got over 30 comments on a variety of posts, and I was stoked! The comments seem intelligent at first blush, and don’t contain any links… so I didn’t think they were spam. That is, until I started trying match them up with the subject of the posts. For example, on the post about Automating Tasks in a CMIS Repo, which discusses python, there was a comment that discussed a particular python API… only, that API did not have anything to do with the post and did not follow logically with the post it was replying to.

There are others which related particular problems about XSLT… but don’t really ask a question or have anything to do with DITA. Like this one,

"hi Mukul,i have a problem. Can you 
plesae provide me a good suggestion.the expalnation for the problem
is as follows: i have a xslt code which transforms a xml to xsd.
i want to throw an error as the output when i execute the xslt if
the schema generated by the xslt is not valid. So this should stop
the schema generation also. so the output should only be an error
message without the generation of the schemas"

That almost makes sense, except that there are no approved comments by “Makul”. Or this one,

If you’re pursuing the beenift of XML to get the separation between
form and content, why do you want to reintroduce the requirement to
do output by hand?

OK, fair question… only the original comment was about a Facebook like button. So bizarre.

At any rate, I just deleted a ton of comments from the queue–my sincere apologies if any of them were legit. I’m pretty sure the multi-page poem in Japanese, along with the English translation, was not legit. But still… why?

Review: DITA for Practitioners, Volume 1: Architecture and Technology

Eliot Kimber has done a great job of compiling relevant, actionable guidelines and practices in the first volume of DITA for Practitioners. I fall into the “those with prior DITA experience” category. As a self-taught DITA (and XML, for that matter) user, I found a lot here that filled in the gaps in my knowledge. (Especially helpful was the section on essential terminology.) While I skimmed over some of the basic info in Chapters 2 and 3, new users will find a thorough explanation of how to get up running, writing and producing output with DITA and the Open Toolkit.

In later chapters, Eliot goes into how to install, run, and make basic customizations to the toolkit. Even though I’ve created lots of plugins, I’m certain I’ll come back to the sections where he explains extension points and best practices for creating ant targets. Part 2 builds on the foundation set in part 1, layering in complexities like specialization, compound maps, vocabularies, reuse, and more. (I’m still trying to wrap my brain around Chapter 8 on linking and addressing.)

In short, I wish I’d had this book when I started out implementing DITA four years ago. I’m certainly glad I have it now.

Impressions from DITA NA 2012

DITA NA 2012 has come and gone. This year the conference boasted a record 318 attendees. They added an emerging technologies track. It was held in beautiful San Diego.

I’ve presented at this conference for the last several years, and my impression this year is that the level of discourse has noticeably risen. There were more topics that were more technical. Presenters discussed more best practices, more concrete experiences, and more practical advice than in years past, when many discussions were more or less theoretical. Instead of “this is what will/might/should happen,” I heard more of what did happen and is happening. It sounds cliche and, yes, self-serving to say it, but this is an exciting time to be “in” DITA.

For me, these were the highlights of the conference:

  • Steve Anderson mentioned the QA plugin in his presentation “Automation and testing DITA OT Content and Customizations”! I was totally stoked.
  • George Bina showed us his RelaxNG plugin, which reproduces the DITA DTDs in an easier to manipulate format. By combining RelaxNG with Schematron, you can deeply customize your authoring experience, both constraining elements and attributes and providing in-line guidelines to authors.
  • Michael Boses also discussed the awesomeness of Schematron. Ok, fine, I’m convinced. We’ll rewrite the QA plugin in Schematron.
  • Bryan Schnabel showed his XLIFF round trip plugin, which converts DITa to and from XLIFF. Just converted a document to XLIFF. It was glorious. I am going to be all over this one.
  • Mat Verghese from Citrix discussed a detailed and solid vision for raising the value and esteem of Content Strategists
  • Keith Schengili-Roberts, in his keynote, gave me some great ideas for additions to the QA plugin, like calculating Flesch-Kincaid reading scale values. More to come on that front.
  • Mark Baker discussed the use cases for his SPFE architecture, which is a different solution to many of the problems DITA implementors face. I particularly liked the idea of automagically creating links based on string matches. It’d be cool if the QA plugin’s link report could suggest new links based on the content….hmmm…
  • I learned you can directly style XML with CSS! Who knew? There must be some great applications for this.
  • And of course, Eliot Kimberly released his new book on implementing DITA. I’ll be posting a review sooner rather than later.

All told, a great conference.

QR Codes in DITA Ouput

Inspired by a thread started by Sean Healy, and building on the instructions posted by Kevin Brown, I added the ability to generate and insert QR Codes into PDF output to the mypdf plugin.

I ignore QR Codes in marketing, but I think they could be a great way to link to resources, such as videos, from printed technical documents. Readers can simply zap the codes with their phones to pull up the content.

Continue reading

Example Constraints Plugin

One of the most important features, in my opinion, of DITA 1.2 is the constraints mechanism. In short, constraints let you reduce the elements and attributes available to your authors. You can also specify when elements/attributes are required, and which tag structures are legal (and, therefore, which are illegal). Eliot Kimber wrote a great tutorial on how to set up constraints, but if you’d like an example plugin, you can download the one I’ve created off sourceforge.

Continue reading

Improving readability of DOT build log

I ran across a post on the Yahoo DITA Users’ group today about improving the readability of the toolkit’s error log. It is surprisingly easy to do – in fact, I’m not sure why this isn’t the default behavior. At any rate, if you run the toolkit from the command line via ant, you can specify that the log be created in XML instead of plain text output to the command line window. Then, you can style is with XSL. I’ve created a stylesheet that makes the log much easier for a novice to read.

Continue reading