Chunking vs. nesting

At the blog I’d rather be writing there is some concern about navigability when the content set has too many small files. The solution discussed is to use @chunk="to-content", but that’s not the only option.

Although it’s common practice to have one topic per file, it’s not required. If you are not going to be reusing topics in multiple contexts you might prefer to have multiple topics in the same file.

For example:

<!DOCTYPE dita PUBLIC "-//OASIS//DTD DITA Composite//EN" "ditabase.dtd">
<dita>
  <concept id="concept_bm1_q4m_yj">
    <title>About widgets</title>
    <conbody>
      <p>Widgets are very useful.</p>
    </conbody>
    <task id="task_bm1_q4m_yj">
      <title>Create a widget</title>
      <taskbody>
        <steps> ... </steps>
      </taskbody>
    </task>
    <task id="task_bm1_q4m_yk">
      <title>Drag a widget in place</title>
      <taskbody>
        <steps> ... </steps>
      </taskbody>
    </task>
  </concept>
</dita>

The question about relying primarily on TOC is an interesting one discussed by Jonatan Lundin here and Mark Baker here.

DITA to PPT mapping

In a discussion about my previous post on the LinkedIn DITA Awareness group, the question of mapping DITA elements to PPT was raised. Because the answer is too long to comfortably fit in the discussion thread I’ll answer it here. There are two aspects: conditions and element selection/mapping.

For the conditions, the normal Open Toolkit filtering with ditaval is used.

  • @otherprops="slide" is explicitly included in the PPT and implicitly included in PDF.
  • @audience="instructor" is explicitly included in the PPT and explicitly excluded from PDF.
  • @otherprops="noslide" is explicitly excluded from PPT and implicitly included in PDF.

The question of element selection and mapping is a bit more complicated. First, the PPT template has three bullet levels defined. The first is meant for unbulleted items, for example learning objectives lead in. The second is meant for bulleted and numbered items, for example step and li elements. The third is meant for command and output references, for example codeblock and msgblock.

In the PPT utility, XPath expressions are mapped to one of these three paragraph styles or to an object.
XPath expression PPT mapping
$t//ph[$s] P1 unbulleted
$t//p[$s] P1 unbulleted
$t//*[$s]/li P2 bulleted
$t//sl[$s]/sli P2 unbulleted
$t//dl[$s]/dlentry/dt P2 bulleted
$t//codeblock[$s] P3 fixed-width
$t//msgblock[$s] P3 fixed-width
$t//steps[$s]/step/cmd P2 numbered
$t//steps[$s]/step/info/codeblock P3 fixed-width
$t//fig/image/@href image object
$t//image/@href image object
$t//note[contains(@audience,'instructor')] note object
$t//lcObjectivesStem P1 unbulleted
$t//lcObjectivesGroup//lcObjective P2 bulleted
$t//table[$s] table object
$t//simpletable[$s] table object

The variables $t and $s are also XPath expressions that I put into variables to keep the other expressions legible:

  • $t = "child::*[contains(@class, ' topic/body ')]"
  • $s = "contains(@otherprops, 'slide')"
Edit: correction on filtering logic.

PowerPoint Output in an ant Task

For several years now I have been working on utilities to convert DITA to PowerPoint, as I have discussed elsewhere. The original implementation was as a VBA macro stored inside a PPT template. I won’t go on at length about all the limitations that VBA has; suffice it to say that it’s verbose and not very robust. My colleagues in the training group requested some enhancements and I couldn’t bear the thought of spending much time in VBA.

So I decided to rewrite it in PowerShell. Other posts in Ditanauts have detailed working with XML files in PowerShell, which I have found to work pretty well. In addition, PowerShell has much better capabilities for exception handling, and it can manipulate PPT objects.

Continue reading

Configuring Fonts for the Open Toolkit with Apache FOP

While I can’t justify the cost for an expert stylesheet developer or a fancy PDF renderer, I don’t wan’t my PDFs to look like garbage either. Using Jarno Elivirta’s PDF plugin generator is a great place to start for PDF customization, and Apache FOP (bundled with the Open Toolkit) is my only option for a PDF renderer. Once I did a lot of the work in a PDF plugin, I got to the point where I wanted to change fonts. Although despite overuse there’s nothing too offensive about Times New Roman + Arial + Courier, that font set doesn’t conform to any branding guidelines that have been given any thought.

Should be pretty easy to do what I wanted, right?

Continue reading

Reporting on your repository with PowerShell, part 2

A couple months ago, some developers and support engineers were looking over some documentation and said to me, “These procedures are too complicated!” To which I said, “I know! I made them as simple as possible, but I can do only so much within the constraints of the interface.” Then the engineers asked me an astounding question: “Can you give us some complexity measure on each procedure so we know where to start making things simpler?” Because I knew how to use PowerShell to get information out of my set of DITA topics, I calmly said, “Let me look into it” while inside I was bursting with excitement. Continue reading

STC Webinar: Simplify DITA Authoring with Constraints

On Tuesday, June 19th at 9pm EDT, I’ll be presenting an STC webinar about constraints, specifically on how to download, install, and customize the ditanauts constraints example plugin. Hope to see you there!

To register, go to: http://www.stc.org/education/online-education/live-seminars/item/simplify-dita-authoring-with-constraints?category_id=53

After the webinar, please feel free to post comments and questions here.

Spambots!

Yesterday we got over 30 comments on a variety of posts, and I was stoked! The comments seem intelligent at first blush, and don’t contain any links… so I didn’t think they were spam. That is, until I started trying match them up with the subject of the posts. For example, on the post about Automating Tasks in a CMIS Repo, which discusses python, there was a comment that discussed a particular python API… only, that API did not have anything to do with the post and did not follow logically with the post it was replying to.

There are others which related particular problems about XSLT… but don’t really ask a question or have anything to do with DITA. Like this one,

"hi Mukul,i have a problem. Can you 
plesae provide me a good suggestion.the expalnation for the problem
is as follows: i have a xslt code which transforms a xml to xsd.
i want to throw an error as the output when i execute the xslt if
the schema generated by the xslt is not valid. So this should stop
the schema generation also. so the output should only be an error
message without the generation of the schemas"

That almost makes sense, except that there are no approved comments by “Makul”. Or this one,

If you’re pursuing the beenift of XML to get the separation between
form and content, why do you want to reintroduce the requirement to
do output by hand?

OK, fair question… only the original comment was about a Facebook like button. So bizarre.

At any rate, I just deleted a ton of comments from the queue–my sincere apologies if any of them were legit. I’m pretty sure the multi-page poem in Japanese, along with the English translation, was not legit. But still… why?

Review: DITA for Practitioners, Volume 1: Architecture and Technology

Eliot Kimber has done a great job of compiling relevant, actionable guidelines and practices in the first volume of DITA for Practitioners. I fall into the “those with prior DITA experience” category. As a self-taught DITA (and XML, for that matter) user, I found a lot here that filled in the gaps in my knowledge. (Especially helpful was the section on essential terminology.) While I skimmed over some of the basic info in Chapters 2 and 3, new users will find a thorough explanation of how to get up running, writing and producing output with DITA and the Open Toolkit.

In later chapters, Eliot goes into how to install, run, and make basic customizations to the toolkit. Even though I’ve created lots of plugins, I’m certain I’ll come back to the sections where he explains extension points and best practices for creating ant targets. Part 2 builds on the foundation set in part 1, layering in complexities like specialization, compound maps, vocabularies, reuse, and more. (I’m still trying to wrap my brain around Chapter 8 on linking and addressing.)

In short, I wish I’d had this book when I started out implementing DITA four years ago. I’m certainly glad I have it now.

Python, LXML, and setting xml:lang

Having trouble figuring out how to script the @xml:lang attribute?

For a DITA document that contains a single language, the highest level element (i.e. map, concept, task, etc.) that contains content should set the @xml:lang attribute to the language that applies to the document.

The question: How do I set the @xml:lang attribute using lxml? Everyone seems confused on the forums. Continue reading