Number of steps is of course a good place to start, but it doesn’t tell the whole story. Different types of notes, environment-specific information that the user has to type, commands, and the number of different interfaces required all play a part. Most of these I can map to DITA elements, even if imperfectly. The goal is just to get some rough measures for engineering prioritization.
The overall flow is straightforward:
- Define the metrics as key-value pairs, where the key is the metric name and the value is an XPath expression.
- Iterate over each task topic in the repository (as described in myearlier post) and count the number of occurrences of each XPath match.
- Output the topic title, aggregate score, and individual metric scores as CSV to import into Google Docs.
$metrics = @{ 'Steps' = '//step[not(substeps)] | //substep' 'root commands' = '//step//codeblock[starts-with(text(), "#")] | //step//codeblock[contains(text(), "sudo")]' 'Non-root commands' = '//step//codeblock[starts-with(text(), "$")]' 'GUI screens/menus' = '//uicontrol' 'Notes' = '//note' 'User-supplied parameters' = '//varname' }
$title = $fileContent.SelectSingleNode("//task/title").get_InnerText() $key = "$title ($fileRel)" $score = 0 $metrics.GetEnumerator() | Foreach-Object { $xpath = $_.Value $metricName = ($_.Key | Out-String).Trim() $metricValue = $fileContent.SelectNodes($xpath).Count $score += $metricValue $fileDB.Add($metricName, $metricValue) } $metricDB.Add($key, $fileDB)
$metricDB.GetEnumerator() | ForEach-Object { $key = ($_.Key | Out-String).Trim() $val = $_.Value $metricValues = $val.GetEnumerator() | Sort-Object Name | Foreach-Object { $value = ($_.Value | Out-String).Trim() $total += $value $value } $metricValues = [string]::join(",", $metricValues) write-output "$key,$total,$metricValues" }
As you can see, the score for each topic is simply the sum of all the other metrics.
Sample output:
Title,Total,Branches,Cautions,Choice points,Dangers,GUI screens/menus,Interface switches,nCLI commands,Non-root commands,Notes,root commands,Steps,Typed text, User-supplied parameters,Warnings To Configure a Host IP Address (ip_config\t_reconfigure_a_host_ip_address.dita) ,39,0,0,0,0,26,0,0,0,0,0,13,0,0,0
Then I uploaded the resulting CSV to Google Docs, sorted by the “Total” column, and let the engineers take a look.
It was clear that certain procedures were unusually complex, which gave areas for focus. In a few cases, they explained to me how it could be written more simply, which I was glad to do. In others, they saw that it needed engineering work. Now, when the next release comes out, I can calculate the complexity again and demonstrate that the procedures have become simpler.
That’s really cool. Could you run it against the OT’s userguide bookmap and see what kind of results you get? It would be great to see a formatted example.
Would also be fun to add this to the QA report.
I think the technique is generalizable, but what “complexity” means will differ from situation to situation. Although steps/substeps will almost always be relevant and uicontrol would be relevant for software products, those in themselves wouldn’t be all that interesting. The other metrics I chose are specifically related to a Linux-based virtualization system. Number of interfaces is a pretty interesting metric from a usability standpoint–but the expressions that count interfaces wouldn’t make sense for anyone but me.
Conditional steps would also be a good general metric if there were some reliable way of testing for it. I think that conditional steps should have their own element, and not just a regular step/cmd that starts with “If”.
