"DITAworks is an Eclipse based solution which is built on DITA architecture and supports collaborative modeling, maintaining, publishing of complex documentation arrays."

*instinctools GmbH

Common errors in DITA specializations

DITA as information architecture can be seen as a set of best practices for topic-oriented authoring. Besides providing a set of DTDs and rules for authoring, DITA also defines a way to design new information structures customized through custom semantics. This is achieved by process of specialization.

Specialization opens a new dimension for customizing DITA to needs of different enterprises. But doing a valid specialization is not a trivial process. It requires rather deep understanding of DITA internals and command in technologies like DTD, XML Schema and s.o.

But even with sufficient knowledge of DTD (or XML Schema) coding, doing a valid specialization requires knowledge of certain DITA principles and rules. Just designing a valid DTD will not necessarily mean that you have a valid DITA model. More on Specializations can be read here: http://docs.oasis-open.org/dita/v1.0/archspec/ditaspecialization.html

This article discusses some common problems and pitfalls people face when they do DITA 1.1 specializations .

Problem 1: Doing changes in original DITA model

One of the foremost and simplest mistakes people do with DITA is to change original DITA DTDs directly in DITA source files.

From point of view of DTD syntax or XML editing software, they do nothing wrong: DTD is parsed by editor without errors and XML is edited and validated. But real problem is that resulting DTD and contents authored according to it are not DITA any more. These new contents take a new custom format which has nothing to do with DITA.

Solution

Use DITA specialization as the only valid way of extending DITA models. It assumes that all extensions are kept in separate files and no changes to original DITA DTDs or XML Schemas are done.

Problem 2: Usage of same infotype or element names as in base DITA

Next problem that we find in specialized models is a problem related to naming of newly defined topic types (infotypes) or elements. To be more precise, naming of your new elements or infotypes becomes a problem when you give them the same name as is already being used in basic DITA model.

Due to technical specifications of DITA specialization approach, it is not possible to use the same names for new infotypes or elements that are already defined in base DITA model or other models being used. This will result in conflict during name resolution and can lead to unpredictable results on the stage of DTD parsing and transformation.

Solution

Give unique names to your new infotypes and elements. It is a good strategy to choose some kind of prefix for the names of your specialized elements. In this way, you will most probably avoid naming conflicts with other models and also distinguish standard DITA elements easily from specialized elements later on when the content is being authored.

Problem 3: Structural elements without parent

Often people are tempted to define new structural elements in DITA that have no reference to parent DITA elements. This is not a valid way according to DITA specialization.

An important part of DITA lies in the concept of generalization. According to it, any specialized model (infotype) can be automatically translated into valid infotype from base DITA model. This provides benefits of easier data interchange between organizations even if they use their own specialized models.

But benefits do not always come without costs as we know. Here, generalization comes at the cost of several rules that a valid DITA specialization needs to follow. E.g.:

  1. Every XML element in specialized model has to have “parent” element reference. This reference shows which element from base DITA (or other) model was used as a “base” for specialization. This parent element will be used later on during generalization process as base representation of specialized element.
  2. Content model of specialized element has to be equal or more restrictive to its parent element.

The problem, which we are talking about in this section, is violating the rule 1. Elements created in such a way are “floating” without a connection to a base DITA model and for such elements, generalization will not be possible. Transformation of these elements using Open Toolkit will lead to unpredictable results (in most of the cases, Open toolkit will highlight such elements by yellow background, thereby signaling that it was not able to find proper transformation style sheet for this element).

Solution

Select a possible parent element from DITA or other valid DITA specialization and define parent reference using class attribute.

Problem 4: Wrong element content model

Very often, people violate rule (2) from the previous section (Refer Problem 3 above): Content model of specialized element has to be equal or more restrictive to its parent element. This rule more elaborately means that the content model of specialized elements has to follow these factors:

  • The content model of specialized elements should consist of elements from parent content model or their specializations.
  • Multiplicity of elements in specialized content model should be the same or more restrictive as the parent content model.
  • Sequence is also important.

This means that when doing structural specialization, selection of the parent element should be done with care. It is so because finally it will influence possible content model of your specialized element substantially.

Violation of this rule also makes DITA generalization impossible and as a result, the whole model cannot be treated as DITA model any more.

Though violations of this rule are very often, they are very hard to detect manually. And that is so because content models of parent and specialized elements need to be compared by taking into account the specialization of involved elements. Special validation tools (see below) can be used for that.

Solution

The solution of this problem is not always trivial and it depends on requirements towards the objective of specialized elements. There are basically 3 possible solutions:

  • Adjust content model of specialized element.
  • Choose other parent element.
  • Combination of both above. This is the most probable scenario.

Problem 5: Mixing domain and structural elements

Another problem, that is very hard to detect manually, is related to a mix of structural and domain elements in your structural specializations.

As it is described here, it conflicts with DITA concept of separate specialization of infotypes and domains. Here is a quote:

«When you define new types of topics or domain elements, remember that the hierarchies for topic specialization and domain specialization must be distinct. A specialized topic cannot use a domain element in a content model. Similarly, a domain element can specialize only from an element in the base topic or in another domain. That is, a topic and domain cannot have dependencies. To combine topics and domains, use a shell DTD.»

The principal function of a domain is the extension of structure inline elements. These elements cannot be used directly in content models of the specialized elements.

This requirement will be relaxed in future with the release of DITA 1.2. Then, it will be easier to use elements from domains in structural specialization.

Solution

The solution of this problem can be seen in clear separation of domain and structural elements. To specialize domain elements, new domains have to be defined and linked on the level of shell DTDs.

Conclusions and a bit of self-marketing

As we can see from problems listed above, DITA specialization in current form is a process that requires a lot of technical expertise and discipline. This has hindered DITA adoption in enterprises and has established unfavorable public image of DITA being a “very complex” standard.

In a comprehensive article “10 DITA Lessons Learned…” on ContentWrangler (http://thecontentwrangler.com/article/10_dita_lessons_learned/), we read the following about specialization:

“Specialization is one of the best things about DITA, but it is also the least understood…”

and one of the main conclusions of the author is:

“Resist the Temptation to Specialize”.

From the same article, we also read:

“Specialization requires a lot of discipline. Until we have specialization wizards to help us correctly implement DTD changes that are needed when you specialize DITA, we will continue struggling to conceptualize what specialization is and how it works.”

We totally agree with this last statement. Majority of these problems could be addressed by modeling tools that should help information architects to focus on task of data modeling instead of spending time on looking for errors in their specializations. These tools should also lower entry barriers for people interested in adoption of DITA.

That’s where we see the main benefit of DITAworks modeling module. It’s the first solution in the market which provides much needed support to Information Architects in designing their DITA specializations by providing wizard-based visual editors for DITA information types. It is also able to import existing models from DTD or XML Schema and detect possible problems. Later, this model can be exported again to DTD or Schema.

We hope this is the right way to break the “Resist the Temptation to Specialize” attitude. Working with DITA specialization in tools like DITAworks can become much more agile.

Note

Problem list presented in this article was compiled from the results of validation and analysis of several DITA specializations from real customer projects. For model validation and analysis, DITAworks modeling module was used.

This list can’t be treated as a full catalog of possible problems with DITA specialization. If you have experienced other types of problems with specialization, we encourage you to contribute and share your experience in comments.

Looking forward to hear from you..

Tags: , , , ,



Leave a Reply

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word

 

Copyright © 2008-2010 * instinctools GmbH