Syndicate content

#PCDATA and DTDs

If you are like me and write DTDs to check your XML files to make sure you don't have too many mistakes in them, then you probably have had this problem before.

The #PCDATA has a very special behavior and it is really restrained as follow:

  • #PCDATA must appear at the start
  • #PCDATA must be repeated from zero to infinity, so only * work with it
  • #PCDATA cannot be used with sub-groups (things between parenthesis)

Something like this:

<!ELEMENT Z (P | (#PCDATA | A | B | C)* | Q)+>

does not work because you use + and #PCDATA is within a sub-group.

What you need to do is have this instead:

<!ELEMENT Z (#PCDATA | A | B | C)*>

And if P and Q are still necessary, add them there too:

<!ELEMENT Z (#PCDATA | A | B | C | P | Q)*>

If you have an entity and have a need for #PCDATA, it is very likely that you'll have to extract the #PCDATA from the entity and move it to the main tag:

<!ENTITY % E "(#PCDATA | A | B | C)*">
<!ELEMENT Z %E;> <!-- this one works fine -->
<!ELEMENT Z (%E; | P | Q)*> <!-- this one fails! -->

Notice that it fails because #PCDATA finds itself within a sub-group. Since the entity and the use of it are both followed by an asterisk, it is not required, but somehow it is still failing.

In most cases what you will have to do is move the #PCDATA to the element and remove all parenthesis from your entities:

<!ENTITY % E "A | B | C">
<!ELEMENT Z (#PCDATA | %E; | P | Q)*> <!-- that works -->

Sample errors that we get from xmllint when #PCDATA is missused (totally confusing if you ask me!):

Entity: line 1: parser error : xmlParseElementMixedContentDecl : Name expected
 %html-data;
            ^
Entity: line 1:
#PCDATA|a|b|br|div|em|i|img|p|small|strong|u
^
Entity: line 1: parser error : expected '>'
 %html-data;
            ^
Entity: line 1:
#PCDATA|a|b|br|div|em|i|img|p|small|strong|u
^
Entity: line 1: parser error : Content error in the external subset
 %html-data;
            ^
Entity: line 1:
#PCDATA|a|b|br|div|em|i|img|p|small|strong|u
^
Entity: line 1: parser error : ContentDecl : Name or '(' expected
 %html-data;
            ^
Entity: line 1:
(#PCDATA|a|b|br|div|em|i|img|p|small|strong|u)*
 ^
Entity: line 1: parser error : ContentDecl : ',' '|' or ')' expected
 %html-data;
            ^

Source: http://stackoverflow.com/.../why-is-this-not-a-valid-xml-dtd-parameter-entity-and-pcdata

Syndicate content

Diverse Realty

Diverse Realty Team

Want a New Home?
Want to Sell Your House?

Call Alex at
+1 (916)
220 6482

Alexis Wilke, Realtor
Salesperson
Lic. # 02024063

Cory Marcus, Broker
Lic. # 01079165

     

Terms of Site Index

Find the page/content you are looking for with our index.