- Escaping XML Data
- XML Encoding
- XML 1. 0 EBNF
- XML 1.1 EBNF
- Byte Order Marker
- XML Schemas
- XML Mixed Content
- Describing Complex Structures in XML
A CDATA section contains character data, meaning the XML Parser should treat it as raw character data not as markup.
Put simply a CDATA block starts with the literal
and ends with the literal
Everything in between is treated as raw character data, so NO escaping rules will be applied when reading it. Furthermore whitespace is preserved within a CDATA block.
Nesting CDATA Blocks
There is no way to escape the CDATA end literal ']]>' because of this there is no way to nest CDATA blocks, and for that matter no way to store the literal ']]>' within a CDATA block.
Because of this care should be taken to avoid inadvertently entering the literal ']]>' into the body of a CDATA block.
When is this useful?
If the XML is being read/written purely by an XML parser then the use of this construct is debatable, the XML Parser is capable of escaping the data within it as needed, and avoids any issues if the data happens to contain the literal ']]>'.
If the data is designed to be viewed/editied by a person and the data contains chunks of data which contain symbols, formatting and XML control characters that would normally need padding, then CDATA is a good choice as it allows data to be typed in without the need for the user to escape the input. Typically examples are where source code/script is included within an XML document.
This example shows how source code can be entered into the XML CDATA section without the need to escape the '>', '&', '"' symbols, making it much easier for someone looking at it to understand and edit.
<ApplicationExtensibility> <OnLoad> <![CDATA[ if (DateTime.Now > License.Expiry) Application.Exit(); ]]> </OnLoad> <ItemAdded> <![CDATA[ if (Item.Count > MaxItemLimit && ShowWarnings) MessageBox.Show("To many items"); ]]> </ItemAdded> </ApplicationExtensibility>
The syntax for a CDATA section described by the W3C using EBNF as follows.
 CDSect ::= CDStart CData CDEnd  CDStart ::= '<![CDATA['  CData ::= (Char* - (Char* ']]>' Char*))  CDEnd ::= ']]>'