XML Schema Tutorial, Part 1

Defining Elements and Attributes

Graphical XML Schema Editor (XSD)
This article gives a basic overview of the building blocks of XML Schemas and how to use them. It covers:

Overview

An XML schema, commonly known as an XML Schema Definition (XSD), formally describes what a given XML document can contain, in the same way that a database schema describes the data that can be contained in a database (i.e. table structure, data types, constraints etc.). The XML schema defines the shape, or structure, of an XML document, along with rules for data content and semantics such as what fields an element can contain, which sub elements it can contain and how many items can be present. It can also describe the type and values that can be placed into each element or attribute. The XML data constraints are called facets and include rules such as min and max length.

This tutorial guides you through the basics of the XSD standard and the examples use the graphical XML tool Liquid XML Studio.

XML Schema Standards

  • XML Schema Definition (XSD) is currently the de facto standard for describing XML documents and is the XML Schema standard we will concentrate on in this tutorial. XSD is controlled by the World Wide Web Consortium (W3C). An XSD is itself an XML document, and there is even an XSD to describe the XSD standard.
  • Document Type Definition (DTD) was the first formalized standard but has now, in most cases, been superseded by XSD.
  • XML Data Reduced (XDR) was an early attempt but Microsoft to provide a more comprehensive standard than DTD. This standard has been phased out in the Microsoft products in favour of XSD.
  • There are also a number of other schema standards such as Schematron and RELAX NG.

XML Design Tools

The XSD standard has evolved over a number of years, and is extremely comprehensive and as a result has become rather complex. For this reason it is a good idea to make use of a graphical XSD design tool when working with XSDs.

Liquid XML Studio is an advanced graphical XML editor containing all the tools needed for designing, developing and testing XML applications complying with the W3C standards. Features include an XML Editor, XML Schema Editor, XML Data Mapper, XPath and XQuery Debugger, WSDL Editor, Web Service Tools, integration with Microsoft Visual Studio and much more.

XML Development Tools

For those who wish to programmatically work with XML documents, XML Data Binding is a much easier way to manipulate your documents using an object oriented approach to enforce the XML schema rules and constraints.

Liquid XML Data Binder is an advanced XML toolkit and code generator that will save you many hours of repetitive coding by allowing you to treat your XML documents as an object model within your C++, C#, Java, Silverlight or Visual Basic source code. The easy to use Wizard driven interface also generates HTML documentation for your custom API along with a Sample Application.

Elements

Tip: To add an Element in the Liquid XML Studio graphical XSD view, select menu item Edit->Add Child->Element (Ctrl+Shift+E) or select the toolbar button Element.

ElementElements are the main building block of all XML documents, containing the data and determine the structure of the instance document.

An element can be defined within an XSD as follows:

<xs:element name="x" type="y"/>

Each element definition within the XSD must have a 'name' property, which is the tag name that will appear in the XML document. The 'type' property provides the description of what type of data can be contained within the element when it appears in the XML document. There are a number of predefined simple types, such as xs:string, xs:integer, xs:boolean and xs:date (see XSD standard for a complete list). Elements of these simple data types are said to have a 'simple content model', whereas elements that can contain other elements are said to have a 'complex content model' and elements that can contain both have a 'mixed content model'. You can also create user defined types using the <xs:simpleType> and <xs:complexType> constructs, which we will cover later.

If we have set the type property for an element in the XSD, then the corresponding value in the XML document must be in the correct format for its given type otherwise this will cause a validation error when a validating parser attempts to parse the data from the XML document. Examples of simple elements and their XML data are shown below:

Sample XSD Sample XML
<xs:element name="Customer_dob" type="xs:date"/>
<Customer_dob
     2000-01-12T12:13:14Z
</Customer_dob>
<xs:element name="Customer_address" type="xs:string"/>
<Customer_address>
     99 London Road
</Customer_address>
<xs:element name="OrderID" type="xs:int"/>
<OrderID>
     5756 
</OrderID>
<xs:element name="Body" type="xs:string"/>
<Body> (a type can be defined as a string but not have any
content, this is not true of all data types however).</
Body>

The previous XSD definitions are shown graphically in Liquid XML Studio as follows:

The valid data values for the element in the XML document can be further constrained using the fixed and default properties.

Default means that if no value is specified in the XML document then the application reading the document, typically an XML parser or XML Data Binding Library, should use the default specified in the XSD.
Fixed means the value in the XML document can only have the value specified in the XSD.

For this reason it does not make sense to use both default and fixed in the same element definition, and is invalid to do so.

<xs:element name="Customer_name" type="xs:string" default="unknown"/>
<xs:element name="Customer_location" type="xs:string" fixed=" UK"/> 

Cardinality

Sometimes it is useful to add a constraint to allow an specific number of elements to appear at a specific point in an XML document, this is referred to as cardinality. The cardinality is specified using the minOccurs and maxOccurs attributes, and allows an element to be specified as mandatory, optional, or can appear up to a set number of times. The default values for minOccurs and maxOccurs is 1. Therefore, if both the minOccurs and maxOccurs attributes are absent, as in all the previous examples, the element must appear once and once only.

'minOccurs' can be assigned any non-negative integer value (e.g. 0, 1, 2, 3... etc.), and 'maxOccurs' can be assigned any non-negative integer value or the special string constant "unbounded" meaning there is no maximum so the element can occur an unlimited number of times.

Sample XSD Description
<xs:element name="Customer_dob"
            type="xs:date"/>
If we do not specify minOccurs or maxOccurs, then the default values of 1 are used. This means there has to be one and only one occurrence of Customer_dob, i.e. it is mandatory.
<xs:element name="Customer_order"
            type="xs:integer"
            minOccurs ="0"
            maxOccurs="unbounded"/>
If we set minOccurs to 0, then the element is optional. Here, a customer can have from 0 to an unlimited number of Customer_orders.
<xs:element name="Customer_hobbies"
            type="xs:string"
            minOccurs="2"
            maxOccurs="10"/>
Setting both minOccurs and maxOccurs means the element Customer_hobbies must appear at least twice, but no more than 10 times.

 These XSD definitions can be shown graphically in Liquid XML Studio as follows:

 

Simple Types

Tip: To add an Element in the Liquid XML Studio graphical XSD view, select menu item Edit->Add Child->Simple Type (Ctrl+Shift+S) or select the toolbar button Simple Type.

Simple TypeA simple type extends the built in data types such as xs:string, xs:integer, and xs:date, allowing you to create your own data types.

Examples of this are:

  • Defining an ID, this may be an integer with a maximum value limit.
  • A Postcode or Zip code could be restricted to ensure it is the correct length and complies with a regular expression.
  • Defining a field to have a maximum length.

Creating you own types is coved more thoroughly in the next section.

Complex Types

Tip: To add an Element in the Liquid XML Studio graphical XSD view, select menu item Edit->Add Child->Complex Type (Ctrl+Shift+C) or select the toolbar button Simple Type.

Complex TypeA complex type is a container for other element definitions, this allows you to specify which child elements an element can contain. This allows you to provide some structure within your XML documents.

Here are some simple element definitions:

<xs:element name="Customer" type="xs:string"/> 
<xs:element name="Customer_dob" type="xs:date"/> 
<xs:element name="Customer_address" type="xs:string"/>
  
<xs:element name="Supplier" type="xs:string"/> 
<xs:element name="Supplier_phone" type="xs:integer"/> 
<xs:element name="Supplier_address" type="xs:string"/> 

We can see that some of these elements should really be represented as child elements, "Customer_dob" and "Customer_address" belong to a parent element – "Customer". While "Supplier_phone" and "Supplier_address" belong to a parent element "Supplier". We can therefore re-write this in a more structured way:

<xs:element name="Customer">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Dob" type="xs:date" />
            <xs:element name="Address" type="xs:string" /> 
        </xs:sequence
    </xs:complexType>
</xs:element
  
<xs:element name="Supplier">
    <xs:complexType>
        <xs:sequence
            <xs:element name="Phone" type="xs:integer"/> 
            <xs:element name="Address" type="xs:string"/>
        </xs:sequence>
    </xs:complexType
</xs:element>

The previous XSD definitions are shown graphically in Liquid XML Studio as follows:

What's changed?

  • We created a definition for an element called "Customer".
  • Inside the <xs:element> definition we added a <xs:complexType>. This is a container for other <xs:element> definitions, allowing us to build a simple hierarchy of elements in the resulting XML document.
  • Note the contained elements for "Customer" and "Supplier" do not have a type specified as they do not extend or restrict an existing type, they are a new definition built from scratch.
  • The <xs:complexType> element contains another new element <xs:sequence>, but more on these in a minute.
  • The <xs:sequence> in turn contains the definitions for the two child elements "Dob" and "Address". Note the customer/supplier prefix has been removed as it is implied from its position within the parent element "Customer" or "Supplier".

So in plain English this is saying we can have an XML document that contains an element <Customer> which must have two child elements <Dob> and <Address>.

Example XML

<Customer>
    <Dob> 2000-01-12T12:13:14Z </Dob>
    <Address> 34 thingy street, someplace, sometown, w1w8uu </Address>
</Customer
  
<Supplier
    <Phone>0123987654</Phone>
    <Address>22 whatever place, someplace, sometown, ss1 6gy </Address
</Supplier>

Compositors

There are three types of compositors <xs:sequence>, <xs:choice> and <xs:all>. These compositors allow us to determine how the child elements contained within them will appear within the XML document.

Compositor Description
Sequence The child elements in the XML document MUST appear in the order they are declared in the XSD schema.
Choice Only one of the child elements described in the XSD schema can appear in the XML document.
All The child elements described in the XSD schema can appear in the XML document in any order.

Notes

The compositors <xs:sequence> and <xs:choice> can be nested inside other compositors, and be given there own minOccurs and maxOccurs properties. This allows for quite complex combinations to be formed.

Example

The definitions of "Customer->Address" and "Supplier->Address" are currently not very usable as they are grouped into a single field. In the real world it would be better break this out into a few fields. Let's fix this by breaking it out using the same technique shown above:

<xs:element name="Customer">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="Dob" type="xs:date" />
      <xs:element name="Address">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="Line1" type="xs:string" />
            <xs:element name="Line2" type="xs:string" />
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:sequence>
  </xs:complexType>
</xs:element>
<xs:element name="Supplier">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="Phone" type="xs:integer" />
      <xs:element name="Address">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="Line1" type="xs:string" />
            <xs:element name="Line2" type="xs:string" />
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:sequence>
  </xs:complexType>
</xs:element>

The previous XSD definitions are shown graphically in Liquid XML Studio as follows:

This is much better, but we now have two definitions for address, which are the identical.

Global Types

It would make much more sense to have a single definition for "Address", which could then be used by both customer and supplier.
We can do this by defining a complexType independently of an element, and giving it a unique name:

<xs:complexType name="AddressType">
    <xs:sequence>
        <xs:element name="Line1" type="xs:string"/> 
        <xs:element name="Line2" type="xs:string"/>
    </xs:sequence
</xs:complexType>

The previous XSD definitions are shown graphically in Liquid XML Studio as follows:

We have now defined a <xs:complexType> that describes our representation of an address, so let's use it.
Earlier, when we started looking at elements, we said you could define your own types instead of using one of the standard ones (xs:string, xs:integer), and that is exactly what were now doing.

<xs:element name="Customer">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Dob" type="xs:date"/> 
            <xs:element name="Address" type="AddressType"/>
        </xs:sequence
    </xs:complexType
</xs:element
  
<xs:element name="supplier">
    <xs:complexType>
        <xs:sequence
            <xs:element name="address" type="AddressType"/>
            <xs:element name="phone" type="xs:integer"/>
        </xs:sequence>
    </xs:complexType
</xs:element

The previous XSD definitions are shown graphically in Liquid XML Studio as follows:

Hopefully, the advantages are obvious. Instead of having to define Address twice (once for Customer and once for Supplier) we now have a single definition. This makes maintenance simpler, i.e. if you decide to add "Line3" or "Postcode" elements to your address you only have to add them in one place.

Example XML

<Customer>
    <Dob> 2000-01-12T12:13:14Z </Dob>
    <Address
        <Line1>34 thingy street, someplace</Line1
        <Line2>sometown, w1w8uu </Line2>
    </Address>
</Customer
  
<Supplier
    <Phone>0123987654</Phone>
    <Address
        <Line1>22 whatever place, someplace</Line1
        <Line2>sometown, ss1 6gy </Line2>
    </Address>
</Supplier>

Note: Only complex types defined globally (as children of the <xs:schema> element can have their own name and be re-used throughout the schema). If they are defined inline within an <xs:element> they can not have a name (anonymous) and can not be reused elsewhere.

Attributes

Tip: To add an Element in the Liquid XML Studio graphical XSD view, select menu item Edit->Add Child->Attribute or select the toolbar button Attribute.

AttributeAn attribute provides extra information within an element. Attributes have name and type properties and are defined within an XSD as follows:

<xs:attribute name="x" type="y"/>

An Attribute can appear 0 or 1 times within a given element in the XML document. Attributes are either optional or mandatory (by default they are optional). The "use" property in the XSD definition is used to specify if the attribute is optional or mandatory.

So the following are equivalent:

<xs:attribute name="ID" type="xs:string"/>
<xs:attribute name="ID" type="xs:string" use="optional"/>

The previous XSD definitions are shown graphically in Liquid XML Studio as follows

To specify that an attribute must be present, use = "required" (Note: use may also be set to "prohibited", but we'll come to that later).

An attribute is typically specified within the XSD definition for an element, nesting the attribute in the element. Attributes can also be specified globally and then referenced (but more about this later).

Sample XSD Sample XML
<xs:element name="Order">
    <xs:complexType>
        <xs:attribute name="OrderID"
                type="xs:int"/>
    </xs:complexType>
</xs:element>

 

<Order OrderID="6"/>
or
<Order/>
<xs:element name="Order">
    <xs:complexType>
       <xs:attribute name="OrderID"
               type="xs:int"
               use="optional"/>
    </xs:complexType>
</xs:element>

<Order OrderID="6"/> 
or

<Order/>
<xs:element name="Order">
    <xs:complexType>
        <xs:attribute name="OrderID"
                type="xs:int"
                use="required"/>
    </xs:complexType>
</xs:element>

<Order OrderID="6"/>

 

The default and fixed attributes can be specified within the XSD attribute specification (in the same way as they are for elements).

Mixed Content

So far we have seen how an element can contain data, other elements and attributes. Elements can also contain a combination of all of these. You can also mix elements and data. You can specify this in the XSD schema by setting the mixed property.

<xs:element name="MarkedUpDesc">
    <xs:complexType mixed="true">
        <xs:sequence>
            <xs:element name="Bold" type="xs:string" />
            <xs:element name="Italic" type="xs:string" />
        </xs:sequence>
    </xs:complexType>
</xs:element>

A sample XML document could look like this:

<MarkedUpDesc>
    This is an <Bold>Example</Bold> of <Italic>Mixed</Italic> Content, 
    Note there are elements mixed in with the elements data.
</MarkedUpDesc>

 

Next >>

 

 

 XSD Tutorial Parts

1 - Defining Elements and Attributes
2 - Best Practices, Conventions & Recommendations
3 - Extending Existing Types
4 - Using XML Schema Namespaces
5 - Group, AttributeGroup, Any and AnyAttribute
XSD Editor Video Demo

This XSD tutorial was created using Liquid XML Studio.

The XSD standard is complex, and without a graphical tool it is difficult to understand. For that reason it is recommended that you install the Free 30 day trial.

The graphical schema editor, split screen editing, XSD/XML validation all make it much easier to learn how to design and work with XML Schemas.

Getting Started

Liquid XML Studio makes it easy to learn new XML technologies.

Download the free 30 day trial now and get started the smart way!