Liquid XML Studio
XSD Tutorial - Part 1 - Elements and Attributes
Send Feedback
Tutorials > W3C XSD Schema Tutorial > XSD Tutorial - Part 1 - Elements and Attributes

Glossary Item Box

This article gives a basic overview of the building blocks of XML Schemas and how to use them. It covers:

Overview

First lets look at what an XML schema is. A schema formally describes what a given XML document contains, in the same way a database schema describes the data that can be contained in a database (table structure, data types). An XML schema describes the coarse shape of the XML document, what fields an element can contain, which sub elements it can contain etc, it can also describe the values that can be placed into any element or attribute.

A Note About standards

DTD was the first formalized standard, but is rarely used anymore.
XDR was an early attempt but Microsoft to provide a more comprehensive standard than DTD. This standard has pretty much been abandoned now in favor of XSD.
XSD is currently the de facto standard for describing XML documents. There are 2 versions in use 1.0 and 1.1, which are on the whole the same (you have to dig quite deep before you notice the difference). An XSD schema is itself an XML document, there is even an a XSD schema to describe the XSD standard.
There are also a number of other standards but there take up has been patchy at best.

The XSD standard has evolved over a number of years, and is controlled by the W3C. It is extremely comprehensive, and as a result has become rather complex. For this reason it is a good idea to make use of design tools when working with XSD's (See XML Studio, a FREE XSD development tool), also when working with XML documents programmatically XML Data Binding is a much easier way to manipulate your documents (a object oriented approach see Liquid XML Data Binding).

The remainder of this tutorial guides you through the basics of the XSD standard, things you should really know even if your using a design tool like Liquid XML Studio.

Elements

Elements are the main building block of any XML document, they contain the data and determine the structure of the document. An element can be defined within an XML Schema (XSD) as follows:

<xs:element name="x" type="y"/>

An element definition within the XSD must have a name property, this is the name that will appear in the XML document. The type property provides the description of what can be contained within the element when it appears in the XML document. There are a number of predefined types, such as xs:string, xs:integer, xs:boolean or xs:date (see XSD standard for a complete list). You can also create a user defined type using the <xs:simple type> and <xs:complexType> tags, but more on these later.

If we have set the type property for an element in the XSD, then the corresponding value in the XML document must be in the correct format for its given type (failure to do this will cause a validation error). Examples of simple elements and their XML are below:

Sample XSD Sample XML
<xs:element name="Customer_dob"
                    type="xs:date"/>
<Customer_dob>
     2000-01-12T12:13:14Z
</Customer_dob>
<xs:element name="Customer_address"
                    type="xs:string"/>
<Customer_address>
     99 London Road
</Customer_address>
<xs:element name="OrderID"
                    type="xs:int"/>
<OrderID>
     5756
</OrderID>
<xs:element name="Body" type="xs:string"/> <Body> (a type can be defined as a string but not have any content, this is not true of all data types however).</Body>


The previous XSD definitions are shown graphically in Liquid XML Studio as follows

The value the element takes in the XML document can further be affected using the fixed and default properties.

Default means that if no value is specified in the XML document then the application reading the document (typically an XML parser or XML Data binding Library) should use the default specified in the XSD.
Fixed means the value in the XML document can only have the value specified in the XSD.
For this reason it does not make sense to use both default and fixed in the same element definition (in fact its illegal to do so)..

<xs:element name="Customer_name" type="xs:string" default="unknown"/>
<xs:element name="Customer_location" type="xs:string" fixed=" UK"/> 

Cardinality

Specifying how many times an element can appear is referred to as cardinality, and is specified using the attributes minOccurs and maxOccurs. In this way an element can be mandatory, optional, or appear many times. minOccurs can be assigned any non-negative integer value (e.g. 0, 1, 2, 3... etc.), and maxOccurs can be assigned any non-negative integer value or the string constant "unbounded" meaning no maximum.
The default values for minOccurs and maxOccurs is 1 . So if both the minOccurs and maxOccurs attributes are absent, as in all the previous examples, the element must appear once and once only.

Sample XSD Description
<xs:element name="Customer_dob"
                    type="xs:date"/>
If we don’t specify minOccurs or maxOccurs, then the default values of 1 are used, so in this case there has to be one and only one occurrence of Customer_dob
<xs:element name="Customer_order"
                    type="xs:integer"
                    minOccurs ="0"
                    maxOccurs="unbounded"/>
Here, a customer can have any number of Customer_orders (even 0)
<xs:element name="Customer_hobbies"
                    type="xs:string"
                    minOccurs="2"
                    maxOccurs="10"/>
In this example, the element Customer_hobbies must appear at least twice, but no more than 10 times

The previous XSD definitions are shown graphically in Liquid XML Studio as follows

.

Simple Types 

So far we have touched on a few of the built in data types xs:string, xs:integer, xs:date. But you can also define your own types by modifying existing ones.

Examples of this would be;

Creating you own types is coved more thoroughly in the next section

Complex Types

A complex type is a container for other element definitions, this allows you to specify which child elements an element can contain. This allows you to provide some structure within your XML documents.

 Have a look at these simple elements:

<xs:element name="Customer" type="xs:string"/> 
<xs:element name="Customer_dob" type="xs:date"/> 
<xs:element name="Customer_address" type="xs:string"/>

<xs:element name="Supplier" type="xs:string"/> 
<xs:element name="Supplier_phone" type="xs:integer"/> 
<xs:element name="Supplier_address" type="xs:string"/> 

We can see that some of these elements should really be represented as child elements, "Customer_dob" and "Customer_address" belong to a parent element – "Customer". While "Supplier_phone" and "Supplier_address" belong to a parent element "Supplier". We can therefore re-write this in a more structured way:

<xs:element name="Customer">
        <xs:complexType>
            <xs:sequence> 
                <xs:element name="Dob" type="xs:date" />
                <xs:element name="Address" type="xs:string" /> 
            </xs:sequence> 
        </xs:complexType>
</xs:element> 

<xs:element name="Supplier">
    <xs:complexType>
        <xs:sequence> 
            <xs:element name="Phone" type="xs:integer"/> 
            <xs:element name="Address" type="xs:string"/>
        </xs:sequence>
    </xs:complexType> 
</xs:element>

The previous XSD definitions are shown graphically in Liquid XML Studio as follows

Example XML

<Customer>
    <Dob> 2000-01-12T12:13:14Z </Dob>
    <Address> 34 thingy street, someplace, sometown, w1w8uu </Address>
</Customer> 

<Supplier> 
    <Phone>0123987654</Phone>
    <Address>22 whatever place, someplace, sometown, ss1 6gy </Address> 
</Supplier>

What’s changed?

Let’s look at this in detail.

So in English this is saying we can have an XML document that contains an element <Customer> which must have 2 child elements <Dob> and <Address>.

Compositors

There are 3 types of compositors <xs:sequence>, <xs:choice> and <xs:all>. These compositors allow us to determine how the child elements within them appear within the XML document.

Compositor Description
Sequence The child elements in the XML document MUST appear in the order they are declared in the XSD schema.
Choice Only one of the child elements described in the XSD schema can appear in the XML document.
All The child elements described in the XSD schema can appear in the XML document in any order.

Notes
The compositors <xs:sequence> and <xs:choice> can be nested inside other compositors, and be given there own minOccurs and maxOccurs properties. This allows for quite complex combinations to be formed.

One step further… The definition of "Customer->Address" and "Supplier->Address" are currently not very usable as they are grouped into a single field. In the real world it would be better break this out into a few fields. Lets fix this by breaking it out using the same technique shown above:

  <xs:element name="Customer">
<xs:complexType>
<xs:sequence>
<xs:element name="Dob" type="xs:date" />
<xs:element name="Address">
<xs:complexType>
<xs:sequence>
<xs:element name="Line1" type="xs:string" />
<xs:element name="Line2" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>

<xs:element name="Supplier">
<xs:complexType>
<xs:sequence>
<xs:element name="Phone" type="xs:integer" />
<xs:element name="Address">
<xs:complexType>
<xs:sequence>
<xs:element name="Line1" type="xs:string" />
<xs:element name="Line2" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>

The previous XSD definitions are shown graphically in Liquid XML Studio as follows

This is much better, but we now have 2 definitions for address, which are the same.

Re-use

It would make much more sense to have 1 definition of "Address", that could be used by both customer and supplier.
We can do this by defining a complexType independently of an element, and giving it a unique name :

<xs:complexType name="AddressType">
    <xs:sequence>
        <xs:element name="Line1" type="xs:string"/> 
        <xs:element name="Line2" type="xs:string"/>
    </xs:sequence> 
</xs:complexType> 

The previous XSD definitions are shown graphically in Liquid XML Studio as follows

We have now defined a <xs:complexType> that describes our representation of an address, so lets use it.
Remember when we started looking at elements and we said you could define your own type instead of using one of the standard ones (xs:string, xs:integer), well that's exactly what were doing now.

<xs:element name="Customer">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Dob" type="xs:date"/> 
            <xs:element name="Address" type="AddressType"/>
        </xs:sequence> 
    </xs:complexType> 
</xs:element> 

<xs:element name="supplier">
    <xs:complexType>
        <xs:sequence> 
            <xs:element name="address" type="AddressType"/>
            <xs:element name="phone" type="xs:integer"/>
        </xs:sequence>
    </xs:complexType> 
</xs:element> 

The previous XSD definitions are shown graphically in Liquid XML Studio as follows

The advantage should be obvious, instead of having to define Address twice (once for Customer and once for Supplier) we have a single definition. This makes maintenance simpler i.e. if you decide to add "Line3" or "Postcode" elements to your address you only have to add them in one place.

Example XML

<Customer>
    <Dob> 2000-01-12T12:13:14Z </Dob>
    <Address> 
        <Line1>34 thingy street, someplace</Line1> 
        <Line2>sometown, w1w8uu </Line2>
        </Address>
</Customer> 

<Supplier> 
    <Phone>0123987654</Phone>
    <Address> 
        <Line1>22 whatever place, someplace</Line1> 
        <Line2>sometown, ss1 6gy </Line2>
        </Address>
</Supplier>

Note: Only complex types defined globally (as children of the <xs:schema> element can have their own name and be re-used throughout the schema). If they are defined inline within an <xs:element> they can not have a name (anonymous) and can not be reused elsewhere.

Attributes

An attribute provides extra information within an element. Attributes are defined within an XSD as follows, having name and type properties.

<xs:attribute name="x" type="y"/>

An Attribute can appear 0 or 1 times within a given element in the XML document. Attributes are either optional or mandatory (by default the are optional). The " use" property in the XSD definition is used to specify if the attribute is optional or mandatory.

So the following are equivalent

<xs:attribute name="ID" type="xs:string"/>
<xs:attribute name="ID" type="xs:string" use="optional"/>

The previous XSD definitions are shown graphically in Liquid XML Studio as follows

To specify that an attribute must be present, use = "required" (Note use may also be set to "prohibited", but we'll come to that later).

An attribute is typically specified within the XSD definition for an element, this ties the attribute to the element. Attributes can also be specified globally and then referenced (but more about this later).

Sample XSD Sample XML

<xs:element name="Order">
    <xs:complexType>
        <xs:attribute name="OrderID"
                               type="xs:int"/>
    </xs:complexType>
</xs:element>

<Order OrderID="6"/>
or
<Order/>
<xs:element name="Order">
    <xs:complexType>
       <xs:attribute name="OrderID"
                              type="xs:int"
                              use="optional"/>
    </xs:complexType>
</xs:element>
<Order OrderID="6"/>
or
<Order/>
<xs:element name="Order">
    <xs:complexType>
        <xs:attribute name="OrderID"
                              type="xs:int"
                              use="required"/>
    </xs:complexType>
</xs:element>
<Order OrderID="6"/>

The default and fixed attributes can be specified within the XSD attribute specification (in the same way as they are for elements).

Mixed Element Content

So far we have seen how an element can contain data, other elements or attributes. Elements can also contain a combination of all of these. You can also mix elements and data. You can specify this in the XSD schema by setting the mixed property.

<xs:element name="MarkedUpDesc">
    <xs:complexType mixed="true">
        <xs:choice minoccurs="0" maxoccurs="unbounded">
            <xs:element name="Bold" type="xs:string" />
            <xs:element name="Italic" type="xs:string" />
        </xs:choice>
    </xs:complexType>
</xs:element> 

A sample XML document could look like this.

<MarkedUpDesc>
This is an <Bold>Example</Bold> of <Italic>Mixed</Italic> Content,
Note there are elements mixed in with the elements data.
</MarkedUpDesc>