XML Schema Tutorial - Part 1 XML Schema Tutorial - Part 1

Home > XML Tutorials > Part 1 - XSD Elements and Attributes

Defining Elements and Attributes

This article gives an overview of the basic building blocks of XML Schemas and how to use them.

XML schema (XSD) Overview

An XML schema, commonly known as an XML Schema Definition (XSD), formally describes what a given XML document can contain, in the same way that a database schema describes the data that can be contained in a database (i.e. table structure, data types, constraints etc.). The XML schema defines the shape, or structure, of an XML document, along with rules for data content and semantics such as what fields an element can contain, which sub elements it can contain and how many items can be present. It can also describe the type and values that can be placed into each element or attribute. The XML data constraints are called facets and include rules such as min and max length.

This tutorial guides you through the basics of the XSD standard and the examples use the graphical XML Integrated Development Environment (IDE) Liquid Studio.

XML Schema Standards

  • XML Schema Definition (XSD) is currently the de facto standard for describing XML documents and is the XML Schema standard we will concentrate on in this tutorial. XSD is controlled by the World Wide Web Consortium (W3C). An XSD is itself an XML document, and there is even an XSD to describe the XSD standard.
  • Document Type Definition (DTD) was the first formalized standard but has now, in most cases, been superseded by XSD.
  • XML Data Reduced (XDR) was an early attempt but Microsoft to provide a more comprehensive standard than DTD. This standard has been phased out in the Microsoft products in favour of XSD.
  • There are also a number of other schema standards such as Schematron and RELAX NG.

XML Design Tools

The XSD standard has evolved over a number of years, and is extremely comprehensive and as a result has become rather complex. For this reason it is a good idea to make use of a graphical XSD design tool when working with XSDs.

Liquid StudioLiquid Studio is an advanced graphical XML editor containing all the tools needed for designing, developing and testing XML applications complying with the W3C standards. Features include an XML Editor, XML Schema Editor, XML Data Mapper, XPath and XQuery Debugger, WSDL Editor, Web Service Tools, integration with Microsoft Visual Studio and much more.

XML Development Tools

For those who wish to programmatically work with XML documents, XML Data Binding is a much easier way to manipulate your documents using an object oriented approach to enforce the XML schema rules and constraints.

Liquid StudioLiquid XML Data Binder is an advanced XML toolkit and code generator that will save you many hours of repetitive coding by allowing you to treat your XML documents as an object model within your C++, C#, Java, Silverlight or Visual Basic source code. The easy to use Wizard driven interface also generates HTML documentation for your custom API along with a Sample Application.

Try Liquid XML Free and see how we can help you today Free Trial

Defining XML Elements

Elements are the main building block of all XML documents, containing the data and determine the structure of the instance document.

An element can be defined within an XSD as follows:

<xs:element name="x"
            type="y" />

Each element definition within the XSD must have a 'name' property, which is the tag name that will appear in the XML document. The 'type' property provides the description of what type of data can be contained within the element when it appears in the XML document. There are a number of predefined simple types, such as xs:string, xs:integer, xs:boolean and xs:date (see XSD standard for a complete list). Elements of these simple data types are said to have a 'simple content model', whereas elements that can contain other elements are said to have a 'complex content model' and elements that can contain both have a 'mixed content model'. You can also create user defined types using the and constructs, which we will cover later.

If we have set the type property for an element in the XSD, then the corresponding value in the XML document must be in the correct format for its given type otherwise this will cause a validation error when a validating parser attempts to parse the data from the XML document. Examples of simple elements and their XML data are shown below:

Sample XSD Sample XML
<xs:element name="Customer_dob"
            type="xs:date" />
<Customer_dob>
    2000-01-12T12:13:14Z
</Customer_dob>
<xs:element name="Customer_address"
            type="xs:string" />
<Customer_address>
    99 London Road
</Customer_address>
<xs:element name="OrderID"
            type="xs:int" />
<OrderID>
    5756
</OrderID>
<xs:element name="Body"
            type="xs:string" />
<Body></Body>
Note: A type can be defined as a string but not have
any content, this is not true for all data types.

The previous XSD definitions are shown graphically in Liquid Studio as follows:

Defining Elements

The valid data values for the element in the XML document can be further constrained using the fixed and default properties.

Default means that if no value is specified in the XML document then the application reading the document, typically an XML parser or XML Data Binding Library, should use the default specified in the XSD.

Fixed means the value in the XML document can only have the value specified in the XSD.

For this reason it does not make sense to use both default and fixed in the same element definition, and is invalid to do so.

<xs:element name="Customer_name"
            type="xs:string"
            default="unknown" />
<xs:element name="Customer_location"
            type="xs:string"
            fixed=" UK" /> 

Tip: To add an Element in the Liquid Studio graphical XSD view select menu item Edit->Add Child->Element (Ctrl+Shift+E) or select the toolbar button .

Try Liquid Studio and see how we can help you today Free Community Edition

Specifying Element Cardinality

It is possible to constrain the number of instances (cardinality) of an XML element that appear in an XML document. The cardinality is specified using the minOccurs and maxOccurs attributes, and allows an element to be specified as mandatory, optional, or can appear up to a set number of times. The default values for minOccurs and maxOccurs is 1. Therefore, if both the minOccurs and maxOccurs attributes are absent, as in all the previous examples, the element must appear once and once only.

minOccurs can be assigned any non-negative integer value (e.g. 0, 1, 2, 3... etc.), and maxOccurs can be assigned any non-negative integer value or the special string constant "unbounded" meaning there is no maximum so the element can occur an unlimited number of times.

Sample XSD Description
<xs:element name="Customer_dob"
            type="xs:date" />
If we do not specify minOccurs or maxOccurs, then the default values of 1 are used. This means there has to be one and only one occurrence of Customer_dob, i.e. it is mandatory.
<xs:element name="Customer_order"
             type="xs:integer"
             minOccurs ="0"
             maxOccurs="unbounded" />
If we set minOccurs to 0, then the element is optional. Here, a customer can have from 0 to an unlimited number of Customer_orders.
<xs:element name="Customer_hobbies"
            type="xs:string"
            minOccurs="2"
            maxOccurs="10" />
Setting both minOccurs and maxOccurs means the element Customer_hobbies must appear at least twice, but no more than 10 times.

These XSD definitions can be shown graphically in Liquid Studio as follows:

Specifying Element Cardinality

Defining Simple Types

It is possible to create a new simpleType by restricting an existing simpleType, allowing you to define your own data types. It is possible to restrict a built in type (xs:string, xs:integer, xs:date etc) or one of your own previously defined simpleType's

Examples uses:

  • Defining an ID, this may be an integer with a maximum value limit.
  • A Postcode or Zip code could be restricted to ensure it is the correct length and complies with a regular expression.
  • Defining a field to have a maximum length.

Creating you own types is covered more thoroughly in the Part 2 - Best Practices, Conventions and Recommendations.

Tip: To add a Simple Type in the Liquid Studio graphical XSD view, select menu item Edit->Add Child->Simple Type (Ctrl+Shift+S) or select the toolbar button .

Defining Complex Types

A xs:complexType provides the definition for an XML Element, it's specifies which element and attributes are permitted and the rules regarding where they can appear and how many times. They can be used inplace within an element definition or named and defined globally (but more about this later).

Examples use:

Here are some simple element definitions:

<xs:element name="Customer_dob"      type="xs:date" />
<xs:element name="Customer_address"  type="xs:string" />
<xs:element name="Supplier_phone"    type="xs:integer" />
<xs:element name="Supplier_address"  type="xs:string" />

We can see that some of these elements should really be represented as child elements, "Customer_dob" and "Customer_address" should belong to a parent element – "Customer". While "Supplier_phone" and "Supplier_address" should belong to a parent element "Supplier". We can therefore re-write this in a more structured way:

<xs:element name="Customer">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Dob" type="xs:date" />
            <xs:element name="Address" type="xs:string" />
        </xs:sequence>
    </xs:complexType>
</xs:element>
<xs:element name="Supplier">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Phone" type="xs:integer" />
            <xs:element name="Address" type="xs:string" />
        </xs:sequence>
    </xs:complexType>
</xs:element>

The previous XSD definitions are shown graphically in Liquid Studio as follows:

Defining Child Complex Types

What's changed?

  • We created a definition for an element called "Customer".
  • Inside the <xs:element> definition we added a <xs:complexType>. This is a container for other <xs:element> definitions, allowing us to build a simple hierarchy of elements in the resulting XML document.
  • Note the contained elements for "Customer" and "Supplier" do not have a type specified as they do not extend or restrict an existing type, they are a new definition built from scratch.
  • The <xs:complexType> element contains another new element <xs:sequence>, but more on these in a minute.
  • The <xs:sequence> in turn contains the definitions for the two child elements "Dob" and "Address". Note the customer/supplier prefix has been removed as it is implied from its position within the parent element "Customer" or "Supplier".

So in plain English this is saying we can have an XML document that contains an element <Customer> which must have two child elements <Dob> and <Address>.

Example XML

<Customer>
    <Dob>2000-01-12T12:13:14Z</Dob>
    <Address> 34 thingy street, someplace, sometown, ww1 8uu </Address>
</Customer>
<Supplier>
    <Phone>0123987654</Phone>
    <Address>22 whatever place, someplace, sometown, ss1 6gy </Address>
</Supplier>

Tip: To add a xs:complexType in the Liquid Studio graphical XSD view, select menu item Edit->Add Child->Complex Type (Ctrl+Shift+C) or select the toolbar button .

Try all the features of Liquid Studio Free Trial

Defining Compositors

Compositors provide rules that determine how and in what order there children can appear within XML document. There are three types of compositors <xs:sequence>, <xs:choice> and <xs:all>.

Compositor Description
Sequence The child elements in the XML document MUST appear in the order they are declared in the XSD schema.
Choice Only one of the child elements described in the XSD schema can appear in the XML document.
All The child elements described in the XSD schema can appear in the XML document in any order.

Notes

The compositors <xs:sequence> and <xs:choice> can be nested inside other compositors, and be given there own minOccurs and maxOccurs properties. This allows for quite complex combinations to be formed.

Example

The definitions of "Customer->Address" and "Supplier->Address" are currently not very usable as they are grouped into a single field. In the real world it would be better break this out into a few fields. Let's fix this by breaking it out using the same technique shown above:

<xs:element name="Customer">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Dob" type="xs:date" />
            <xs:element name="Address">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="Line1" type="xs:string" />
                        <xs:element name="Line2" type="xs:string" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>
<xs:element name="Supplier">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Phone" type="xs:integer" />
            <xs:element name="Address">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="Line1" type="xs:string" />
                        <xs:element name="Line2" type="xs:string" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>

The previous XSD definitions are shown graphically in Liquid Studio as follows:

Defining Compositors

This is much better, but we now have two definitions for address, which are the identical.

Liquid Studio XML the Smart Way™ Free Community Edition

Defining Global Complex Types

A xs:complexType can also defined globally and given a name. Named xs:complexTypes can then be re-used throughout the schema, either referenced directly or used as the basis to define other xs:complexTypes. This makes it possible to build more object oriented data structures that are easier to work with and manage.

Looking at our example again, it would make much more sense to have a single definition for "Address", which could then be used by both customer and supplier. We can do this by defining a global (named) xs:complexType:

<xs:complexType name="AddressType">
    <xs:sequence>
        <xs:element name="Line1" type="xs:string" />
        <xs:element name="Line2" type="xs:string" />
    </xs:sequence>
</xs:complexType>

The previous XSD definitions are shown graphically in Liquid Studio as follows:

Defining Global Complex Types

We have now defined a <xs:complexType> that describes our representation of an address, so let's use it. Earlier, when we started looking at elements, we said you could define your own types instead of using one of the standard types such as xs:string or xs:integer, and that is exactly what were now doing.

<xs:element name="Customer">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Dob" type="xs:date" />
            <xs:element name="Address" type="AddressType" />
        </xs:sequence>
    </xs:complexType>
</xs:element>
<xs:element name="Supplier">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Phone" type="xs:integer" />
            <xs:element name="Address" type="AddressType" />
        </xs:sequence>
    </xs:complexType>
</xs:element>

The previous XSD definitions are shown graphically in Liquid Studio as follows:

Referencing Global Complex Types

Hopefully, the advantages are obvious. Instead of having to define Address twice (once for Customer and once for Supplier) we now have a single definition. This makes maintenance simpler, i.e. if you decide to add "Line3" or "Postcode" elements to your address you only have to add them in one place.

Example XML

<Customer>
    <Dob>2000-01-12T12:13:14Z</Dob>
    <Address>
        <Line1>34 thingy street, someplace</Line1>
        <Line2>sometown, ww1 8uu</Line2>
    </Address>
</Customer>
<Supplier>
    <Phone>0123987654</Phone>
    <Address>
        <Line1>22 whatever place, someplace</Line1>
        <Line2>sometown, ss1 6gy</Line2>
    </Address>
</Supplier>

Notes

Note: Only complex types defined globally (as children of the <xs:schema> element can have their own name and be re-used throughout the schema). If they are defined inline within an <xs:element> they can not have a name (anonymous) and can not be reused elsewhere.

Learn faster with Liquid XML Free Community Edition Download Now!

Defining XML Attributes

An attribute provides extra information within an element. Attributes have name and type properties and are defined within an XSD as follows:

<xs:attribute name="x" type="y" />

An Attribute can appear 0 or 1 times within a given element in the XML document. Attributes are either optional or mandatory (by default they are optional). The "use" property in the XSD definition is used to specify if the attribute is optional or mandatory.

So the following are equivalent:

<xs:attribute name="ID"  type="xs:string" />
<xs:attribute name="ID"  type="xs:string"  use="optional" />

The previous XSD definitions are shown graphically in Liquid Studio as follows:

Defining Attributes

To specify that an attribute must be present, use="required" (Note: use may also be set to "prohibited", but we'll come to that later).

An attribute is specified within a xs:complexType, the type information for the attribute comes from a xs:simpleType (either defined inline or via a reference to a built in or user defined xs:simpleType definition). The Type information describes the data the attribute can contain in the XML document, i.e. string, integer, date etc. Attributes can also be specified globally and then referenced (but more about this later).

Sample XSD Sample XML
<xs:element name="Order">
    <xs:complexType>
        <xs:attribute name="OrderID"
                      type="xs:int" />
    </xs:complexType>
</xs:element>
<Order OrderID="6" />
- or no attribute -
<Order />
<xs:element name="Order">
    <xs:complexType>
        <xs:attribute name="OrderID"
                      type="xs:int"
                      use="optional" />
    </xs:complexType>
</xs:element>
<Order OrderID="6" />
- or no attribute -
<Order />
<xs:element name="Order">
    <xs:complexType>
        <xs:attribute name="OrderID"
                      type="xs:int"
                      use="required" />
    </xs:complexType>
</xs:element>
<Order OrderID="6" />

The default and fixed attributes can be specified within the XSD attribute specification (in the same way as they are for elements).

Tip: To add an Attribute in the Liquid Studio graphical XSD view, select menu item Edit->Add Child->Attribute or select the toolbar button .

Learn faster with Liquid XML Free Community Edition Download Now!

XML Element Mixed Content

So far we have seen how an element can contain data, other elements and attributes. Elements can also contain a combination of all of these. You can also mix elements and data. You can specify this in the XSD schema by setting the mixed property.

<xs:element name="MarkedUpDesc">
    <xs:complexType mixed="true">
        <xs:sequence>
            <xs:element name="Bold" type="xs:string" />
            <xs:element name="Italic" type="xs:string" />
        </xs:sequence>
    </xs:complexType>
</xs:element>

A sample XML document could look like this:

<MarkedUpDesc>
        This is an <Bold>Example</Bold> of <Italic>Mixed</Italic> Content,
        Note there are elements mixed in with the elements data.
</MarkedUpDesc>

Notes

Mixed content works well for some types of data (HTML being the obvious example), but is very difficult to work with pragmatically and it will cause issues with may productivity tools. Its very rarely needed (and not seen at all in most XSD standards), so if you find yourself thinking of using it, I suggest you examine your design again and make sure you really do!

< Prev | 1 | 2 | 3 | 4 | 5 | Next >

Try Liquid Studio and see how we can help you today Free Community Edition