Tuesday, March 14, 2017

XML Schema Primer

1 Introduction

Basic Concepts: The Purchase Order (§2) covers the basic mechanisms of XML Schema. It describes how to declare the elements and attributes that appear in XML documents, the distinctions between simple and complex types, defining complex types, the use of simple types for element and attribute values, schema annotation, a simple mechanism for re-using element and attribute definitions, and nil values.

Advanced Concepts I: Namespaces, Schemas & Qualification (§3), explains the basics of how namespaces are used in XML and schema documents.

2 Basic Concepts: The Purchase Order

The purpose of XML schemas

  • a schema defines a class of XML documents (is a description of an XML document)
  • an instance document is an XML document that conforms to a particular schema

The instance document, the po.xml file, describes a purchase order that may be generated by a product ordering application.

Example
The Purchase Order, po.xml
<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
   <shipTo country="US">
      <name>Alice Smith</name>
      <street>123 Maple Street</street>
      <city>Mill Valley</city>
      <state>CA</state>
      <zip>90952</zip>
   </shipTo>
   <billTo country="US">
      <name>Robert Smith</name>
      <street>8 Oak Avenue</street>
      <city>Old Town</city>
      <state>PA</state>
      <zip>95819</zip>
   </billTo>
   <comment>Hurry, my lawn is going wild!</comment>
   <items>
      <item partNum="872-AA">
         <productName>Lawnmower</productName>
         <quantity>1</quantity>
         <USPrice>148.95</USPrice>
         <comment>Confirm this is electric</comment>
      </item>
      <item partNum="926-AA">
         <productName>Baby Monitor</productName>
         <quantity>1</quantity>
         <USPrice>39.98</USPrice>
         <shipDate>1999-05-21</shipDate>
      </item>
   </items>
</purchaseOrder>

The purchase order's elements have simple types or complex types

  • the purchase order consists of a main element purchaseOrder and the subelements {shipTo, billTo, comment, items}.
  • subelements can contain other subelements or data
  • elements that contain subelements or carry attributes are said to have complex types
  • elements that contain numbers, strings, dates and no subelements are said to have simple types, attributes always have simple types

Where to find the definitions of the types in the instance document

  • the complex types in the instance document are defined in the schema for purchase orders
  • the simple types in the instance document are defined either in the schema for purchase orders or are part of the built-in simple types of XML Schema

What is the association between the instance document and the the purchase order schema?

  • an instance document does not need to refer to a schema
  • the purchase order does not reference the purchase order schema

2.1 The Purchase Order Schema

The purchase order schema is contained in the file po.xsd:

Example
The Purchase Order Schema, po.xsd
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

  <xsd:element name="purchaseOrder" type="PurchaseOrderType"/>

  <xsd:element name="comment" type="xsd:string"/>

  <xsd:complexType name="PurchaseOrderType">
    <xsd:sequence>
      <xsd:element name="shipTo" type="USAddress"/>
      <xsd:element name="billTo" type="USAddress"/>
      <xsd:element ref="comment" minOccurs="0"/>
      <xsd:element name="items"  type="Items"/>
    </xsd:sequence>
    <xsd:attribute name="orderDate" type="xsd:date"/>
  </xsd:complexType>

  <xsd:complexType name="USAddress">
    <xsd:sequence>
      <xsd:element name="name"   type="xsd:string"/>
      <xsd:element name="street" type="xsd:string"/>
      <xsd:element name="city"   type="xsd:string"/>
      <xsd:element name="state"  type="xsd:string"/>
      <xsd:element name="zip"    type="xsd:decimal"/>
    </xsd:sequence>
    <xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>
  </xsd:complexType>

  <xsd:complexType name="Items">
    <xsd:sequence>
      <xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
        <xsd:complexType>
          <xsd:sequence>
            <xsd:element name="productName" type="xsd:string"/>
            <xsd:element name="quantity">
              <xsd:simpleType>
                <xsd:restriction base="xsd:positiveInteger">
                  <xsd:maxExclusive value="100"/>
                </xsd:restriction>
              </xsd:simpleType>
            </xsd:element>
            <xsd:element name="USPrice"  type="xsd:decimal"/>
            <xsd:element ref="comment"   minOccurs="0"/>
            <xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
          </xsd:sequence>
          <xsd:attribute name="partNum" type="SKU" use="required"/>
        </xsd:complexType>
      </xsd:element>
    </xsd:sequence>
  </xsd:complexType>

  <xsd:simpleType name="SKU">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="\d{3}-[A-Z]{2}"/>
    </xsd:restriction>
  </xsd:simpleType>

</xsd:schema>

What elements constitute the purchase order schema?

  • the purchase order schema consists of a schema element and many subelements, especially element, complexType and simpleType
  • the schema's subelements determines the appearence of elements and content in instance documents

The po schema qualifies the names of the elements using a namespace

  • the schema element contains a standard namespace declaration:
    • <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    • the namespace declaration associates the xsd: prefix with the XML namespace identified by the URI http://www.w3.org/2001/XMLSchema
  • the xsd: prefix appears on the schema elements and on the names of the built-in simple types
  • the purpose of this association is to identify the elements and the simple types as belonging to the vocabulary of the XML Schema languange rather than the vocabulary of the author of the schema

2.2 Complex Type Definitions, Element & Attribute Declarations

        2.2.1 Occurrence Constraints
        2.2.2 Global Elements & Attributes
        2.2.3 Naming Conflicts

Complex types vs simple types:

  • simple types cannot have element content and attribute
  • complex types allow element as content and may carry attributes

Definitions vs declarations:

  • definitions create new types, both simple and complex
  • declarations enable elements and attributes with specific names and types to appear in document instances

The aim of the chapter is defining new complex types and declaring the elements and attributes that appear within them

How to define new complex types

  • to define complex types use the xsd:complexType element and include a set of element declarations, element references and attribute declarations.
  • elements are declared using the xsd:element element, and attributes are declared using the xsd:attribute element
  • the declarations within the complex type are not types but associations between a name and the constraints on that name specified by the associated schema.

USAddress complex type definition example:

  • the definition of the USAddress incorporates the declarations of five elements and one attribute
Example
Defining the USAddress Type
<xsd:complexType name="USAddress" >
  <xsd:sequence>
    <xsd:element name="name"   type="xsd:string"/>
    <xsd:element name="street" type="xsd:string"/>
    <xsd:element name="city"   type="xsd:string"/>
    <xsd:element name="state"  type="xsd:string"/>
    <xsd:element name="zip"    type="xsd:decimal"/>
  </xsd:sequence>
  <xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>
</xsd:complexType>

Consequence of the USAddress definition in instance documents

  • any element whose type is declared to be USAddress (for ex. shipTo and billTo) must consist of five elements and one attribute
  • these five elements must be called name, street, city, state and zip
  • these five elements must appear in the same sequence as they are declared
  • elements whose type is declared to be USAddress may carry an attribute called country which must contain the token "US"

PurchaseOrderType complex type definition example:

  • the USAddress definition contains element declarations involving only simple type
  • in contrast, the PurchaseOrderType definition contains element declarations involving the USAddress complex type
Example
Defining PurchaseOrderType
<xsd:complexType name="PurchaseOrderType">
  <xsd:sequence>
    <xsd:element name="shipTo" type="USAddress"/>
    <xsd:element name="billTo" type="USAddress"/>
    <xsd:element ref="comment" minOccurs="0"/>
    <xsd:element name="items"  type="Items"/>
  </xsd:sequence>
  <xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>

Consequence of the PurchaseOrderType complex type definition example:

  • the element declarations for names shipTo and billTo have the USAddress type.
  • as a consequence, each PurchaseOrderType element in the instance document must contain two elements named shipTo and billTo, each containing five subelements and carrying an attribute.

Attribute declaration example

  • the USAddress type definition contains the country attribute declaration of NMTOKEN simple type
  • the PurchaseOrderType type definition contains an orderDate attribute declaration indicating a date simple type
  • all attribute declaration must always reference simple types

Declaration that references another existing element

Example
<xsd:element ref="comment" minOccurs="0"/>

The PurchaseOrderType type definition contains the comment element declaration that refers to the comment element.

  • an element declaration associates a name with an existing type definition, but sometimes, instead of declaring a new element, it references an existing one.
  • in general, the value of the ref attributes must always reference a global element, that is one declared under the schema element.

2.2.1 Occurrence Constraints

How to constraint the number of occurrences of elements

  • the minOccurs and maxOccurs element declaration's attributes are used for occurrence constraint
  • <xsd:element name="name" (type="type"|ref="ref") minOccurs="0|m" maxOccurs="m|unbounded">
  • the minOccurs attribute determines the minimum number of occurrences an element may appear, this value may range from 0 to n
  • for example, the comment element is optional within PurchaseOrderType because it has minOccurs attribute equal to 0
  • the maxOccurs attribute determines the maximum number of occurrences an element may appear, its value is a number n or the term unbounded
  • the default values for the minOccurs and maxOccurs attributes is 1, so if both attributes are omitted the element must appear exactly once

How to constraint the number of occurrences of attributes

  • in general, an attribute may appear once or not at all but not a number of times like elements
  • the use attribute on attribute declarations is employed to constrain the number of occurrences
  • the use attribute indicates whether the attribute is required, optional or prohibited, the default value of use is optional
  • for example, the partNum attribute is declared as required

Default values for elements and attributes

  • the default attribute on element and attribute declarations is employed to indicate default values
  • but the schema processor treats defaulted attributes and elements slightly differently
  • default attribute on element declaration
    • if the element appears in the instance document without a content, the schema processor provides the default value
    • if the element does not appear in the instance document, the schema processor does not provide a value at all
    • if the element already has a value in the document, the default value does not apply
  • default attribute on attribute declaration, only when the attribute is optional
    • if the attribute does not appear in the instance document, the schema processor provides the default value
    • if the attribute already appears in the instance document, the default value does not apply
  • notice that default attribute values apply when attributes are missing and when elements are empty

Fixed values for elements and attributes

  • the fixed attribute is used in attribute and element declarations to ensure that the attributes and elements are set to particular values
  • NB: the concept of default value and fixed value are mutaully exclusive and a declaration cannot contain both default and fixed attributes
  • for example, po.xsd contains a declaration for the country attribute, with a fixed value US
    • if the attribute appear, its value must be US
    • if the attribute does not appear the schema processor will provide a country attribute with the value US

2.2.2 Global Elements & Attributes

Global elements and global attributes are created by declarations that appear as children of the schema element

  • global element declarations in po.xsd allow elements to appear at top-level of an instance document
    • for this reason the purchaseOrder element appear as the top-level elements in the instance document po.xml
  • global elements and global attributes can be referenced by multiple local declarations that use the ref attribute
  • for example, in the PurchaseOrderType complex type definition, the comment element declaration references the global comment declaration
    • this allows a comment element to appear in the instance document po.xml at the same level of the shipTo, billTo and items elements

Global element declarations warnings:

  • global declarations must use the type attribute, they cannot not contain the ref attribute
  • global declarations cannot use cardinality constraints: minOccurs, maxOccurs and use.

2.2.3 Naming Conflicts

Naming conflicts may occurr if we give two elements or types or attributes the same name.

When same names cause problems

  • a simple type and complex type with same name -> conflict
  • a type definition and element/attribute declaration with same name -> no conflict
  • two element declarations with same name within two different type definition -> no conflict
  • two types with same name in different namespace -> no conflict

2.3 Simple Types

        2.3.1 List Types
        2.3.2 Union Types

The purchase order schema declares many elements and attributes having simple types, you can distinguish:

  • simple types built in to XML Schema, for example string and decimal
  • simple types derived from the built-in types
    • for example the SKU type derived from the string built-in type

A list of the simple types built-in to XML Schema

Example
Simple types for strings
<xsd:element name="customer-string" type="xsd:string"/>
<xsd:element name="customer-normalized" type="xsd:normalizedString"/> (1)
<xsd:element name="customer-token" type="xsd:token"/> (2)

<customer-string>Joe Blogs</customer-string> 
<customer-normalized>Joe Blogs</customer-normalized>
<customer-token>Joe Blogs</customer-token>
(1) Newline, tab and carriage-return characters are converted to space characters before schema processing
(2) As normalizedString, space characters are collapsed to a single space character and leading and trailing spaces are removed before schema processing

Example
Numeric simple types
<xsd:element name="price-decimal" type="xsd:decimal"/>
<xsd:element name="price-integer" type="xsd:integer"/>
<xsd:element name="price-double" type="xsd:double"/>

<price-decimal>+999.5450</price-decimal> 
<price-integer>+999</price-integer>
<price-double>34.78E-2</price-double>

Example
Simple types for dates
<xsd:element name="birth-date" type="xsd:date"/>
<xsd:element name="wakeUp-time" type="xsd:time"/> 
<xsd:element name="my-datetime" type="xsd:datetime"/> 
<xsd:element name="rotationPeriod-duration" type="xsd:duration"/>

<birth-date>1970-01-24</birth-date>
<wakeUp-time>07:00:00</wakeUp-time>
<my-datetime>1970-01-24T06:00:00</my-datetime>
<rotationPeriod-duration>P24H</rotationPeriod-duration>

Example
Other simple types
<xsd:attribute name="sale" type="xsd:boolean"/>
<xsd:element name="blob" type="xsd:base64Binary"/> 
<xsd:element name="site-URI" type="xsd:anyURI"/> 

<price-integer sale="true">+999</price-integer>
<blob>Y2lhbyBtb25kbw==</blob>
<site-URI>https://www.ibm.com/support/knowledgecenter/</site-URI>

You can define new simple types by deriving them from existing simple types: built-ins simple types and derived simple types

Derive new simple types by restricting existing simple types:

  • the legal range of values for the new type is a subset of the existing types's range of values
  • use the simpleType element to define and name the new simple type
  • use the restriction element to indicate the existing (base) type and to identify the "facets" that constrain the range of values

Example: the myInteger simple type is derived by restiction from the built-in simple type integer, the base type

  • we constrain the values using two facets called minInclusive and maxInclusive
Example
Defining myInteger, Range 10000-99999
<xsd:simpleType name="myInteger">
  <xsd:restriction base="xsd:integer">
    <xsd:minInclusive value="10000"/>
    <xsd:maxInclusive value="99999"/>
  </xsd:restriction>
</xsd:simpleType>

Example: the SKU simple type (contained in the po schema) is derived by restiction from the built-in simple type string, the base type

  • we constrain the values using a facet called pattern.
Example
Defining the Simple Type "SKU"
<xsd:simpleType name="SKU">
  <xsd:restriction base="xsd:string">
    <xsd:pattern value="\d{3}-[A-Z]{2}"/>
  </xsd:restriction>
</xsd:simpleType>

Example: the USState simple type is derived by restiction from the built-in simple type string, the base type

  • we constrain the values using the enumeration facet
  • the enumeration facet can be used to constrain the values of every simple type except the boolean type
  • the enumeration facet limits a simple type to a set of distinct values: in this case one of the US states abbreviations.
Example
Using the Enumeration Facet
<xsd:simpleType name="USState">
  <xsd:restriction base="xsd:string">
    <xsd:enumeration value="AK"/>
    <xsd:enumeration value="AL"/>
    <xsd:enumeration value="AR"/>
    <!-- and so on ... -->
  </xsd:restriction>
</xsd:simpleType>

2.3.1 List Types

XML Schema has the concept of atomic types, list types and union types, which are collectively called simple types

atomic types:
atomic types are those having values which are considered by specification indivisible
list types:
list types are those having values each of which consists of a sequence of values of an atomic type
union types:
union types are those whose values and lexical values are the union of the values and lexical values of one or more datatypes

Example of atomic type: the NMTOKEN is an atomic type

  • the NMTOKEN value US is indivisible, no part of the value US, such as the character "S", has any meaning by itself

Example of list type: the NMTOKENS is an list type

  • a value of this type is a white-space separated list of NMTOKENs, such as "IT ES CH"
  • the three XML Schema built-in list types are: NMTOKENS, IDREFS, and ENTITIES.

Creating list types from existing types

  • you can create new list types from existing atomic types
  • you cannot create new list types from existing list types or complex types

Example: creating a new list type from the myInteger atomic type

Example
Creating a List of myInteger's
<xsd:simpleType name="listOfMyIntType">
  <xsd:list itemType="myInteger"/>
</xsd:simpleType>

In an instance document a conforming element is:

Example
<listOfMyInt>20003 15037 95977 95945</listOfMyInt>

Deriving new list types by restricting existing list types

  • several facets can be applied to list types: length, minLength, maxLength, pattern, and enumeration

Example: derive the SixUSStates list type from the USStateList type, employing the length facet

Example
List Type for Six US States
<xsd:simpleType name="USStateList">
  <xsd:list itemType="USState"/>
</xsd:simpleType>

<xsd:simpleType name="SixUSStates">
  <xsd:restriction base="USStateList">
    <xsd:length value="6"/>
  </xsd:restriction>
</xsd:simpleType>

Elements whose type is SixUSStates must have six items, and each of the six items must be one of the (atomic) values of the enumerated type USState, for example:

Example
<sixStates>PA NY CA NY LA AK</sixStates>

2.3.2 Union Types

Union types enable element or attributes values to be instances of one type drawn from the union of multiple atomic and list types.

Example of union type: the zipUnion type is the union of the two types USState and listOfMyIntType

  • the memberTypes attribute value is a list of all the types in the union
Example
Union Type for Zip Codes
<xsd:simpleType name="zipUnion">
  <xsd:union memberTypes="USState listOfMyIntType"/>
</xsd:simpleType>

Assuming we have declared an element called zips of type zipUnion, valid instances of the element are:

Example
<zips>CA</zips>
<zips>95630 95977 95945</zips>
<zips>AK</zips>

2.4 Anonymous Type Definitions

How to construct a schema

  • define so many named types and then declare as many elements that references the type using the type= construction
    • for example, the PurchaseOrderType type and the purchaseOrder element
  • if a type definition is referenced only once, you can define anonymous types
    • for example, the definition of the Item type contains two element declarations that use the two anonymous types item and quantity
    • you can recognize the use of an anonymous type:
      • by the lack of the type= syntax in the element/attribute declaration
      • by the presence of a nested (simple or complex) type definition lacking of the name= syntax
    • item element declaration has an anonymous complex type consisting of {productName, quantity, USPrice, comment, shipDate} elements and partNum attribute
Example
Two Anonymous Type Definitions
<xsd:complexType name="Items">
  <xsd:sequence>
    <xsd:element name="item" minOccurs="0" maxOccurs="unbounded"> 
      <xsd:complexType>       <!-- anonymous complex type -->
        <xsd:sequence>
          <xsd:element name="productName" type="xsd:string"/>
          <xsd:element name="quantity"> 
            <xsd:simpleType>   <!-- anonymous simple type -->
              <xsd:restriction base="xsd:positiveInteger">
                <xsd:maxExclusive value="100"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="USPrice"  type="xsd:decimal"/>
          <xsd:element ref="comment"   minOccurs="0"/>
          <xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
        </xsd:sequence>
        <xsd:attribute name="partNum" type="SKU" use="required"/>
      </xsd:complexType>
    </xsd:element>
  </xsd:sequence>
</xsd:complexType>

3 Advanced Concepts I: Namespaces, Schemas & Qualification

XML schema's target namespace

  • a schema can be viewed as a collection (vocabulary) of type definitions and element declarations
  • all type definitions and element declarations names of a schema belong to the same namespace called target namespace
  • target namespaces allow XML parsers to distinguish between definitions and declarations from different vocabularies with different meanings that happen to share the same name
  • for example, target namespaces would enable to distinguish between the declaration for the set element in the SVG vocabulary and the declaration for the set element in the MathML vocabulary, because they belong to different target namespaces

Schema validations and target namespaces

  • schema validation is the process that checks that an instance document conforms to one or more schemas
  • schema validation employs target namespaces to identify which element and attribute declarations and type definitions in the schemas should be used to check elements and attributes in the instance document

The schema author can decide how elements and attributes declared in a schema should appear in an instance document.

  • the author can decide whether or not the appearance of locally declared elements and attributes must be qualified by a namespace
  • elements and attributes in an instace document may be qualified by a namespace, using either an explicit prefix or implicitly by default.

3.1 Target Namespaces & Unqualified Locals

The new version of the po schema, po1.xsd:

  • declares a target namespace
  • specifies that locally defined elements and attributes must be unqualified

How a schema declares a target namespace

  • the targetNamespace attribute on the schema element indicates the target namespace of the po1.xsd schema
  • <schema ...
      targetNamespace="http://www.example.com/PO1"

How a schema specifies qualification of local elements and attributes

  • the elementFormDefault and attributeFormDefault attributes on the schema element specify qualification globally
  • the form attribute specifies qualification for each local declaration
  • all such attributes' values may be set to unqualified or qualified, the default values are unqualified
    • this indicates whether or not locally declared elements and attributes must be unqualified.

The po1.xsd schema globally specifies the qualification of elements and attributes by setting the elementFormDefault and attributeFormDefault attributes

  • <schema ...
      elementFormDefault="unqualified"
      attributeFormDefault="unqualified"
  • these settings are unnecessary because the values are the defaults for the two attributes
Example
Purchase Order Schema with Target Namespace, po1.xsd
<schema xmlns="http://www.w3.org/2001/XMLSchema"
        xmlns:po="http://www.example.com/PO1"
        targetNamespace="http://www.example.com/PO1"
        elementFormDefault="unqualified"
        attributeFormDefault="unqualified">

  <element name="purchaseOrder" type="po:PurchaseOrderType"/>
  <element name="comment"       type="string"/>

  <complexType name="PurchaseOrderType">
    <sequence>
      <element name="shipTo"    type="po:USAddress"/>
      <element name="billTo"    type="po:USAddress"/>
      <element ref="po:comment" minOccurs="0"/>
      <!-- etc. -->
    </sequence>
    <!-- etc. -->
  </complexType>

  <complexType name="USAddress">
    <sequence>
      <element name="name"   type="string"/>
      <element name="street" type="string"/>
      <!-- etc. -->
    </sequence>
  </complexType>

  <!-- etc. -->

</schema>

How type definitions populate the schema's target namespace:

  • the USAddress type definition includes the type in the schema's target namespace
  • the PurchaseOrderType type definition includes the type in the schema's target namespace
  • local element declarations references are prefixed, i.e. po:USAddress and po:comment
  • the standard namespace declaration associates the prefix :po with the namespace http://www.example.com/PO1
    • <schema ...
                xmlns:po = "http://www.example.com/PO1"
         targetNamespace = "http://www.example.com/PO1"
    • the namespace declaration has the same value as the target namespace
  • a schema processor that reads the element declaration's references knows that it has to look into this schema to find the definition of USAddress type and the declaration of comment element

How element declarations populate the schema's target namespace:

  • the purchaseOrder and comment global element declarations include these elements in the target namespace
  • in the purchaseOrder element declaration the type, po:PurchaseOrderType, is prefixed
  • in contrast, in the comment element declaration the type value, string, is not prefixed

The default namespace of the po schema: po1.xsd

  • the schema element contains a default namespace declaration:
       <schema xmlns="http://www.w3.org/2001/XMLSchema"
  • unprefixed types and elements are associated with the default namespace
    • in <complexType name="USAddress">, complexType is in the default namespace
    • in <element name="shipTo" type="po:USAddress"/>, element is in the default namespace
  • the default namespace is the target namespace for the XML Schema itself
  • a schema processor that reads the po schema, knows that it has to look into the schema of XML Schema to find the definitions of the type string and the declaration of the element called element. The schema of XML Schema is also known as "schema for schemas".

Let us now examine how the target namespace of the schema affects a conforming instance document:

Example
A Purchase Order with Unqualified Locals, po1.xml
<?xml version="1.0"?>
<apo:purchaseOrder xmlns:apo="http://www.example.com/PO1" orderDate="1999-10-20">
  <shipTo country="US">
    <name>Alice Smith</name>
    <street>123 Maple Street</street>
    <!-- etc. -->
  </shipTo>
  <billTo country="US">
    <name>Robert Smith</name>
    <street>8 Oak Avenue</street>
    <!-- etc. -->
  </billTo>
  <apo:comment>Hurry, my lawn is going wild<!/apo:comment>
  <!-- etc. -->
</apo:purchaseOrder>

How declaring a namespace does it affect a conforming instance document ?

  • the previous version of the purchase order po.xml does not have any namespace
  • the new version of the purchase order po1.xml contains a standard namespace declaration with the prefix apo:
    • <apo:purchaseOrder xmlns:apo="http://www.example.com/PO1">
    • the declaration associates the namespace http://www.example.com/PO1 with the prefix apo:
    • the namespace is the same as the target namespace of the schema in po1.xsd
  • the apo: prefix is used to qualify the elements purchaseOrder and comment
  • so a processor of the instance document knows that it has to look into the po schema to find the declarations of purchaseOrder and comment
  • target namespace are named so because elements in the instance document have a target namespace in the schema which controls validation of the elements themself

Qualified global elements and unqualified locals

  • the apo: prefix is applied only to elements purchaseOrder and comment elements which have global declaration in the schema
  • the elementFormDefault requires that the apo: prefix is not applied to the locally declared elements such as shipTo, billTo, name and street
  • the attributeFormDefault requires that the apo: prefix is not applied to any of the attributes (which were all declared locally)

If local elements and attributes are not required to be qualified, how is it hard to write an instance document ?

  • if only the root element is global, it is a simple matter to qualify only the root element
  • if all the elements are declared globally, then all elements can be prefixed
  • if there is no uniform pattern of declarations, the author of an instance document must have a detailed knowledge of the schema

3.2 Qualified Locals

Setting the schema locally declared elements to be qualified:

  • set the elementFormDefault schema attribute to qualified
  • elements and attributes can be independently required to be qualified
Example
Modifications to po1.xsd for Qualified Locals
<schema xmlns="http://www.w3.org/2001/XMLSchema"
        xmlns:po="http://www.example.com/PO1"
        targetNamespace="http://www.example.com/PO1"
        elementFormDefault="qualified"
        attributeFormDefault="unqualified">

  <element name="purchaseOrder" type="po:PurchaseOrderType"/>
  <element name="comment"       type="string"/>

  <complexType name="PurchaseOrderType">
    <!-- etc. -->
  </complexType>

  <!-- etc. -->

</schema>

How all the elements in a conforming instance document are to be qualified:

  • prefix all the elements with the namespace prefix for explicit qualification
  • alternatively, use a default namespace for implicit qualification as in po2.xml

elements explicitly qualified:

Example
A Purchase Order with Explicitly Qualified Locals
<?xml version="1.0"?>
<apo:purchaseOrder xmlns:apo="http://www.example.com/PO1"
                   orderDate="1999-10-20">
  <apo:shipTo country="US">
    <apo:name>Alice Smith</apo:name>
    <apo:street>123 Maple Street</apo:street>
    <!-- etc. -->
  </apo:shipTo>
  <apo:billTo country="US">
    <apo:name>Robert Smith</apo:name>
    <apo:street>8 Oak Avenue</apo:street>
    <!-- etc. -->
  </apo:billTo>
  <apo:comment>Hurry, my lawn is going wild<!/apo:comment>
  <!-- etc. -->
</apo:purchaseOrder>

element implicitly qualified by a default namespace:

  • all the elements in the instance belong to the same default namespace, hence, it is unnecessary to explicitly prefix any of the elements.
Example
A Purchase Order with Default Qualified Locals, po2.xml
<?xml version="1.0"?>
<purchaseOrder xmlns="http://www.example.com/PO1"
               orderDate="1999-10-20">
  <shipTo country="US">
    <name>Alice Smith</name>
    <street>123 Maple Street</street>
    <!-- etc. -->
  </shipTo>
  <billTo country="US">
    <name>Robert Smith</name>
    <street>8 Oak Avenue</street>
    <!-- etc. -->
  </billTo>
  <comment>Hurry, my lawn is going wild<!/comment>
  <!-- etc. -->
</purchaseOrder>

Attributes must be qualified for the same reason elements do

  • either the attributeFormDefault schema attribute is set to qualified
  • or the attributes are declared globally in the schema

How to prefix attributes in instance documents

  • attributes that are required to be qualified must be explicitly prefixed
  • the Namespaces in XML specification does not provide a mechanism for defaulting the namespaces of attributes

Example: a schema using the form qualification mechanism for attribute declaration:

  • the schema below requires the publicKey locally declared attribute to be qualified in document instances
  • notice that the value of the form attribute overrides the value of the attributeFormDefault attribute for the publicKey attribute only
  • the form attribute can be applied to an element declaration in the same manner
Example
Requiring Qualification of Single Attribute
<schema xmlns="http://www.w3.org/2001/XMLSchema"
        xmlns:po="http://www.example.com/PO1"
        targetNamespace="http://www.example.com/PO1"
        elementFormDefault="qualified"
        attributeFormDefault="unqualified">
  <!-- etc. -->
  <element name="secure">
    <complexType>
      <sequence>
        <!-- element declarations -->
      </sequence>
      <attribute name="publicKey" type="base64Binary" form="qualified"/>
    </complexType>
  </element>
</schema>

and a conforming instance document

Example
Instance with a Qualified Attribute
<?xml version="1.0"?>
<purchaseOrder xmlns="http://www.example.com/PO1"
               xmlns:po="http://www.example.com/PO1"
               orderDate="1999-10-20">
  <!-- etc. -->
  <secure po:publicKey="GpM7">
    <!-- etc. -->
  </secure>
</purchaseOrder>

3.3 Global vs. Local Declarations

Schema declaring only global elements

  • if all elements names are unique within a namespace, you can construct a schema where all elements are declared globally
  • in schema with global declarations you can omit the elementFormDefault and attributeFormDefault attributes, because they become irrelevant
  • global declarations in schemas have the same effect of <!ELEMENT> in DTDs
Example
Modified version of po1.xsd using only global element declarations
<schema xmlns="http://www.w3.org/2001/XMLSchema"
        xmlns:po="http://www.example.com/PO1"
        targetNamespace="http://www.example.com/PO1">

  <element name="purchaseOrder" type="po:PurchaseOrderType"/>

  <element name="shipTo"  type="po:USAddress"/>
  <element name="billTo"  type="po:USAddress"/>
  <element name="comment" type="string"/>

  <element name="name" type="string"/>
  <element name="street" type="string"/>

  <complexType name="PurchaseOrderType">
    <sequence>
      <element ref="po:shipTo"/>
      <element ref="po:billTo"/>
      <element ref="po:comment" minOccurs="0"/>
      <!-- etc. -->
    </sequence>
  </complexType>

  <complexType name="USAddress">
    <sequence>
      <element ref="po:name"/>
      <element ref="po:street"/>
      <!-- etc. -->
    </sequence>
  </complexType>

  <!-- etc. -->

</schema>

Which schema (with target namespace) validate the purchase order with default qualified locals, po2.xml?

  • the purchase order with default namespace is schema valid against:
  • both schema approaches can validate the same namespace defaulted document, but when elements are all globals you cannot have two elements with same names.

3.4 Undeclared Target Namespaces

The first po schema po.xsd does not declare a target namespace and the first purchase order does not declare a namespace.

  • what is the target namespace in the first po schema?
  • how to reference the undeclared namespace?

Consequences for the po schema that does not use a target NameSpace

  • the schema po.xsd does not declare a target namespace
  • the schema po.xsd does not declare a namespace with a prefix associated with the schema's target namespace
  • as a consequence that schema's definitions and declarations are referenced without namespace qualification, no explicit or implicit namespace prefix is applied to references
  • for example, the purchaseOrder element is declared employing the reference to PurchaseOrderType
  • in contrast, all the XML Schema elements and type are qualified with the xsd: prefix associated with the XML Schema namespace

Recommendations for schemas that do not use a target NameSpace

  • all XML Schema elements and types should be explicitly qualified with a prefix associated with the XML Schema namespace
    • for example, the po.xsd schema qualifies elements and type with the xsd: prefix
    • if XML Schema elements and types are associated with the XML Schema using a default namespace, then references to XML Schema types are not distinguishable from references to user-defined types

Schemas with no targetnamespace and document instances with unqualified elements

  • elements declarations from a schema with no targetnamespace validate unqualified elements in instance documents
  • unqualified elements are those for which a namespace is not provided by either an explicit prefix or by default (xmlns:)
  • to validate an XML document which does not use namespaces, you must provide a schema with no target namespace
    • there are many XML documents that do not use namespaces and many schema without target namespaces
    • you must be sure to give to your processor a schema document with the matching vocabulary

No comments :

Post a Comment