Recursion in XML Schema

by Michael Szul on

No ads, no tracking, and no data collection. Enjoy this article? Buy us a ☕.

XML Schema is a powerful validation tool for XML documents that is virtually a requirement if you are accepting 3rd-party XML as incoming data for a web service. Many shy away from using the technology because of its overt verbosity and complexity, but it offers the granularity necessary for fine-tuned validation.

With XML, many different markup structures are possible, including element names existing within each other. For example, maybe you have a <container> element that can exist within itself:

<container>
          <container>
              <container />
          </container>
      </container>
      

This embeddedness can can be difficult to initially discern how to validate with XML Schema. It actually requires the use of a global complex element type that is then referenced by itself (and elsewhere in the initial code).

Below is an example:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
          <xs:element name="container">
              <xs:complexType>
                  <xs:sequence>
                      <xs:element ref="container" minOccurs="0" maxOccurs="unbounded" />
                  </xs:sequence>
              </xs:complexType>
          </xs:element>
      
          <xs:element name="containers">
              <xs:complexType>
                  <xs:sequence>
                      <xs:element ref="container" minOccurs="0" maxOccurs="unbounded" />
                  </xs:sequence>
              </xs:complexType>
          </xs:element>
      </xs:schema>
      

One thing you'll notice right from the beginning is that the <container> element exists in an area where XML Schema will consider it a global element. You can only ref a global element. That global element has a name attribute, while all other areas where the element can reside, as per validation, are noted with the ref instead.

The second <xs:element> declaration is the root of our document: <containers>. It is within this structure that we make our first call to the global element, and then the global element, in turn references itself, creating the recursive validation.