34 XML format

XML is a markup language, similar to HTML (used to build web pages) that is defined and maintained by the World Wide Web Consortium (W3C). The goals of XML emphasize simplicity, generality, and usability across the Internet. Although XML focuses on creating documents, it is also used to represent arbitrary data structures, for integration between computer systems. A typical XML file has the following structure:

<?xml version="1.0" encoding="UTF-8"?>
<Exemplo>
  <Localidade número="1">
   <Continente>áfrica</Continente>
   <País>Angola</País>
   <Capital>Luanda</Capital>
  </Localidade>
  <Localidade número="2">
   <Continente>América do Norte</Continente>
   <País>Estados Unidos</País>
   <Capital>Washington DC</Capital>
  </Localidade>
  <Localidade número="3">
   <Continente>América Central</Continente>
   <País>México</País>
   <Capital>Cidade do México</Capital>
  </Localidade>
  <Localidade número="4">
   <Continente>América do Sul</Continente>
   <País>Brasil</País>
   <Capital>Brasília</Capital>
  </Localidade>
  <Localidade número="5">
   <Continente>Europa</Continente>
   <País>Espanha</País>
   <Capital>Madri</Capital>
  </Localidade>
  <Localidade número="6">
   <Continente>Europa</Continente>
   <País>Alemanha</País>
   <Capital>Berlim</Capital>
  </Localidade>
  <Localidade número="7">
   <Continente>Oceania</Continente>
   <País>Austrália</País>
   <Capital>Camberra</Capital>
  </Localidade>
  <Localidade número="8">
   <Continente>ásia</Continente>
   <País>Japão</País>
   <Capital>Tóquio</Capital>
  </Localidade>
</Exemplo>

Markup and content

An XML file has two main features: markup and content. Generally, strings that constitute "markup" either begin with the character < and end with >, or they begin with the character & and end with ;. Strings of characters that are not markup are "content." In the example above, and are markup. The names of countries, continents and capitals are the "content."

Tags

Tags are markup that begins with “<” and ends with “>.” There are three types of tags:

Start-tags; for example: <Location>

End-tags; for example: </Location>

Empty-element tags; for example: <line break />

Elements

Elements are XML components that begin with a start-tag and end with a corresponding end-tag, or consist of only an empty-element tag. The string of characters between the start- and end-tags, if any, are the element's content, and may contain markup, including other elements, which are called "child" elements. In the example above, an element would be

<País>Brasil</País>.

Attributes

Attributes are "name/value" pairs that exist within a start-tag or empty-element tag. In the example above, the element <Location> has a "number" attribute and a corresponding value:

<Localidade número="8">

The name of the attribute is "number" and its value is "8." Attributes can only have a single value in quotes, and each attribute cannot appear more than once in each element.

XML declaration

XML documents should begin by declaring some information about themselves, as in the following example:

<?xml version="1.0" encoding="UTF-8"?>