The Differences Between XML and HTML
Join the DZone community and get the full member experience.Join For Free
In this article, we first explain XML terminology and concepts by reviewing a simple example. Next, we discuss the differences between HTML and XML programming. Along the way, you will learn essential tips for working with XML, so if you are coming from a web design background, you will find our practical tips and advice very helpful.
What Is XML?
XML is essentially an agreement among people to store and share textual data using standard methods. XML stands for eXtensible Markup Language, while XSL stands for eXtensible Stylesheet Language. Below is an example of a simple XML code.
<?xml version="1.0" encoding="utf-8"?> <HiWorld> <Message>Here is my first XML document!</Message> </HiWorld>
The first line specifies that the file is an XML document and gives useful information about its encoding. Then the rest of the document is a text format whose structure is specified by tags between brackets.
While data is stored in XML documents, XSL documents describe how to change XML documents into other types of documents (such as HTML, TXT or even XML). The process of transformation is called XSLT, or, sometimes, XSL transformations. Parsing an XML document can be done via a variety of programming languages, such as PHP, Java, Python, etc. To better understand how XML works, let’s consider a very simple example like below:
Given just this information, you wouldn't know what it was for or what it meant.
Tom Johnson 52 Maryland
XML provides a way to mark up this data so that it can be interpreted by other people, as well as other computer programs. So, the data above might be marked up like this:
<BROKERINFO> <BROKERNAME>Tom Johnson</BROKERNAME> <BROKERAGE>52</BROKERAGE> <BROKERSTATE>Maryland</BROKERSTATE> </BROKERINFO>
Now, you know a lot more about the meaning of this information. After the XML tags are added, you can tell that Tom Johnson is a broker agent, that his age is 52, and that he operates in Maryland. It might look a lot like HTML to you, yet the markup above doesn't work exactly like HTML.
XML does not tell us how content should look; instead, it gives content context and meaning. In the next section, we discuss differences between XML and HTML coding.
Differences Between XML and HTML
Since HTML is a markup language like XML, they have many similarities, but there are a few key differences between HTML and XML. Here are some fundamental differences you should be aware of:
- With HTML, small errors in syntax are often ignored, which may not be the case in XML.
- HTML has only pre-defined tags, whereas XML tags are created by the author. Documents can be structured logically in XML (the author chooses the appropriate structure), while HTML has a pre-defined "head" and "body" type structure. Also, sometimes, HTML tags are used in conjunction with a specific JS API like HTML 5 Geolocation or HTML 5 Custom Data.
- XML is not always useful on its own. Translating it to different forms (such as HTML) is one of its great powers.
- XML has different rules from HTML because XML was created to serve a different purpose.
Most of the differences between HTML and XML syntax serve to make parsing XML documents faster and easier. It is also worth mentioning that mistakes in XML are relatively more consequential than those made in HTML.
You should be aware of some common pitfalls that many novice XML programmers run into. As an experienced HTML programmer, you're probably used to HTML's more flexible syntax structure. If you're converting existing HTML to XML (defining some HTML tags in your XML specification), you may see some problems. Here are six areas in which HTML coding differs from XML.
XML treats white space much differently from HTML. In HTML, white space (spaces, newlines, tabs, and other "white" characters) are pretty much ignored. This is not the case in XML, as every character is important in an XML document. However, if you are using XML to output to HTML only, there is little need to worry about white space, but, for strictly XML applications, it becomes an issue.
All tags must be closed in XML. So, a tag by itself is wrong in XML. There are few fixes for this issue. For instance, if a tag is empty with no contents, a single tag can serve as both opening and closing tag if it ends with /> instead of >.
All XML tags must be nested correctly. In HTML, nesting is not always important, so consider the below example:
<?xml version="1.0"?> <div><span>
The above code is wrong because the
span tag is not closed before the
div tag. In other words, even though the said nesting error can be tolerated by browsers for an HTML page, it causes a problem or error in XML. Structure is extremely important in XML, so important that your document will not be processed if the structure is incorrect. Let's change the code and make it work:
<?xml version="1.0"?> <div><span>
In addition to nesting elements correctly, XML requires a root element, which, as a wrapper contains all the other elements. Change our previous XML code as below:
<?xml version="1.0"?> <div><span> <div><span>
The code is wrong since there is no wrapper or containing element, as fixed below by adding an HTML tag to it.
<?xml version="1.0"?> <html> <div><span> <div><span> </html>
Another major difference between HTML and XML is capitalization. While in HTML, browsers tolerate tags in upper or lower case, or even a combination of the two, this is not the case with XML. So, keep in mind that XML is case sensitive. That means that when you write XML, adopt either lowercase or uppercase coding, so if you open a tag uppercase like <BODY>, make sure to close it like </BODY> not </body>.
Another difference between HTML and XML may be in the syntax of your inline style attributes. Let’s consider the following XML example:
<?xml version="1.0"?> <span>
Although the above code may work on a web page or can be processed by a web browser, it is wrong for an XML document, as XML styling attributes must have quotation marks around them, as fixed below:
<?xml version="1.0"?> <span>
In this article, we reviewed what XML is and in what ways writing an XML document, or code, is different than HTML. Specifically, if you are coming from a web design background, you may not pay attention to coding issues, such as nesting tags, white spaces, tag case sensitivity, etc. Yet, all of those issues may cause errors in an XML document.
Another important takeaway from this article is that you can define your tag name and data structure as opposed to HTML pre-set tags, so prior to do your XML coding you need to have some foresight regarding what you want to achieve in your XML doc. It is important to note that above differences are applied to front-end frameworks like Bootstrap.
Now that you have learned the differences between XML and HTML coding, it is time to do some practice by creating few XML documents. For a start, you can use text editors that highlight your XML errors.
Opinions expressed by DZone contributors are their own.