The word extensible is the new definition of the Internet today. The next
generation Net will be more interactive. And, what is the one thing that could
make this transition possible? XML, a markup language for documents containing
structured information.
Structured information contains both content (sentences, pictures, etc.) and
some indication of what role that content plays (for example, content in a
section heading has a different meaning from content in a footnote, which means
something different than content in a figure caption or content in a database
table, etc.). Almost all documents have some structure. A markup language is a
mechanism to identify structures in a document. The XML specification defines a
standard way to add markup to documents.
XML is defined as an application profile of SGML. SGML, a grandfather of HTML
is the Standard Generalized Markup Language defined by ISO 8879 based on
concepts developed at IBM. SGML is a comprehensive language that defined
hypertext links. SGML adhered to a model, which is similar to a database schema.
This means that it can be stored in a database or processed by software designed
to interpret the model.
As new technology and new applications come up, the limitations of existing
languages start to become evident. This happened with FORTRAN, when the
object-oriented languages such as C++ took over. Now it is HTML’s turn. The
language works fine for marking up Web pages consisting of simple text and
graphics. But, it does not meet the demand when it comes to handling streaming
media, interactive applications, database operations, and formatting content for
many different display devices. So with XML, one can design a custom language
for any application. So as new needs and new capabilities come up, one simply
creates the necessary new elements, but all within the standard XML framework.
The other alternatives to the language are HTML and SGML. But HTML comes
bound with a set of semantics and does not provide arbitrary structure. SGML
provides arbitrary structure, but is too difficult to implement just for a Web
browser.
XML is similar to HTML in many ways. Essentially, XML holds the data for a
page in between tags, which look very similar to HTML. XML also uses TAGS to
surround data, but the difference is that you get to "invent" some of
the tags! Additionally, many of the markup tags used in HTML utilize both start
tags and end tags, such as
But all tags in XML have both start tags and end tags. The tags are used to
define the data they contain. In XML the term "element" is used to
describe the tags (both start and end tags) and the data they contain as one
unit. In this case, I have defined a SECTION element:
The use of nesting tags also occurs in both HTML and XML such as:
XML uses a nesting method, which could be looked upon as logical. For
instance, let's suppose as our first tag we use, "ENTERTAINMENT." Our
next tag could be "MOVIES," and under that we can create two elements
"HINDI" and "ENGLISH". Even without knowing anything about
XML, you're likely to realize that each is nested inside the other. Movies
belong to the entertainment industry and that Hindi and English are its members.
Thus, the tags would in this case nest as follows:
In our next session, we will see some more aspects of the tags used in
XML.