Home
Music
- Telugu
- Hindi
- English
Videos
- Telugu
- Hindi
- English
Movies
- Telugu
- Hindi
- English
- Reviews
  - Telugu
  - Hindi
  - English
Photo Gallery
- Birthday Cakes
- Top Ten
JNTU
- Question Papers
  - CSE
  - IT
  - ECE
  - EEE
  - MECH
  - Civil
- Online Bits
  - CSE
  - IT
  - ECE
  - EEE
  - MECH
  - Civil
- Results
- Syllabus
- Projects
- College List
- Resumes
- Jobs
E-Books
Interview Skills
CET-Info
- GRE
- CAT
Softwares
Tips & Tricks

Wednesday, April 24, 2013

Some Information about DOM and SAX

12:45 PM

Anonymous

DOM and SAX

XML DOM

The XML DOM defines a standard for accessing and manipulating XML.

What is the DOM?

The DOM is a W3C (World Wide Web Consortium) standard.

The DOM defines a standard for accessing documents like XML and HTML:

"The W3C Document Object Model (DOM) is a platform and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure, and style of a document."

The DOM is separated into 3 different parts / levels:

Core DOM - standard model for any structured document
XML DOM - standard model for XML documents
HTML DOM - standard model for HTML documents

The DOM defines the objects and properties of all document elements, and the methods (interface) to access them.

What is the XML DOM?

The XML DOM is:

A standard object model for XML
A standard programming interface for XML
Platform- and language-independent
A W3C standard

The XML DOM defines the objects and properties of all XML elements, and the methods (interface) to access them.

In other words: The XML DOM is a standard for how to get, change, add, or delete XML elements.

DOM Nodes

According to the DOM, everything in an XML document is a node.

The DOM says:

The entire document is a document node
Every XML element is an element node
The text in the XML elements are text nodes
Every attribute is an attribute node
Comments are comment nodes

Node type	nodeName returns	nodeValue returns
Document	#document	null
DocumentFragment	#document fragment	null
DocumentType	doctype name	null
EntityReference	entity reference name	null
Element	element name	null
Attr	attribute name	attribute value
ProcessingInstruction	target	content of node
Comment	#comment	comment text
Text	#text	content of node
CDATASection	#cdata-section	content of node
Entity	entity name	null
Notation	notation name	null

Element Object Properties

Property	Description	IE	F	O	W3C
attributes	Returns a NamedNodeMap of attributes for the element	5	1	9	Yes
baseURI	Returns the absolute base URI of the element	No	1	No	Yes
childNodes	Returns a NodeList of child nodes for the element	5	1	9	Yes
firstChild	Returns the first child of the element	5	1	9	Yes
lastChild	Returns the last child of the element	5	1	9	Yes
localName	Returns the local part of the name of the element	No	1	9	Yes
namespaceURI	Returns the namespace URI of the element	No	1	9	Yes
nextSibling	Returns the node immediately following the element	5	1	9	Yes
nodeName	Returns the name of the node, depending on its type	5	1	9	Yes
nodeType	Returns the type of the node	5	1	9	Yes
ownerDocument	Returns the root element (document object) for an element	5	1	9	Yes
parentNode	Returns the parent node of the element	5	1	9	Yes
prefix	Sets or returns the namespace prefix of the element	No	1	9	Yes
previousSibling	Returns the node immediately before the element	5	1	9	Yes
schemaTypeInfo	Returns the type information associated with the element			No	Yes
tagName	Returns the name of the element	5	1	9	Yes
textContent	Sets or returns the text content of the element and its descendants	No	1	No	Yes
text	Returns the text of the node and its descendants. IE-only property	5	No	No	No
xml	Returns the XML of the node and its descendants. IE-only property	5	No	No	No

DOM Example

<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book category="cooking">
    <title lang="en">Everyday Italian</title>
    <author>Giada De Laurentiis</author>
    <year>2005</year>
    <price>30.00</price>
</book>
<book category="children">
    <title lang="en">Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
</book>
<book category="web">
    <title lang="en">XQuery Kick Start</title>
    <author>James McGovern</author>
    <author>Per Bothner</author>
    <author>Kurt Cagle</author>
    <author>James Linn</author>
    <author>Vaidyanathan Nagarajan</author>
    <year>2003</year>
    <price>49.99</price>
</book>
<book category="web" cover="paperback">
    <title lang="en">Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
</book>
</bookstore>

SAX

SAX (Simple API for XML) is an event-based sequential access parser API developed by the XML-DEV mailing list for XML documents.SAX provides a mechanism for reading data from an XML document that is an alternative to that provided by the Document Object Model (DOM). Where the DOM operates on the document as a whole, SAX parsers operate on each piece of the XML document sequentially.

Unlike DOM, there is no formal specification for SAX. The Java implementation of SAX is considered to be normative. It is used for state-independent processing of XML documents, in contrast to StAX that processes the documents state-dependently.

Benefits

SAX parsers have some benefits over DOM-style parsers. A SAX parser only needs to report each parsing event as it happens, and normally discards almost all of that information once reported (it does, however, keep some things, for example a list of all elements that have not been closed yet, in order to catch later errors such as end-tags in the wrong order). Thus, the minimum memory required for a SAX parser is proportional to the maximum depth of the XML file (i.e., of the XML tree) and the maximum data involved in a single XML event (such as the name and attributes of a single start-tag, or the content of a processing instruction, etc).

This much memory is usually considered negligible. A DOM parser, in contrast, typically builds a tree representation of the entire document in memory to begin with, thus using memory that increases with the entire document length. This takes considerable time and space for large documents (memory allocation and data-structure construction take time). The compensating advantage, of course, is that once loaded any part of the document can be accessed in any order.

Because of the event-driven nature of SAX, processing documents is generally far faster than DOM-style parsers, so long as the processing can be done in a start-to-end pass. Many tasks, such as indexing, conversion to other formats, very simple formatting, and the like, can be done that way. Other tasks, such as sorting, rearranging sections, getting from a link to its target, looking up information on one element to help process a later one, and the like, require accessing the document structure in complex orders and will be much faster with DOM than with multiple SAX passes.

Some implementations do not neatly fit either category: a DOM approach can keep its persistent data on disk, cleverly organized for speed (editors such as SoftQuad Author/Editor and large-document browser/indexers such as DynaText do this); while a SAX approach can cleverly cache information for later use (any validating SAX parser keeps more information than described above). Such implementations blur the DOM/SAX tradeoffs, but are often very effective in practice.

Due to the nature of DOM, streamed reading from disk requires techniques such as lazy evaluation, caches, virtual memory, persistent data structures, or other techniques (one such technique is disclosed in [US Patent 5,557,722]). Processing XML documents larger than main memory is sometimes thought impossible because some DOM parsers do not allow it. However, it is no less possible than sorting a datset larger than main memory. disk space as memory to sidestep this limitation.

Drawbacks

The event-driven model of SAX is useful for XML parsing, but it does have certain drawbacks.

Virtually any kind of XML validation requires access to the document in full. The most trivial example is that an attribute declared in the DTD to be of type IDREF, requires that there be an element in the document that uses the same value for an ID attribute. To validate this in a SAX parser, one must keep track of all ID attributes (any one of them might end up being referenced by an IDREF attribute at the very end); as well as every IDREF attribute until it is resolved. Similarly, to validate that each element has an acceptable sequence of child elements, information about what child elements have been seen for each parent, must be kept until the parent closes.

Additionally, some kinds of XML processing simply require having access to the entire document. XSLT and XPath, for example, need to be able to access any node at any time in the parsed XML tree. Editors and browsers likewise need to be able to display, modify, and perhaps re-validate at any time. While a SAX parser may well be used to construct such a tree initially, SAX provides no help for such processing as a whole.

For soft copy Click Here

All For You

Wednesday, April 24, 2013

Some Information about DOM and SAX

DOM and SAX

Benefits

Drawbacks

0 comments:

Post a Comment

Share This

Popular Posts

Blogroll

Categories

Blog Archive