In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly introduces the XML parsing example analysis, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand it.
In the project development, the main function of HTML is to display data, and to standardize the data storage structure, we need to use XML. XML has its own syntax, and all markup elements can be arbitrarily defined by the user.
1. Know XML
XML (eXtended Markup Language, Extensible markup language) provides a set of cross-platform, cross-network, cross-program language for data description. Using XML can easily achieve common functions such as data exchange, system configuration, content management and so on.
XML, like HTML, is a markup language. The biggest difference is that the elements in HTML are fixed and display-based, while the tags in XML are user-defined, mainly for data preservation.
Comparison between XML and HTML
Virtually all XML files are made up of two parts: the lead area and the data area.
Lead area: specify some properties of the XML page, with the following three attributes:
Version: indicates the version of XML used, which is now 1. 0.
Encoding: the text encoding used in the page. If there is any Chinese, be sure to specify the encoding.
Standalone: whether this XML file is run independently, you can use CSS or XSL control if you need to display it.
The order in which the attributes appear is fixed, version, encoding, standalone, and if the order is wrong, the XML will have an error.
Data area: all data areas must have a root element, under which multiple child elements can be stored, but each element must be completed, and each tag is case-sensitive.
CDATA tags are provided in the XML language to identify file data. When the XML parser processes CDATA tags, it does not parse any symbols or tags in the data, but passes the original data to the application intact.
CDATA syntax format:
2. XML parsing
In the XML file, because it is more about the content of the description information, after getting a XML document, use the program to extract the corresponding content according to the definition name of the elements in it. This operation is called XML parsing.
In XML parsing, W3C defines two parsing methods: SAX and DOM. The program operations of these two parsing methods are as follows:
XML parsing operation
It can be seen that the application does not operate on the XML document directly, but first analyzes the XML document by the XML parser, and then the application operates on the analysis structure through the DOM interface or SAX interface provided by the XML parser, thus indirectly realizing the access to the XML document.
2.1The DOM parsing operation
In the application, the XML parser based on DOM (Document Object Model, document object model) transforms an XML document into a collection of object model (usually called DOM tree). Cheng Xun, the application, operates on the XML document data through the operation of this object model. Through the DOM interface, applications can access any part of the data in the XML document at any time, so this mechanism using the DOM interface is also called random access mechanism.
Because the DOM parser converts the entire XML document into a DOM tree and puts it in memory, when the document is large or the structure is complex, the need for memory is high, and traversing the tree with complex structure is a time-consuming operation.
The DOM operation turns all XML files into DOM trees in memory.
There are four core operation interfaces in DOM parsing:
Document: this interface represents the entire XML document, represents the root of the entire DOM tree, and provides an entrance to access and manipulate the data in the document. All the element contents of the XML file can be accessed through the Document node.
Common methods of Document Interface
A large portion of the core interface for Node:DOM operations is inherited from the Node interface. For example, Document, Element, Attri, and so on. In the DOM tree, each Node interface represents a node in the DOM tree.
Common methods of Node Interface
NodeList: this interface represents a collection of nodes and is generally used to represent a set of nodes that have a sequential relationship. For example, the children of a node directly affect the NodeList collection when the document changes.
Common methods of NodeList Interface
NameNodeMap: this interface represents the one-to-one correspondence between a set of nodes and their unique names, and is mainly used for the representation of attribute nodes.
In addition to the above four core APIs, if a program needs to perform DOM parsing and reading operations, it needs to follow the following steps:
(1) establish DocumentBuilderFactory:DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance ()
(2) establish DocumentBuilder:DocumentBuilder builder = factory.newDocumentBuilder ()
(3) create Document:Document doc= builder.parse ("File path to read")
(4) establish NodeList:NodeList nl = doc.getElementsByTagName ("read node")
(5) read the XML information.
/ / xml_demo.xml Xiao Ming asaasa@163.com Xiao Zhang xiaoli@163.com
DOM completes the reading of XML.
Package com.demo;import java.io.IOException;import javax.xml.parsers.DocumentBuilder;import javax.xml.parsers.DocumentBuilderFactory;import javax.xml.parsers.ParserConfigurationException;import org.w3c.dom.Document;import org.w3c.dom.Element;import org.w3c.dom.NodeList;import org.xml.sax.SAXException;public class XmlDomDemo {public static void main (String [] args) {/ / (1) set up DocumentBuilderFactory to get DocumentBuilder DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance () / / (2) through DocumentBuilderFactory, get DocumentBuilder DocumentBuilder builder = null; try {builder = factory.newDocumentBuilder ();} catch (ParserConfigurationException e) {e.printStackTrace ();} / / (3) define the Document interface object, and DOM tree through DocumentBuilder class is a conversion operation Document doc = null Try {/ / reads the XML file of the specified path and reads it into memory doc = builder.parse ("xml_demo.xml");} catch (SAXException e) {e.printStackTrace ();} catch (IOException e) {e.printStackTrace () } / / (4) find the linkman node NodeList nl = doc.getElementsByTagName ("linkman"); / / (5) output the content for of the text node in the first child node in the NodeList (int I = 0; I < nl.getLength (); itext +) {/ / take out each element Element element = (Element) text (I) System.out.println ("name:" element.getElementsByTagName ("name"). Item (0). GetFirstChild (). GetNodeValue (); System.out.println ("mailbox:" + element.getElementsByTagName ("email"). Item (0). GetFirstChild (). GetNodeValue ();}
DOM completes the file output of XML.
At this point, you need to use the various interfaces provided in the DOM operation, such as the Element interface, and manually set up the relationship of each node, and when you create the Document object, you must use the newDocument () method to create a new DOM tree.
If you need to save the XML file on your hard drive now, you need to use four classes: TransformerFactory, Transformer, DOMSource, and StreamResult.
TransformerFactory class: gets an instance object of the Transformer class.
The DOMSource class: receives a Document object.
StreamResult class: specifies the output stream object to be used (either to a file or to a specified output stream).
Transformer class: the output of the content is completed through this class.
The construction method of StreamResult class
Package com.demo;import java.io.File;import javax.xml.parsers.DocumentBuilder;import javax.xml.parsers.DocumentBuilderFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.transform.OutputKeys;import javax.xml.transform.Transformer;import javax.xml.transform.TransformerConfigurationException;import javax.xml.transform.TransformerException;import javax.xml.transform.TransformerFactory;import javax.xml.transform.dom.DOMSource;import javax.xml.transform.stream.StreamResult;import org.w3c.dom.Document;import org.w3c.dom.Element Public class XmlDemoWrite {public static void main (String [] args) {/ / (1) set up DocumentBuilderFactory to get DocumentBuilder DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance (); / / (2) through DocumentBuilderFactory, get DocumentBuilder DocumentBuilder builder = null; try {builder = factory.newDocumentBuilder ();} catch (ParserConfigurationException e) {e.printStackTrace () } / / (3) define the Document interface object. DOM tree through the DocumentBuilder class is a conversion operation Document doc = null; / / create a new document doc = builder.newDocument (); / / (4) establish each operation node Element addresslist = doc.createElement ("addresslist"); Element linkman = doc.createElement ("linkman") Element name = doc.createElement ("name"); Element email = doc.createElement ("email"); / / (5) set the text content of the node, that is, add a text node name.appendChild (doc.createTextNode ("Xiao Ming")); email.appendChild (doc.createTextNode ("xiaoming@163.com")) for each node / / (6) set node relationship linkman.appendChild (name); linkman.appendChild (email); addresslist.appendChild (linkman); doc.appendChild (addresslist); / / (7) output document to file TransformerFactory tf = TransformerFactory.newInstance (); Transformer t = null; try {t = tf.newTransformer () } catch (TransformerConfigurationException e) {e.printStackTrace ();} / set encoding t.setOutputProperty (OutputKeys.ENCODING, "GBK"); / / output document DOMSource source = new DOMSource (doc); / / specify output location StreamResult result = new StreamResult (new File ("xml_wirte.xml")) Try {/ / output t.transform (source, result); System.out.println ("yes");} catch (TransformerException e) {e.printStackTrace ();}
Generate documentation:
/ / xml_write.xml Xiaoming xiaoming@163.com
2.2, SAX parsing operation
SAX (Simple APIs for XML, a simple interface for operating XML) differs from DOM operation in that SAX uses a sequential mode for access, which is a way to quickly read XML data.
A series of events are triggered when an operation is done using the SAX parser.
Main events of SAX
When scanning to the document (Document) starts and ends, the relevant processing methods are called when the elements (Element) start and end, and the corresponding operations are made by these operation methods until the entire document scanning is finished.
If you want to use SAX parsing in your development, you should first write a SAX parser, then define a class directly, inherit it from the DefaultHandler class, and override the methods in the table above.
Package com.sax.demo;import org.xml.sax.Attributes;import org.xml.sax.SAXException;import org.xml.sax.helpers.DefaultHandler;public class XmlSax extends DefaultHandler {@ Override public void startDocument () throws SAXException {System.out.println (");} @ Override public void endDocument () throws SAXException {System.out.println ("\ n end of document reading. ") ;} @ Override public void startElement (String url, String localName, String name, Attributes attributes) throws SAXException {System.out.print (");} @ Override public void characters (char [] ch, int start, int length) throws SAXException {System.out.print (new String (ch, start, length)) } @ Override public void endElement (String url, String localName, String name) throws SAXException {System.out.print ("");}}
After building the Liwan SAX parser, you also need to create the SAXParserFactory and SAXParser objects, and then specify the XML file to parse and the specified SAX parser through the parse () method of SAXPaeser.
Create a file to read: sax_demo.xml
Xiao Ming xiaoming@163.com Xiao Li xiaoli@163.com
Using the SAX parser
Package com.sax.demo;import javax.xml.parsers.SAXParser;import javax.xml.parsers.SAXParserFactory;public class SaxTest {public static void main (String [] args) throws Exception {/ / (1) set up SAX parsing factory SAXParserFactory factory = SAXParserFactory.newInstance (); / / (2) construct parser SAXParser parser = factory.newSAXParser () / / (3) parse XML using handler parser.parse ("sax_demo.xml", new XmlSax ());}}
From the above program, you can find that parsing with SAX is easier than parsing with DOM.
The difference between DOM parsing and SAX parsing
The difference between the two can be found in the characteristics of the two:
DOM parsing is suitable for file modification and random access operations, but not suitable for large file operations.
SAX uses partial reading, so it can handle large files and only needs to read specific contents from the file. SAX parsing can be done by users to build their own object model.
2.3.A good helper for XML parsing: jdom
Jdom is a set of components written in Java for reading, writing, and manipulating XML.
Jdom = dom modified file + the advantage of fast SAX read
The main operation classes of jdom
Generate XML files using jdom
Package com.jdom.demo;import java.io.FileNotFoundException;import java.io.FileOutputStream;import java.io.IOException;import org.jdom.Attribute;import org.jdom.Document;import org.jdom.Element;import org.jdom.output.XMLOutputter;public class WriteXml {public static void main (String [] args) {/ / define node Element addresslist = new Element ("addresslist"); Element linkman = new Element ("linkman"); Element name = new Element ("name") Element email = new Element ("email"); / / define the attribute Attribute id = new Attribute ("id", "xm"); / / declare a Document object Document doc = new Document (addresslist); / / set the content of the element name.setText ("Xiaoming"); email.setText ("xiaoming@163.com"); name.setAttribute (id) / / set child node linkman.addContent (name); linkman.addContent (email); / / add linkman to root child node addresslist.addContent (linkman); / / output XML file XMLOutputter out = new XMLOutputter (); / / set output code out.setFormat (out.getFormat (). SetEncoding ("GBK")) / / output XML file try {out.output (doc, new FileOutputStream ("jdom_write.xml"));} catch (FileNotFoundException e) {e.printStackTrace ();} catch (IOException e) {e.printStackTrace ();} / / jdom_write.xml Xiaoming xiaoming@163.com
Use jdom to read XML files
Package com.jdom.demo;import java.io.IOException;import java.util.List;import org.jdom.Document;import org.jdom.Element;import org.jdom.JDOMException;import org.jdom.input.SAXBuilder;public class ReadXml {public static void main (String [] args) throws JDOMException, IOException {/ / establish SAX parsing SAXBuilder builder = new SAXBuilder (); / / find Document Document readDoc = builder.build ("jdom_write.xml") / / read the root element Element stu = readDoc.getRootElement (); / / get all the Linkman child elements List list = stu.getChildren ("linkman"); for (int I = 0; I < list.size (); iTunes +) {Element e = (Element) list.get (I); String name = e.getChildText ("name") String id = e.getChild ("name"). GetAttribute ("id"). GetValue (); String email = e.getChildText ("email"); System.out.println ("- contact -"); System.out.println ("name:" + name + "number:" + id); System.out.println ("Email:" + email);}
Jdom is a common operation component
It is widely used in actual development.
2.4.Parse tool: dom4j
Dom4j is also a set of XML operation component packages, which is mainly used to read and write XML files. Dom4j has been widely used because of its excellent performance, powerful function and ease of use. For example, dom4j is used to parse XML in Hibernate and Spring frameworks.
Jar packages to be introduced during development: dom4j-1.6.1.jar, lib/jaxen-1.1-beta-6.jar
The operation interfaces used in dom4j are defined in the org.dom4j package. Other packages are selected to use as needed.
The main interface of dom4j
Generate the XML file with dom4j:
Package com.dom4j.demo;import java.io.File;import java.io.FileNotFoundException;import java.io.FileOutputStream;import java.io.IOException;import java.io.UnsupportedEncodingException;import org.dom4j.Document;import org.dom4j.DocumentHelper;import org.dom4j.Element;import org.dom4j.io.OutputFormat;import org.dom4j.io.XMLWriter;public class Dom4jWrite {public static void main (String [] args) {/ / create document Document doc = DocumentHelper.createDocument () / / define node Element addresslist = doc.addElement ("addresslist"); Element linkman = addresslist.addElement ("linkman"); Element name = linkman.addElement ("name"); Element email = linkman.addElement ("email"); / / set node content name.setText ("Xiaoming"); email.setText ("xiaoming@163.com") / / set the output format OutputFormat format = OutputFormat.createPrettyPrint (); / / specify the output encoding format.setEncoding ("GBK"); try {XMLWriter writer = new XMLWriter (new FileOutputStream ("dom4j_demo.xml"), format); writer.write (doc); writer.close () } catch (UnsupportedEncodingException e) {e.printStackTrace ();} catch (FileNotFoundException e) {e.printStackTrace ();} catch (IOException e) {e.printStackTrace ();} / / dom4j_demo.xml Xiaoming xiaoming@163.com
Read the XML file with dom4j:
Package com.dom4j.demo;import java.io.File;import java.util.Iterator;import org.dom4j.Document;import org.dom4j.DocumentException;import org.dom4j.Element;import org.dom4j.io.SAXReader;public class Dom4jRead {public static void main (String [] args) {/ / read file File file = new File ("dom4j_demo.xml"); / / establish SAX parsing read SAXReader reader = new SAXReader (); Document doc = null Try {/ / read the document doc = reader.read (file);} catch (DocumentException e) {e.printStackTrace ();} / get the root element Element root = doc.getRootElement (); / / get all the child nodes Iterator iter = root.elementIterator () While (iter.hasNext ()) {/ / get each linkman Element linkman = (Element) iter.next (); System.out.println ("name:" + linkman.elementText ("name"); System.out.println ("email address:" + linkman.elementText ("email")) } Thank you for reading this article carefully. I hope the article "sample Analysis of XML parsing" shared by the editor will be helpful to you. At the same time, I also hope you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.