In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly talks about "Python parsing XML instance sharing". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "Python parsing XML instance sharing"!
Analysis of XML by Python
The common XML programming interfaces are DOM and SAX, which deal with XML files in different ways and in different scenarios.
Python has three ways to parse XML, which are SAX, DOM, and ElementTree:
The SAX:Python standard library contains SAX parsers. SAX uses an event-driven model to process XML files by triggering individual events and calling user-defined callback functions during parsing XML.
DOM: parses XML data into a tree in memory and manipulates XML by manipulating the tree.
ElementTree:ElementTree (element tree) is like a lightweight DOM with a convenient and friendly API. Good code availability, high speed and low memory consumption.
Because DOM needs to map XML data to a tree in memory, it is slow and memory-consuming, while SAX streaming reads XML files, which is faster and takes up less memory, but requires users to implement callback functions (handler).
The XML instance file movies.xml used in this section is as follows:
Comedy, affection, love, fantasy DVD 2014 PG 8 the film tells the story of a 70-year-old woman who unbelievably turns into a young woman and returns to daily life in a new identity, triggering a series of ironic fantasy stories of science fiction, disaster, adventure and action DVD 2019 R 10 tells the story of the imminent destruction of the sun, which is no longer suitable for human survival, while facing a desperate situation. Mankind will start the comedy of "wandering the Earth" project. Love DVD 2015 PG 10 tells the story of a girl with high IQ and low EQ. Bacha knife meets a villain who collects debts full-time. The two men "treat" each other under the introduction of the divine doctor, a series of story movements, war, military VHS 2015 PG 10 tells the legendary story of small potatoes growing into lone heroes who save the fate of the country and the nation. Use SAX API to analyze XML
SAX is an event-driven API. Parsing XML documents using SAX involves two parts, the parser and the event handler. The parser is responsible for reading the XML document and sending events to the event handler, such as the event that the element begins and ends. On the other hand, the event handler is responsible for responding to events and processing the transmitted XML data.
To deal with xml in sax mode in Python, you should first introduce the parse function in xml.sax and ContentHandler in xml.sax.handler.
Introduction to ContentHandler class methods
Characters (content) method:
Starting from the line, before the label is encountered, there are characters, and the value of content is these strings. From one tag, before the next tag is encountered, there are characters, and the value of content is these strings. From a label, before the line Terminator is encountered, there are characters, and the value of content is these strings. The label can be either the start tag or the end tag.
The startDocument () method is called when the document starts.
The endDocument () method is called when the parser reaches the end of the document.
The startElement (name, attrs) method is called when a XML is encountered at the beginning of the tag. Name is the name of the tag, and attrs is the dictionary of the attribute values of the tag.
EndElement (name) method: called when a XML closing tag is encountered.
Make_parser method
The make_parser () method is used to create a new parser object and return it. The parser object created will be the first parser type found by the system.
The syntax is as follows:
Xml.sax.make_parser (parser_list) parser_list-optional parameters, parser list parser method
The parser () method is used to create an SAX parser and parse the xml document.
The syntax is as follows:
Xml.sax.parse (xmlfile, contenthandler [, errorhandler]) xmlfile-- xml file name contenthandler-- must be a ContentHandler object errorhandler-- if this parameter is specified, errorhandler must be a SAX ErrorHandler object parseString method
The parseString () method creates a XML parser and parses the xml string.
The syntax is as follows:
Xml.sax.parseString (xmlstring, contenthandler [, errorhandler]) xmlstring-- xml string contenthandler-- must be an object errorhandler of ContentHandler-if this parameter is specified Errorhandler must be a SAX ErrorHandler object Python parsing XML instance import xml.saxclass MovieHandler (xml.sax.ContentHandler): def _ _ init__ (self): self.CurrentData = "" self.type = "" self.format = "" self.year = "" self.rating = "" self.stars = "" self.description = "" # element starts calling def startElement (self, tag Attributes): self.CurrentData = tag if tag = = "movie": print ("- Movie Information introduction -") title = attributes ["title"] print ("Title:", title) # element ends calling def endElement (self, tag): if self.CurrentData = = "type": print ("Type:" Self.type) elif self.CurrentData = = "format": print ("format:", self.format) elif self.CurrentData = = "year": print ("time:", self.year) elif self.CurrentData = = "rating": print ("rating:", self.rating) elif self.CurrentData = = "stars": print ("Star:" Self.stars) elif self.CurrentData = = "description": print ("description:", self.description, "\ n") self.CurrentData = "" # call def characters (self) when reading characters Content): if self.CurrentData = = "type": self.type = content elif self.CurrentData = = "format": self.format = content elif self.CurrentData = = "year": self.year = content elif self.CurrentData = = "rating": self.rating = content elif self.CurrentData = = "stars": self.stars = content Elif self.CurrentData = = "description": self.description = contentif (_ _ name__ = = "_ _ main__"): # create a XMLReader parser = xml.sax.make_parser () # close the namespace parser.setFeature (xml.sax.handler.feature_namespaces 0) # rewrite ContextHandler Handler = MovieHandler () parser.setContentHandler (Handler) parser.parse ("movies.xml")
After running the program, the output is as follows:
-introduction to the movie-Title: return to the age of 20 type: comedy, affection, love, fantasy format: DVD time: 2014 rating: over 7 years old can watch stars: 8 description: the film tells the story of an old woman in her seventies who returned to daily life in a new identity after she was transformed into a young woman. Triggered a series of ironic fantasy stories-movie information introduction-Title: wandering Earth Type: science Fiction, disaster, Adventure, Action format: DVD time: 2019 rating: r Star: 10 description: about the imminent destruction of the sun, no longer suitable for human survival, but in the face of desperation Mankind will start the "wandering Earth" project-Movie Information introduction-Title: villain Angel Type: comedy, Love format: DVD time: 2015 rating: PG Star: 10 description: it tells the story of high IQ, low EQ female school bully knife encounter full-time debt collection villain Moffrey A series of stories about the occurrence of each other's "treatment" under the introduction of the magic doctor-introduction of movie information-Title: Wolf Warriors Type: action, War, military format: VHS time: 2015 rating: PG Star: 10 description: it tells the legendary story of small potatoes growing into lone heroes who save the fate of the country and nation. Use DOM API to analyze XML.
The document object model is a cross-language API from the W3C that is used to access and modify XML documents.
When parsing an XML document, the DOM parser reads the entire document at once, saving all the elements in the document in a tree structure in memory. After that, you can use different functions provided by DOM to read or modify the content and structure of the document, or you can write the modified content to the xml file.
When using xml.dom.minidom to parse xml files, the minidom object provides a simple parser method to quickly create DOM trees from XML files.
Example: from xml.dom.minidom import parseimport xml.dom.minidom# uses the minidom parser to open the XML document DOMTree = xml.dom.minidom.parse ("movies.xml") collection = DOMTree.documentElementif collection.hasAttribute ("shelf"): print ("root element:% s"% collection.getAttribute ("shelf")) # get all movies movies = collection.getElementsByTagName ("movie") # print the details of each movie for movie In movies: print ("- introduction to the movie -") if movie.hasAttribute ("title"): print ("title:% s"% movie.getAttribute ("title")) type = movie.getElementsByTagName ('type') [0] print ("Type:% s"% type.childNodes [0] .d ata) format = movie.getElementsByTagName (' format') [0] print ("format:% s" "% format.childNodes [0] .data) rating = movie.getElementsByTagName ('rating') [0] print (" level:% s "% rating.childNodes [0] .data) stars = movie.getElementsByTagName (' stars') [0] print (" Star:% s "% stars.childNodes [0] .data) description = movie.getElementsByTagName ('description') [0] print (" description:% s "% description.childNodes [0] .data) "\ n")
After running the program, the output is as follows:
Root element: new product recommendation-movie information introduction-title: return to 20-year-old type: comedy, affection, love, fantasy format: DVD level: PG Star: 8 description: the film tells the story of an old woman in her seventies who returned to daily life with a new identity after she was transformed into a young woman. A series of ironic fantasy stories-introduction to the movie-title: wandering Earth Type: science Fiction, disaster, Adventure, Action format: DVD level: r Star: 10 description: about the imminent destruction of the sun, no longer suitable for human survival, but in the face of desperation Mankind will start the "wandering Earth" project-introduction to movie information-title: villain Angel Type: comedy, Love format: DVD rating: PG Star: 10 description: it tells the story of high IQ, low EQ female school, check knife, meet full-time debt collection villain Moffrey. A series of stories about each other's "treatment"-introduction to movie information-title: Wolf Warriors Type: action, War, military format: VHS level: PG Star: 10 description: it is about the legendary story of small potatoes growing into lone heroes who save the fate of the country and the nation. So far, I believe you have a better understanding of "Python parsing XML instance sharing". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.