In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
Editor to share with you how to solve the problem of python converting xml format files into txt files, I believe most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!
Overview
Let's first introduce the file in xml format. From the point of view of data analysis, the dataset in xml format has the following advantages: openness (ability to read and process data on any platform, allowing xml data to be exchanged through some network protocols), simplicity (plain text, data can be exchanged between different systems), structure and content separation (unlike HTML, data display and data itself are separate), scalability
Problem description
So how do we use the data in xml when we analyze the data?
We need to convert this type of file into other types of files.
(in fact, I think it's better to extract xml data to form a new type file.) from my personal point of view, dealing with this problem is a bit similar to a web crawler, but unlike a crawler, you don't need to consider the IP proxy address (anti-crawling is really a difficult problem to deal with)
Solution to the problem
The content of the file in xml format is roughly as follows:
Import osimport sysimport xml.etree.ElementTree as ETimport globdef xml_to_txt (indir, outdir): os.chdir (indir) # indir is the folder from which xml files are sourced Outdir for the converted txt file storage path annotated = os.listdir ('.') # returns a list containing the file names in the directory print (annotated) for I, file in enumerate (annotated): file_save = file.split ('.) [0] + .txt'# split separates the file name from the suffix name file_txt = outdir + "\\" + file_save file w = open (file_txt) 'w') in_file = open (file,encoding='UTF-8') tree = ET.parse (in_file) root = tree.getroot () # the following code can be ignored You need to find the label corresponding to the data you need on the xml dataset, find a way to assign it to a variable, and then write it into a new file to ok the for value in root.iter ('xxx'): value = value.text f_w.write (value) f_w.write ('\ n\ n')
Also, I would like to say that this method is quite useful, and it is also convenient to read all xml files directly when you are dealing with a folder that contains a lot of .xml.
The above is all the contents of this article entitled "how to solve the problem of python converting xml format files into txt files". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.