In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
Python scripting language as a tool to convert data example analysis, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can get something.
It is said that python is an interpretive language, which means to read one line and execute one line. Now there is a python program in front of you. Its function is to "convert a csv file name and its contents into multiple linux directory structures with contents as directories." This paragraph should be relatively modeled, and it will be easy to understand with a specific example. Here, suppose there is such a small demand: there is a compressed file JiangSu-1949-10-01-HumanScience.csv.gz, which contains a detailed list of humanities universities established in Jiangsu Province before October 1, 1949, and needs to be converted into a set of folders. Each folder separately stores a university details contained in the signed university general record. (note:
The following is only a sample python that implements this small requirement, and the parameter file is JiangSu-1949-10-01-HumanScience.csv.gz)
#! / usr/bin/pythonimport csv, gzip, os, sys, redef ResolveUnivercity (TotalUniversities, pattern): SpecifiedUnivercity = pattern.search (TotalUniversities)
If SpecifiedUnivercity is None:
Return None
Return SpecifiedUnivercity.groups ()
Def TransformData (TotalUniversities, RootUnivercity_Path, pattern):
SpecifiedUniversity = ResolveUnivercity (TotalUniversities, pattern)
If SpecifiedUniversity is None:
Print 'Cannot resolved TotalUniversities file'
Return 1
Else:
Print 'UniversityName: [% s] Born: [% s-%s-%s] Area: [% s]'% SpecifiedUniversity
If not SpecifiedUniversity [4] in ["NatureScience", "HumanScience"]:
Print "system only analyze Nature or Human Science"
Return 1
If RootUnivercity_Path is None:
RootUnivercity_Path ='% s'% SpecifiedUniversity [0]
Gfile = gzip.open (TotalUniversities, "rb")
Rownum = 0
HasNext = False
SpecifiedUnivercity_Content = {}
SpecifiedUnivercity_DetailContent = None
For row in csv.reader (gfile):
If rownum = = 0:
Header = row
Else:
SpecifiedUniversity_Name = row [0]
If len (SpecifiedUniversity_Name) > 0:
HasNext = True
If SpecifiedUnivercity_Content.has_key (SpecifiedUniversity_Name):
SpecifiedUnivercity_DetailContent = SpecifiedUnivercity_ content [SpecifiedUniversity _ Name] [1]
Else:
Rownum = 0
# create new file
SpecifiedUniversity_File ='% sails% slots% s.csv'% (SpecifiedUniversity [0], SpecifiedUniversity_Name, SpecifiedUniversity [2])
AbsolutePath = os.path.join (RootUnivercity_Path, SpecifiedUniversity_Name)
AbsolutePathName = os.path.join (AbsolutePath, SpecifiedUniversity_File)
If not os.path.exists (AbsolutePath):
Os.makedirs (AbsolutePath, 0777)
Outfile = open (AbsolutePathName, 'wb')
SpecifiedUnivercity_DetailContent = csv.writer (outfile, Token=',')
# insert into SpecifiedUniversity
SpecifiedUnivercity_ content [SpecifiedUniversity _ Name] = [outfile, SpecifiedUnivercity_DetailContent, False]
Else:
HasNext = False
If SpecifiedUnivercity_DetailContent is not None:
SpecifiedUnivercity_DetailContent.writerow (row)
Else:
Print "row d has no owner file" rownum
Rownum + = 1
Gfile.close ()
For key in SpecifiedUnivercity_Content.keys ():
SpecifiedUnivercity_ content.close ()
SpecifiedUnivercity_ content.close ()
If _ name__ = = "_ _ main__":
# JiangSu-1949-10-01-HumanScience.csv.gz
ResolvedSpecofiedUnivercity = re.compile (r'(/ w {2,})-(/ d {4})-(/ d {2})-(/ d {2})-(/ w {1,}) /.')
For file in sys.argv [1:]:
TransformData (file, None, ResolvedSpecofiedUnivercity)
Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.