In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the relevant knowledge of "how to deal with large files on a single machine by python". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
The following discussion is based on the assumption that a single row of data can be processed with zero correlation between rows.
Method 1:
Read to memory line by line using only Python built-in templates.
With yield, the benefit is to decouple read operations and processing operations:
Def python_read (filename):
With open (filename,'r',encoding='utf-8') as f:
While True:
Line = f.readline ()
If not line:
Return
Yield line
Read one row at a time, iterate row by line, and process data row by row
If _ _ name__ = ='_ _ main__':
G = python_read ('. / data/movies.dat')
For c in g:
Print (c)
# process c
Method 2:
The first method has some shortcomings, it is read line by line, and frequent IO operations slow down the processing efficiency. Is there a way for IO to read multiple lines at one time?
Pandas package read_csv function, as many as 38 parameters, very powerful.
When it comes to processing large files on a single machine, the chunksize parameter of read_csv can do this, which is set to 5, which means reading five lines at a time.
Def pandas_read (filename,sep=',',chunksize=5):
Reader = pd.read_csv (filename,sep,chunksize=chunksize)
While True:
Try:
Yield reader.get_chunk ()
Except StopIteration:
Print ('- Done---')
Break
Use the same as method 1:
If _ _ name__ = ='_ _ main__':
G = pandas_read ('. / data/movies.dat',sep= "::")
For c in g:
Print (c)
# process c, "how to deal with large files on a single python" ends here. Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.