In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
Python developers should know what the seven development libraries are, I believe that many inexperienced people do not know what to do, so this article summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.
Please note that I specifically ruled out libraries like SQLAlchemy and Flask, because they are too good to mention.
Here's how to start:
1. PyQuery (with lxml)
Installation method pip install pyquery
Beautiful Soup is most often recommended by Python when parsing HTML, and it does perform well. Provide good Python-style API, and it is easy to find relevant documents online, but when you need to parse a large number of documents in a short period of time, you will encounter performance problems, simple, but really slow.
The following is a performance comparison chart from 2008:
In this figure, we find that the performance of lxml is so good, but the documentation is very few, and the use is quite clumsy! So do you want to choose a library that is simple to use but extremely slow, or a library that is fast but very complex to use?
Who says we have to choose one of the two? what we want is a XML/HTML parsing library that is easy to use and has the same fast speed.
And PyQuery can meet your demanding requirements in terms of ease of use and parsing speed at the same time.
Look at the following lines of code:
From pyquery import PyQuery page = PyQuery (some_html) last_red_anchor = page ('# container > a.redveLast')
It's very simple, very similar to jQuery, but it is Python.
However, there are some disadvantages, such as the need to reencapsulate the text when using iterations:
For paragraph in page ('# container > p'): paragraph = PyQuery (paragraph) text = paragraph.text ()
2. Dateutil
Installation method: pip install dateutil
The processing date is painful, thanks to dateutil.
From dateutil.parser import parse > parse ('Mon, 11 Jul 2011 10:01:56 + 0200 (CEST)') datetime.datetime (2011, 7, 11, 10, 1, 56, tzinfo=tzlocal ()) # fuzzy ignores unknown tokens > s = "" Today is 25 of September of 2003, exactly. At 10:49:41 with timezone-03:00 "" > > parse (s, fuzzy=True) datetime.datetime (2003, 9, 25, 10, 49, 41, tzinfo=tzoffset (None,-10800))
3. Fuzzywuzzy
Installation method: pip install fuzzywuzzy
Fuzzywuzzy allows you to vaguely compare two strings, which is useful when you need to deal with some human-generated data. The following code uses the Levenshtein distance comparison method to match the user input array and possible selections.
From Levenshtein import distance countries = ['Canada',' Antarctica', 'Togo',.] Def choose_least_distant (element, choices): 'Return the one element of choices that is most similar to element' return min (choices, key=lambda s: distance (element, s)) user_input =' canaderp' choose_least_distant (user_input, countries) > 'Canada'
This is good, but it can be done better:
From fuzzywuzzy import process process.extractOne ("canaderp", countries) > > ("Canada", 97)
4. Watchdog
Installation method: pip install watchdog
Watchdog is a Python API and shell utility for monitoring file system events.
5. Sh
Installation method: pip install sh
Sh allows you to call any program as if it were a function:
From sh import git, ls, wc # checkout master branch git (checkout= "master") # print (the contents of this directory print (ls ("- l")) # get the longest line of this file longest_line = wc (_ _ file__, "- L")
6. Pattern
Installation method: pip install pattern
Pattern is a Web data mining module of Python. It can be used in data mining, natural language processing, machine learning and network analysis.
7. Path.py
Installation method: pip install path.py
When I started learning Python, os.path was part of my least favorite stdlib. Although it is easy to create a set of files in a directory.
Import os some_dir ='/ some_dir' files = [] for f in os.listdir (some_dir): files.append (os.path.joinpath (some_dir, f))
But listdir is in os, not os.path.
With path.py, dealing with file paths becomes easier:
From path import path some_dir = path ('/ some_dir') files = some_dir.files ()
Other uses:
> > path ('/'). Owner 'root' > path (' an ab/c'). Splitall () [path ('), 'ab/d/f',' bake,'c'] # overriding _ div__ > path ('a') /'b' /'c 'path (' an ab/c'') > > path ('ab/c'). Relpathto (' ab/d/f') path ('. / dgamf')
Isn't it much better?
After reading the above, have you mastered the 7 development libraries that Python developers should know? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.