In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly explains the "which practical Python libraries", the content of the explanation is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in depth, together to study and learn "which practical Python libraries" bar!
Wget
Extracting data from the network is one of the important tasks of data scientists. Wget is a free utility that can be used to download non-interactive files from the Internet. It supports HTTP, HTTPS, and FTP protocols, as well as file retrieval through HTTP agents. Because it is non-interactive, it can work in the background even if the user is not logged in. So the next time you want to download all the pictures on a website or page, wget can help you. Installation:
$pip install wget
Example:
Import wget url = 'http://www.futurecrew.com/skaven/song_files/mp3/razorback.mp3' filename = wget.download (url) 3841532 / 3841532 filename' razorback.mp3' # Pendulum
For those who feel frustrated when dealing with dates and times in python, Pendulum is perfect for you. It is a Python package that simplifies date-time operations. It is a simple alternative to Python native classes. Please refer to the documentation for further study.
Installation:
$pip install pendulum
Example:
Import pendulum dt_toronto = pendulum.datetime (2012, 1, 1, tz='America/Toronto') dt_vancouver = pendulum.datetime (2012, 1, 1, tz='America/Vancouver') print (dt_vancouver.diff (dt_toronto). In_hours () 3
Imbalanced-learn
It can be seen that when the number of samples in each class is basically the same, the effect of most classification algorithms is the best, that is, the need to maintain data balance. However, most of the real cases are unbalanced data sets, which have a great impact on the learning stage and subsequent prediction of machine learning algorithms. Fortunately, this library is used to solve this problem. It is compatible with scikit-learn and is part of the scikit-lear-contrib project. The next time you encounter an unbalanced dataset, try using it.
Installation:
Pip install-U imbalanced-learn # or conda install-c conda-forge imbalanced-learn
Example:
Please refer to the documentation for usage and examples.
FlashText
In NLP tasks, cleaning up text data often requires replacing keywords in sentences or extracting keywords from sentences. Typically, this can be done with regular expressions, but it can be cumbersome if you search for thousands of terms. Python's FlashText module is based on FlashText algorithm, which provides a suitable alternative for this situation. The best thing about FlashText is that the run time is the same regardless of the number of search terms. You can learn more here.
Installation:
$pip install flashtext
Example:
Extract keywords
From flashtext import KeywordProcessor keyword_processor = KeywordProcessor () # keyword_processor.add_keyword (,) keyword_processor.add_keyword ('Big Apple',' New York') keyword_processor.add_keyword ('Bay Area') keywords_found = keyword_processor.extract_keywords (' I love Big Apple and Bay Area.') Keywords_found ['New York',' Bay Area']
Replace keyword
Keyword_processor.add_keyword ('New Delhi',' NCR region') new_sentence = keyword_processor.replace_keywords ('I love Big Apple and new delhi.') New_sentence'I love New York and NCR region.' Fuzzywuzzy
The name of this library sounds strange, but fuzzywuzzy is a very useful library when it comes to string matching. It is convenient to calculate string matching degree, token matching degree and other operations, and it is also convenient to match records stored in different databases.
Installation:
$pip install fuzzywuzzy
Example:
From fuzzywuzzy import fuzz from fuzzywuzzy import process # simple matching degree fuzz.ratio ("this is a test", "this is a test!") 97 # Fuzzy matching degree fuzz.partial_ratio ("this is a test", "this is a test!") 100
More interesting examples can be found in the GitHub warehouse.
PyFlux
Time series analysis is one of the most common problems in the field of machine learning. PyFlux is an open source library in Python, which is built to deal with time series problems. The library has a series of excellent modern time series models, including but not limited to ARIMA, GARCH and VAR models. In short, PyFlux provides a probabilistic method for time series modeling. It's worth a try.
Installation
Pip install pyflux
Examples
Please refer to the official documentation for detailed usage and examples.
Ipyvolume
Result presentation is also an important aspect of data science. Being able to visualize the results will have a great advantage. IPyvolume is a Python library that can visualize 3D volumes and graphics (such as 3D scatter plots, etc.) in Jupyter notebook, and requires only a small amount of configuration. But it is still in the pre-1.0 release stage. A more appropriate analogy is that IPyvolume's volshow is as good for 3D arrays as matplotlib's imshow for 2D arrays. You can get more here.
Use pip
$pip install ipyvolume
Use Conda/Anaconda
$conda install-c conda-forge ipyvolume
Examples
Animation
Volume rendering
Dash
Dash is an efficient Python framework for building web applications. It is designed on the basis of Flask, Plotly.js and React.js, bound with many modern UI elements such as drop-down boxes, sliders and charts, and you can write related analysis directly in Python code without using javascript. Dash is ideal for building data visualization applications. These applications can then be rendered in a web browser. The user guide can be obtained here.
Installation
Pip install dash==0.29.0 # Core dash backend pip install dash-html-components==0.13.2 # HTML components pip install dash-core-components==0.36.0 # Enhancement components pip install dash-table==3.1.3 # Interactive DataTable components (the latest! )
The example below shows a highly interactive chart with a drop-down function. When the user selects a value in the drop-down menu, the application code dynamically exports the data from Google Finance to panda DataFrame.
Gym
OpenAI's Gym is a development and comparison toolkit for reinforcement learning algorithms. It is compatible with any numerical calculation library, such as TensorFlow or Theano. The Gym library is a must-have tool for testing problem sets, also known as environments-you can use it to develop your reinforcement learning algorithms. These environments have a shared interface that allows you to write general algorithms.
Installation
Pip install gym
Example this example runs an instance in the CartPole-v0 environment with 1000 time steps, each of which renders the entire scene.
Thank you for your reading, these are the contents of "what are the practical Python libraries". After the study of this article, I believe you have a deeper understanding of what practical Python libraries there are, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.