In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article introduces the knowledge of "how to download with Python". Many people will encounter such a dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
1. Use requests
You can use the requests module to download files from a URL.
Consider the following code:
You simply use the get method of the requests module to get the URL and store the result in a variable called "myfile". Then, write the contents of this variable to the file.
two。 Use wget
You can also download files from a URL using Python's wget module. You can use pip to install the wget module with the following command:
Consider the following code, which we will use to download the logo image of Python.
In this code, the URL and path (where the image will be stored) are passed to the download method of the wget module.
3. Download redirected files
In this section, you will learn how to download a file from a URL using requests, and the URL will be redirected to another URL with a .pdf file. The URL looks like this:
To download this pdf file, use the following code:
In this code, the first step we specify is URL. Then, we use the get method of the request module to get the URL. In the get method, we set allow_redirects to True, which will allow redirection in URL, and the redirected content will be assigned to the variable myfile.
Finally, we open a file to write the acquired content.
4. Download large files in blocks
Consider the following code:
First, we use the get method of the requests module as before, but this time, we will set the stream property to True.
Next, we create a file called PythonBook.pdf in the current working directory and open it for writing.
Then, we specify the size of the block to download each time. We have set it to 1024 bytes, then iterate through each block and write the blocks in the file until the block ends.
Isn't it beautiful? Don't worry, we'll display a progress bar for the download process later.
5. Download multiple files (parallel / bulk download)
To download multiple files at the same time, import the following modules:
We imported the os and time modules to check how long it takes to download the file. The ThreadPool module allows you to run multiple threads or processes using a pool.
Let's create a simple function that sends the response in chunks to a file:
This URL is a two-dimensional array that specifies the path and URL of the page you want to download.
As we did in the previous section, we pass this URL to requests.get. Finally, we open the file (the path specified in URL) and write to the page content.
Now we can call this function separately for each URL, or we can call this function for all URL at the same time. Let's call this function separately for each URL in the for loop, noticing the timer:
Now replace the for loop with the following line of code:
Run the script.
6. Use the progress bar to download
The progress bar is a UI component of the clint module. Enter the following command to install the clint module:
Consider the following code:
In this code, we first import the requests module, and then we import the progress component from clint.textui. The only difference is in the for loop. When writing content to the file, we used the bar method of the progress bar module.
7. Download a web page using urllib
In this section, we will use urllib to download a web page.
The urllib library is the standard library for Python, so you don't need to install it.
The following lines of code can easily download a web page:
Specify here the URL where you want to save the file and where you want to store it.
In this code, we use the urlretrieve method and pass the URL of the file and the path to save the file. The file extension will be .html.
8. Download through an agent
If you need to use an agent to download your files, you can use the ProxyHandler of the urllib module. Look at the following code:
In this code, we create a proxy object, open it by calling the build_opener method of urllib, and pass in the proxy object. Then, we create a request to get the page.
In addition, you can use the requests module as described in the official documentation:
You just need to import the requests module and create your proxy object. Then you can get the file.
9. Use urllib3
Urllib3 is an improved version of the urllib module. You can download and install it using pip:
We will use urllib3 to get a web page and store it in a text file.
Import the following modules:
We used the shutil module when processing files.
Now, let's initialize the URL string variable like this:
Then we used urllib3's PoolManager, which tracks the necessary connection pooling.
Create a file:
Finally, we send a GET request to get the URL and open a file, and then write the response to the file:
10. Download files from S3 using Boto3
To download files from Amazon S3, you can use the Python boto3 module.
Before you begin, you need to install the awscli module using pip:
For AWS configuration, run the following command:
Now, enter your details by pressing the following command:
To download files from Amazon S3, you need to import boto3 and botocore. Boto3 is an Amazon SDK that allows Python to access Amazon web services such as S3. Botocore provides command-line services that interact with Amazon web services.
Botocore comes with awscli. To install boto3, run the following command:
Now, import these two modules:
When downloading a file from Amazon, we need three parameters:
Bucket name
The name of the file you need to download
The name of the file after download
Initialize variables:
Now, let's initialize a variable to use the resources of the session. To do this, we will call the resource () method of boto3 and pass in the service, that is, S3:
Finally, download the file using the download_file method and pass in the variables:
11. Use asyncio
The asyncio module is mainly used to handle system events. It works around an event loop that waits for the event to occur and then reacts to it. The reaction can be to call another function. This process is called event handling. The asyncio module uses collaborative programs for event handling.
To use asyncio event handling and collaboration, we will import the asyncio module:
Now, define the asyncio collaborative approach like this:
The keyword async indicates that this is a native asyncio collaborative program. Inside the collaborative program, we have an await keyword that returns a specific value. We can also use the return keyword.
Now, let's use collaboration to create a piece of code to download a file from the website:
In this code, we create an asynchronous collaborative function that downloads our file and returns a message.
Then we use another asynchronous collaborator to call main_func, which waits for URL and groups all the URL into a queue. The wait function of asyncio waits for the collaborative program to complete.
Now, in order to start the collaborative program, we must use the get_event_loop () method of asyncio to put the collaborative program into the event loop, and finally, we use the run_until_complete () method of asyncio to execute the event loop.
This is the end of "how to download with Python". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.