Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Python to draw the global diffusion map of COVID-19

2025-01-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to use Python to draw the global diffusion map of COVID-19", the content of the explanation in the article is simple and clear, easy to learn and understand, now please follow the editor's train of thought slowly in depth, together to study and learn "how to use Python to draw the global diffusion map of COVID-19"!

In a world where global travel is commonplace, the spread of disease is a real concern. Some organizations track major epidemics (and all pandemics) and disclose the data obtained by their tracking work. However, these raw data can be difficult for people to deal with, which is why data science is so important. For example, using Python and Pandas to visualize the global spread path of COVID-19 may be helpful to the analysis of these data.

At first, it may be difficult to deal with such a large amount of raw data. But when you start to process the data, you will slowly find some ways to deal with the data. Here are some common situations for working with COVID-19 data:

Download COVID-19 's national daily broadcast data from GitHub and save it as a DataFrame object in Pandas. At this point you need to use the Pandas library in Python.

Process and clean up the downloaded data to make it meet the input format of visual data. The downloaded data is in good condition (the data is regular). One problem with this data is that it identifies the country by its name, but it is best to use a three-digit ISO 3 code (country code table) to identify the country. In order to generate ISO 3 code, you can use pycountry, the Python library. After you generate this code, you can add a column to the original DataFrame and populate it with it.

Finally, in order to achieve visualization, we use the express module in the Plotly library. This article uses a map called choropleth (available in the Plotly library) to visualize the spread of the disease around the world.

Step 1: Corona data

Download the latest corona data from the following website. (LCTT note: 2020-12-14 is still accessible, with walls):

Https://raw.githubusercontent.com/datasets/covid-19/master/data/countries-aggregated.csv

Between us, we load the downloaded data into the DataFrame of Pandas. Pandas provides a function, read_csv (), that reads data directly using URL and returns a DataFrame object, as shown below:

Import pycountryimport plotly.express as pximport pandas as pdURL_DATASET = r 'https://raw.githubusercontent.com/datasets/covid-19/master/data/countries-aggregated.csv'df1 = pd.read_csv (URL_DATASET) print (df1.head (3)) # Get first 3 entries in the dataframeprint (df1.tail (3)) # Get last 3 entries in the dataframe

Screenshot of the output on Jupyter:

Jupyter screenshot

From this output, you can see that the DataFrame (df1) includes the following columns of data:

Date

Country

Confirmed

Recovered

Dead

You can also see that the Date column contains entry information from January 22 to March 31. This data is updated every day, so you will get your value for the day.

Step 2: clean up and modify DataFrame

We are going to add a column of data to this DataFrame, which contains the ISO 3 code. This can be done in the following three steps:

Create a list of all countries. Because in the Country column of df1, the country is repeated for each date. So there are actually multiple entries for each country in the Country column. I use the unique (). Tolist () function to accomplish this task.

I use the d_country_code dictionary object (initially empty), then set its key to the name of the country, and then set its value to its corresponding ISO 3 encoding.

I use pycountry.countries.search_fuzzy (country) to generate ISO 3 codes for each country. What you need to understand is that the return value of this function is a list of Country objects. I assign the return value of this function to the country_data object. Take the first element of this object (serial number 0) as an example. This\ object has an alpha_3 property. So I can "get" the ISO 3 encoding of the first element using country_data [0] .alpha _ 3. However, the names of some countries in this DataFrame may not have a corresponding ISO 3 code (such as disputed territories). So for these "countries", I will replace the ISO 3 encoding with a blank string. You can also replace this part with a try-except code. The statement in except can be written: print ('could not add ISO 3 code for->', country). This allows you to give an output prompt if the ISO 3 codes for these "countries" are not found. In fact, you will find that these "countries" will be represented in white in the final output.

After getting the ISO 3 codes for each country (some are blank strings), I add the names of those countries (as keys) and the corresponding ISO 3 codes (as values) to the previous dictionary d_country_code. You can do this using the update () method of the dictionary object in Python.

After creating a dictionary containing country names and corresponding ISO 3 encodings, I added them to DataFrame using a simple loop.

Step 3: use Plotly to visualize the propagation path

Choropleth map is a map made up of colored polygons. It is often used to represent the change of a variable in space. We use the px module in Plotly to create the choropleth diagram, the specific function is: px.choropleth.

The parameters of this function are as follows:

Plotly.express.choropleth (data_frame=None, lat=None, lon=None, locations=None, locationmode=None, geojson=None, featureidkey=None, color=None, hover_name=None, hover_data=None, custom_data=None, animation_frame=None, animation_group=None, category_orders= {}, labels=, color_discrete_sequence=None, color_discrete_map= {}, color_continuous_scale=None, range_color=None, color_continuous_midpoint=None, projection=None, scope=None, center=None, title=None, template=None, width=None, height=None)

There are a few other things to note about the function choropleth ():

Geojson is a geometry object (the sixth argument to the function above). This object is a bit disturbing because it is not explicitly mentioned in the function documentation. You can provide or not provide geojson objects. If you provide a geojson object, then it will be used to draw earth features. If you do not provide a geojson object, this function will use a built-in geometry object by default. (in our experiment, we use the built-in geometry object, so we will not provide a value for the geojson parameter)

The DataFrame object has a data_frame property, and here we provided a df1 that we created earlier.

We use Confirmed (confirmed number) to determine the color of polygons in each country.

Finally, we create an animation_frame for the Date column. In this way, we can divide the data by date, and the color of the country will change with the change of Confirmed.

The final complete code is as follows:

Import pycountryimport plotly.express as pximport pandas as pd#-Step 1-URL_DATASET = r 'https://raw.githubusercontent.com/datasets/covid-19/master/data/countries-aggregated.csv'df1 = pd.read_csv (URL_DATASET) # print (df1.head) # Uncomment to see what the dataframe is like#-Step 2- -list_countries = df1 ['Country'] .unique (). Tolist () # print (list_countries) # Uncomment to see list of countriesd_country_code = {} # To hold the country names and their ISOfor country in list_countries: try: country_data = pycountry.countries.search_fuzzy (country) # country_data is a list of objects of class pycountry.db.Country # The first item ie at index 0 of list is best fit # object of class Country have an alpha_3 attribute country_code = country_data [0] .alpha _ 3 d_country_code.update ({country: country_code}) except: print ('could not add ISO 3 code for->' Country) # If could not find country, make ISO code''d_country_code.update ({country:'}) # print (d_country_code) # Uncomment to check dictionary # create a new column iso_alpha in the df# and fill it with appropriate iso 3 codefor k, v ind _ country_code.items (): df1.loc [(df1.Country = = k) 'iso_alpha'] = v # print (df1.head) # Uncomment to confirm that ISO codes added#-Step 3-fig = px.choropleth (data_frame = df1, locations= "iso_alpha", color= "Confirmed", # value in column' Confirmed' determines color hover_name= "Country" Color_continuous_scale= 'RdYlGn', # color scale red, yellow green animation_frame= "Date") fig.show ()

The output of this code is the content of the following diagram:

Thank you for your reading, the above is the content of "how to use Python to draw the global diffusion map of COVID-19". After the study of this article, I believe you have a deeper understanding of how to use Python to draw the global diffusion map of COVID-19. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report