In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article introduces what is the elegant method of dealing with JSON files in Python, the content is very detailed, interested friends can refer to, hope to be helpful to you.
1. Introduction
We will learn how to read, parse, and write JSON files using Python.
We will discuss how best to handle simple JSON files as well as nested JSON files, and of course we will discuss how to access specific values in the Json file data.
two。 What is a JSON file?
JSON (JavaScript Object Notation) is a popular file format that is mainly used to store and transfer data in web applications. If we often deal with data, we must have encountered files in JSON format more or less, so it is necessary to learn how to read and write JSON.
The following figure shows an example of a common JSON file structure.
The JSON structure looks very similar to the dictionary in Python. It is important to note that the JSON format usually consists of key: pairs, where key is a string and value is a string, number, Boolean, array, object, or null.
For more intuitive illustration, we highlight all key in blue and all value in orange in the following figure. Note that each of the following sets of key/value is distinguished by commas.
3. Using Python to process JSON files
A function is built into Python to read the JSON file. Here are a few examples of how to parse a JSON file into a Python object.
3.1. Read JSON files as dictionary types
First we need to import the json library, then we use the open function to read the JSON file, and finally we use the json.load () function to convert the JSON string into Python dictionary form.
As simple as that, the code is as follows:
Import json with open ('superheroes.json') as f: superHeroSquad = json.load (f) print (type (superHeroSquad)) # Output: dictprint (superHeroSquad.keys ()) # Output: dict_keys ([' squadName', 'homeTown',' formed', 'secretBase',' active', 'members'])
The above code is very simple and intuitive, the only thing to note is that there are two functions load () and loads () in the json library.
The function load () generates Python objects for reading JSON files. The function loads () generates Python objects for reading JSON string streams.
We can understand the meaning of the character s in the loads () function as load for strings.
3.2. Read the JSON file as a Pandas type
Of course, we can also use the read_json function in the Pandas library to read the corresponding JSON file.
The code is as follows:
Import pandas as pddf = pd.read_json ('superheroes.json')
The running results are as follows:
It should be noted that using the Pandas library can read not only the JSON files on the local disk of the computer, but also the files stored on the network through URL.
The code is as follows:
Df1 = pd.read_json ('https://mdn.github.io/learning-area/javascript/oojs/json/superheroes.json')3.3. Use Pandas to read nested JSON types
The JSON files we sometimes encounter are nested, which often makes it difficult to read. In fact, the idea of nested JSON is similar to that of nested dictionaries in Python, that is, nested dictionaries in dictionaries.
We look at the member field in the above example, whose value is also a dictionary type, and in the following figure we use indentation to show the nested structure.
Imagine that when we load the JSON file into the Pandas data framework, the members column looks like this. Each line contains a dictionary.
Next we discuss two implementations, in which we can parse the data so that each key is decomposed into a separate column.
Option one
We can use the apply method on the members column with the following code:
Df ['members'] .apply (pd.Series)
After the above code is executed, the members column is split into four new columns, as follows:
Of course, if you want to merge the above split result with the previous result, you can use the pd.concat function
The code is as follows:
Df = pd.concat ([df ['members'] .apply (pd.Series), df.drop (' members', axis = 1)], axis = 1)
Option 2
There is also a function json_normalize () in the Pandas library that allows us to expand the nested JSON. This is the easiest way to parse a nested JSON.
The code is as follows:
Def test2 (): with open ('superheroes.json') as f: superHeroSquad = json.load (f) out = pd.json_normalize (superHeroSquad, record_path= [' members'], meta= ['squadName',' homeTown', 'formed',' secretBase', 'active']) print (out)
In the above code:
Record_path is the name of the column we want to split
Meta is the list of the column name, which is the order in which we output.
The running results are as follows:
Finally, we need to note that we can add the parameter meta_prefix to the above function json_normalize, which allows us to add a uniform prefix to the names in meta.
The code is as follows:
Pd.json_normalize (superHeroSquad, record_path = ['members'], meta = [' squadName', 'homeTown',' formed', 'secretBase',' active'], meta_prefix = 'members_')
The running results are as follows:
3.4. Access to data at a specific location
In Python, we can access data anywhere in the JSON file through the name or subscript of Key.
For example, suppose we want to know the secret identity of our second superhero. That is, in the following illustration, the data that needs to access a specific location is highlighted in purple in the following illustration.
To get this value, we can use the following statement directly:
SuperHeroSquad ['members'] [1] [' secretIdentity']
Starting from the top of the hierarchy, from top to bottom, the first key we need is' members', because it is the parent node where the value we need to access is located.
In the key value corresponding to 'members', we look at the parentheses, and then subscript 1 represents the second member of the list. Then let's look at the field 'secretIdentity',' as follows:
By combining the above processes, we can get a value of 'Jane Wilson' at our specific location.
Careful students may have noticed that I highlighted two blue values in the JSON fragment above. It is hoped that interested students can try to access these values as an exercise. You are welcome to share your code in the comments section at the end of the article.
3.5. Export JSON
Let's edit our last superhero, change his secretIdentity from 'Unknow' to' Will Smith', and then export this dictionary to a JSON file. Here we will use the json.dump () function to write the dictionary to the file.
The code is as follows:
# update secret identity of Eternal FlamesuperHeroSquad ['members'] [2] [' secretIdentity'] = 'Will Smith'with open (' superheroes.json','w') as file: json.dump (superHeroSquad, file)
After the above code runs, we open the file superheroes.json and find that the secretIdentity of the last superhero has been changed from Unknow to Will Smith.
Of course, as an option, we can also use the to_json () function in Pandas to do the above.
Df.to_json ('superheroes.json') 3. 6. Formatted output
We sometimes print json files directly on the terminal, and we usually get very ugly output, as shown in the following example:
To make it look more beautiful, we can use the parameter indent parameter in the function json.dump to control the output format. The code is as follows:
With open ('superheroes.json', 'w') as file: json.dump (superHeroSquad, file, indent = 4)
The output is as follows, does it look more beautiful.
3.7. Sort the output fields
Of course, the dump function contains the field sort_key, and by setting its value, you can control whether the key is sorted on output. It is important to note that all key, including nested key, are sorted.
Examples are as follows:
With open ('superheroes.json', 'w') as file: json.dump (superHeroSquad, file, indent = 4, sort_keys = True)
The running results are as follows:
4. Summary
Finally, let's review this article and summarize it as follows:
JSON files usually consist of key: pairs, where key is usually a string format, and value is usually a string, number, Boolean, array, object, or null
Python has built-in functions to easily read JSON files and convert them to dictionary types in Python or types that Pandas can handle.
Use pd.read_json () to read simple JSON and pd.json_normalize () to read nested JSON
We can easily get the value of a specific location in the JSON file through the name or subscript of key.
Python objects can be converted to JSON files, and the output can be formatted to increase readability
So much for the graceful handling of JSON files in Python. I hope the above content can be of some help and learn more. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.