In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly explains "Python Jupyter Notebook example analysis", interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let Xiaobian take you to learn "Python Jupyter Notebook Example Analysis"!
1. Jupyter Notebook Basic Introduction
Jupyter Notebook (formerly IPython Notebook) is an interactive notebook that supports running more than 40 programming languages.
Before you can start using Notebook, you need to install the library: (1) pip install Jupyter from the command line;(2) install Anaconda with Jupyter Notebook.
Executing Jupyter notebook from the command line will launch Jupyter service in the current directory and open the page in the default browser, and you can copy the link to open it in another browser.
The notebook interface consists of the following parts: (1) notebook name;(2) main toolbar, which provides options such as save, export, reload notebook, and restart kernel; and (3) notebook main area, which contains notebook content editing area.
2. Use of Jupyter Notebook
The main area at the bottom of the Jupyter page consists of sections called cells. Each notebook consists of multiple cells, and each cell can have different uses. What you see above is a code cell that starts with [ ], and in this type of cell you can enter any code and execute it. For example, enter 1 + 2 and press Shift + Enter, the code in the cell is calculated, and the cursor is moved to a new cell.
If you want to create a new notebook, just click New and select the notebook type you want to start.
Notebook can modify previous cells and recalculate them so that the entire document can be updated. This feature is especially powerful if you don't want to rerun the entire script, but just want to test a program with different parameters. However, you can also recalculate the entire notebook by clicking Cell -> Run all.
The retest title and other codes are as follows:
You can see that a notebook title has been added at the top, and statements like for loops can also be executed.
Python in Jupyter
Jupyter tests Python variables and data types as follows:
Test Python modules as follows:
Data reading and writing is important because data analysis must first read the data and save it after processing.
4. Data interaction case
Load csv data, process data, save to MongoDB database
There are shopproducts.csv and userratings.csv files for product data and user rating data, respectively, as follows:
Now you need to read it out through Python and save the specified fields to MongoDB by executing the command conda install pymongo in Anaconda.
Python code is as follows:
import pymongoclass Product: def __init__(self,productId:int ,name, imageUrl, categories, tags): self.productId = productId self.name = name self.imageUrl = imageUrl self.categories = categories self.tags = tags def __str__(self) -> str: return self.productId +'^' + self.name +'^' + self.imageUrl +'^' + self.categories +'^' + self.tagsclass Rating: def __init__(self, userId:int, productId:int, score:float, timestamp:int): self.userId = userId self.productId = productId self.score = score self.timestamp = timestamp def __str__(self) -> str: return self.userId +'^' + self.productId +'^' + self.score +'^' + self.timestampif __name__ == '__main__': myclient = pymongo.MongoClient("mongodb://127.0.0.1:27017/") mydb = myclient["goods-users"] ## val attr = item.split("\\^") ## //Convert to Product ## Product(attr(0).toInt, attr(1).trim, attr(4).trim, attr(5).trim, attr(6).trim) shopproducts = mydb['shopproducts'] with open('shopproducts.csv', 'r',encoding='UTF-8') as f: item = f.readline() while item: attr = item.split('^') product = Product(int(attr[0]), attr[1].strip(), attr[4].strip(), attr[5].strip(), attr[6].strip()) shopproducts.insert_one(product.__ dict__) ## print(product) ## print(json.dumps(obj=product.__ dict__,ensure_ascii=False)) item = f.readline() ## val attr = item.split(",") ## Rating(attr(0).toInt, attr(1).toInt, attr(2).toDouble, attr(3).toInt) userratings = mydb['userratings'] with open('userratings.csv', 'r',encoding='UTF-8') as f: item = f.readline() while item: attr = item.split(',') rating = Rating(int(attr[0]), int(attr[1].strip()), float(attr[2].strip()), int(attr[3].strip())) userratings.insert_one(rating.__ dict__) ## print(rating) item = f.readline()
After starting the MongoDB service, run the Python code, and when it is finished, view the database through Robo 3T as follows:
Including name, number of comments, price, address, rating list, etc., where the number of comments, price and rating are irregular and need to be cleaned.
Jupyter handles this as follows:
As you can see, the rule data after cleaning is finally obtained.
The complete Python code is as follows:
##data read f = open ('store data. csv','r', encoding ='utf8')for i in f.readlines()[1:15]: print(i.split (','))##Create comment, price, commentlist Cleaning function def fcomment(s): '''comment Cleaning function: segment with spaces, select the first comment in the result list, and convert to integer''' if 'article' in s: return int(s.split(' ')[0]) else: return 'missing data'def fprice(s): '''price cleaning function: use ¥ segmentation, select the last one in the result list as per capita price, and convert it to floating point type''' if '¥' in s: return float(s.split('¥')[-1]) else: return 'missing data'def fcommentl(s): '''commentlist cleaning function: use spaces to segment, clean out quality, environment and service data respectively, and convert them to floating point type''' if ' ' in s: quality = float(s.split(' ')[0][2:]) environment = float(s.split(' ')[1][2:]) service = float(s.split(' ')[2][2:-1]) return [quality, environment, service] else: return 'Missing data'##Data processing cleaning datalist = [] ##Create empty list f.seek(0)n = 0 ##Create count variable for i in f.readlines(): data = i.split(',') ## print(data) classify = data[0] ##extract classification name = data[1] ##Extract store name comment_count = fcomment(data[2]) ##Extract number of comments star = data[3] ##extract star price = fprice(data[4]) ##Extract per capita address = data[5] ##Extract address quality = fcommentl(data[6])[0] ##Extract quality score env = fcommentl(data[6])[1] ##Extract environment score service = fcommentl(data[6])[2] ##Extract service rating if 'missing data' not in [comment_count, price, quality]: ##Used to determine if there is missing data n += 1 data_re = [['classify', classify], ['name', name], ['comment_count', comment_count], ['star', star], ['price', price], ['address', address], ['quality', quality], ['environment', env], ['service', service]] datalist.append(dict(data_re)) ##dictionary generated and stored in the list datalist print ('successfully loaded %i pieces of data'% n) else: continueprint(datalist)print ('Total load %i data'% n)f.close() At this point, I believe that everyone has a deeper understanding of "Python Jupyter Notebook example analysis", may wish to actually operate it! Here is the website, more related content can enter the relevant channels for inquiry, pay attention to us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.