Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the most helpful project settings in JuypterNotebook

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "what are the most helpful project settings in JuypterNotebook". In daily operation, I believe many people have doubts about the most helpful project settings in JuypterNotebook. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the questions of "what are the most helpful project settings in JuypterNotebook?" Next, please follow the editor to study!

1. Make sure the Python version

Check the Python interpreter version in JupyterNotebook:

Import sys sys.version'3.7.6 (default, Jan 8 2020, 13:42:34)\ n [Clang 2020 (tags/RELEASE_401/final)]'

To ensure that the project is run by the minimum and above required version of the Python interpreter, add the following code to the project settings:

# Python ≥ 3.7 is required import sys assert sys.version_info > = (3,7)

Python must be version 3.7 or above, otherwise AssertionError will be thrown.

two。 Ensure the package version

Check the version of the package installed, such as TensorFlow.

Import tensorflow as tf tf.__version__'2.0.0'

Make sure the project is run by TensorFlow2.0 or above, otherwise AssertionError will be thrown.

# TensorFlow ≥ 2.0 is required import tensorflow as tf assert tf.__version__ > = "2.0"

3. Avoid drawing blurred images

The default drawing in JuypterNotebook looks a little blurry. For example, a simple heat map that looks for missing values.

(https://towardsdatascience.com/using-pandas-pipe-function-to-improve-code-readability-96d66abfaf8)

Import seaborn as sns import matplotlib.pyplot as plt matplotlib inline# Default figure format png sns.heatmap (df.isnull (), yticklabels=False, cbar=False, cmap='viridis')

The default image looks blurry

As you can see from the image above, the text is blurred, the missing values in the Cabin column are too crowded, and the missing values in the Embarked column cannot be recognized.

To resolve this problem, use% configInlineBackend.figure_format = 'retina' or% configInlineBackend.figure_format =' svg', after% matplotlib inline that is:

% matplotlib inline% config InlineBackend.figure_format = 'retina' # or' svg'sns.heatmap (df.isnull (), yticklabels=False, cbar=False, cmap='viridis')

The picture format is set to retina or svg

Compared with the previous images, the above image is clearer, and the missing values in the Embarked column can also be successfully identified.

4. Keep the output stable in different operations

Random numbers are used in many places in data science projects. For example:

Train_test_split () from Scikit-Learn

Np.random.rand () for initializing weights

If the random seed is not reset, a different number appears for each call:

> > np.random.rand (4) array ([0.83209492, 0.10917076, 0.15798519, 0.99356723]) > np.random.rand (4) array ([0.46183001, 0.7523687, 0.96599624, 0.32349079])

Np.random.seed (0) makes random numbers predictable:

> np.random.seed (0) > np.random.rand (4) array ([0.5488135, 0.71518937, 0.60276338, 0.54488318]) > np.random.seed (0) > np.random.rand (4) array ([0.5488135, 0.71518937, 0.60276338, 0.54488318])

If the random seed is reset (each time), the same data set will appear each time. Therefore, the project can keep the output stable in different operations.

5. Multi-unit output

By default, JupyterNotebook cannot output multiple results in the same cell. To output multiple results, you can reconfigure shell using IPython.

From IPython.core.interactiveshell import InteractiveShell InteractiveShell.ast_node_interactivity = "all"

6. Save the picture to a file

Matplotlib can save the picture through the savefig () method, but raises an error if the given path does not exist.

Plt.savefig ('. / figures/my_plot.png') FileNotFoundError: [Errno 2] Nosuch file or directory:'. / figures/my_plot.png'

The best thing to do is to put all the pictures in one place, such as the figures folder in the workspace. You can use OS GUI (operating system interface) or run the logic instruction in JupyterNotebook to create a figures folder manually, but it is best to create a small function to do so.

This method is especially useful when you need some custom drawing settings or additional subfolders to group drawings. The following is the function to save the picture to a file:

Import os% matplotlib inline import matplotlib.pyplot as plt# Where to save the figures PROJECT_ROOT_DIR = "." SUB_FOLDER = "sub_folder" # a sub-folder IMAGES_PATH = os.path.join (PROJECT_ROOT_DIR, "images", SUB_FOLDER) defsave_fig (name, images_path=IMAGES_PATH, tight_layout=True,extension= "png", resolution=300): if not os.path.isdir (images_path): os.makedirs (images_path) path= os.path.join (images_path, name+ ". + extension) print (" Saving figure: " Name) if tight_layout: plt.tight_layout () plt.savefig (path, format=extension,dpi=resolution)

Now call save_fig ('figure_name'), and an images/sub_folder directory will be created in the workspace, and the image will be saved to the directory with the name "figure_name.png". In addition, the three most commonly used settings are provided:

Tight_layout can automatically adjust subgraph filling.

Extension can save pictures in multiple formats

Resolution can set picture resolution

7. Download the data (and extract it)

It is common for data scientists to deal with network data. You can use the browser to download the data and run instructions to extract the file, but it's best to create a small function to perform the operation. This is especially important when data needs to be changed on a regular basis.

Write a small script that runs when you get the latest data (you can also set up a scheduled work that is automatically executed on a regular basis). It is also useful to automate the process of fetching data if you need to install datasets on multiple machines.

The following is the function to download and extract the data:

Import os import tarfile import zipfile import urllib # Where to save the data PROJECT_ROOT_DIR = "." SUB_FOLDER = "group_name" LOCAL_PATH = os.path.join (PROJECT_ROOT_DIR, "datasets", SUB_FOLDER) defdownload (file_url, local_path = LOCAL_PATH): if not os.path.isdir (local_path): os.makedirs (local_path) # Download file print ("> downloading") filename = os.path.basename (file_url) file_local_path = os.path.join (local_path Filename) urllib.request.urlretrieve (file_url,file_local_path) # untar/unzip file if filename.endswith ("tgz") or filename.endswith ("tar.gz"): print ("> unpacking file:", filename) tar = tarfile.open (file_local_path "r:gz") tar.extractall (path = local_path) tar.close () eliffilename.endswith ("tar"): print ("> > unpackingfile:", filename) tar = tarfile.open (file_local_path, "r:") tar.extractall (path = local_path) tar.close () eliffilename.endwith ("zip"): print ("> > unpackingfile:" Filename) zip_file = zipfile.ZipFile (file_local_path) zip_file.extractall (path = local_path) zip_file.close () print ("Done")

Now call download ("http://a_valid_url/housing.tgz"), which creates a datasets/group_name directory in the workspace, downloads housing.tgz, and extracts housing.csv from that directory. This small function can also be used for CSV and text files.

At this point, the study on "what are the most helpful project settings in JuypterNotebook" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report