In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Recently, in learning Spark machine learning, due to the good performance of Python language in machine learning, I chose Python as the development language of Spark machine learning, which also laid the foundation for subsequent in-depth learning. Therefore, the following is to build an eclipse4.4.2+Python2.7.14+Spark2.1.0 development environment under windows8.1. The specific process is as follows:
1. Install Python1.1. 1 under windows. Download Python
Download the Python installation file for the corresponding operating system at the following address
Https://www.python.org/downloads/release/python-2714/
I am windows8.1 64-bit, so download the installation file for Windows x86-64 MSI installer version.
1.2. Install Python
1), double-click
2) Select install to the current user in the pop-up interface, and then click Next next
3) Select the installation path. Here I choose to install to D:\ Python27\, and then click Next next step
4), click Next next step directly, and then wait for the installation to complete
5) the following screen appears, indicating that the installation is complete, and click Finsh to complete it.
1.3. Environment variable configuration 1.3.1. The first way
Add the Python directory to the environment variable:
In the command prompt (cmd): enter
Path=%path%;D:\ Python27
Press "Enter".
Note: d:\ Python27 is the directory where Python is installed.
1.3.2. The second way
You can also set it in the following ways:
Right-click computer, and then click Properties
Then click "Advanced system Settings"
Select "Path" under the "system variable" window and double-click!
Then on the "Path" line, add the python installation path (my D:\ Python27), so you can add it later. Ps: remember, paths are separated by semicolons.
As shown below:
Finally, restart the computer after the successful setup. After restarting the computer, enter the command "python" on the cmd command line, and you can see the relevant information shown in the figure below, indicating that the python has been installed successfully.
1.3.3. Python environment variable
Here are a few important environment variables that apply to Python:
Variable name
Description
PYTHONPATH
PYTHONPATH is the Python search path. By default, our import modules will be found in PYTHONPATH.
PYTHONSTARTUP
After Python starts, look for the PYTHONSTARTUP environment variable, and then execute the execution code specified by the variable in this file.
PYTHONCASEOK
Adding the environment variable of PYTHONCASEOK will make python import the module case-insensitive.
PYTHONHOME
Another module search path. It is usually embedded in the PYTHONSTARTUP or PYTHONPATH directory, making it easier to switch between the two module libraries.
two。 Install Eclipse under windows
This step is simple and omitted, my Eclipse version is 4.4.2.
Note: JDK needs to be installed before installing Eclipse.
3. Eclipse installs and configures the PvDev plug-in 3.1. Install the PvDev plug-in
1) start Eclipse and click Help- > InstallNew Software... In the pop-up dialog box, click the Add button. Fill in pydev in Name and https://dl.bintray.com/fabioz/pydev/5.2.0 in Location (because my Eclipse is 4.4.2, install the plug-in corresponding to version 5.2.0, if it is the latest Eclipse, use http://pydev.org/updates directly), and then install it step by step. If you make a mistake in the process of loading, reinstall it.
2) in the following step, only select all under the PyDev node, and then click Next next step
3) Click Next next step directly for this step
4), this step chooses to accept Iaccept. Then click Next next, then wait for the plug-in installation to complete and restart Eclipse.
3.2. Configure the PvDev plug-in
After installing pydev, you need to configure the Python interpreter.
1) in the Eclipse menu bar, click Windows- > Preferences.
2) in the dialog box, click PyDev- > Interpreters- Python Interpreter. Click the New button, select the path to python.exe, and then click OK to bring up the next window.
3) A new window with a lot of check boxes pops up, and the next window appears after clicking OK.
4) Click the OK of the window to complete the plug-in configuration.
4. Develop code to test the construction of Python environment
1) start Eclipse and create a new project, File- > New- > Projects... Select PyDev- > PyDevProject to enter the project name, as shown below:
2) create a new PyDevPackage, and enter the package name Test1
3) write the code in the _ _ init__.py file, then run it, and output it normally on the console, indicating that the development environment has been built.
5. Use Python to develop Spark environment configuration 5. 1. Download unzipped spark installation package
You can download the corresponding version from http://spark.apache.org/downloads.html. The version I use is spark-2.1.0-bin-hadoop2.7.tgz. After downloading the compressed file, extract it. I extracted it to F:\ BigData\ Spark\ spark-2.1.0-bin-hadoop2.7
5.2. Configure spark environment variables
1) create a new SPARK_HOME variable with a value of F:\ BigData\ Spark\ spark-2.1.0-bin-hadoop2.7, add% SPARK_HOME%\ bin to the system Path variable, and then restart the computer
5.3. Python configuration
Copy the pyspark folder under the spark directory (F:\ BigData\ Spark\ spark-2.1.0-bin-hadoop2.7\ python\ pyspark) to the python installation directory D:\ Python27\ Lib\ site-packages, and then execute the pyspark command in the cmd command line window. The following figure indicates that the installation is successful:
6. Using Python to develop spark FAQ 6.1. ImportError: No module named py4j.protocol
Reason: an error message is reported when running python code, indicating that Python does not have py4j module installed
Solution: run cd D:\ Python27\ Scripts under the cmd command line (my python is installed on D:\ Python27\ disk. Here, you can change to the installation directory of pip before you can execute pip. You do not need to install pip in advance), and then run pipinstall py4j to install the related libraries. The screenshot below indicates that the installation is successful.
6.2. ImportError: No module named numpy
Reason: an error message is reported when running python code, indicating that Python does not have numpy module installed
Solution: run cd D:\ Python27\ Scripts under the cmd command line (my python is installed on D:\ Python27\ disk. Here, you can change to the installation directory of pip before you can execute pip. You do not need to install pip in advance), and then run pipinstall numpy to install the related libraries. The screenshot below indicates that the installation is successful.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.