In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces what is the difference between MLlib and ML library in Spark. The content is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.
Machine learning library (MLlib)
MLlib is the machine learning (ML) library of Spark. The goal is to make actual machine learning scalable and easy. At a high level, it provides the following tools:
ML algorithm: general learning algorithms, such as classification, regression, clustering and collaborative filtering
Feature extraction, feature extraction, transformation, dimension reduction and selection
Pipes: tools for building, evaluating, and tuning ML pipes
Persistence: saving and loading algorithms, models, and pipes
Utilities: linear algebra, statistics, data processing, etc.
Announcement: DataFrame-based API is the primary API
MLlib's RDD-based API is now in maintenance mode.
Starting with Spark 2.0, the RDD-based API spark.mllib in the package has entered maintenance mode. Spark's main machine learning API is now the DataFrame-based API spark.ml in the package.
What's the impact?
MLlib will still support RDD-based API spark.mllib and fix errors.
MLlib does not add new features to RDD-based API.
In the Spark 2.x release, MLlib will add functionality to DataFrame-based API to achieve functional parity with RDD-based API.
When functional equivalence (roughly estimated as Spark 2.2) is achieved, RDD-based API will be deprecated.
RDD-based API is expected to be deleted in Spark 3.0.
Why did MLlib switch to DataFrame-based API?
DataFrames provides a more friendly API than RDD. Many of the advantages of DataFrame include Spark data sources, SQL / DataFrame queries, Tungsten and Catalyst optimization, and unified API across languages.
MLlib's DataFrame-based API provides unified API across ML algorithms and multiple languages.
The data box facilitates the actual ML pipeline, especially the function conversion.
What is "Spark ML"?
"Spark ML" is not an official name and is occasionally used to refer to MLlib DataFrame-based API. This is mainly due to the Scala package name used by org.apache.spark.ml 's DataFrame-based API and the term "Spark ML Pipelines" that we originally used to emphasize the concept of pipes.
Is MLlib deprecated?
The numbered MLlib includes RDD-based API and DataFrame-based API. RDD-based API is now in maintenance mode.
Dependence
MLlib uses the linear algebraic package Breeze, which relies on netlib-java for optimized numerical processing. If the native library is not available at run time, you will see a warning message and will use a pure JVM implementation.
Due to authorization issues with runtime proprietary binaries, we do not include local proxies by default for netlib-java. To configure netlib-java/ Breeze to use system-optimized binaries, include com.github.fommil.netlib:all:1.1.2 (or build Spark-Pnetlib-lgpl) as a dependency on the project, and read the netlib-java documentation for additional installation instructions for the platform.
To use MLlib in Python, you will need NumPy 1.4 or later.
So much for sharing the difference between MLlib and ML libraries in Spark. I hope the above content can be of some help and learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.