In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the relevant knowledge of "how to use Spark to find the maximum value of data". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
1: I have used hadoop to read data from files to find the maximum value before. Now use Spark to find the maximum. It took a long time to finish because of the lack of information on spark. It took me 10 seconds to test 14750778 pieces of data in local.
2: download spark-0.9.1-bin-hadoop1 and extract it to F:\ BigData (Spark can be run on Windows). Open the directory F:\ BigData\ spark-0.9.1-bin-hadoop1\ assembly\ target\ scala-2.10 and put the spark-assembly_2.10-0.9.1-hadoop1.0.4.jar in the new project.
Prepare data: create a new data file and enter the following:
1,1,5.0
1,2,1.0
1,3,5.0
1,4,1.0
2,1,5.0
2,2,1.0
2,3,5.0
2,4,1.0
3,1,1.0
3,2,5.0
3,3,1.0
3,4,5.0
4,1,1.0
4,2,5.0
4,3,1.0
4,4,5.0
1,1,5.0
1,2,1.0
1,3,5.0
1,4,1.0
2,1,5.0
2,2,1.0
2,3,5.0
2,4,1.0
3,1,1.0
3,2,5.0
3,3,1.0
3,4,5.0
4,1,1.0
4,2,5.0
4,3,1.0
4,4,5.0
1,1,5.0
1,2,1.0
1,3,5.0
1,4,1.0
2,1,5.0
2,2,1.0
Data are separated by commas
Open eclipse and create a new javaProject.
Write the following in any package
Package com.spark.test
Import java.util.Arrays
Import java.util.regex.Pattern
Import org.apache.spark.api.java.JavaDoubleRDD
Import org.apache.spark.api.java.JavaRDD
Import org.apache.spark.api.java.JavaSparkContext
Import org.apache.spark.api.java.function.DoubleFunction
Import org.apache.spark.api.java.function.FlatMapFunction
Import org.apache.spark.api.java.function.Function2
Public final class Max {
Private static final Pattern SPACE = Pattern.compile (",")
Public static void main (String [] args) throws Exception {
/ / spark installation directory
String spark_home = "F:\\ BigData\\ spark-0.9.1-bin-hadoop1"
/ / "local" represents the local operation mode
JavaSparkContext ctx = new JavaSparkContext ("local", "JavaWordCount"
Spark_home, JavaSparkContext.jarOfClass (JavaWordCount.class))
/ / load the file
JavaRDD lines = ctx
.textFile (
"E:\\ workspace\\ spark\\ src\ com\\ spark\\ resource\\ test.data"
1)
The / / flatMap function turns each line into multiple lines according to the delimiter. For example, flatMap 2 is separated by a comma and becomes
/ / 1
/ / 2
/ / 3. The main purpose of this line is to put all the data into the JavaRDD
JavaRDD words = lines
.flatMap (new FlatMapFunction () {)
@ Override
Public Iterable call (String s) {
Return Arrays.asList (SPACE.split (s)
}
});
/ / change the JavaRDD type to the JavaDoubleRDD type
JavaDoubleRDD one = words.map (new DoubleFunction () {
@ Override
Public Double call (String s) throws Exception {
If (s.trim () .length () = = 0) {
S = "0"
}
Return Double.parseDouble (s)
}
});
/ / count how many pieces of data there are
System.out.println (one.count () + "%")
/ / find the maximum. New Function2 ()
/ / the three parameters of the function, the first and second parameters correspond to the first and second parameters in the call function. The third argument represents the return value type of the call function
Double max = one.rdd () .reduce (new Function2 () {
@ Override
Public Double call (Double i1, Double i2) throws Exception {
Return Math.max (i1, i2)
}
});
System.out.println (max)
System.exit (0)
}
}
This is the end of the content of "how to use Spark to find the maximum value of data". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.