Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The difference between flatmap and map in spark

2025-04-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article focuses on "the difference between flatmap and map in spark". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn the difference between flatmap and map in spark.

Background

The flattening calls of map and flatmap, introduced literally or on the official website, may cause trouble for some people in understanding [including myself], so I take the time to analyze them today, which are sorted out as follows:

First of all, let's explain the nouns.

My understanding

The map:map method returns an object,map that replaces the current element in the stream with this return value

The flatMap:flatMap method returns a stream,flatMap that replaces the current element in the stream with the stream element disassembled for this return stream. The underlying stream element is recursive. As long as the data is a collection, all the data in the collection will be taken out.

Official explanation

Map:Returns a stream consisting of the results of applying the given function to the elements of this stream.

Returns a stream containing the result of a given function applied to each element in the flow

Flatmap:Returns a stream consisting of the results of replacing each element of this stream with the contents of a mapped stream produced by applying the provided mapping function to each element.

Returns a stream containing the contents of the mapping jet generated by replacing each element in the flow with a given function mapping applied to each element

Give examples to illustrate

There are two cases of eggs, 5 each. Now we have to process the eggs into fried eggs and distribute them to the students.

What map does: process two boxes of eggs into fried eggs, or put them into the original two boxes and divide them into two groups of students

What flatMap does: process two boxes of eggs into fried eggs, then put them together [10 fried eggs] and divide them among 10 students.

The complete test code is as follows:

Public class Map_FlatMap {List eggs = new ArrayList (); @ Before public void init () {/ / the first box of eggs eggs.add (new String [] {"egg _ 1", "egg _ 1"}) / / the second box of eggs eggs.add (new String [] {"egg _ 2", "egg _ 2"});} / / self-generating group number static int group = 1; / / self-generating student number static int student = 1 Process two cartons of eggs into fried eggs, or put them in the original two cases. Assigned to 2 groups of students * / @ Test public void map () {eggs.stream () .map (x-> Arrays.stream (x) .map (y-> y.replace ("chicken", "fried")) .forEach (x-> System.out.println ("group" + group++ + ":" + Arrays.toString (x.toArray () / * console print:-Group 1: [fried egg _ 1, fried egg _ 1, fried egg _ 1] group 2: [fried egg _ 2, fried egg _ 2 Fried eggs _ 2] * /} / * * process two cartons of eggs into fried eggs respectively Then put together [10 fried eggs] and give them to 10 students * / @ Test public void flatMap () {eggs.stream () .flatMap (x-> Arrays.stream (x) .map (y-> y.replace ("chicken", "fried")) .forEach (x-> System.out.println ("student" + student++ + ":" + x)). / * console print:-Student 1: fried egg _ 1 student 2: fried egg _ 1 student 3: fried egg _ 1 student 4: fried egg _ 1 student 5: fried egg _ 1 student 6: fried egg _ 2 student 7: fried egg student 8: Fried egg 2 student 9: fried egg 2 student 10: fried egg 2 * /}}

The difference between implementing it in Python is

FlatMap

Val lineArray = Array ("hello you", "hello me", "hello world") val lines = sc.parallelize (lineArray, 1) val words = lines.flatMap (line = > {line.split (")}) words.foreach {word = > println (word.mkString)}

Results:

Map

Val lineArray = Array ("hello you", "hello me", "hello world") val lines = sc.parallelize (lineArray, 1) val words = lines.map (line = > {line.split (")}) words.foreach {word = > println (word.mkString)}

Result

Map: get a new element (original or several elements)

Flatmap: get one or more new elements (more than the original elements)

At this point, I believe you have a deeper understanding of "the difference between flatmap and map in spark". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 269

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report