Apache Beam Program Wizard 4 07/09 Update SLTechnology News&Howtos

Apache Beam Program Wizard 4

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Apache Beam Program Wizard 4

Today, when I was experimenting with Beam On Spark on a cluster, I encountered a tricky problem, which can be summed up as a java.lang.NoClassDefFoundError.

Error, as shown in figure 1 below

Figure 1 error prompt

This error indicates that the SparkStreamingContext is not defined, which means that the java virtual machine has loaded the SparkStreamingContext, that is, the corresponding code package has been imported, and there is no problem at this stage. The next step is to see if the version of the code package is inappropriate.

My maven depends on the following figure 2

Figure 2 Maven dependency package

So after various attempts, I finally found a promising solution, which is to change the version of spark from 2.1.0 to 1.6.3.

The revised status is as follows:

Figure 3 successful local operation

This seems to be a success, but this is only when you have the running test results of Beam On Spark during the native test. As for whether it can run in the Spark cluster, we also need to generate the corresponding jar package according to the official method, and submit it to the yarn cluster through Spark-submit submission to schedule the operation through yarn. If the operation is successful in this case, that is the real success. So now is not the time to be happy, so let's move on to the next experiment.

You should first add the following build dependencies on shade to the pom.xml file of maven

(this part is added to the plugins node of the build node of the root node

)

Org.apache.maven.plugins

Maven-shade-plugin

False

*: *

META-INF/*.SF

META-INF/*.DSA

META-INF/*.RSA

Package

Shade

True

Shaded

First, package and copy the project to the machine that submitted the application and extract it.

If you have already installed the maven environment, execute the following command directly from the root directory after the decompression

Mvnpackage

And then wait for the compilation to finish.

... ..

As shown in the following figure, the compiled jar package is in the target directory

Submit the jar package as spark. The submitted script is as follows:

${SPARK_HOME} / bin/./spark-submit\

-- class org.tongfang.beam.examples.WordCount\ # # specify the class to run, full path + class name

-- master yarn\ # # submit to yarn scheduling

-- deploy-mode cluster\ # # run in cluster mode

/ home/ubuntu/shaded/target/shaded-1.0.0-shaded.jar## specifies the jar package path

After adding executable permissions to the script, you can start running:

After waiting for a while, it is successful, as shown in the following figure:

The article is from the Mathematical Model Hall. For more communication, please scan and follow us.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.