Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to analyze the principle of Zeppelin

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces how to analyze the principle of Zeppelin. The content is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.

Zeppelin, introduced in previous technical explorations, is an idea derived from Jupyter's visual Spark/Hive/R analysis programming interface. This week, I want to move some of the spark from the original code to Zeppelin. The main reasons are:

Trouble with spark modification in code: git push, git pull, compile, package, run. I hope you can modify and run spark with one click

The visual degree of the running result of spark is poor. I hope to see tables and pictures on the web page.

Hope that colleagues other than engineers can also run spark.

However, the reality is bony, and the things I planned to do in one day have been done one after another for four days. Now the Zeppelin still has too many small problems, it is interesting, but v0.6.0, if you dare to call V1.0, I will kill it.

# # install the version of all-in-binary that I downloaded directly from the official website. There seems to be a problem with the compatibility of Zeppelin and spark, and the safest way is to download the source code of Zeppelin and select the corresponding spark/hadoop version to compile with mvn.

It says on the official website

Zeppelin will work with any version of Spark and any deployment type without rebuilding Zeppelin in this way. (Zeppelin 0.5.6-incubating release works up to Spark 1.6.1)

I believed it, and in the end, spark did run, except for my lost youth.

But now Spark is 2.0. if you can, you should still use your own build.

# # creating a Notebook A Notebook is equivalent to a task, which can be divided into multiple Paragraph. Each Paragraph can use a different interpreter, thus using a different syntax.

I have created two Notebook, one is connected to HBase to clean up data, and the other uses Spark to do HBase data statistics. Basically, you can copy the Java/Scala code directly into it.

The following questions took a lot of time:

# there are three ways to add dependency packages to load Dependency, which is reliable here.

The way z.load () is not recommended.

Note that versions 0.5.6 and earlier do not have this Interpreter dependent package management feature!

# if the Spark cannot be connected, the Can't get status error message is probably org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused

If all other settings are correct, it is likely that SPARK_HOME did not set export SPARK_HOME= your spark installation address

Some fixed settings can be written in conf/zeppelin-env.sh.

# Jackson problem

The specific error report should be

The reason is that there is a conflict between jackson and zeppelin, which spark depends on. Just delete the package related to jackson under the lib directory of Zeppelin. Here is a more detailed solution for the Great God: Apache Zeppelin & Spark parses Json exceptions

# # final result

# # Peer comparison this article simply compares Zeppelin, Spark-notebook and Jupyter. The author's conclusion is:

If you only use Scala without Spark, you can use jupyter-scala.

If you want Spark+Scala and use fancy Javascript display, use spark-notebook.

Choose Zeppelin when you want to use AWS EMR or other daemons.

On how to carry out the principle analysis of Zeppelin to share here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report