Big data Virtual mixed Computing platform Moonbox configuration Guide 07/04 Update SLTechnology News&Howtos

Big data Virtual mixed Computing platform Moonbox configuration Guide

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

First, the environment is ready to install Apache Spark 2.2.0 (this version only supports Apache Spark 2.2.0, other Spark versions will be compatible later) MySQL has been installed and started, and remote access has been enabled. Each installation node has been configured with ssh secret-free login II. Download

Moonbox-0.3.0-beta download: https://github.com/edp963/moonbox/releases/tag/0.3.0-beta

Decompress tar-zxvf moonbox-assembly_2.11-0.3.0-beta-dist.tar.gz IV. Modify the configuration file

The configuration file is located in the conf directory

Step 1: modify slaves mv slaves.example slaves vim slaves

You will see the following:

Localhost

Please modify it to the address of the worker node that needs to be deployed according to the actual situation, one address per line.

Step 2: modify moonbox-env.sh mv moonbox-env.sh.example moonbox-env.sh chmod uplix moonbox-env.sh vim moonbox-env.sh

You will see the following:

Export JAVA_HOME=path/to/installed/dir export SPARK_HOME=path/to/installed/dir export YARN_CONF_DIR=path/to/yarn/conf/dir export MOONBOX_SSH_OPTS= "- p 22" export MOONBOX_HOME=path/to/installed/dir # export MOONBOX_LOCAL_HOSTNAME=localhost export MOONBOX_MASTER_HOST=localhost export MOONBOX_MASTER_PORT=2551

Please modify it according to the actual situation

Step 3: modify moonbox-defaults.conf mv moonbox-defaults.conf.example moonbox-defaults.conf vim moonbox-defaults.conf

You will see the following, where:

Catalog

Configure the metadata storage location, which must be modified. Please modify it according to the actual situation

Rest

Configure the rest service and modify as needed

Tcp

Configure the tcp (jdbc) service and modify it as needed

Local

Configure Spark Local mode jobs with an array of values, and how many elements indicate how many Spark Local mode jobs are started per Worker node. Delete if you don't need it.

Cluster

Configure Spark yarn mode jobs with an array of values, and how many elements indicate how many Spark Yarn mode jobs are started per Worker node. Delete if you don't need it.

Moonbox {deploy {implementation = "mysql" url = "jdbc:mysql://host:3306/moonbox?createDatabaseIfNotExist=true" user = "root" password = "123456" driver = "com.mysql.jdbc.Driver"} rest {enable = true port = 9099 request.timeout = "600s" idle. Timeout= "600s"} tcp {enable = true port = 10010}} mixcal {pushdown.enable = true column.permission.enable = true spark.sql.constraintPropagation.enabled = false local = [{}] cluster = [{spark.hadoop.yarn.resourcemanager.hostname = "master" spark.hadoop.yarn.resourcemanager. Address = "master:8032" spark.yarn.stagingDir = "hdfs://master:8020/tmp" spark.yarn.access.namenodes = "hdfs://master:8020" spark.loglevel = "ERROR" spark.cores.max = 2 spark.yarn.am.memory = "512m" spark.yarn.am.cores = 1 spark.executor.instances = 2 spark.executor.cores = 1 Spark.executor.memory = "2g"}]} optional: if HDFS is configured with high availability (HA), Either HDFS is configured with kerberos, or YARN is configured with HA, or YARN is configured with kerberos

Change the relevant part of the cluster element to the following configuration, please modify it according to the actual situation. Specific values can be found in the hdfs configuration file and the yarn configuration file.

# HDFS HA # spark.hadoop.fs.defaultFS= "hdfs://service_name" spark.hadoop.dfs.nameservices= "service_name" spark.hadoop.dfs.ha.namenodes.service_name= "xxx1 Xxx2 "spark.hadoop.dfs.namenode.rpc-address.abdt.xxx1=" xxx1_host:8020 "spark.hadoop.dfs.namenode.rpc-address.abdt.xxx2=" xxx2_host:8020 "spark.hadoop.dfs.client.failover.proxy.provider.abdt=" org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider "spark.yarn.stagingDir =" hdfs://service_name/tmp "# HDFS kerberos # dfs.namenode.kerberos.principal =" "dfs.namenode.kerberos.keytab ="# YARN HA # spark.hadoop.yarn.resourcemanager.ha.enabled=true spark.hadoop.yarn.resourcemanager.ha.rm-ids=" yyy1 Yyy2 "spark.hadoop.yarn.resourcemanager.hostname.rm1=" yyy1_host "spark.hadoop.yarn.resourcemanager.hostname.rm2=" yyy2_host "# YARN kerberos # spark.yarn.principal ="spark.yarn.keytab =" 5. Distribute the installation package

Place the MySQL Jdbc driver package in the libs and runtime directories, and then copy the entire moonbox installation directory to all installation nodes to ensure that the location is the same as the primary node location.

Start the cluster

Execute on the master node

Sbin/start- all.sh7. Stop the cluster

Execute on the master node

Sbin/stop- all.sh8. Check whether the cluster starts successfully.

Execute the following command on the master node and you will see the MoonboxMaster process

Jps | grep Moonbox

Execute the following command on the worker node and you will see the MoonboxWorker process

Jps | grep Moonbox

Execute the following command on the worker node and you will see the number of SparkSubmit processes corresponding to the configuration file

Jps-m | grep Spark

Use the moonbox-cluster command to view cluster information

Bin/moonbox-cluster workers bin/moonbox-cluster apps

If the check passes, the cluster starts successfully, and you can refer to the examples section to get started. If the check fails, you can troubleshoot the problem by looking at the logs under the logs directory on the master node or the worker node.

Open source address: https://github.com/edp963/moonbox

Source: Yixin Institute of Technology

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.