Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use spark-sql-perf

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

How to use spark-sql-perf, aiming at this problem, this article introduces the corresponding analysis and answer in detail, hoping to help more partners who want to solve this problem to find a more simple and easy way.

Installation of the basic environment

Knife cutter: a 126g memory 64 core centos 7.2

Virtualbox installs four virtual machines (centos 7.2gm 16G memory, 4 cores): master,worker1,worker2,worker3 (under centos)

Spark version: 2.0

Hadoop version: 2.6

For installation, please refer to hadoop installation or Spark On Yarn installation

Screenshot after installation

Introduction to downloading, compiling and deploying davies/tpcds-kit

Davies/tpcds-kit is a tool for generating test data

download

Git clone https://github.com/davies/tpcds-kit.git

Compile

Choose a machine (here we choose master) to install the following compilation tools (there are no compilation tools in the default software)

Yum install gcc gcc-c++ bison flex cmake ncurses-develcd tpcds-kit/toolscp Makefile.suite Makefile # copy Makefile.suite run the make command for Makefilemake #

Next, copy the tpcds-kit to the same directory on all machines (important)

Scp-r / directory / tpcds-kit root@worker1:/ directory / tpcds-kit # execute this command three times to copy to worker1,worker2,worker3databricks/spark-sql-perf for download and package download

Git clone https://github.com/databricks/spark-sql-perf.git

Packing

When using jar packaged with sbt package, the dependency can not be found. We use Intellij Idea to import the project.

Modify sbt.build to change scala version to 2.11.8

Pack it into a jar package

Set up Project Structure

Set up Artifacts

Build

Jar packages do not need to be available on every node.

Run the TPCDS test to change the driver memory limit in spark.env

SPARK_DRIVER_MEMORY=8G # depending on the situation

Run spark-shellcd spark-2.0.0-bin-hadoop2.6./bin/spark-shell-- jars / jar package directory / spark-sql-perf.jar-- num-executors 20-- executor-cores 2-- executor-memory 8G-- master spark://master:7077 to run the test in spark-shell / / create sqlContextval sqlContext=new org.apache.spark.sql.SQLContext (sc) import sqlContext.implicits._// to generate data parameter 1:sqlContext parameter Number 2:tpcds-kit directory parameter 3: amount of data generated (GB) val tables=new Tables (sqlCotext "/ directory / tpcds-kit/tools", 1) tables.genData ("hdfs://master:8020:tpctest", "parquet", true,false,false,false,false) / / create table structure (external table or temporary table) / / talbles.createExternalTables ("hdfs://master:8020:tpctest", "parquet", "mytest", false) talbles.createTemporaryTables ("hdfs://master:8020:tpctest", "parquet") import com.databricks.spark.sql.perf.tpcds.TPCDSval tpcds=new TPCDS (sqlContext=sqlContext) / / run test val experiment=tpcds.runExperiment (tpcds.tpcds1_4Queries)

In spark-shell, we can call _ experiment.html_ to check the execution status

Screenshots of data generated on HDFS

Run screenshot

The running result is saved in the spark/performance directory

Screenshots of evaluation results on HDFS

This is the answer to the question on how to use spark-sql-perf. I hope the above content can be of some help to you. If you still have a lot of doubts to solve, you can follow the industry information channel to learn more about it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report