How hive integrates phoenix 07/12 Update SLTechnology News&Howtos

How hive integrates phoenix

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "hive how to integrate phoenix", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "hive how to integrate phoenix" this article.

First of all, we need phoenix to integrate hbase.

Hive integrates hbase. Refer to previous notes here.

Copy phoenix {core,queryserver,4.8.0-HBase-0.98,hive} to $hive/lib/

Modify the configuration file as required by the official website

> vim conf/hive-env.sh

> vim conf/hive-site.xml

Start:

> hive-hiveconf phoenix.zookeeper.quorum=hadoop01:2181

Create an internal table

Create table phoenix_table (

S1 string

I1 int

F1 float

D1 double

)

STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler'

TBLPROPERTIES (

"phoenix.table.name" = "phoenix_table"

"phoenix.zookeeper.quorum" = "hadoop01"

"phoenix.zookeeper.znode.parent" = "/ hbase"

"phoenix.zookeeper.client.port" = "2181"

"phoenix.rowkeys" = "S1, i1"

"phoenix.column.mapping" = "s1:s1, i1:i1, f1:f1, d1:d1"

"phoenix.table.options" = "SALT_BUCKETS=10, DATA_BLOCK_ENCODING='DIFF'"

);

Created successfully. There are corresponding table generation in query phoenix and hbase: phoenix

Hbase:

Attribute

Phoenix.table.name

Phoenix specifies the table name

Default value: table like hive

Phoenix.zookeeper.quorum

Specify ZK address

Default value: localhost

Phoenix.zookeeper.znode.parent

Specify the directory where HBase is in ZK

Default value: / hbase

Phoenix.zookeeper.client.port

Specify ZK port

Default value: 2181

Phoenix.rowkeys

Specify the rowkey of phoenix, that is, the rowkey of hbase

Request

Phoenix.column.mapping

Column mapping between hive and phoenix.

Insert data

Import data using the hive test table pokes

> insert into table phoenix_table select bar,foo,12.3 as fl,22.2 as dl from pokes

Success, query

Query in phoenix

You can also use phoenix to import data, as explained on the official website.

Note: phoenix4.8 thinks that adding the tbale keyword is a grammatical error. I didn't try other versions. I don't know why it's not explained on the official website.

Create an external table

For external tables Hive works with an existing Phoenix table and manages only Hive metadata. Deleting an external table from Hive only deletes Hive metadata and keeps Phoenix table

First create a table in phoenix

Phoenix > create table PHOENIX_TABLE_EXT (aa varchar not null primary key,bb varchar)

Then create an external table in hive:

Create external table phoenix_table_ext_1 (aa string, bb string) STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler' TBLPROPERTIES ("phoenix.table.name" = "phoenix_table_ext", "phoenix.zookeeper.quorum" = "hadoop01", "phoenix.zookeeper.znode.parent" = "/ hbase", "phoenix.zookeeper.client.port" = "2181", "phoenix.rowkeys" = "aa", "phoenix.column.mapping" = "aa:aa, bb:bb")

Created successfully, inserted successfully

These options can be set in hive CLI

Performance tuning

The default value of the parameter describes the phoenix.upsert.batch.size1000 bulk size insert. [phoenix-table-name] .room.walfalse it temporarily sets the table property DISABLE_WAL = true. Can be used to improve performance [phoenix-table-name] .auto.flushfalse when WAL is the flush of disabled and true, press the file to swipe into the library

Query data

You can use HiveQL to query data in the phoenix table. A simple table query when hive.fetch.task.conversion=more and hive.exec.parallel=true. It can be as fast as in Phoenix CLI.

The default value of the parameter describes the hbase.scan.cache100 as a unit request read row size. Whether the hbase.scan.cacheblockfalse is a cache block. Split.by.statsfalseIf true, mappers will use table statistics. The number of One mapper per guide post. [hive-table-name] .reducer.count1reducer. In tez mode is affected only single-table query. See Limitations [phoenix-table-name] .query.hintHint for phoenix query (like NO_INDEX)

Problems encountered:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Org.apache.hadoop.hbase.client.Scan.isReversed () Z

At first, the hbase-0.96.2-hadoop2 version I used could not be integrated. This requires a hbase-client-0.98.21-hadoop2.jar package. I can solve it by replacing this jar package, but I will still report the following errors.

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException (message:ERROR 103 (08004): Unable to establish connection.

So the ok with version 0.98.21 of hbase has been replaced.

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Java.lang.StringIndexOutOfBoundsException: String index out of range:-1

Because the corresponding fields are different.

Create table phoenix_table_3 (a string,b int) STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler' TBLPROPERTIES ("phoenix.table.name" = "phoenix_table_3", "phoenix.zookeeper.quorum" = "hadoop01", "phoenix.zookeeper.znode.parent" = "/ hbase", "phoenix.zookeeper.client.port" = "2181", "phoenix.rowkeys" = "A1", "phoenix.column.mapping" = "a:a1, b:b1", "phoenix.table.options" = "SALT_BUCKETS=10" DATA_BLOCK_ENCODING='DIFF' ")

The hive table field is the same as the phoenix field.

If it is created successfully, the insert can also be successful, that is, the A1 column cannot be found when hive query is reported, because phoenix is an aa column.

Failed with exception java.io.IOException:java.lang.RuntimeException: org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): Undefined column. ColumnName=A1

Create external table phoenix_table_ext (A1 string,b1 string) STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler' TBLPROPERTIES ("phoenix.table.name" = "phoenix_table_ext", "phoenix.zookeeper.quorum" = "hadoop01", "phoenix.zookeeper.znode.parent" = "/ hbase", "phoenix.zookeeper.client.port" = "2181", "phoenix.rowkeys" = "aa", "phoenix.column.mapping" = "a1:aa, b1:bb")

Solution: the hive field above is the same as the phoenix field.

The above is all the content of the article "how hive integrates phoenix". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.