The pit encountered by Federated HDFS+beeline+hiveserver2 10/19 Update SLTechnology News&Howtos

The pit encountered by Federated HDFS+beeline+hiveserver2

2025-10-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Pit encountered:

1. The task of Hive will move data from the temporary directory to the data warehouse directory. By default, hive uses / tmp as the temporary directory, and users usually use / user/hive/warehouse/ as the data warehouse directory. In the case of Federated HDFS, / tmp and / user are treated as two different ViewFS mount table, so the hive task moves data between the two directories. Federated HDFS does not support this, so the task will fail.

Error message:

ERROR: Failed with exception Unable to move sourceviewfs://cluster9/tmp/.hive-staging_hive_2015-07-29, 12-34-11, 306, 6082682065011532871-5/-ext-10002to destinationviewfs://cluster9/user/hive/warehouse/tandem.db/cust_loss_alarm_unit

Org.apache.hadoop.hive.ql.metadata.HiveException: Unable to movesourceviewfs://cluster9/tmp/warehouse/.hive-staging_hive_2015-07-29, 12-34-11, 306, 6082682065011532871-5/-ext-10002to destinationviewfs://cluster9/user/hive/warehouse/tandem.db/cust_loss_alarm_unit

Atorg.apache.hadoop.hive.ql.metadata.Hive.moveFile (Hive.java:2521)

Atorg.apache.hadoop.hive.ql.exec.MoveTask.moveFile (MoveTask.java:105)

Atorg.apache.hadoop.hive.ql.exec.MoveTask.execute (MoveTask.java:222)

At org.apache.hadoop.hive.ql.exec.Task.executeTask (Task.java:160)

Atorg.apache.hadoop.hive.ql.exec.TaskRunner.runSequential (TaskRunner.java:88)

Atorg.apache.hadoop.hive.ql.Driver.launchTask (Driver.java:1640)

Atorg.apache.hadoop.hive.ql.Driver.execute (Driver.java:1399)

Atorg.apache.hadoop.hive.ql.Driver.runInternal (Driver.java:1183)

Atorg.apache.hadoop.hive.ql.Driver.run (Driver.java:1049)

Atorg.apache.hadoop.hive.ql.Driver.run (Driver.java:1044)

At org.apache.hive.service.cli.operation.SQLOperation.runQuery (SQLOperation.java:144)

Atorg.apache.hive.service.cli.operation.SQLOperation.access$100 (SQLOperation.java:69)

Atorg.apache.hive.service.cli.operation.SQLOperation$1 $1.run (SQLOperation.java:196)

Atjava.security.AccessController.doPrivileged (Native Method)

Atjavax.security.auth.Subject.doAs (Subject.java:415)

Atorg.apache.hadoop.security.UserGroupInformation.doAs (UserGroupInformation.java:1671)

At org.apache.hive.service.cli.operation.SQLOperation$1.run (SQLOperation.java:208)

Atjava.util.concurrent.Executors$RunnableAdapter.call (Executors.java:471)

Atjava.util.concurrent.FutureTask.run (FutureTask.java:262)

Atjava.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1145)

Atjava.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:615)

Atjava.lang.Thread.run (Thread.java:745)

Caused by: java.io.IOException: Renames across Mount points notsupported

Atorg.apache.hadoop.fs.viewfs.ViewFileSystem.rename (ViewFileSystem.java:444)

Atorg.apache.hadoop.hive.ql.metadata.Hive.moveFile (Hive.java:2509)

... 21 more

Related codes:

Org.apache.hadoop.fs.viewfs.ViewFileSystem

/ * *

/ / Alternate 1: renames within same file system-valid but we disallow

/ / Alternate 2: (as described in next para-valid butwe have disallowed it

/ /

/ / Note we compare the URIs. The URIs include the linktargets.

/ / hence we allow renames across mount links as longas the mount links

/ / point to the same target.

If (! re***c.targetFileSystem.getUri (). Equals (

ResDst.targetFileSystem.getUri ()) {

Throw new IOException ("Renames acrossMount points not supported")

}

, /

/ /

/ / Alternate 3: renames ONLY within the the samemount links.

/ /

If (reproducing targets. TargetFileSystemSystems) {

Throw new IOException ("Renames acrossMount points not supported")

}

Workaround:

Create / user/hive/warehouse/staging directory in hdfs and grant 777 permissions

Then add the configuration:

Hive.exec.stagingdir

/ user/hive/warehouse/staging/.hive-staging

B. create only one load point such as / cluser, and then create a directory such as / tmp / user under this load point, and finally modify the default value of the hive-related directory.

2. When the query returns a large result set, the beeline client will get stuck or out-of-memory

Error message:

Org.apache.thrift.TException: Error in calling method FetchResults

Atorg.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke (HiveConnection.java:1271)

Atcom.sun.proxy.$Proxy0.FetchResults (Unknown Source)

Atorg.apache.hive.jdbc.HiveQueryResultSet.next (HiveQueryResultSet.java:363)

At org.apache.hive.beeline.BufferedRows. (BufferedRows.java:42)

Atorg.apache.hive.beeline.BeeLine.print (BeeLine.java:1756)

Atorg.apache.hive.beeline.Commands.execute (Commands.java:806)

Atorg.apache.hive.beeline.Commands.sql (Commands.java:665)

Atorg.apache.hive.beeline.BeeLine.dispatch (BeeLine.java:974)

Atorg.apache.hive.beeline.BeeLine.execute (BeeLine.java:810)

Atorg.apache.hive.beeline.BeeLine.begin (BeeLine.java:767)

At org.apache.hive.beeline.BeeLine.mainWithInputRedirection (BeeLine.java:480)

Atorg.apache.hive.beeline.BeeLine.main (BeeLine.java:463)

Atsun.reflect.NativeMethodAccessorImpl.invoke0 (NativeMethod)

Atsun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:57)

Atsun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)

Atjava.lang.reflect.Method.invoke (Method.java:606)

Atorg.apache.hadoop.util.RunJar.run (RunJar.java:221)

At org.apache.hadoop.util.RunJar.main (RunJar.java:136)

Caused by: java.lang.OutOfMemoryError: Java heap space

Atjava.lang.Double.valueOf (Double.java:521)

Workaround:

Looking at the source code, it is found that there are two modes for beeline to get the result set, one is incremental mode, and the other is buffer mode.

Org.apache.hive.beeline.BeeLine

Int print (ResultSet rs) throws SQLException {

String format = getOpts () .getOutputFormat ()

OutputFormat f = (OutputFormat) formats.get (format)

If (f = = null) {

Error (loc ("unknown-format", new Object [] {)

Format,formats.keySet ()}))

F = new TableOutputFormat (this)

}

Rows rows

If (getOpts () .getIncremental ()) {

Rows = new IncrementalRows (this,rs); / / incremental mode

} else {

Rows = new BufferedRows (this, rs); buffer mode

}

Return f.print (rows)

}

Org.apache.hive.beeline.BeeLineOpts

Private boolean incremental = false; / / defaults to buffer mode

However, no relevant settings were found through beeline-help

Beeline-help

Usage: java org.apache.hive.cli.beeline.BeeLine

-u the JDBC URL to connect to

-n the username to connect as

-p the password to connect as

-d the driver class to use

-i script file for initialization

-e query that should be executed

-f script file that should be executed

W (or)-password-file the password file to read password from

-- hiveconfproperty=value Use value for given property

-- hivevarname=value hive variable name and value

This is Hive specific settings in which variables

Can be set at session level and referenced in Hive

Commands or queries.

Color= [true/false] control whether color is used for display

ShowHeader= [true/false] show column namesin query results

-- headerInterval=ROWS; the interval between which heades are displayed

FastConnect= [true/false] skip buildingtable/column list for tab-completion

AutoCommit= [true/false] enable/disableautomatic transaction commit

Verbose= [true/false] show verbose error messages and debug info

ShowWarnings= [true/false] display connection warnings

ShowNestedErrs= [true/false] displaynested errors

NumberFormat= [pattern] formatnumbers using DecimalFormat pattern

Force= [true/false] continue running script even after errors

-- maxWidth=MAXWIDTH the maximum width of the terminal

-- maxColumnWidth=MAXCOLWIDTH themaximum width to use when displaying columns

Silent= [true/false] be more silent

Autosave= [true/false] automatically save preferences

Outputformat= [table/vertical/csv2/tsv2/dsv/csv/tsv] format mode forresult display

Note that csv, and tsv are deprecated-use csv2, tsv2 instead

TruncateTable= [true/false] truncatetable column when it exceeds length

-- delimiterForDSV=DELIMITER specify the delimiter for delimiter-separated values output format (default: |)

-- isolation=LEVEL set the transaction isolation level

Nullemptystring= [true/false] set to true toget historic behavior of printing null as emptystring

-- help display this message

Beeline version 1.1.0-cdh6.4.3 by Apache Hive

But it doesn't matter. Yes.

Beeline-u jdbc:hive2://10.17.28.173:10000-n xxxx-pxxxx-- incremental=true can still enter incremental mode

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.