In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
The ImportTsv tool is done through map reduce. So start yarn. The tool uses the jar package, so be careful to configure classpath. ImportTsv inserts data through hbase api by default
[hadoop-user@rhel work] $cat / home/hadoop-user/.bash_profile
# .bash _ profile
# Get the aliases and functions
If [- f ~ / .bashrc]; then
. ~ / .bashrc
Fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin
Export PATH
JAVA_HOME=/usr/java/jdk1.8.0_171-amd64
PATH=$PATH:$JAVA_HOME/bin
CLASSPATH=$CLASSPATH:$JAVA_HOME/lib
HADOOP_HOME=/home/hadoop-user/hadoop-2.8.0
PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib
HBASE_HOME=/home/hadoop-user/hbase-2.0.0
PATH=$PATH:$HBASE_HOME/bin
CLASSPATH=$CLASSPATH:$HBASE_HOME/lib
ZOOKEEPER_HOME=/home/hadoop-user/zookeeper-3.4.12
PATH=$PATH:$ZOOKEEPER_HOME/bin
PHOENIX_HOME=/home/hadoop-user/apache-phoenix-5.0.0-alpha-HBase-2.0-bin
PATH=$PATH:$PHOENIX_HOME/bin
Export PATH
Create a tabl
Hbase (main): 033 create 0 > test','cf'
Create a file to import
[hadoop-user@rhel work] $cat / home/hadoop-user/work/sample1.csv
Row10, "mjj10"
Row11, "mjj11"
Row12, "mjj12"
Row14, "mjj13"
Put files into hdfs
[hadoop-user@rhel work] $hdfs dfs-put / home/hadoop-user/work/sample1.csv / sample1.csv
ImportTsv Import Command
Hbase org.apache.hadoop.hbase.mapreduce.ImportTsv-Dimporttsv.separator= ","-Dimporttsv.columns=HBASE_ROW_KEY,cf:a test / sample1.csv
Note: HBASE_ROW_KEY indicates the location of the file rowid, followed by the definition of the column. This means that the imported column family is cf and the column name is a. The file to be imported is hdfs / sample1.csv
Explanation in help
Usage: importtsv-Dimporttsv.columns=a,b,c
Imports the given input directory of TSV data into the specified table.
The column names of the TSV data must be specified using the-Dimporttsv.columns
Option. This option takes the form of comma-separated column names, where each
Column name is either a simple columnfamily, or a columnfamily:qualifier. The special
Column name HBASE_ROW_KEY is used to designate that this column should be used
As the row key for each imported record. You must specify exactly one column
To be the row key, and you must specify a column name for every column that exists in the
Input data. Another special columnHBASE_TS_KEY designates that this column should be
Used as timestamp for each record. Unlike HBASE_ROW_KEY, HBASE_TS_KEY is optional.
You must specify at most one column as timestamp key for each imported record.
Record with invalid timestamps (blank, non-numeric) will be treated as bad record.
Note: if you use this option, then 'importtsv.timestamp' option will be ignored.
Note: the content imported by ImportTsv cannot be seen by phoenix. In fact, tables created by hbase are not visible to Phoenix. The table created by phoenix can be seen by hbase, but the content is encoded.
The importtsv tool uses hbase put api to import data by default. When using the option-Dimporttsv.bulk.output, it will be converted to the internal format of the HFILE file.
The importtsv tool, by default, uses the HBase Put API to insert data into the HBase
Table using TableOutputFormat in its map phase. But when the-Dimporttsv.bulk.output option is specified, it instead generates HBase internal format (HFile) files on HDFS
By using HFileOutputFormat. Therefore, we can then use the completebulkload tool to load the generated files into a running cluster. The following steps are to use the bulk output and load tools:
Generate file command in HFILE format
Hbase org.apache.hadoop.hbase.mapreduce.ImportTsv-Dimporttsv.separator= ","-Dimporttsv.bulk.output=/hfiles_tsv-Dimporttsv.columns=HBASE_ROW_KEY,cf:a test / sample1.csv
Note: generate files in hfile format and store them in the / hfile_tsv directory in hdfs. The directory will be created by the command itself.
[hadoop-user@rhel work] $hdfs dfs-ls / hfiles_tsv/cf
18-06-28 10:49:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicable
Found 1 items
-rw-r--r-- 1 hadoop-user supergroup 5125 2018-06-28 10:40 / hfiles_tsv/cf/0e466616d42a4a128fb60caa7dbe075a
Note: the naming format of 0e466616d42a4a128fb60caa7dbe075a is similar to that of region in WEB.
Pass through
Hadoop jar hbase-server-2.0.0.jar completebulkload / hfiles_tsv 'test'
Exception occurred; Exception in thread "main" java.lang.ClassNotFoundException: completebulkload
There are two ways to import HBASE documents:
There are two ways to invoke this utility, with explicit classname and via the driver:
Explicit Classname
$bin/hbase org.apache.hadoop.hbase.tool.LoadIncrementalHFiles
Driver
HADOOP_CLASSPATH= `${HBASE_HOME} / bin/hbase classpath` ${HADOOP_HOME} / bin/hadoop jar ${HBASE_HOME} / hbase-server-VERSION.jar completebulkload
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.