Introduction to Analysis of Hive (1) 04/17 Update SLTechnology News&Howtos

Introduction to Analysis of Hive (1)

2025-04-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

1 introduction to Hive

1.1 Hive definition

Hive is a data warehouse tool based on Hadoop, which can map structured data files to a database table and provide SQL-like query functions.

The essence is to convert SQL into MapReduce program.

1.2 Why use Hive

1. Problems faced

The cost of personnel learning is too high.

The project cycle requirement is too short.

I just need a simple environment.

How to deal with MapReduce

Complex query is very difficult.

How to implement Join

2. Why use Hive

The operation interface adopts SQL-like syntax to provide the ability of rapid development.

Avoid writing MapReduce and reduce the learning cost of developers.

It is convenient to expand the function.

1.3Features of Hive

1. Scalable

Hive is free to expand the size of the cluster. Generally, there is no need to restart the service.

2. Malleability

Hive supports user-defined functions. Users can implement their own functions according to their own needs.

3. Fault tolerance

Good fault tolerance, SQL can still be executed if there is a problem with the node.

1.4 the relationship between Hive and Hadoop

1.5 the relationship between Hive and traditional databases

1.6 History of Hive

Implemented by FaceBook and open source

In March 2011, version 0.7.0 was released, which is a major upgrade, with many advanced features such as simple indexing, HAING and so on.

In June 2011, version 0.7.1 was released, fixing some BUG issues, such as using JDBC on Windows

In December 2011, version 0.8.0 was released, which is a major upgrade, adding many advanced features such as insert into, HA, and so on.

February 5, 2012, version 0.8.1 was released, fixed some BUG, such as enabling Hive to run on both Hadoop0.20.x and 0.23.0

On April 30, 2012, version 0.9.0 was released with major improvements, adding support for Hadoop 1.0.0, implementation of BETWEEN and other features.

1.7 Future Development of Hive

Add more functions similar to traditional databases, such as stored procedures

Improve the performance of converted MapReduce

The ability to have a real data warehouse

Partial reinforcement of UI

2 Software preparation and environmental planning

2.1 introduction to the Hadoop environment

Hadoop installation path: / home/test/Desktop/hadoop-1.0.0/

Hadoop metadata storage directory: / home/test/data/core/namenode

Hadoop data storage path: / home/test/data/core/datanode

Hive installation path: / home/test/Desktop/

Hive data storage path: / user/hive/warehouse

Hive metadata

Third-party database: derby mysql

2.2 Software preparation

Ubuntu

JDK

Java 1.6.0_27

Hadoop

Hadoop-1.0.0.tar

Hive

Hive-0.8.1.tar

2.3 Project structure

2.4 introduction to Hive profile

1. Introduction of Hive configuration file

Configuration file for hive-site.xml hive

Running environment file for hive-env.sh hive

Hive-default.xml.template default template

Hive-env.sh.template hive-env.sh default configuration

Hive-exec-log4j.properties.template exec default configuration

Hive-log4j.properties.template log default configuration

2 、 hive-site.xml

< property>

Javax.jdo.option.ConnectionURLjdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true JDBC connectstring for a JDBC metastorejavax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver Driver classname for a JDBC metastorejavax.jdo.option.ConnectionUserName root username touse against metastore databasejavax.jdo.option.ConnectionPassword test password touse against metastore database

3 、 hive-env.sh

Configure the configuration file path for Hive: export HIVE_CONF_DIR= your path

Configure the installation path of Hadoop: HADOOP_HOME=your hadoop home

2.5 installation using the Derby database

1. What is the Derby installation method

ApacheDerby is a database written entirely in java, so it can be cross-platform, but it needs to be run in JVM

Derby is an Open source product based on Apache License 2.0 distribution

Metadata is stored in the Derby database, which is also the default installation of Hive.

2. Install Hive

Decompress Hive:tar zxvf hive-0.8.1.tar / home/test/Desktop

Establish a soft connection: ln-s hive-0.8.1 hive

Add environment variabl

Export HIVE_HOME=/home/test/Desktop/hiveexport PATH=... .HIVE _ HOME/bin:$PATH:.

3. Configure Hive

Enter the hive/conf directory

Create a hive-env.sh file based on hive-env.sh.template

Cp hive-env.sh.template hive-env.sh

Modify hive-env.sh

Specify the path to the hive configuration file

Export HIVE_CONF_DIR=/home/test/Desktop/hive/conf

Specify Hadoop path

HADOOP_HOME=/home/test/Desktop/hadoop

4 、 hive-site.xml

Javax.jdo.option.ConnectionURLjdbc:derby:;databaseName=metastore_db;create=true JDBCconnect string for a JDBC metastore javax.jdo.option.ConnectionDriverName org.apache.derby.jdbc.EmbeddedDriver Driver class name for a JDBC metastore javax.jdo.option.ConnectionUserName APP username to use against metastoredatabase javax.jdo.option.ConnectionPassword mine password to use against metastoredatabase

5. Start hive

Command line type

Hive

Display

WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Pleaseuse org.apache.hadoop.log.metrics.EventCounter in all the log4j.propertiesfiles.Logging initialized using configuration injarvus filebank bank Homebank test historyfile=/tmp/test/hive_job_log_test_201208260529_167273830.txthive Desktopstop HiveMays 0.8.1, HiveMutel 0.8.1, HiveMattel 0.8.1.jarbank, hiveMux log4j.propertiesHive,

5. Test statement

Set up the test table test

Createtable test (key string); showtables

2.6 installation using the MySQL database

1. Install MySQL

Ubuntu is installed with apt-get

Sudo apt-get install mysql-server

Set up the database hive

Create database hive

Create a hive user and authorize

Grant all on hive.* to hive@'%' identified by 'hive';flush privileges

2. Install Hive

Extract the Hive:

Tar zxvf hive-0.8.1.tar / home/test/Desktop

Establish a soft connection:

Ln-s hive-0.8.1 hive

Add environment variabl

ExportHIVE_HOME=/home/test/Desktop/hiveexportPATH=... .HIVE _ HOME/bin:$PATH:.

3. Modify hive-site.xml

Javax.jdo.option.ConnectionURL jdbc:mysql://localhost:3306/hive javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver javax.jdo.option.ConnectionPassword hive hive.hwi.listen.port 9999 This is the port the Hive Web Interface will listenon datanucleus.autoCreateSchema false datanucleus.fixedDatastore true hive.metastore.local true controls whether toconnect to remove metastore server or open a new metastore server in HiveClient JVM

4. Start Hive

On the command line, type: Hive

Display

WARNING: org.apache.hadoop.metrics.jvm.EventCounter isdeprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all thelog4j.properties files.Logging initialized using configuration injarvus filebank bank Homebank test historyfile=/tmp/test/hive_job_log_test_201208260529_167273830.txthive Desktopstop HiveMays 0.8.1, HiveMutel 0.8.1, HiveMattel 0.8.1.jarbank, hiveMux log4j.propertiesHive,

5. Test statement

Set up the test table test

Create table test (key string); show tables

3 Hive built-in operators and function development

3.1 Relational operator

Equivalent comparison: =

Unequal value comparison:

Less than comparison: =

Null value judgment: IS NULL

Non-empty judgment: IS NOT NULL

LIKE comparison: LIKE

LIKE operation of JAVA: RLIKE

REGEXP operation: REGEXP

Equivalent comparison: =

Grammar: axib

Operation types: all basic types

Description: TRUE; if expression An is equal to expression B, otherwise FALSE

For example: hive > select 1 from dual where 1

Unequal value comparison:

Grammar: a B

Operation types: all basic types

Description: if expression An is NULL, or expression B is NULL, return NULL; if expression An and expression B are not equal, TRUE; otherwise FALSE

For example: hive > select 1 from dual where 1 2

Less than comparison: select 1 from dual where 1

< 2; 小于等于比较: = B 操作类型: 所有基本类型描述: 如果表达式A为NULL，或者表达式B为NULL，返回NULL；如果表达式A大于或者等于表达式B，则为TRUE；否则为FALSE 举例：hive>

Select 1 from dual where 1 > = 1

Null value judgment: IS NULL

Syntax: An IS NULL

Operation types: all types

Description: TRUE; if the value of expression An is NULL, otherwise FALSE

For example: hive > select 1 from dual where null is null

Non-empty judgment: IS NOT NULL

Syntax: An IS NOT NULL

Operation types: all types

Description: FALSE; if the value of expression An is NULL, otherwise TRUE

For example: hive > select 1 from dual where 1 is not null

LIKE comparison: LIKE

Syntax: A LIKE B

Operation type: strings

Description: returns NULL; if string An or string B is NULL. If string A conforms to the regular syntax of expression B, it is TRUE; otherwise FALSE. The character "_" in B represents any single character, while the character "%" represents any number of characters.

For example: hive > select 1 from dual where 'key' like' foot%'

Hive > select 1 from dual where 'key' like'foot____'

Note: use NOT A LIKE B when negative comparison

Hive > select 1 from dual where NOT 'key' like 'fff%'

LIKE operation of JAVA: RLIKE

Syntax: A RLIKE B

Operation type: strings

Description: returns NULL; if string An or string B is NULL. If string A conforms to the regular syntax of JAVA regular expression B, TRUE; otherwise FALSE.

For example: hive > select 1 from dual where 'footbar' rlike' ^ f.

Note: determine whether a string is all numeric:

Hive > select 1 from dual where '123456' rlike' ^\\ dbath'

Hive > select 1 from dual where '123456aaa' rlike' ^\\ dbath'

REGEXP operation: REGEXP

Syntax: A REGEXP B

Operation type: strings

Description: the function is the same as RLIKE

For example: hive > select 1 from dual where 'key' REGEXP' ^ f.

3.2 logical operation and mathematical operation

Addition operation: +

Subtraction operation:-

Multiplication: *

Division operation: /

Residual operation:%

Bit and operation: &

Bit or operation: |

Bit XOR operation: ^

Bit inversion operation: ~

Logic and operation: AND

Logic or operation: OR

Logical non-operation: NOT

Take the whole function: round

Specified precision integer function: round

Take the whole function down: floor

Take the whole function up: ceil

Take the whole function up: ceiling

Take random number function: rand

Natural exponential function: exp

Logarithmic function with base 10: log10

Logarithmic function with base 2: log2

Logarithmic function: log

Exponentiation function: pow

Exponentiation function: power

Square function: sqrt

Binary function: bin

Hexadecimal function: hex

Reverse hexadecimal function: unhex

Binary conversion function: conv

Absolute value function: abs

Positive remainder function: pmod

Sine function: sin

Arcsine function: asin

CoSine function: cos

Inverse cosine function: acos

Positive function: positive

Negative function: negative

UNIX timestamp date function: from_unixtime

Get the current UNIX timestamp function: unix_timestamp

Date to UNIX timestamp function: unix_timestamp

Specified format date to UNIX timestamp function: unix_timestamp

Date time to date function: to_date

Date transfer function: year

Date-to-month function: month

Date-to-day function: day

Date-to-hour function: hour

Date to minute function: minute

Date to second function: second

Date transfer function: weekofyear

Date comparison function: datediff

Date increment function: date_add

Date reduction function: date_sub

If function: if

Non-empty lookup function: COALESCE

Conditional judgment function: CASE

String length function: length

String inversion function: reverse

String concatenation function: concat

Delimited string concatenation function: concat_ws

String interception function: substr,substring

String to uppercase function: upper,ucase

String to lowercase function: lower,lcase

Unspace function: trim

Go to the left space function: ltrim

The right-hand space function: rtrim

Regular expression replacement function: regexp_replace

Regular expression parsing function: regexp_extract

URL analytic function: parse_url

Json analytic function: get_json_object

Space string function: space

Repeating string function: repeat

First character ascii function: ascii

Left complement function: lpad

Right complement function: rpad

Split string function: split

Collection lookup function: find_in_set

Map type construction: map

Struct type construction: struct

Array type construction: array

Array type access: a [n]

Map type access: M [key]

Struct type access: S.X

Map type length function: size (Map)

Array type length function: size (Array)

Type conversion function

1. Addition operation: +

Grammar: a + B

Operation types: all numeric types

Description: returns the result of the addition of An and B. The numeric type of the result is equal to the type of An and the smallest parent of the type of B. (see the inheritance relationship of the data type for details). For example, the general result of int + int is int, while the general result of int + double is double.

For example: hive > select 1 + 9 from dual; 10

2. Subtraction operation:-

Grammar: a-B

Operation types: all numeric types

Description: returns the result of subtracting An and B. The numeric type of the result is equal to the type of An and the smallest parent of the type of B. (see the inheritance relationship of the data type for details). For example, the general result of int-int is int, while the general result of int-double is double.

For example: hive > select 10-5 from dual;5

3. Multiplication operation: *

Grammar: a * B

Operation types: all numeric types

Description: returns the result of multiplying An and B. The numeric type of the result is equal to the type of An and the smallest parent of the type of B. (see the inheritance relationship of the data type for details). Note that if the result of multiplying A by B exceeds the numerical range of the default result type, you need to convert the result to a wider range of numeric types through cast

For example: hive > select 40 * 5 from dual;200

4. Division operation: /

Grammar: a / B

Operation types: all numeric types

Description: returns the result of A divided by B. The numerical type of the result is double

For example: hive > select 40 / 5 from dual;8.0

Note: the highest precision data type in hive is double, which is only accurate to 16 places after the decimal point. Pay special attention to when doing division operations:

Hive > select ceil (28.0 amp 6.99999999999999) from dual limit 1; 4

Hive > select ceil (28.0 amp 6.99999999999999) from dual limit 1; 5

5. Surplus operation:%

Syntax: a% B

Operation types: all numeric types

Description: returns the remainder of A divided by B. The numeric type of the result is equal to the type of An and the smallest parent of the type of B. (see the inheritance relationship of the data type for details).

For example: hive > select 41% 5 from dual; 1

Hive > select 8.4% 4 from dual; 0.400000000000036

Note: precision is a big problem in hive, and operations like this are best specified through round.

Hive > select round (8.4% 4,2) from dual;0.4

6. Bit and operation: &

Grammar: a & B

Operation types: all numeric types

Description: returns the results of bit-by-bit operations of An and B. The numeric type of the result is equal to the type of An and the smallest parent of the type of B. (see the inheritance relationship of the data type for details).

For example: hive > select 4 & 8 from dual;0

Hive > select 6 & 4 from dual;4

7. Bit or operation: |

Syntax: a | B

Operation types: all numeric types

Description: returns the result of bit-by-bit or operation of An and B. The numeric type of the result is equal to the type of An and the smallest parent of the type of B. (see the inheritance relationship of the data type for details).

For example: hive > select 4 | 8 from dual; 12

Hive > select 6 | 8 from dual; 14

8. XOR operation: ^

Grammar: a ^ B

Operation types: all numeric types

Description: returns the result of bit-by-bit XOR operations of An and B. The numeric type of the result is equal to the type of An and the smallest parent of the type of B. (see the inheritance relationship of the data type for details).

For example: hive > select 4 ^ 8 from dual; 12

Hive > select 6 ^ 4 from dual; 2

9. Bit inversion operation: ~

Syntax: ~ A

Operation types: all numeric types

Description: returns the result of A bitwise reverse operation. The numeric type of the result is equal to the type of A.

For example: hive > select ~ 6 from dual;-7

Hive > select ~ 4 from dual;-5

10. Logic and operation: AND

Syntax: An AND B

Operation type: boolean

Note: if both An and B are TRUE, it is TRUE;, otherwise it is FALSE. NULL if An is NULL or B is NULL

For example: hive > select 1 from dual where 1 and 2; 1

11. Logic or operation: OR

Syntax: An OR B

Operation type: boolean

Note: if An is TRUE, or B is TRUE, or An and B are both TRUE, it is TRUE; otherwise it is FALSE

For example: hive > select 1 from dual where 1 or 2; 1

12. Logical non-operation: NOT

Syntax: NOT A

Operation type: boolean

Note: if An is FALSE, or An is NULL, it is TRUE;, otherwise it is FALSE.

For example: hive > select 1 from dual where not 1 # 2

13. Rounding function: round

Syntax: round (double a)

Return value: BIGINT

Description: returns the integer value part of the double type (following rounding)

Example: hive > select round (3.1415926) from dual; 3

Hive > select round (3.5) from dual; 4

Hive > create table dual as select round (9542.158) fromdual

Hive > describe dual; _ c0 bigint

14. Specify precision rounding function: round

Syntax: round (double a, int d)

Return value: DOUBLE

Description: returns the double type of the specified precision d

For example: hive > selectround (3.1415926) from dual; 3.1416

15. Take the whole function down: floor

Syntax: floor (double a)

Return value: BIGINT

Description: returns the largest integer equal to or less than the double variable

Example: hive > select floor (3.1415926) from dual; 3

Hive > select floor (25) from dual; 25

16. Take the whole function up: ceil

Syntax: ceil (double a)

Return value: BIGINT

Description: returns the smallest integer equal to or greater than the double variable

Example: hive > select ceil (3.1415926) from dual; 4

Hive > select ceil (46) from dual; 46

17. Take the whole function up: ceiling

Syntax: ceiling (double a)

Return value: BIGINT

Description: same function as ceil

Example: hive > select ceiling (3.1415926) from dual; 4

Hive > select ceiling (46) from dual; 46

18. Take the random number function: rand

Syntax: rand (), rand (int seed)

Return value: double

Description: returns a random number in the range of 0 to 1. If you specify a seed seed, you will wait for a stable sequence of random numbers

For example: hive > select rand () from dual; 0.5577432776034763

19. Natural exponential function: exp

Syntax: exp (double a)

Return value: double

Description: returns the a power of the natural logarithm e

For example: hive > select exp (2) from dual; 7.38905609893065

20. Natural logarithmic function: ln

Syntax: ln (double a)

Return value: double

Description: returns the natural logarithm of a

21. Logarithmic function based on 10: log10

Syntax: log10 (double a)

Return value: double

Description: returns the logarithm of a with a base of 10

For example: hive > select log10 (100) from dual;2.0

22. Logarithmic function with 2 as base: log2

Syntax: log2 (double a)

Return value: double

Description: returns the logarithm of a with base 2

Example: hive > select log2 (8) from dual; 3.0

23. Logarithmic function: log

Syntax: log (double base, double a)

Return value: double

Description: returns the logarithm of a with base as the base

For example: hive > select log (4256) from dual; 4.0

24. Power arithmetic function: pow

Syntax: pow (double a, double p)

Return value: double

Description: returns the p power of a

Example: hive > select pow (2pm 4) from dual; 16.0

25. Square function: sqrt

Syntax: sqrt (double a)

Return value: double

Description: returns the square root of a

Example: hive > select sqrt (16) from dual; 4.0

26. Binary function: bin

Syntax: bin (BIGINT a)

Return value: string

Description: returns the binary representation of a

For example: hive > select bin (7) from dual; 111

27. Hexadecimal function: hex

Syntax: hex (BIGINT a)

Return value: string

Description: returns the hexadecimal representation of an if the variable is of type int; if the variable is of type string, the hexadecimal representation of the string

Example: hive > select hex (17) from dual; 11

Hive > select hex ('abc') from dual; 616263

28. Reverse hexadecimal function: unhex

Syntax: unhex (string a)

Return value: string

Description: returns the string coded by the hexadecimal string

For example: hive > selectunhex ('616263') from dual; abc

Hive > select unhex ('11') from dual;-

Hive > select unhex (616263) from dual; abc

29. Binary conversion function: conv

Syntax: conv (BIGINT num, int from_base, int to_base)

Return value: string

Description: convert the numeric num from from_base to to_base

For example: hive > select conv (17, 10, 10, 16) from dual; 11

Hive > select conv (170.10jue 2) from dual; 10001

30. Absolute value function: abs

Syntax: abs (double a) abs (int a)

Return value: double int

Description: returns the absolute value of the value a

For example: hive > select abs (- 3.9) from dual; 3.9

Hive > select abs (10.9) from dual;

31. Positive remainder function: pmod

Syntax: pmod (int a, int b), pmod (double a, double b)

Return value: int double

Description: returns the remainder of positive a divided by b

For example: hive > select pmod (9 from dual; 4)

Hive > select pmod (- 9pm 4) from dual; 3

32. Sinusoidal function: sin

Syntax: sin (double a)

Return value: double

Description: returns the sine of a

For example: hive > select sin (0.8) from dual; 0.7173560908995228

33. Arcsine function: asin

Syntax: asin (double a)

Return value: double

Description: returns the arcsine of a

Example: hive > select asin (0.7173560908995228) from dual; 0.8

34. CoSine function: cos

Syntax: cos (double a)

Return value: double

Description: returns the cosine of a

For example: hive > select cos (0.9) from dual; 0.6216099682706644

35. Inverse cosine function: acos

Syntax: acos (double a)

Return value: double

Description: returns the inverse cosine of a

Example: hive > select acos (0.6216099682706644) from dual; 0.9

36. Positive function: positive

Syntax: positive (int a), positive (double a)

Return value: int double

Description: return a

For example: hive > selectpositive (- 10) from dual;-10

Hive > select positive (12) from dual; 12

37. Negative function: negative

Syntax: negative (int a), negative (double a)

Return value: int double

Description: return-a

For example: hive > select negative (- 5) from dual; 5

Hive > select negative (8) from dual;-8

38. UNIX timestamp transfer date function: from_unixtime

Syntax: from_unixtime (bigint unixtime [, string format])

Return value: string

Description: converts the UNIX timestamp (the number of seconds from 1970-01-01 00:00:00 UTC to the specified time) to the time format of the current time zone

For example: hive > select from_unixtime (1323308943 MMdd') from dual; 20111208

39. Get the current UNIX timestamp function: unix_timestamp

Syntax: unix_timestamp ()

Return value: bigint

Description: get the UNIX timestamp of the current time zone

For example: hive > select unix_timestamp () from dual; 1323309615

40. Date to UNIX timestamp function: unix_timestamp

Syntax: unix_timestamp (string date)

Return value: bigint

Description: convert the date in the format "yyyy-MM-ddHH:mm:ss" to the UNIX timestamp. If the conversion fails, 0 is returned.

For example: hive > select unix_timestamp ('2011-12-07 13 from dual; 01purl 03')

41. Date in specified format is converted to UNIX timestamp function: unix_timestamp

Syntax: unix_timestamp (string date, string pattern)

Return value: bigint

Description: convert the date in pattern format to UNIX timestamp. If the conversion fails, 0 is returned.

For example: hive > select unix_timestamp ('2011120713 HH:mm:ss' 01bank 01lo 03mm HH:mm:ss') from dual; 1323234063

42. Date time to date function: to_date

Syntax: to_date (string timestamp)

Return value: string

Description: returns the date section in the date time field.

For example: hive > select to_date ('2011-12-08 10 fromdual 0315) fromdual

43. Date transfer function: year

Syntax: year (string date)

Return value: int

Description: returns the year in the date.

For example: hive > select year ('2011-12-08 10 fromdual;2011 0315) fromdual;2011

Hive > select year ('2012-12-08') from dual; 2012

44. Date-to-month function: month

Syntax: month (string date)

Return value: int

Description: returns the month in the date.

For example: hive > select month ('2011-12-08 10 fromdual;12 0315) fromdual;12

Hive > select month ('2011-08-08') from dual; 8

45. Date transfer function: day

Syntax: day (string date)

Return value: int

Description: returns the day in the date.

For example: hive > select day ('2011-12-08 10 from dual; 0315 01')

Hive > select day ('2011-12-24') from dual; 24

46. Date-to-hour function: hour

Syntax: hour (string date)

Return value: int

Description: returns the hour in the date.

For example: hive > select hour ('2011-12-08 10 fromdual;10 0315) fromdual;10

47. Date to minute function: minute

Syntax: minute (string date)

Return value: int

Description: returns the minutes in the date.

For example: hive > select minute ('2011-12-08 10 fromdual; 01') fromdual; 3

48. Date to second function: second

Syntax: second (string date)

Return value: int

Description: returns the seconds in the date.

For example: hive > select second ('2011-12-08 10 fromdual; 0315 01')

49. Date transfer function: weekofyear

Syntax: weekofyear (string date)

Return value: int

Description: the return date is the current number of weeks.

For example: hive > select weekofyear ('2011-12-08 10 fromdual;49 0315) fromdual;49

50. Date comparison function: datediff

Syntax: datediff (string enddate, string startdate)

Return value: int

Description: returns the number of days from the end date minus the start date.

For example: hive > select datediff ('2012-12-08) from dual; 213

51. Date increment function: date_add

Syntax: date_add (string startdate, int days)

Return value: string

Description: returns the start date the date after startdate adds days days.

For example: hive > select date_add ('2012-12-08) from dual

52. Date reduction function: date_sub

Syntax: date_sub (string startdate, int days)

Return value: string

Description: returns the date after the start date startdate minus days.

For example: hive > select date_sub ('2012-12-08) from dual

53. If function: if

Syntax: if (boolean testCondition, T valueTrue, TvalueFalseOrNull)

Return value: t

Note: when the condition testCondition is TRUE, return valueTrue; otherwise return valueFalseOrNull

For example: hive > select if (1x 2100200) from dual; 200

Hive > select if (1mm 1100200) from dual;100

54. Non-empty lookup function: COALESCE

Syntax: COALESCE (T v1, T v2, …)

Return value: t

Description: returns the first non-null value in the parameter; if all values are NULL, return NULL

For example: hive > select COALESCE (null,'100','50') from dual; 100

55. Conditional judgment function: CASE

Syntax: CASE a WHEN b THEN c [WHEN d THEN e] * [ELSE f] END

Return value: t

Note: if an equals b, return c; if an equals d, return e; otherwise return f

For example: hive > Select case 100 when 50 then 'tom' when 100then' mary' else 'tim' end from dual; mary

56. String length function: length

Syntax: length (string A)

Return value: int

Description: returns the length of string A

For example: hive > select length ('abcedfg') from dual; 7

57. String inversion function: reverse

Syntax: reverse (string A)

Return value: string

Description: returns the inversion result of the string A

For example: hive > select reverse (abcedfg') from dual; gfdecba

58. String concatenation function: concat

Syntax: concat (string A, string B...)

Return value: string

Description: returns the result of the input string connection. Any input string is supported.

For example: hive > select concat ('abc','def','gh') from dual

Abcdefgh

59. Delimited string concatenation function: concat_ws

Syntax: concat_ws (string SEP, string A, string B...)

Return value: string

Description: returns the result after the input string is concatenated. SEP represents the delimiter between each string.

For example: hive > select concat_ws (',', 'abc','def','gh') fromdual

Abc,def,gh

60. String intercepting function: substr,substring

Syntax: substr (string A, intstart), substring (string A, intstart)

Return value: string

Description: returns the string A from the start position to the end of the string

For example: hive > select substr ('abcde',3) from dual; cde

Hive > select substring ('abcde',3) from dual; cde

Hive > selectsubstr ('abcde',-1) from dual; e

61. String intercepting function: substr,substring

Syntax: substr (stringA, int start, int len), substring (stringA, int start, int len)

Return value: string

Description: returns the string A that starts at the start position and has a length of len.

For example: hive > select substr ('abcde',3,2) from dual; cd

Hive > select substring ('abcde',3,2) from dual; cd

Hive > select substring ('abcde',-2,2) from dual; de

62. String capitalization function: upper,ucase

Syntax: upper (string A) ucase (string A)

Return value: string

Description: returns the uppercase format of string A

For example: hive > select upper ('abSEd') from dual; ABSED

Hive > select ucase ('abSEd') from dual; ABSED

63. String to lowercase function: lower,lcase

Syntax: lower (string A) lcase (string A)

Return value: string

Description: returns the lowercase format of string A

For example: hive > select lower ('abSEd') from dual; absed

Hive > select lcase ('abSEd') from dual; absed

64. Unblanking function: trim

Syntax: trim (string A)

Return value: string

Description: remove spaces on both sides of the string

For example: hive > select trim ('abc') from dual; abc

65. Remove the space function to the left: ltrim

Syntax: ltrim (string A)

Return value: string

Description: remove the space on the left side of the string

For example: hive > select ltrim ('abc') from dual; abc

64. Remove the space function on the right: rtrim

Syntax: rtrim (string A)

Return value: string

Description: remove the space on the right side of the string

For example: hive > select rtrim ('abc') from dual; abc

65. Regular expression replacement function: regexp_replace

Syntax: regexp_replace (string A, string B, string C)

Return value: string

Description: replace the part of the string A that conforms to the java regular expression B with C. Note that escape characters are used in some cases, similar to the regexp_replace function in oracle.

For example: hive > select regexp_replace ('foobar',' oo | ar','') from dual; fb

66. Regular expression parsing function: regexp_extract

Syntax: regexp_extract (string subject, string pattern, intindex)

Return value: string

Description: splits the string subject according to the rules of pattern regular expressions and returns the characters specified by index.

For example: hive > select regexp_extract ('foothebar','foo (. *?) (bar)', 1) from dual; the

Hive > select regexp_extract ('foothebar','foo (. *?) (bar)', 2) from dual; bar

Hive > select regexp_extract ('foothebar','foo (. *?) (bar)', 0) from dual; foothebar

Note that escape characters are used in some cases, and the equal sign below is escaped with double vertical lines, which is the rule of java regular expressions.

Select data_field

Regexp_extract (data_field,'.*?bgStart\ = ([^ &] +)', 1) as aaa

Regexp_extract (data_field,'.*?contentLoaded_headStart\ = ([^ &] +)', 1) as bbb

Regexp_extract (data_field,'.*?AppLoad2Req\ = ([^ &] +)', 1) as ccc

From pt_nginx_loginlog_st

Where pt = '2012-03-26' limit 2

67. URL analytic function: parse_url

Syntax: parse_url (string urlString, string partToExtract [, string keyToExtract])

Return value: string

Description: returns the specified part of the URL. The valid values of partToExtract are: HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, and USERINFO.

For example:

Hive > selectparse_url ('http://facebook.com/path2/p.php?k1=v1&k2=v2#Ref1',' HOST') fromdual; facebook.com

Hive > selectparse_url ('http://facebook.com/path2/p.php?k1=v1&k2=v2#Ref1',' QUERY','k1') from dual; v1

68. Json analytic function: get_json_object

Syntax: get_json_object (string json_string, string path)

Return value: string

Description: parses the string json_string of json and returns the content specified by path. If the json string entered is not valid, NULL is returned.

For example:

Hive > select get_json_object ('{"store":

> {"fruit":\ [{"weight": 8, "type": "apple"}, {"weight": 9, "type": "pear"}]

> "bicycle": {"price": 19.95, "color": "red"}

> "email": "amy@only_for_json_udf_test.net"

> "owner": "amy"

>','$.owner') from dual

Amy

69. Space string function: space

Syntax: space (int n)

Return value: string

Description: returns a string of length n

For example:

Hive > select space (10) from dual

Hive > select length (space (10)) from dual; 10

70. Repeat string function: repeat

Syntax: repeat (string str, int n)

Return value: string

Description: returns the str string repeated n times

For example: hive > select repeat ('abc',5) from dual;abcabcabcabcabc

71. The first character ascii function: ascii

Syntax: ascii (string str)

Return value: int

Description: returns the ascii code of the first character of the string str

For example: hive > select ascii ('abcde') from dual; 97

72. Left complement function: lpad

Syntax: lpad (string str, int len, string pad)

Return value: string

Description: use pad to complete str to Lenn bit

For example: hive > select lpad ('abc',10,'td') from dual;tdtdtdtabc

Note: unlike GP and ORACLE, pad cannot default

73. Right complement function: rpad

Syntax: rpad (string str, int len, string pad)

Return value: string

Description: right complement of str to Lenn bit with pad

For example: hive > select rpad ('abc',10,'td') from dual;abctdtdtdt

74. Split string function: split

Syntax: split (stringstr, string pat)

Return value: array

Note: split the str according to the pat string and return the split string array.

For example:

Hive > select split ('abtcdtef','t') from dual

["ab", "cd", "ef"]

75. Set search function: find_in_set

Syntax: find_in_set (string str, string strList)

Return value: int

Description: returns the position where str first appeared in strlist. Strlist is a string separated by commas. If the str character is not found, 0 is returned.

For example: hive > select find_in_set ('ab','ef,ab,de') fromdual;2

Hive > select find_in_set ('at','ef,ab,de') from dual;0

76. Aggregate statistical function

Syntax: count (*), count (expr), count (DISTINCT expr [, expr_.])

Return value: int

Description: count (*) counts the number of rows retrieved, including rows with null values; count (expr) returns the number of non-null values in the specified field; count (DISTINCTexpr [, expr_.]) Returns the number of different non-null values for the specified field

For example: hive > select count (*) from dual; 20

Hive > select count (distinct t) from dual; 10

77. Summation statistical function: sum

Syntax: sum (col), sum (DISTINCT col)

Return value: double

Description: the result of the addition of col in the statistical result set of sum (col); the result of the addition of different values of col in the statistical result of sum (DISTINCT col)

For example: hive > select sum (t) from dual; 100

Hive > select sum (distinct t) from dual; 70

78. Average statistical function: avg

Syntax: avg (col), avg (DISTINCT col)

Return value: double

Description: the average value of col in the avg (col) statistical result set; the average value of the addition of different col values in the avg (DISTINCT col) statistical result

For example: hive > select avg (t) from dual; 50

Hive > select avg (distinct t) from dual; 30

79. Minimum statistical function: min

Syntax: min (col)

Return value: double

Description: the minimum value of col field in the statistical result set

Example: hive > select min (t) from dual; 20

80. Maximum statistical function: max

Syntax: maxcol)

Return value: double

Description: the maximum value of the col field in the statistical result set

For example: hive > select max (t) from dual; 120

81. Map type construction: map

Syntax: map (key1, value1, key2, value2, …)

Description: build the map type based on the input key and value pairs

For example:

Hive > Create table alex_testas select map ('100th and 2000s) as t from dual

Hive > describe alex_test

T map

Hive > select t from alex_test

{"100,200": "tom", "200": "mary"}

82. Struct type construction: struct

Syntax: struct (val1, val2, val3, …)

Description: build the structure struct type according to the input parameters

For example:

Hive > create table alex_test as selectstruct ('tom','mary','tim') as t from dual

Hive > describe alex_test

T struct

Hive > select t from alex_test

{"col1": "tom", "col2": "mary", "col3": "tim"}

83. Array type construction: array

Syntax: array (val1, val2, …)

Description: build an array array type based on the input parameters

For example:

Hive > create table alex_test as selectarray ("tom", "mary", "tim") as t from dual

Hive > describe alex_test

T array

Hive > select t from alex_test

["tom", "mary", "tim"]

84. Array type access: a [n]

Syntax: a [n]

Operation type: an is array type, n is int type

Description: returns the nth variable value in array A. The starting subscript of the array is 0. For example, if An is an array type with a value of ['foo','bar'], then A [0] will return' foo', 'and A [1] will return' bar''

For example:

Hive > create table alex_test as selectarray ("tom", "mary", "tim") as t from dual

Hive > select t [0], t [1], t [2] from alex_test; tom mary tim

85. Map type access: M [key]

Syntax: M [key]

Operation type: M is the map type, and key is the key value in map

Description: returns the map type M where the key value is the value of the specified value. For example, if M is a map type with a value of {'f'- > 'foo',' b'- > 'bar',' all'-> 'foobar'}, then M [' all'] will return 'foobar''.

For example:

Hive > Create table alex_test as selectmap ('100th and 2000s) as t from dual

Hive > select t ['200'], t [' 100'] from alex_test; mary tom

86. Struct type access: S.X

Syntax: S.X

Operation type: s is struct type

Description: returns the x field in structure S. For example, for the structure struct foobar {int foo, int bar}, foobar.foo returns the foo field in the structure

For example:

Hive > create table alex_test as selectstruct ('tom','mary','tim') as t from dual

Hive > describe alex_test

T struct

Hive > select t.col1ret t.col3from alex_test

Tom tim

87. Map type length function: size (Map)

Syntax: size (Map)

Return value: int

Description: returns the length of the map type

For example: hive > select size (map)) from dual; 2

88. Array type length function: size (Array)

Syntax: size (Array)

Return value: int

Description: returns the length of the array type

For example: hive > select size (array ('100th, 101th, 102th, 103th)) from dual; 4

89. Type conversion function

Type conversion function: cast

Syntax: cast (expr as)

Return value: Expected "=" to follow "type"

Description: returns the length of the array type

For example: hive > select cast (1 as bigint) from dual; 1

4 Hive JDBC

4.1 introduction of basic operating objects

1 、 Connection

Description: Connection object connected to Hive

Connection of Hive

Jdbc:hive://IP:10000/default "

The method of getting Connection

DriverManager.getConnection ("jdbc:hive://IP:10000/default", "", ")

2 、 Statement

Description: used to execute statements

Creation method

Statementstmt = con.createStatement ()

Main methods

ExecuteQueryexecute

3 、 ResultSet

Description: used to store result sets

Creation method

Stmt.executeQuery

Main methods

GetString ()

4. Special types of processing

Array

Map

Struct

4.2 datafile write operation

Try {Class.forName (driverName); Connection con = DriverManager.getConnection ("jdbc:hive://IP:10000/default", "", "); Statement stmt = con.createStatement (); String sql =" show tables "; / / show tablesSystem.out.println (" Running: "+ sql); ResultSet res = stmt.executeQuery (sql); if (res.next ()) {System.out.println (res.getString (1));}

4.3Database connection pool based on Hive

Using DataSource as the implementation of data source

DBConnectionManager adopts singleton mode

Provide a way to get a connection and close it

SetupDataSource () DBConnectionManagergetInstance () close (Connectionconn) synchronizedConnection getConnection ()

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.