Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Apache Hawq-- optimization notes

2025-03-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Optimization: the data table partition uses numeric type fields as far as possible, such as the Date type is converted to an absolute number of days from 1970-01-01. SQL syntax: try to avoid the use of between and (check the query plan, the impact is small), try to use CTE (with v as … ) query. The data distribution strategy (randomly distributed or hash distribution) and bucketnum parameter settings are designed according to the amount of data in the table and the type of most queries. Hot and cold data is stored in the same table with different schema, minimizing the number of partitions in the table where the hot data is located (by dynamically increasing and deleting partitions). Set the parameters hawq_rm_stat_nvseg and hawq_rm_vseg_memory based on the resources required by SQL. Execute "vacuum table_name;analyze table_name;" regularly every day to get statistics for each table in order to generate the optimal query plan. Reduce the number of metadata records by executing the "vacuum pg_class; reindex table pg_class" source data table pg_class. Execute the analysis query plan, find the performance bottleneck of SQL, and optimize it. Take measures to increase the localization ratio of query data as much as possible. Actual test

The data table is randomly distributed: bucketnum=9

1. Set the number of virtual segment used for query

Statement level:

SET hawq_rm_stmt_nvseg=10;SET hawq_rm_stmt_vseg_memory='256mb'; disables statement-level SET hawq_rm_stmt_nvseg=0;set hawq_rm_nvseg_perquery_perseg_limit=10;set hawq_rm_nvseg_perquery_limit=512

   can adjust the number of virtual segments used in query execution through the hawq_rm_nvseg_perquery_limit and hawq_rm_nvseg_perquery_perseg_limit parameters

two。 Relevant parameters of the hash distribution table:

Default_hash_table_bucket_number

Hawq_rm_nvseg_perquery_limit

Hawq_rm_nvseg_perquery_perseg_limit

3. You can use the pg_partitions view to find information about the partition design. For example, view the partition design of the sales table:

SELECT partitionboundary, partitiontablename, partitionname, partitionlevel, partitionrankFROM pg_partitionsWHERE tablename='ins_wifi_dates'

The following tables and views display information about partitioned tables.

Pg_partition-tracks partitioned tables and their inheritance relationships. Pg_partition_templates-displays subpartitions created using the subpartition template. Pg_partition_columns-displays the partition key columns used in the partition design.

4. View the segment file distribution of the table

SELECT gpr.tablespace_oid, gpr.database_oid, gpf.relfilenode_oid, gpf.segment_file_num,'/ hawq_data/' | | gpr.tablespace_oid | |'/'| | gpr.database_oid | |'/'| | gpf.relfilenode_oid | |'/'| | gpf.segment_file_num as path, pg_class.relname, gpr.persistent_state, gpf.persistent_stateFROM gp_persistent_relfile_node gpf,pg_class Gp_persistent_relation_node gprWHERE gpf.relfilenode_oid = pg_class.relfilenode AND gpr.relfilenode_oid = pg_class.relfilenode AND pg_class.relname='person' SELECT distinct gpr.tablespace_oid, gpr.database_oid, gpf.relfilenode_oid, pg_class.relname, gpr.persistent_state, gpf.persistent_stateFROM gp_persistent_relfile_node gpf,pg_class, gp_persistent_relation_node gprWHERE gpf.relfilenode_oid = pg_class.relfilenode AND gpr.relfilenode_oid = pg_class.relfilenode AND pg_class.relname like 'person_%' order by pg_class.relname

# schema

SELECT gpr.tablespace_oid, gpr.database_oid, gpf.relfilenode_oid, gpf.segment_file_num,'/ hawq_data/' | | gpr.tablespace_oid | |'/'| | gpr.database_oid | |'/'| | gpf.relfilenode_oid | |'/'| | gpf.segment_file_num as path, pgn.nspname AS schemaname, pg_class.relname AS tablename, gpr.persistent_state Gpf.persistent_stateFROM gp_persistent_relfile_node gpf,pg_class, gp_persistent_relation_node gpr, pg_namespace pgnWHERE gpf.relfilenode_oid = pg_class.relfilenode AND gpr.relfilenode_oid = pg_class.relfilenode AND pgn.oid = pg_class.relnamespace AND pg_class.relname='t_wifi_terminal_chrs_1_prt_1'

After testing, it is found that:

The storage location of    data in hdfs is: tablespace/database/table/segfile

The partition table A directory has a segfile with the default number of hash buckets, but the size is 0, while its word table (such as A1) directory has a segfile with the default number of hash buckets and files.

View the table size:

Select sotdsize from hawq_toolkit.hawq_size_of_table_disk where sotdtablename='t_net_access_log'

5. When viewing the query plan using explain or explain analyze, specify

Set gp_log_dynamic_partition_pruning=on

The name of the scanned partition can be displayed.

Unlike the    Explain analyze and explain statements, explain analyze actually executes the query and gets the statistics during the query execution. The results of explain analyze are of great help to understand the specific situation of query execution and the causes of query performance problems.

SELECT * FROM pg_stats WHERE tablename = 'inventory'

# query session information

Select * from pg_stat_activity;select application_name, datname, procpid, sess_id, usename, waiting, client_addr, client_port, waiting_resource, query_start, backend_start, xact_start from pg_stat_activity;select application_name, datname, procpid, sess_id, usename, waiting, client_addr, client_port, waiting_resource, current_query, query_start, backend_start, xact_start from pg_stat_activity Select application_name, datname, procpid, sess_id, usename, waiting, client_addr, client_port, waiting_resource, query_start, backend_start, xact_start from pg_stat_activity where application_name='psql' and current_query''

Datname represents the database name

Procpid represents the PID corresponding to the current SQL

Query_start indicates the start time of the SQL execution

Current_query represents the currently executed SQL statement

Waiting indicates whether execution is in progress, t indicates execution, and f indicates completion of execution.

Client_addr represents the client IP address

284933

There are two ways to kill. The first is:

SELECT pg_cancel_backend (PID)

This method can only be queried by kill select, and does not work for update, delete and DML)

The second is:

SELECT pg_terminate_backend (PID)

This can kill various operations (select, update, delete, drop, etc.)

Under pg_cancel_backend (), the session is still there, and things go back.

After the pg_terminate_backend () operation, the session disappears and things fall back.

If pg_terminate_backend () can't kill session at some point, then you can directly kill-9 pid at the os level.

Select * from pg_resqueue_status

-- Resource queue

SELECT * FROM dump_resource_manager_status (2)

-- Segment

SELECT * FROM dump_resource_manager_status (3); SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE relname = 'ins_wifi_dates';SELECT * FROM pg_stats WHERE tablename =' ins_wifi_dates';SELECT gp_segment_id, COUNT (*) FROM ins_wifi_datesGROUP BY gp_segment_idORDER BY gp_segment_idset gp_select_invisible=true;select count (*) from pg_class;set gp_select_invisible=false;select count (*) from pg_class;vacuum pg_class;reindex table pg_class

Log:

Set

Table redistribution:

ALTER TABLE sales SET WITH (REORGANIZE=TRUE)

Check tables that are not analyze:

Select * from hawq_toolkit.hawq_stats_missing

Http://hawq.incubator.apache.org/docs/userguide/2.2.0.0-incubating/reference/toolkit/hawq_toolkit.html#topic46

HAWQ view table size: / / does not contain partitioned tables

SELECT relname AS name, sotdsize AS size, sotdtoastsize AS toast, sotdadditionalsize AS other FROM hawq_toolkit.hawq_size_of_table_disk AS sotd, pg_catalog.pg_class WHERE sotd.sotdoid=pg_class.oid and pg_class.relname='t_wifi_terminal_chrs'ORDER BY relname

Hawq_size_of_partition_and_indexes_disk

Select relname AS name, sopaidpartitionoid, sopaidpartitiontablename, sopaidpartitiontablesize as size, sotailtablesizeuncompressed as uncompressed from hawq_toolkit.hawq_size_of_partition_and_indexes_disk sopi,pg_catalog.pg_class WHERE sopi.sopaidparentoid=pg_class.oid and pg_class.relname='t_wifi_terminal_chrs'ORDER BY sopaidpartitionoid Select relname AS name,sum (sopaidpartitiontablesize) as size from hawq_toolkit.hawq_size_of_partition_and_indexes_disk sopi,pg_catalog.pg_class WHERE sopi.sopaidparentoid=pg_class.oid and pg_class.relname='t_wifi_terminal_chrs'group by relname

Memory / vore ratio

[root@master2 pg_log] # cat hawq-2017-10-17_224829.csv 2017-10-17 1821 CST,p237647,th317192736,0,con4,seg-10000 57.319620 CST,p237647,th317192736,0,con4,seg-10000, "LOG", "00000", "Resource manager chooses ratio 5120 MB per core as cluster level memory to core ratio, there are 2304 MB memory 6 CORE resource unable to be utilized.", 0Magic, "resourcepool.c", 4641 May 2017-10-17 1821 CST,p237647,th317192736,0,con4,seg-10000 57.319668 CST,p237647,th317192736 Resource manager adjusts segment hd4.bigdata original global resource manager resource capacity from (154368 MB, 32 CORE) to (153600 MB, 30 CORE), 0meme, "resourcepool.c", 4787 CST,p237647,th317192736,0,con4,seg-10000 2017-10-17 1821 Glaze 57.319716 CST,p237647,th317192736,0,con4,seg-10000, "LOG", "00000" "Resource manager adjusts segment hd1.bigdata original global resource manager resource capacity from (154368 MB, 32 CORE) to (153600 MB, 30 CORE)", 0Magnee, "resourcepool.c", 4787 May 2017-10-17 18purr 21 CORE 57.319762 CST,p237647,th317192736,0,con4,seg-10000, "LOG", "00000", "Resource manager adjusts segment hd2.bigdata original global resource manager resource capacity from (154368 MB, 32 CORE) to (153600 MB, 30 CORE)", 0 "resourcepool.c", 4787

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report