Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Best practices on Cloud on vehicle networking (4)

2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Key business testing and performance tuning on the cloud

1. Load balancing type selection and performance index

Performance-guaranteed instances are recommended for load balancers. Compared with performance-sharing instances, the resources of shared load balancers are shared, so the performance metrics of instances cannot be guaranteed. Because the industry characteristic of vehicle networking is that high concurrency scenarios recommend the use of performance-guaranteed examples. The three key metrics of performance-guaranteed instances are as follows:

Maximum connections-Max Connection

The maximum number of connections defines the maximum number of connections that a load balancer instance can carry. When the number of connections on the instance exceeds the maximum number of connections defined by the specification, the new connection request is discarded.

New connections per second-Connection Per Second (CPS)

The number of new connections per second defines the rate of new connections. When the rate of new connections exceeds the number of new connections per second defined by the specification, the new connection request is discarded.

Queries per second-Query Per Second (QPS)

Requests per second is a unique concept of layer 7 snooping, which refers to the number of HTTP/HTTPS queries (requests) that can be completed per second. When the request rate exceeds the number of queries per second defined by the specification, the new connection request is discarded.

Ali Cloud load balancer performance-guaranteed instances offer the following six instance specifications (depending on the resources in different regions, the open specifications may vary slightly. Please refer to the purchase page on the console).

Specification maximum connections per second (CPS) queries per second (QPS)

Specification 1 simple I (slb.s1.small) 5000 3000

Specification 2 Standard I (slb.s2.small) 50000 5000

Specification 3 Standard II (slb.s2.medium) 100000 10000

Specification 4 High-order I (slb.s3.small) 200000 20000

Specification 5 Advanced II (slb.s3.medium) 500000 50000

Specification 6 Super I (slb.s3.large) 1000000 100000

Specification maximum connections per second (CPS) queries per second (QPS)

Cdn.com/908cb96098bd26e223e490bb2a18bb4dad175335.png ">

Note: the above specifications can be purchased in the console, and you can find that the maximum specification is only 100W connection. But what if a ten-million-class car networking enterprise requires a maximum connection capacity of 1000W for load balancing? What if I can't buy it at the console? Don't worry, although Aliyun only opens 6 kinds of specification examples to ordinary small and medium-sized enterprise users in the console of the official website. However, for enterprise users with larger specification requirements, you can customize a higher standard cloud load balancer instance by contacting Aliyun account Manager, which can provide a maximum number of connections at a level of 100 million.

Since there is a clear SLA for performance assurance instances in terms of the maximum number of connections, connections per second, queries per second and other metrics, we will not test them here. During normal use, you can observe the performance metrics in real time through cloud monitoring.

Concurrent connection number monitoring

New connection Monitoring

QPS monitoring

2. ECS selection and performance test

CVM Elastic Compute Service (ECS) is a basic cloud computing service provided by Aliyun. Using CVM ECS is as convenient and efficient as using water, electricity, gas and other resources. There is no need to purchase hardware equipment in advance, but to create as many CVM ECS instances as needed at any time according to business needs. In the process of use, with the expansion of the service, you can expand the disk capacity and increase the bandwidth at any time. If you no longer need a CVM, you can release resources at any time and save money.

For ECS selection, you can select the corresponding ECS instance specifications according to different application scenarios. If you don't know how to choose, you can refer to the official website:

Aliyun ECS introduces different instance specifications for different application scenarios, which greatly enriches the diversified needs of users.

In view of the industry characteristics of the Internet of things, such as high concurrency and high throughput, it is recommended to choose a computing (C5) model for the front end of the web, recommend a 4-core 8g for the instance specification, recommend a SSD cloud disk for the system disk, and recommend a universal (G5) model for the back-end application, recommend a 4-core 16g for the instance specification, and recommend a SSD cloud disk for the system disk.

Next, let's test the performance of one of the instances, such as the general-purpose g5 4-core 16g SSD system disk with centos6.8 operating system.

The test is divided into two parts, one is the running score test for CPU, and the other is the IOPS test for SSD system disk.

1) performance running test

Log in to the Aliyun console and purchase a general-purpose g5Jing 4-core 16gECS instance

Log in to the server and verify the machine configuration with 4 cores of 16G

Install the test tool Unixbench, the installation process is as follows

Http://soft.laozuo.org/scripts/UnixBench6.1.3.tgz

Tar xf UnixBench6.1.3.tgz

Cd UnixBench

Make

Error in. / Run installation process:

Can't locate Time/HiRes.pm in @ INC (@ INC contains: / usr/local/lib64/perl5 / usr/local/share/perl5 / usr/lib64/perl5/vendor_perl / usr/share/perl5/vendor_perl / usr/lib64/perl5/ usr/share/perl5.) At. / Run line 6.

BEGIN failed--compilation aborted at. / Run line 6.

Solution: yum install perl-Time-HiRes-y

If there is a bash: make: command not found problem

Solution: yum-y install gcc automake autoconf libtool make

Test screenshot

Summary: the final run score test is 3911.9, usually a 4-core 8G desktop with a score of 2100. Thus it can be seen that the performance of Aliyun is still very strong, almost twice that of desktops.

2) performance test of SSD cloud disk

The SLA of SSD cloud disk announced on the official website is:

Tools such as DD, fio, or sysbench can be used to test block storage performance for Linux operating systems.

Use the fio tool to test the IPOS capability of the general-purpose g5 4-core 16g system disk as SSD cloud disk instance:

Warning:

Testing the bare disk can get the real performance of the block storage disk, but directly testing the bare disk will destroy the file system structure. Please backup the data in advance before testing. It is recommended to use tools to test block storage performance only on newly purchased ECS instances with no data to avoid data loss.

Log in to the server and install the test command fio

To test random write IOPS, run the following command:

Fio-direct=1-iodepth=128-rw=randwrite-ioengine=libaio-bs=4k-size=1G-numjobs=1-runtime=1000-group_reporting-filename=iotest-name=Rand_Write_Testing

To test the random read IOPS, run the following command:

Fio-direct=1-iodepth=128-rw=randread-ioengine=libaio-bs=4k-size=1G-numjobs=1-runtime=1000-group_reporting-filename=iotest-name=Rand_Read_Testing

Test sequential write throughput, run the following command:

Fio-direct=1-iodepth=64-rw=write-ioengine=libaio-bs=1024k-size=1G-numjobs=1-runtime=1000-group_reporting-filename=iotest-name=Write_PPS_Testing

Test sequential write throughput, run the following command:

Fio-direct=1-iodepth=64-rw=write-ioengine=libaio-bs=1024k-size=1G-numjobs=1-runtime=1000-group_reporting-filename=iotest-name=Write_PPS_Testing

The following table shows the meaning of the various parameters in the command by testing the command that writes IOPS randomly.

The following is an example of how to understand the test results of random read IOPS performance of a SSD cloud disk.

Rand_Read_Testing: (Grou0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128

Fio-2.2.8

Starting 1 process

Jobs: 1 (80000KB/0KB/0KB 1): [r (1)] [21.4% done] [80000KB/0KB/0KB / s] [20.0K/0/0 iops] [eta 00Jobs: 1 (f é 1): [r (1)] [28.6% done] [80000KB/0KB/0KB / s] [20.0K/0/0 iops] [eta 00Jobs: 1 (f é 1): [r (1)] [35.7% done] [80000KB/0KB/ 0KB/ s] [20.0K/0/0 iops] [eta 00Jobs: 1 (fission 1): [r (1)] [42.9% done] [80004KB/0KB/0KB / s] [20.1K/0/0 iops] [eta 00Jobs: 1 (fission 1): [r (1)] [50.0% done] [80004KB/0KB/0KB / s] [20.1K/0/0 iops] [eta 00Jobs: 1 (fission 1): [r (1)] [57.1% done] [80000KB/0KB/0KB / s] [20.0K/0/0 iops] [eta 00Jobs: 1 (feng1): [r (1)] [64.3% done] [80144KB/0KB/0KB / s] [20.4K/0/0 iops] [eta 00Jobs: 1 (fag1): [r (1)] [71.4% done] [80388KB/0KB/0KB / s] [20.1K/ 0X 0 iops] [eta 00Jobs: 1 (fission 1): [r (1)] [78.6% done] [80232KB/0KB/0KB / s] [20.6K/0/0 iops] [eta 00Jobs: 1 (fission 1): [r (1)] [85.7% done] [80260KB/0KB/0KB / s] [20.7K/0/0 iops] [eta 00Jobs: 1 (fission 1): [r (1)] [92.9% Done] [80016KB/0KB/0KB / s] [20.4K/0/0 iops] [eta 00Jobs: 1 (fission 1): [r (1)] [100.0% done] [80576KB/0KB/0KB / s] [20.2K/0/0 iops] [eta 00m:00s]

Rand_Read_Testing: (groupid=0, jobs=1): err= 0: pid=9845: Tue Sep 26 20:21:01 2017

Read: io=1024.0MB, bw=80505KB/s, iops=20126, runt= 13025msec

Slat (usec): min=1, max=674, avg= 4.09, stdev= 6.11

Clat (usec): min=172, max=82992, avg=6353.90, stdev=19137.18

Lat (usec): min=175, max=82994, avg=6358.28, stdev=19137.16

Clat percentiles (usec):

| | 1.00th = [454], 5.00th = [668], 10.00th = [812], 20.00th = [996] |

| | 30.00th = [1128], 40.00th = [1256], 50.00th = [1368], 60.00th = [1480] |

| | 70.00th = [1624], 80.00th = [1816], 90.00th = [2192], 95.00th = [79360] |

| | 99.00th = [81408], 99.50th = [81408], 99.90th = [82432], 99.95th = [82432] |

| | 99.99th = [82432] |

Bw (KB / s): min=79530, max=81840, per=99.45%, avg=80064.69, stdev=463.90

Lat (usec): 250 million 0.04%, 500 million 1.49%, 750 million 6.08%, 1000 million 12.81%

Lat (msec): 65.86%, 6.84%, 0.49%, 0.04%, 6.35%

Cpu: usr=3.19%, sys=10.95%, ctx=23746, majf=0, minf=160

IO depths: 1: 0.1%, 2: 0.1%, 4: 0.1%, 8: 0.1%, 16: 0.1%, 32: 0.1%, > = 64: 100.0%

Submit: 0%, 4% 100%, 8%, 16%, 32%, 64%, > = 640.0%

Complete: 0%, 4% 100%, 8%, 16%, 32%, 64%, > = 64% 0.1%

Issued: total=r=262144/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0

Latency: target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):

READ: io=1024.0MB, aggrb=80504KB/s, minb=80504KB/s, maxb=80504KB/s, mint=13025msec, maxt=13025msec

Disk stats (read/write):

Vdb: ios=258422/0, merge=0/0, ticks=1625844/0, in_queue=1625990, util=99.30%

In the output, we mainly focus on the following line:

Read: io=1024.0MB, bw=80505KB/s, iops=20126, runt= 13025msec

Indicates that fio has done 1 GiB Iramp O with a speed of about 80 MiB/s, a total IOPS of 20126, and a running time of 13 seconds. According to the IOPS value, the IOPS performance of the SSD cloud disk is 20126, while the value calculated according to the formula is:

Min {1200 million 30 capacity, 20000} = min {1200 million 30 800, 20000} = 20000

The test results are close to those calculated by the formula.

3. Database RDS testing and tuning

1) RDS MySQL version test

Today we conduct a performance test based on Aliyun's RDS MySQL5.6 version.

A) Test environment

All tests of  were carried out in availability Zone B of East China 2 (Shanghai).

The ECS used for  testing is a series of II instances.

The  instance is configured as an 8-core 16GB.

The  network type is a classical network.

The mirror image used in  pressure test is CentOS 7.064 bit.

B) testing tools

SysBench is a cross-platform and multi-threaded modular benchmark tool for evaluating the performance of relevant core parameters when running a high-load database. It is designed to bypass complex database benchmark settings and quickly understand the performance of the database system even without installing the database.

Installation method

The version of SysBench used in this article is 0.5.

Execute the following command to install SysBench

Yum install gcc gcc-c++ autoconf automake make libtool bzr mysql-develunzip ysbench-0.5.zipcd sysbench-0.5./autogen.sh./configure-prefix=/usr-mandir=/usr/share/manmakemake install

C) testing

Prepare data

Sysbench-num-threads=32-max-time=3600-max-requests=999999999-test= oltp.lua-oltp-table-size=10000000-oltp-tables-count=64-db-driver=mysql-mysql-table-engine=innodb-mysql-host= XXXX-mysql-port=3306 mysql-user= XXXX mysql-password= XXXX prepare

Pressure test performance

Sysbench-num-threads=32-max-time=3600-max-requests=999999999-test= oltp.lua-oltp-table-size=10000000-oltp-tables-count=64-db-driver=mysql-mysql-table-engine=innodb-mysql-host= XXXX-mysql-port=3306 mysql-user= XXXX mysql-password= XXXX run

Clean up the environment

Sysbench-num-threads=32-max-time=3600-max-requests=999999999-test= oltp.lua-oltp-table-size=10000000-oltp-tables-count=64-db-driver=mysql-mysql-table-engine=innodb-mysql-host= XXXX-mysql-port=3306 mysql-user= XXXX mysql-password= XXXX cleanup

D) Test model

Database table structure

CREATE TABLE sbtest (

Id int (10) unsigned NOT NULL AUTO_INCREMENT

K int (10) unsigned NOT NULL DEFAULT'0'

C char (120) NOT NULL DEFAULT'

Pad char (60) NOT NULL DEFAULT'

PRIMARY KEY (id)

KEY kumb1 (k)

) ENGINE=InnoDB DEFAULT CHARSET=utf8

Data format

Id: 1

K: 3718516

CLAV 08566691963-88624912351-16662227201-46648573979-646226163-77505759394-75470094713-41097360717-15161106334-50535565977

Pad: 63188288836-92351140030-06390587585-66802097351-49282961843

SQL style

Query:

SELECT c FROM sbtest64 WHERE id=4957216

SELECT c FROM sbtest43 WHERE id BETWEEN 4573346 AND 457334699

SELECT SUM (K) FROM sbtest57 WHERE id BETWEEN 5034894 AND 5034894

SELECT DISTINCT c FROM sbtest50 WHERE id BETWEEN 4959831 AND 495983199 ORDER BY c

Write:

INSERT INTO sbtest3 (id, k, c, pad) VALUES (4974042, 4963580, '33958272865-80411528812-36334179010-84793024318-25708692091-43736213170-37853797624-40480626242-32131452190-24509204411) VALUES (07716658989-39745043214-17284860193-80004426880-14154945098')

Update:

UPDATE sbtest11 SET k=k+1 WHERE id=5013989

UPDATE sbtest14 SET caterpillar 10695174948-02130015518-68664370682-70336600207-55943744221-72419172189-36252607855-75106351226-86920614936-86254476316 'WHERE id=5299388

E) Test indicators

TPS

Transactions Per Second, the number of transactions executed by the database per second, based on the number of commit successes.

QPS

Queries Per Second, the number of SQL executed by the database per second (including insert, select, update, delete, etc.).

F) Test results

 universal type

 exclusive type

2) suggestions for tuning parameters of MySQL instance

For the instance of MySQL version of cloud database, you can modify some parameters on the console. For some important parameters, inappropriate parameter values can lead to instance performance problems or application errors, so this article will introduce some suggestions for the optimal values of important parameters to reduce doubts when setting parameters. Among them, the red mark is the tuning suggestion for the vehicle networking scene, which is characterized by high concurrency, large amount of data, more reading and writing.

 back_log (high concurrency scenarios need to increase this parameter value)

Default value: 3000

Do you need to restart after modification: yes

Purpose: every time MySQL processes a connection request, it creates a new thread corresponding to it. During the creation of a new thread by the main thread, if the front-end application has a large number of short connection requests arriving at the database, MySQL will restrict these new connections to the request queue, which is controlled by the parameter back_log. If the number of waiting connections exceeds the value of back_log, new connection requests will not be accepted, so if you need MySQL to be able to handle a large number of short connections, you need to increase the size of this parameter.

Phenomenon: if the parameter is too small, the application may have the following error.

SQLSTATE [HY000] [2002] Connection timed out

Modification suggestion: increase the size of this parameter value.

 innodb_autoinc_lock_mode (helps avoid deadlocks and improve performance)

Default value: 1

Do you need to restart after modification: yes

Function: after MySQL 5.1.22, in order to solve the problem of self-increasing primary key locking table, InnoDB introduced the parameter innodb_autoinc_lock_mode to control the locking mechanism of self-increasing primary key. This parameter can be set to 0, 1, 2 the default parameter value for InnoDB is 1, indicating that RDS uses lightweight mutex locks to acquire self-incrementing locks instead of the original table-level locks. But in load data (including INSERT... SELECT and REPLACE... If self-increment table locks are used in SELECT) scenarios, it may cause deadlocks when applications import data concurrently.

Phenomenon: in load data (including INSERT … SELECT and REPLACE... If the self-increment table lock is used in the SELECT scenario, the following deadlock occurs when data is imported concurrently.

RECORD LOCKS space id xx page no xx n bits xx index PRIMARY of table xx.xx trx id xxx lock_mode X insert intention waiting. TABLE LOCK table xxx.xxx trx id xxxx lock mode AUTO-INC waiting

Modification suggestion: it is recommended to change this parameter value to 2, which means that lightweight mutex locks are used for all inserts (for row mode only), so that deadlocks in auto_inc can be avoided, and at the same time in INSERT … The performance of the SELECT scenario will be greatly improved.

Note: when the parameter value is 2, the format of binlog needs to be set to row.

 query_cache_size (the vehicle network is characterized by scenarios that read more and write more, this parameter is recommended to be turned off)

Default value: 3145728

Whether you need to restart after modification: no

Function: this parameter is used to control the memory size of MySQL query cache. If query cache is enabled by MySQL, the query cache will be locked when each query is executed, and then whether it exists in the query cache will be determined. If it exists, the result will be returned directly. If it does not exist, operations such as engine query will be performed. At the same time, operations such as insert, update, and delete invalidate query cahce, which also includes any changes to the structure or index. However, the maintenance cost of cache failure is high, which will bring great pressure to MySQL. Therefore, query cache is useful when the database is not updated frequently, but if the write operation is very frequent and concentrated on several tables, then the lock mechanism of query cache lock will cause frequent lock conflicts, and the write and read of this table will wait for query cache lock to unlock each other, resulting in a decline in the query efficiency of select.

Phenomenon: there are a large number of connection states in the database as checking query cache for query, Waiting for query cache lock, storing result in query cache.

Modification suggestion: RDS disables query cache by default. If the instance has query cache enabled, you can disable query cache when the above situation occurs.

 net_write_timeout (to avoid data write failure due to timeout of car data)

Default value: 60

Whether you need to restart after modification: no

Purpose: the timeout for waiting for a block to be sent to the client.

Phenomenon: if the parameter setting is too small, it may cause the client to have an error the last packet successfully received from the server was milliseconds ago or the last packet sent successfully to the server was milliseconds ago.

Modification suggestion: this parameter is set to 60 seconds by default in RDS. Generally, when the network condition is poor or the client takes a long time to process each block, it is easy to break the connection due to the small net_write_timeout setting. It is recommended to increase the size of this parameter.

 tmp_table_size (large memory is recommended to improve query performance)

Default value: 2097152

Whether you need to restart after modification: no

Function: this parameter is used to determine the maximum value of the internal memory temporary table, which is allocated by each thread, and is actually limited by the minimum values of tmp_table_size and max_heap_table_size. If the memory temporary table exceeds the limit, MySQL automatically converts it to a disk-based MyISAM table. When optimizing query statements, avoid using temporary tables, and if you can't, make sure that these temporary tables are in memory.

Phenomenon: if complex SQL statements include group by, distinct, and so on, which cannot be optimized by index, temporary tables are used, which will result in longer SQL execution time.

Modification suggestion: if there are a lot of group by, distinct and other statements in the application, and the database has enough memory, you can increase the value of tmp_table_size (max_heap_table_size) to improve query performance.

 loose_rds_max_tmp_disk_space

Default value: 10737418240

Whether you need to restart after modification: no

Purpose: used to control the size of temporary files that MySQL can use.

Symptom: if the temporary file exceeds the value of loose_rds_max_tmp_disk_space, it will cause the following error in the application.

The table'/ home/mysql/dataxxx/tmp/#sql_2db3_1' is full

Suggestions for modification: first, you need to analyze whether the SQL statements that lead to the addition of temporary files can be optimized by index or other means. Second, if you determine that there is enough space for the instance, you can increase the value of this parameter to ensure that SQL can be executed properly.

 loose_tokudb_buffer_pool_ratio

Default value: 0

Do you need to restart after modification: yes

Function: it is used to control the amount of buffer memory that can be used by the TokuDB engine. For example, if the innodb_buffer_pool_size is set to 1000MB, the buffer memory that can be used by the table of the TokuDB engine is set to 50 (representing 50%), then the amount of memory that can be used by the table of the buffer engine is set to 500MB.

Modification suggestion: if the TokuDB engine is used in RDS, it is recommended to increase this parameter to improve the access performance of the TokuDB engine table.

 loose_max_statement_time

Default value: 0

Whether you need to restart after modification: no

Function: used to control the maximum execution time of a query in MySQL. If the time set by this parameter is exceeded, the query will fail automatically, and the default is no limit.

Phenomenon: if the query time exceeds the value of this parameter, the following error occurs.

ERROR 3006 (HY000): Query execution was interrupted, max_statement_time exceeded

Modification suggestion: if you want to control the execution time of SQL in the database, you can turn on this parameter in milliseconds.

 loose_rds_threads_running_high_watermark (protective in high concurrency scenarios)

Default value: 50000

Whether you need to restart after modification: no

Function: used to control the number of MySQL concurrent queries, such as setting the value of rds_threads_running_high_watermark to 100, then allow MySQL to simultaneously carry out concurrent queries to 100. queries that exceed the limit will be rejected. This parameter is used in conjunction with rds_threads_running_ctl_mode (the default is select).

Modification suggestion: this parameter is often used in scenarios of second kill or large concurrency, which has a good protection effect on the database.

Parameter setting steps

Enter the RDS console to find the RDS instance-"Click Management-- Click Parameter Settings--" set completion parameters-Click to submit parameters-"Click OK."

4. Elasticsearch performance testing and tuning

Elasticsearch is a Lucene-based search and data analysis tool that provides a distributed service. Elasticsearch is an open source product that complies with the open source terms of Apache and is currently the mainstream enterprise search engine.

Aliyun Elasticsearch provides Elasticsearch 5.5.3 and commercial plug-in X-pack services, which are dedicated to data analysis, data search and other scenario services. Provide enterprise-level authority control, security monitoring alarm, automatic report generation and other functions on the basis of open source Elasticsearch. Today, we will test the read and write performance of Ali Yun's Elasticsearch.

Write performance

Test environment

The Elasticsearch version of this test is 5.5.3 SSD 3 nodes, and the three clusters with 2 cores, 4 cores, 16G and 16 cores are all 1T cloud disks.

Esrally pressure test for mining

Rest api

There are two types of pressure test data sets

 esrally official data geonames 3.3 GB, single document "big" and "small" 311B

 simulates a business "log data log 80GB single" document "big" and small 1432B

Test result

The number of  tablets is 6.

Cpu, load and ioutil below  are all approximate values.

 log metrics have been optimized for batch "log write" scenarios, such as slight adjustment of asynchronous translog/Merge strategy.

Comparison of write performance indicators

The influence of Machine on Writing performance

The number of  tablets is 6.

The number of  copies is all 0.

The capacity of  disk is 1T SSD cloud disk.

The influence of the number of copies on performance Index

The number of  tablets is 6.

 data samples are all log.

The capacity of  disk is 1T SSD cloud disk.

Query performance

Test environment

The Elasticsearch version of  in this test is 5.5.3 SSD with 3 nodes, and the three clusters with 2 cores, 4G and 4 cores are all 1T cloud disks.

 sampling esrally pressure test

 pressure test data is esrally official data geonames 3.3GB, single document "big" and small 311B

Test result

The number of  tablets is 6.

The number of  copies is all 0.

 query types are term and phrase

Mechine Qps Cpu Load JVM Memory 90th percentile service time

2 core 12378 89% 3.98 47% 17.6141ms

4 core 23498 93% 4.63 48% 20.3789ms

5. Performance test and tuning of cloud database HBase

Cloud database ApsaraDB for HBase is a stable, reliable and scalable distributed Nosql database, which is compatible with open source HBase protocol. Existing and upcoming features include: security, public network access, HBase on oss, backup and recovery, cold and hot separation, optional storage media include efficient cloud disk, cloud ssd disk, local disk, product form includes: stand-alone HBase and distributed version.

1) ApsaraDB for HBase test

Today we conduct a performance test based on Aliyun ApsaraDB for HBase.

A) Test environment

All tests of  were carried out in availability Zone B of East China 2 (Shanghai).

The  instance is configured with 8 cores and 32 GB for exclusive use.

The  network type is a classical network.

The mirror image used for  pressure test is CentOS 6.864-bit.

 single data value size 1KB dint 100 thread for data pressure testing

B) testing tools

The test tool uses the PerformanceEvaluation tool that comes with open source HBase for testing, because the tool provides a wealth of test scenarios, including random key read and write, construction sequence key read and write, custom scan operations, and the tool is perfectly compatible with ApsaraDB for HBase access API. In addition, the output data needed for testing can also be shown in detail.

Installation method

No installation is required. The binaries are included in the HBase installation package and do not need to be installed manually. You can simply execute the command to operate.

C) Test methods

Prepare data

In the actual test, you need to inject a certain amount of data into the HBase to achieve the real scene on the simulation line. The statement to write the data is:

Sh hbase org.apache.hadoop.hbase.PerformanceEvaluation-nomapred-writeToWAL=

True-table=test-rows=5000 randomWrite 100

The above statement indicates that 100 threads are started, each thread writes 5000 pieces of data to the test table, and each write operation needs to record the WAL log.

Pressure test performance

The operation of pressure test includes read, write and scan operation.

Write operation:

Sh hbase org.apache.hadoop.hbase.PerformanceEvaluation-nomapred-writeToWAL=

True-table=test-rows=5000 randomWrite 100

The above is the command for the write operation, in which you can modify the number of entries written in each batch by adjusting the configuration file writerbuffer. We test writing 1, 2 and 100 pieces of data information.

Read operation:

Sh hbase org.apache.hadoop.hbase.PerformanceEvaluation-nomapred-table=test-rows=5000 randomRead 100

The above commands are for read operations. For reads, we can test the data when the read hits the memory is very high, and the data without the memory cache. For the case of high hit rate, turn on blockcahe, and then pre-read the hit rate information of the monitoring page for several times. For anonymous, you only need to close cache.

Scan data manipulation:

Sh hbase org.apache.hadoop.hbase.PerformanceEvaluation-nomapred-table=test-rows=5000-caching=100 scanRange100 100

For this kind of scan, we are looking at data with a high hit rate and almost no memory cache hit rate. The judgment of the hit rate and the condition of reference reading.

D) Test model

Database table structure

Hbase (main): 001VR 0 > describe test'

Table test is ENABLED

Test

COLUMN FAMILIES DESCRIPTION

{NAME = > 'info', BLOOMFILTER = >' ROW', VERSIONS = >'1', IN_MEMORY =

'false', KEEP_DELETED_DATA = > 'FALSE', DATA_BLOCK_ENCODING = >' NO

NE', TTL = > 'FOREVER', COMPRESSION = >' NONE', MIN_VERSIONS = >'0'

BLOCKCACHE = > 'true', BLOCKSIZE = >' 6553613, REPLICATION_SCOPE = >'0

'}

Data format

Row: 00000000000000000000000428

Column=info:0

Timestamp=1526367336121, value=NNNNNNNNLLLLLLLLCCCCCCCCJJJJJJJJGGGGGGGGZZZZZZZZPPP

PPPPPQQQQQQQQNNNNNNNNEEEEEEEEGGGGGGGGYYYYYYYYBBBBBBBBLLLLLLLLN NNNNNNNGGGGGGGGNNNNNNNNRRRRRRRREEEE EEEEQQQQQQQQXXXXXXXXTTTTTTTTJJJJJJJJYYYYYYYYXXXXXXXXMMMMMMMMJJJJJJJJMMMMMMMMFFFFFFFFHHHHHHHHDDDDD

DDDOOOOOOOOSSSSSSSSYYYYYYYYRRRRRRRRPPPPPPPPBBBBBBBBHHHHHHHHXXXXXXXXVVVVVVVVEEEEEEEEMMMMMMMMEEEEEE

EEMMMMMMMMDDDDDDDDLLLLLLLLKKKKKKKKSSSSSSSSWWWWWWWWVVVVVVVVXXXXXXXXVVVVVVVVJJJJJJJJAAAAAAAANNNNNNN

NDDDDDDDDLLLLLLLLAAAAAAAAOOOOOOOOKKKKKKKKFFFFFFFFHHHHHHHHIIIIIIIIZZZZZZZZQQQQQQQQAAAAAAAAKKKKKKKK

VVVVVVVVTTTTTTTTXXXXXXXXTTTTTTTTNNNNNNNNYYYYYYYYRRRRRRRRPPPPPPPPMMMMMMMMRRRRRRRRHHHHHHHHKKKKKKKKG

GGGGGGGBBBBBBBBMMMMMMMMBBBBBBBBTTTTTTTTFFFFFFFFSSSSSSSSQQQQQQQQCCCCCCCCSSSSSSSSVVVVVVVVBBBBBBBBFF

FFFFFFZZZZZZZZLLLLLLLLAAAAAAAAIIIIIIIIDDDDDDDDCCCCCCCCEEEEEEEESSSSSSSSOOOOOOOOFFFFFFFFKKKKKKKKHHH

HHHHHVVVVVVVVXXXXXXXXKKKKKKKKKKKKKKKKPPPPPPPPCCCCCCCCMMMMMMMMRRRRRRRREEEEEEEEGGGGGGGGGGGGGGGGPPPP

PPPPUUUUUUUUAAAAAAAAKKKKKKKKAAAAAAAAKKKKKKKKXXXXXXXXCCCCCCCCUU

E) Test indicators

RPS

Request Per Second, that is, the requests executed by the database every second (including put,read,scan, etc.).

Single delay

Average delay of a single request

F) Test results

2) suggestions for tuning parameters of HBaseclient instance

When  encapsulates hbase client with Htable, and then requests hbase, Htable can configure AutoFlush, which defaults to true and can be set to false, which means that flush will not be performed until the local cache of Htable is full. Enabling false can improve the write speed perceived by the client.

Do you need to restart client after modification: yes

When  is encapsulated in HBase client, you can also set the number of scan caching parameters to indicate how many data user scan are cached from the server each time. The default value is 1, but if a single value is large, it is recommended that it not be set too high.

 Scan Attribute Selection, when doing scan, it is recommended to add some filters related to table attributes. If only one column family is added, only the data under the cf will be manipulated when it is returned. Otherwise, many unnecessary column families will be returned to reduce unnecessary traffic.

 WAL flag, for non-important data, that is, for the sake of graph performance, if inconsistencies are allowed and data security requirements are not high, then when performing put operations, you can set the flag whether to write WAL to false, that is, the data will not write WAL first when writing, so it will have a good performance.

 enables bloomfilter. For read operations, if bloomfilter is enabled, the read performance will be improved to a certain extent, because this is the time to swap space for read operations.

6. HiTSDB performance testing and tuning

HiTSDB Edge is a high-performance temporal database that runs near the client.

6.1 Test model

The data model used in the test comes from the randomly generated metric. The timeline metric and tagKey,tagValue composed of tags are indexed by fixed length (10 bytes string) + index respectively. The reason for not using metric and tagKey,tagValue with real meaning is that they can not cover all kinds of real scenarios, and the purpose of the test is to compare HiTSDB horizontally with other timing databases, so under the condition of ensuring the test data, software and hardware conditions are consistent. Results with high credibility can also be obtained. Timestamp (8bytes long integer), value (8bytes double integer). Sample data are as follows:

6.2 Statistical results show that

Restart the database service before each test to avoid the impact of caching on the results obtained from previous execution.

Throughput (TPS) is the number of data point operations completed per unit time.

After each write, the timeline is manually randomly selected for data query, and the data write consistency is checked with coarse granularity. At the same time, for HiTSDB writes, SDK callbacks are registered, and either failure or success is aware.

At the same time, in order to avoid the impact of the test data, the data is cleaned up before each test.

With regard to the number of timelines, the number of timelines is verified through the HiTSDB internal interface / api/tscount each time the test is run, and each increase is in line with expectations. HiTSDB and other database single node comparison test

6.3 Test environment and step instructions

The comparative testing of all databases is carried out on the same server based on Aliyun ECS. The detailed configuration of the server is as follows:

CPUs: 4

Memory (GiB): 8

Network Performance: High

Disk Capacity (GiB): 60

In testing, HiTSDB uses the client version number 1.4.9. In this test, InfluxDB is set by default.

This test tests the write and read performance of each database in different timeline order of magnitude scenarios.

6.4 write performance

A write request to a database can contain one or more data points. In general, the more data points a request contains, the better write performance will be. In the following tests, Pamp R represents Points/Request (the number of data points in a request).

6.4.1 HiTSDB test results

According to the results of the table below, when HiTSDB grows in the timeline, the writing of TPS will decrease to a certain extent. Finally, when the timeline specified in 6xlarge specification is 1 million, the number of TPS written will be about 50,000, which is basically not far from the performance of 60000 points per second advertised on Aliyun's official website.

According to the analysis of the results of three groups of tests, the writing TPS of HiTSDB shows a downward trend with the increase of timeline. At the same time, even if the number of clients is increased, the TPS does not change much, with the order of magnitude of 1 million timeline. When the number of clients increases from one client to two, TPS increases somewhat, while when two to four, TPS does not increase. Therefore, it can be considered that the peak write TPS of 1 million timeline HiTSDB under 6xlarge hardware is of the order of magnitude of 150000. Note that the reason for saying peak value is that the precondition of this test is to clear data. As the timeline in the environment increases, the writing TPS will drop. It has been tested, which is about 50-60,000 TPS.

6.5 Optimization recommendations

1) write optimization

Write multiple data in batches at a time

2) query optimization

If you need to query data with a long time range at one time, it is recommended to split the data into multiple hours to query.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report