In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
The source code makes one thing clear these days: the WEB server receives browser requests, sends requests to PHP/Python processes (FCGI, etc.), and interacts with databases using socket (sockets).
In other words, these behaviors are inter-process communication. When the hardware and operating system of a WEB server remain unchanged, its performance mainly depends on the speed of socket communication. If all processes are on the same server, the speed depends on the efficiency of communication.
For example, when interacting with a MySQL database, the program invokes the driver to access the database, which mainly does a few things:
1. Create a socket and connect to the MySQL.
two。 Translate the API called by the program into SQL statements, send it to MySQL;MySQL through socket for execution, and then send the result back; the driver passes the result (string) to the program and, if necessary, can automatically translate it into variable types that the program can recognize (such as integers).
3. Disconnect.
It can be seen that the speed of connection, the speed of translation and the timeliness of receiving response are the three most important aspects.
After understanding this, it is not difficult to find that these are only a small part of the database execution time; and PHP and Python are implemented in C, while JDBC is implemented in Java, so there is no need to worry about the performance of PHP and Python at all.
However, there are still differences in algorithm and implementation in translation, and the client can also cache some statements, so there will still be some performance differences.
In order to prove my idea, I specially went to test it.
First, list the test platform:
Quote
CPU:Intel Core2 Duo T9400 @ 2.53GHz
Memory: 3GB
Operating system: Windows XP Pro SP2
MySQL:5.1.36
Java:1.6.0_17-b04
JDBC:MySQL Connector/J 5.1.10
PHP:5.2.11 (cli)
MySQLi:5.2.11.11
Python:2.6.4
MySQL-Python:1.2.3c1
The libraries used are all up-to-date and use the most recommended libraries.
But the database did not use the latest stable version because I was too lazy to redo it. 5.4.3-beta beta has also tried, in continuous insertion, the performance is 1 / 2 orders of magnitude faster than 5.1, which is estimated to be the reason for server-side caching and setting.
Test project:
1. Create 1 million random numbers and generate SQL statements that insert those random numbers.
two。 Connect to the local database and, if unsuccessful, try to create the database.
3. Delete and create the database table with the engine type InnoDB, the primary key as an auto-incrementing integer, and a floating-point field (no index).
4. They were divided into 100 groups and 10, 000 random numbers were inserted each time. (because each group has a large amount of execution, auto-commit transactions are enabled. )
5. Use SELECT COUNT (*) to count random numbers less than 0.1. (about 100000)
6. Use SELECT * to take out and count random numbers greater than 0. 9. (about 100000)
7. Add all random numbers between 0.4 and 0.5 by 1. (about 100000)
8. Delete all lines between 0.5 and 0.6. (about 200000)
9. Disconnect the database.
10. Connect to the database again.
Test the code:
Java:
Java code
Import java.sql.Connection
Import java.sql.DriverManager
Import java.sql.SQLException
Import java.sql.Statement
Import java.sql.ResultSet
Import java.util.Random
Public final class Test {
Public static void main (String [] args) {
Final int SIZE1 = 10000
Final int SIZE2 = 100
Final String DB_ENGINE = "InnoDB"; / / InnoDB Memory MyISAM
Final double NANO_TIME_PER_SEC = 1000000000.0
System.out.printf ("Test data volume:% d\ n", SIZE1 * SIZE2)
System.out.printf ("Test engine:% s\ n", DB_ENGINE)
Long T1 = System.nanoTime (), T2, T3 = 0, T4, T5, T6, T7, T8, T9, T10, T11
Connection conn = null
Statement stmt = null
ResultSet rs = null
Random r = new Random ()
String [] sqls = new String [SIZE2]
For (int I = 0; I
< SIZE2; ++i){ StringBuilder buffer = new StringBuilder("INSERT INTO test (value) VALUES ("); for (int j = 0; j < SIZE1; ++j){ buffer.append(r.nextDouble()).append("),("); } sqls[i] = buffer.substring(0, buffer.length() -2); } t2 = System.nanoTime(); try { conn = DriverManager.getConnection("jdbc:mysql://localhost/testdb?user=root&password=123456"); t3 = System.nanoTime(); stmt = conn.createStatement(); } catch (SQLException e) { try { conn = DriverManager.getConnection("jdbc:mysql://localhost/?user=root&password=123456"); t3 = System.nanoTime(); stmt = conn.createStatement(); stmt.execute("CREATE DATABASE testdb"); } catch (SQLException ex) { System.out.println("SQLException: " + ex.getMessage()); System.out.println("SQLState: " + ex.getSQLState()); System.out.println("VendorError: " + ex.getErrorCode()); } } try { stmt.execute("DROP TABLE test"); } catch (SQLException e) { } try { stmt.execute("CREATE TABLE test (`id` INT AUTO_INCREMENT PRIMARY KEY, `value` REAL) ENGINE = " + DB_ENGINE); } catch (SQLException e) { } t4 = System.nanoTime(); try { for (String sql: sqls){ stmt.execute(sql); } t5 = System.nanoTime(); rs = stmt.executeQuery("SELECT COUNT(*) FROM test WHERE value < 0.1"); if (rs.next()) System.out.printf("共有%d个小于0.1的随机数\n", rs.getInt(1)); t6 = System.nanoTime(); rs = stmt.executeQuery("SELECT * FROM test WHERE value >0.9 ")
If (rs.last ())
System.out.printf ("% d random numbers greater than 0.9\ n", rs.getRow ())
T7 = System.nanoTime ()
Stmt.executeUpdate ("UPDATE test SET value = value + 0.1 WHERE value > 0.4 AND value
< 0.5"); t8 = System.nanoTime(); stmt.execute("DELETE FROM test WHERE value >0.5 AND value
< 0.6"); t9 = System.nanoTime(); stmt.close(); conn.close(); t10 = System.nanoTime(); conn = DriverManager.getConnection("jdbc:mysql://localhost/?user=root&password=123456"); t11 = System.nanoTime(); conn.close(); System.out.printf("创建随机数:%f\n", (t2 - t1) / NANO_TIME_PER_SEC); System.out.printf("初次连接数据库:%f\n", (t3 - t2) / NANO_TIME_PER_SEC); System.out.printf("再次连接数据库:%f\n", (t11 - t10) / NANO_TIME_PER_SEC); System.out.printf("初始化数据库和表:%f\n", (t4 - t3) / NANO_TIME_PER_SEC); System.out.printf("插入:%f\n", (t5 - t4) / NANO_TIME_PER_SEC); System.out.printf("选择(COUNT):%f\n", (t6 - t5) / NANO_TIME_PER_SEC); System.out.printf("选择:%f\n", (t7 - t6) / NANO_TIME_PER_SEC); System.out.printf("更新:%f\n", (t8 - t7) / NANO_TIME_PER_SEC); System.out.printf("删除:%f\n", (t9 - t8) / NANO_TIME_PER_SEC); System.out.printf("关闭连接:%f\n", (t10 - t9) / NANO_TIME_PER_SEC); System.out.printf("总时间:%f\n", (t10 - t1) / NANO_TIME_PER_SEC); } catch (SQLException ex) { System.out.println("SQLException: " + ex.getMessage()); System.out.println("SQLState: " + ex.getSQLState()); System.out.println("VendorError: " + ex.getErrorCode()); } } } PHP: Php代码 Python: Python代码 # -*- coding: gbk -*- import MySQLdb from random import random from time import clock SIZE1 = 10000 SIZE2 = 100 DB_ENGINE = 'InnoDB' # InnoDB Memory MyISAM print '测试数据量:', SIZE1 * SIZE2 print '测试引擎:', DB_ENGINE t1 = clock() sqls = ['INSERT INTO test (value) VALUES (%s)' % '),('.join([`random()` for i in xrange(SIZE1)]) for j in xrange(SIZE2)] t2 = clock() try: con = MySQLdb.connect(user='root', passwd='123456', db='testdb') t3 = clock() cu = con.cursor() except: con = MySQLdb.connect(user='root', passwd='123456') t3 = clock() cu = con.cursor() cu.execute('CREATE DATABASE testdb') con.select_db('testdb') con.autocommit(True) try: cu.execute('DROP TABLE test') except: pass cu.execute('''''CREATE TABLE test ( `id` INT AUTO_INCREMENT PRIMARY KEY, `value` REAL) ENGINE = %s''' % DB_ENGINE) t4 = clock() for sql in sqls: cu.execute(sql) t5 = clock() cu.execute('SELECT COUNT(*) FROM test WHERE value < 0.1') print '共有%d个小于0.1的随机数' % cu.fetchone()[0] t6 = clock() cu.execute('SELECT * FROM test WHERE value >0.9')
Print 'has% d random numbers greater than 0. 9% len (cu.fetchall ())
T7 = clock ()
Cu.execute ('UPDATE test SET value = value + 0.1 WHERE value > 0.4 AND value
< 0.5') t8 = clock() cu.execute('DELETE FROM test WHERE value >0.5 AND value
< 0.6') t9 = clock() cu.close() con.close() t10 = clock() con = MySQLdb.connect(user='root', passwd='123456', db='testdb') t11 = clock() con.close() print '创建随机数:', t2 - t1 print '初次连接数据库:', t3 - t2 print '再次连接数据库:', t11 - t10 print '初始化数据库:', t4 - t3 print '插入:', t5 - t4 print '选择(COUNT)', t6 - t5 print '选择:', t7 - t6 print '更新:', t8 - t7 print '删除:', t9 - t8 print '关闭连接:', t10 - t9 print '总时间:', t10 - t1 MySQL-Python还有个底层的模块,一并测试下: Python代码 # -*- coding: gbk -*- import _mysql from MySQLdb.converters import conversions from random import random from time import clock SIZE1 = 10000 SIZE2 = 100 DB_ENGINE = 'InnoDB' # InnoDB Memory MyISAM print '测试数据量:', SIZE1 * SIZE2 print '测试引擎:', DB_ENGINE t1 = clock() sqls = ['INSERT INTO test (value) VALUES (%s)' % '),('.join([`random()` for i in xrange(SIZE1)]) for j in xrange(SIZE2)] t2 = clock() try: con = _mysql.connect(user='root', passwd='123456', db='testdb', conv=conversions) t3 = clock() except: con = _mysql.connect(user='root', passwd='123456', conv=conversions) t3 = clock() con.query('CREATE DATABASE testdb') con.select_db('testdb') con.autocommit(True) try: con.query('DROP TABLE test') except: pass con.query('''''CREATE TABLE test ( `id` INT AUTO_INCREMENT PRIMARY KEY, `value` REAL) ENGINE = %s''' % DB_ENGINE) t4 = clock() for sql in sqls: con.query(sql) t5 = clock() con.query('SELECT COUNT(*) FROM test WHERE value < 0.1') print '共有%d个小于0.1的随机数' % con.store_result().fetch_row()[0] t6 = clock() con.query('SELECT * FROM test WHERE value >0.9')
Print 'has% d random numbers greater than 0.9'% con.store_result () .num_rows ()
T7 = clock ()
Con.query ('UPDATE test SET value = value + 0.1 WHERE value > 0.4 AND value
< 0.5') t8 = clock() con.query('DELETE FROM test WHERE value >0.5 AND value < 0.6')
T9 = clock ()
Con.close ()
T10 = clock ()
Con = _ mysql.connect (user='root', passwd='123456', db='testdb', conv=conversions)
T11 = clock ()
Con.close ()
Print 'create a random number:', T2-T1
Print 'first connection to the database:', T3-T2
Print 'connect to the database again:', T11-T10
Print 'initialize database:', T4-T3
Print 'insert:', T5-T4
Print 'Select (COUNT)', T6-T5
Print 'Select:', T7-T6
Print 'update:', T8-T7
Print 'delete:', T9-T8
Print 'close connection:', T10-T9
Print 'total time:', T10-T1
Each test 3 times (wait for the hard disk light to flicker before the next test), take the best one as the test result:
Java:
Quote
Amount of test data: 1000000
Test engine: InnoDB
There are 99465 random numbers less than 0.1.
There are 99859 random numbers greater than 0.90.
Create a random number: 2.367840
First connection to the database: 0.220420
Connect to the database again: 0.013174
Initialize databases and tables: 0.075140
Insert: 12.139346
Select (COUNT): 1.130345
Choice: 1.017769
Update: 6.173245
Delete: 9.380070
Close connection: 0.002131
Total time: 32.506307
PHP:
Quote
Amount of test data: 1000000
Test engine: InnoDB
There are 99898 random numbers less than 0.1.
There are 100152 random numbers greater than 0.90.
Create a random number: 1.506294
First connection to the database: 0.003146
Connect to the database again: 0.001808
Initialize databases and tables: 0.131754
Insert: 12.046944
Select (COUNT): 1.236742
Choice: 1.238153
Update: 6.115232
Delete: 8.145547
Close connection: 0.000125
Total time: 30.423937
Python (MySQLdb):
Quote
Amount of test data: 1000000
Test engine: InnoDB
There are 100040 random numbers less than 0.1.
There are 100351 random numbers greater than 0.90.
Create a random number: 1.6822107279
First connection to the database: 0.0332120423126
Connect to the database again: 0.00221704155137
Initialize database: 0.131054924578
Insert: 11.7999030603
Select (COUNT) 1.27067266929
Selection: 1.16714526567
Updated: 6.29200638629
Delete: 8.13660563005
Close the connection: 0.000131022238861
Total time: 30.5129417286
Python (_ mysql):
Quote
Amount of test data: 1000000
Test engine: InnoDB
There are 99745 random numbers less than 0.1.
There are 99869 random numbers greater than 0.90.
Create a random number: 1.68099074044
First connection to the database: 0.0112056141213
Connect to the database again: 0.00159293988482
Initialization database: 0.130169616529
Insert: 12.1364623157
Select (COUNT) 1.125517908
Selection: 0.968366649951
Update: 6.8042843434
Delete: 8.9760508668
Close connection: 9.61015995031e-05
Total time: 31.8331441566
As you can see, in mass data testing, Java is the slowest, while PHP is the fastest.
Regardless of fluctuations in IO performance, Java is mainly slow in connecting and shutting down databases. JDBC 4.0 dynamically loads drivers when connecting to the database for the first time, which is very time-consuming, so when using Java, remember to use database connection pooling to avoid wasting a lot of time on connections. Of course, this also creates a burden on the database, which is bound to affect the memory footprint. The algorithms for creating random numbers are different, so they are not comparable; what surprises me is the translation speed of SELECT, which converts strings into floating-point numbers more slowly than Python, whose floating-point numbers are objects.
PHP connects to the database very quickly, so there is no need to use connection pooling at all, because maintaining connection pooling adds complexity.
Python performs similar to PHP, but it is slower to connect to the database for the first time (still an order of magnitude faster than Java). If you do not use connection pooling, it is more appropriate to use methods such as FCGI to run. The communication of the _ mysql module is fast, but the update and delete operations are not ideal, which may be the reason for the fluctuations in IO performance. In addition, I used conversion parameters when connecting to the database. In fact, the statements I used didn't need to be translated, and it would be faster if I didn't use them.
Then try the small data, change it to the most commonly used MyISAM engine, and insert 100 entries (1 group). It is impossible for general applications to insert so many at a time, so it is enough to meet the usual applications; and because the amount of data is very small, it is basically not affected by IO.
Test results:
Java:
Quote
Amount of test data: 100
Test engine: MyISAM
There are 9 random numbers less than 0.1
There are 10 random numbers greater than 0.9
Create a random number: 0.001596
First connection to the database: 0.224135
Connect to the database again: 0.018656
Initialize databases and tables: 0.055601
Insert: 0.001476
Select (COUNT): 0.000529
Choice: 0.000433
Update: 0.000304
Delete: 0.000313
Close connection: 0.000927
Total time: 0.285314
PHP:
Quote
Amount of test data: 100
Test engine: MyISAM
There are 12 random numbers less than 0.1
There are 9 random numbers greater than 0.9
Create a random number: 0.000649
First connection to the database: 0.008077
Connect to the database again: 0.001609
Initialize databases and tables: 0.060421
Insert: 0.001860
Select (COUNT): 0.000580
Choice: 0.000465
Update: 0.000326
Delete: 0.000373
Close connection: 0.000127
Total time: 0.072878
Python (MySQLdb):
Quote
Amount of test data: 100
Test engine: MyISAM
There are 14 random numbers less than 0.1
There are 9 random numbers greater than 0.9
Create a random number: 0.000198907961766
First connection to the database: 0.0334640296462
Connect to the database again: 0.0015057779899
Initialize database: 0.0123194428342
Insert: 0.00125211444471
Select (COUNT) 0.000581079438867
Selection: 0.000484139744018
Updated: 0.000250311142897
Delete: 0.000262323842835
Close connection: 7.98984228442e-05
Total time: 0.0488922474784
Python (_ mysql):
Quote
Amount of test data: 100
Test engine: MyISAM
There are 12 random numbers less than 0.1
There are 10 random numbers greater than 0.9
Create a random number: 0.000214273043082
First connection to the database: 0.0118774872225
Connect to the database again: 0.00123702872851
Initialize database: 0.0315031659052
Insert: 0.00120322554962
Select (COUNT) 0.000596165155069
Selection: 0.000507327048549
Update: 0.0002447238406
Delete: 0.00026148574749
Close connection: 5.78285787719e-05
Total time: 0.0464656820909
As can be seen from the results, although the gap is very small, Python still has a slight advantage. However, Java's SELECT operation is slightly better, which is the most commonly used operation in practical applications.
From a language point of view, Python is undoubtedly the happiest, generating random numbers only takes one line of code; the variable of PHP has to write a $so that I always make mistakes, but database operations do not need to handle exceptions, which saves a lot of code Java has a lot of code and has to use a lot of try...catch, and I don't even bother to put close () in the finally block in a safe way, because it also throws exceptions that I don't bother to care about, and prompts me that the timing variable may not be initialized.
Overall, it is a bit unwise for Google to abandon Python in favor of C++ and Java. Because the page response time mainly lies in the database communication and the disk file IO, the influence of language is basically ignored.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.