In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article shows you what the advantage of HTTPS is, the content is concise and easy to understand, it can definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.
In the course of the development of the HTTPS project, it is obvious that the domestic Internet does not attach great importance to HTTPS, that is, it does not attach importance to user privacy and network security. From the point of view of protecting user privacy, this paper briefly describes the current phenomenon of user privacy disclosure and traffic hijacking, and then further explains why HTTPS can protect user security and the areas that should be paid attention to in the use of HTTPS.
1. There is a great risk of user privacy disclosure.
People's life has become more and more inseparable from the Internet, whether it is social, shopping or search, the Internet can bring people a lot of convenience. At the same time, users have more and more information on the Internet, and another problem is becoming more and more serious, that is, privacy and security.
Almost all Internet companies are at risk of user privacy disclosure and traffic hijacking. BAT attracts fashion, and the problem in this area is particularly serious. For example, when users search for a keyword "flow of people" on Baidu, a hospital will soon call to promote the advertisement for abortion surgery, and unwitting users think that Baidu has sold his mobile phone number and search information. Similarly, the keywords searched by users on Taobao can easily be intercepted by third parties and privately harass users by phone or other forms of advertising. As for QQ and Wechat, it is obvious that users do not want their chat content to be easily known to others. Why is it impossible for BAT to sell users' private information to third parties? Because the protection of user privacy is the foundation of any Internet company that wants to develop for a long time, if users find that there is a serious privacy disclosure problem in using a company's products, they will obviously no longer trust the company's products. in the end, the company will also be in crisis because of the massive loss of users. Therefore, it is impossible for any large Internet company to sell or even ignore user privacy because of short-term interests.
So since Internet companies know the importance of user privacy, is user privacy well protected? The reality is not entirely satisfactory. As most of the current WEB applications and websites are based on HTTP protocol, no large domestic Internet company has adopted a site-wide HTTPS scheme to protect user privacy (excluding payment and landing-related websites or pages as well as PC-side Wechat). Because the HTTP protocol is simple and convenient, easy to deploy, and did not consider the security at the beginning of the design, all the content is transmitted in clear text, which lays a hidden danger for the current security problems. The content transmitted by users on WEB applications based on HTTP protocol can be easily viewed and modified by intermediaries.
For example, if you search for a keyword "https" on Baidu, the middleman can easily know all the contents of the request through tools such as tcpdump or wireshark. The screenshot of wireshark is as follows:
Here the so-called middleman refers to the network nodes through which the network transmits content, including both hardware and software, such as intermediate proxy server, router, community WIFI hotspot, company unified gateway exit and so on. The easiest to get user content is a variety of communication service operators and secondary network bandwidth providers. The nodes that are most likely to be tampered with by third-party hackers are relatively close to users.
Why does the middleman view or modify the actual content requested by the user? It's simple, for profit. Several common forms of intermediate content hijacking that are harmful are as follows:
Get the wireless user's mobile phone number and search content and privately harass the user through phone advertisements. Why can I get the user's mobile phone number? Ha ha, because there is cooperation with the operator.
Get the user account cookie and steal useful information from the account.
Add third-party content to the content returned by the user's destination site, such as advertisements, phishing links, Trojans, etc.
To sum up, due to the plaintext transmission of HTTP and the huge benefits of intermediate content hijacking, the risk of user privacy disclosure is very high.
2Jet HTTPS can effectively protect the privacy of users.
HTTPS equals HTTP plus TLS (SSL). There are three main goals of HTTPS protocol:
Data confidentiality. Ensure that the content will not be seen by a third party during transmission. It's like when couriers deliver parcels are encapsulated, and no one else knows what's in it.
Data integrity. Discover the transmission content tampered with by a third party in time. For example, although the courier does not know what is in the package, he may drop the package midway. Data integrity means that if the package is swapped, we can easily find it and reject it.
Identity check. Ensure that the data reaches the desired destination. Just like when we mail a package, although it is a packaged package that has not been dropped, we must make sure that the package will not be sent to the wrong place.
Popular description of the above three goals is to encapsulate encryption, prevent tampering and swapping, and prevent identity impersonation, so how does TLS achieve the above three points? Let me give you a brief account of each.
2.1 data confidentiality
2.1.1 asymmetric encryption and key exchange
The confidentiality of data is mainly accomplished by encryption. Encryption algorithms are generally divided into two types, one is asymmetric encryption (also called public key encryption), and the other is symmetric encryption (also known as key encryption). Asymmetric encryption means that encryption and decryption use different keys, as shown below:
HTTPS has two main functions when using asymmetric encryption and decryption, one is key negotiation, and the other can be used as a digital signature. The so-called key agreement simply means to calculate the keys that need to be used for symmetrical encryption and decryption when transmitting content according to the respective information of both parties.
Generally speaking, the public key encryption process is that the server holds the private key, the client holds the public key, the private key is used for decryption, and the public key is used for encryption. The public key can be issued to anyone to know, but the private key is only held by the server, so the public key encryption and decryption is very secure. Of course, this security must be established on the basis that the length of the public key is large enough. At present, the minimum secure length of the public key also needs to reach 2048 bits. Large CA no longer supports enterprise certificate applications with less than 2048 digits. Because the public key length of 1024 bits or less is no longer secure, it can be cracked by high-performance computers such as quantum computers. The computational performance decreases exponentially by 2 with the length of the public key.
In that case, why do you need symmetric encryption? Why not always use asymmetric encryption algorithms to complete the whole encryption and decryption process? There are mainly two points:
Asymmetric encryption and decryption consumes a lot of performance, a complete TLS handshake, the amount of computation of asymmetric decryption during key exchange accounts for 95% of the whole handshake process. The amount of computation of symmetric encryption is only 0.1% of that of asymmetric encryption. If the application layer data also uses asymmetric encryption and decryption, the performance overhead is too high to bear.
Asymmetric encryption algorithm limits the length of encrypted content, which cannot exceed the length of public key. For example, the commonly used public key length is 2048 bits, which means that the content to be encrypted cannot exceed 256bytes.
At present, the commonly used asymmetric encryption algorithm is RSA. I would like to emphasize that RSA is the most important algorithm in the whole PKI system and encryption and decryption field. If you want to have an in-depth understanding of all aspects of HTTPS, RSA is a must. Its principle mainly depends on three points:
The irreversibility of multiplication. That is, it is easy to find their product from two multipliers, but given a product, it is difficult to find out which two multiplier factors multiply it.
Euler function. Euler function. Varphi (n) is the number of positive integers less than or equal to n that are coprime with n.
Fermat Little Theorem. If an is an integer and p is a prime, then a ^ p-an is a multiple of p.
RSA algorithm is the first and only algorithm that can be used for both key exchange and digital signature. Another very important key negotiation algorithm is diffie-hellman (DH). Dh can complete the key negotiation without knowing the information of both sides of the communication in advance. it uses an integer multiplication group of prime P and the primitive root G, which is based on discrete logarithm.
Currently, openssl only supports the following key exchange algorithms: RSA,DH,ECDH, DHE,ECDHE. The performance of each algorithm and the impact on speed can be referred to the following chapters, due to the limited space, the specific implementation will not be described in detail.
2.1.2 symmetric encryption
Symmetric encryption means that both encryption and decryption use the same key. As shown below:
After the end of the key negotiation process using the asymmetric cryptographic algorithm, the symmetric key to be used in this session has been obtained. Symmetric encryption is divided into two modes: streaming encryption and packet encryption. RC4 is now commonly used in streaming encryption, but RC4 is no longer secure, and Microsoft recommends that websites try not to use RC4 streaming encryption. Alipay may not realize this, or it may be for other reasons that they are still using the RC4 algorithm and the TLS1.0 protocol.
A new streaming encryption algorithm instead of RC4 is called ChaCha20, which is a faster and more secure encryption algorithm introduced by google. It has been adopted by android and chrome and compiled into boringssl, the open source openssl branch of google, and nginx 1.7.4 also supports compiling boringssl. At present, I have not compared the performance of this algorithm, but some data show that the performance consumption of this algorithm is relatively small, especially on the mobile side.
The commonly used mode of block encryption is AES-CBC, but CBC has been proved to be vulnerable to BEAST and LUCKY13 attacks. At present, the recommended packet encryption mode is AES-GCM, but its disadvantage is that it has a large amount of computation, high performance and power consumption, so it is not suitable for mobile phones and tablets. Nevertheless, it is still our priority.
2.2 data integrity
This part is relatively simple. Openssl now uses two integrity checking algorithms: MD5 or SHA. As MD5 is likely to conflict in practical applications, try not to use MD5 to verify content consistency. SHA also cannot use SHA0 and SHA1. Professor Wang Xiaoyun of Shandong University in China announced that he had cracked the full version of SHA-1 algorithm in 2005. It is recommended to use the SHA2 algorithm, that is, the summary length of the output is more than 224 bits.
2.3 Authentication and authorization
Here is the main introduction is PKI and digital certificates. Digital certificates serve two purposes:
Authentication. Ensure that the site visited by the client is a trusted site that has been verified by CA.
Distribute the public key. Each digital certificate contains the public key generated by the registrant. During the SSL handshake, the certificate message is transmitted to the client.
Here is a brief introduction to how digital certificates verify the identity of a website.
The certificate applicant will first generate a pair of keys, including the public key and key, and then send the request for the public key, domain name and CU to RA,RA in CSR format (RA will ask an independent third party and legal team to confirm the identity of the applicant), then send CSR to CA,CA and create a certificate in X.509 format.
Well, the applicant gets the CA certificate and deploys it on the server side of the website. after receiving the certificate when the browser visits, how do you confirm that the certificate is signed by CA? How to prevent a third party from forging this certificate?
The answer is digital signature (digital signature). A digital signature can be thought of as an anti-counterfeiting label for a certificate. The process of making and verifying the most widely used SHA-RSA digital signature is as follows:
The issuance of digital signatures. First, a hash function is used to hash the certificate data to generate a message digest, and then the certificate content and message digest are encrypted using CA's own private key.
Verification of digital signatures. Use the public key of CA to decrypt the signature, and then use the same signature function to sign the certificate content and compare it with the signature content in the server's digital signature. If the same, the verification is considered successful.
The graphic representation is as follows:
Here are a few points to explain:
The key pair used for digital signature signing and verification is CA's own public and private key, which has nothing to do with the public key submitted by the certificate applicant.
The signing process of digital signature is opposite to that of public key encryption, that is, it is encrypted with private key and decrypted by public key.
Now all large CA will have a certificate chain. One of the advantages of certificate chain is security, keeping the private key of the root CA offline. The second benefit is to facilitate deployment and revocation, that is, if there is a problem with the certificate, only the corresponding level of certificate needs to be revoked, and the root certificate is still secure.
The root CA certificate is self-signed, that is, the signature is made and verified with its own public key and private key. The certificate signatures on the certificate chain are signed and verified using the key pair of the previous certificate.
How to get the key pair of root CA and multi-level CA? Are they credible? Of course, it can be trusted, because these vendors cooperate with browsers and operating systems, and their public keys are installed in the browser or operating system environment by default. For example, firefox maintains a trusted CA list on its own, while chrome and IE use the operating system's CA list.
In fact, the cost of digital certificate is not high, for small and medium-sized websites can use cheap or even free digital certificate services (there may be security risks), such as the certificate of the famous verisign company generally ranges from thousands to tens of thousands of yuan a year. Of course, if the company has a large demand for certificates and high customization requirements, you can set up your own CA site, such as google, and you can issue google-related certificates at will.
3The effect of HTTPS on speed and performance
Since HTTPS is very secure and the cost of digital certificates is not high, why don't Internet companies all use HTTPS? There are two main reasons:
The effect of HTTPS on speed is very obvious. Each HTTPS connection generally adds 1-3 RTT, coupled with the performance consumption of encryption and decryption, the latency may be increased by dozens of milliseconds.
HTTPS consumes the computing power of CPU so heavily that the processing power of web server decreases to 10% or less of HTTP when shaking hands completely.
Let's make a brief analysis of these two points.
3.1 impact of HTTPS on access speed
I use a picture to show the possible increase in latency for a user to visit a website using HTTPS:
The increased latency in HTTPS is mainly reflected in three phases, including phases 2 and 3 shown in the figure above.
302 jump. Why do you need 302? Because the user is lazy. I think the vast majority of netizens usually enter www.baidu.com or baidu.com when visiting Baidu. Very few enter http://www.baidu.com to visit Baidu search, right? As for the direct input https://www.baidu.com to access Baidu's HTTPS service is even less. So in order to force the user to use the HTTPS service, it is necessary to www.baidu.com302 the HTTP request initiated by the user into https://www.baidu.com. This is undoubtedly an increase in the jump delay of RTT.
In the third phase of the figure above, the impact of the SSL full handshake on the delay is even more obvious. This effect is not only reflected in the RTT of network transmission, but also includes the verification of digital signatures. Due to the weak computing performance of the client, especially the mobile terminal, it is common to increase the computing delay by tens of milliseconds.
There is another delay that has not been drawn, that is, the status check of the certificate. Now newer browsers all use ocsp to check the revocation status of the certificate. After getting the certificate content of the server, they will visit the ocsp site to obtain the status of the certificate and check whether the certificate is revoked. If the ocsp site is abroad or if the ocsp server fails, it will obviously affect the access speed of this normal user. Fortunately, however, the inspection cycle of ocsp is usually once every 7 days, so the impact on speed is not very frequent. In addition, chrome turns off ocsp and crl by default. The latest version of firefox enables this feature. If ocsp returns incorrectly, users cannot open and visit the site.
The actual test found that without any optimization, HTTPS will increase the latency above 200ms.
Is it true that we can't optimize these delays? Obviously not. Some of the optimization methods are as follows:
HSTS is configured on the server side to reduce 302 hops. In fact, the biggest function of HSTS is to prevent 302 HTTP hijacking. The disadvantage of HSTS is that the support rate of the browser is not high, and it is very difficult for HTTPS to be downgraded to HTTP in real time after configuring HSTS.
Set the shared memory cache of ssl session. Take nginx as an example, it currently only supports session cache stand-alone multi-process sharing. The configuration is as follows:
Ssl_session_cache shared:SSL:10m
If the front-end access is a multi-server architecture, such session cache is useless, so it is necessary to implement the multi-machine sharing mechanism of session cache. We have implemented session cache for multi-computer sharing on nginx version 1.6.0. The problem with multi-machine session cache requires synchronous access to external session cache, such as redis. Since the API currently provided by openssl is synchronous, we are improving the asynchronous implementation of openssl and nginx.
Configure the same session ticket key and deploy it on multiple servers, so that multiple different servers can produce the same session ticket. The disadvantage of session ticket is that the approval rating is not wide, only about 40%. Session id is the standard content of client hello and has been supported by all customers since SSL2.0.
Ssl_session_tickets on
Ssl_session_ticket_key ticket_keys
Set the ocsp stapling file so that the request for ocsp is not sent to the ocsp site provided by ca, but to the webserver of the site. The configuration and generation commands for ocsp are as follows:
Ssl_stapling on
Ssl_stapling_file domain.staple
The above is the nginx configuration, and the following is the generation command for ocsp_stapling_file:
Openssl s_client-showcerts-connect yourdomain:443
< /dev/null | awk -v c=-1 '/-----BEGIN CERTIFICATE-----/{inc=1;c++} inc {print >("level" c ".crt")} /-END CERTIFICATE-/ {inc=0}'
For i in level?.crt
Do
Openssl x509-noout-serial-subject-issuer-in "$I"
Echo
Done
Openssl ocsp-text-no_nonce-issuer level1.crt-CAfile CAbundle.crt-cert level0.crt-VAfile level1.crt-url $ocsp_url-respout domain.staple, where $ocsp_url is equal to the URL of the ocsp site, which can be obtained by the following command: for i in level?.crt; do echo "$I:"; openssl x509-noout-text-in "$I" | grep OCSP; done, if it is a certificate chain, it is usually the lowest value.
The ecdhe key exchange algorithm is preferred because it supports PFS (perfect forward secrecy) and implements false start.
To set the tls record size, it is best to dynamically adjust the record size, that is, when the connection is first established, the record size is set to msg, and the record size can be dynamically increased after the connection is stable.
Tcp fast open can be enabled if there are conditions. Although there is no client support at the moment.
Enable SPDY. SPDY is mandatory to use HTTPS, and the protocol is complex and needs to be analyzed in a separate article. What is certain is that requests to use SPDY not only significantly improve the speed of HTTPS, but even faster than HTTP. In the wireless WIFI environment, SPDY is about 50ms faster than HTTP, and 3G environment is faster than HTTP 250ms.
3.2 impact of HTTPS on performanc
Why does HTTPS degrade performance so badly? It is mainly the operation of large numbers in the handshake stage. The one that consumes the most performance is the private key decryption phase of key exchange (the function is rsa_private_decryption). The performance consumption at this stage accounts for 95% of the total SSL handshake performance consumption.
As mentioned earlier, there are only four algorithms used for openssl key exchange: rsa, dhe, and ecdhe,dh. Dh is rarely used due to security issues, so you can compare the performance of the previous three key exchange algorithms here. The specific data are as follows:
The figure above refers to the time it takes to complete the 1000 handshake. Obviously, the higher the time value, the lower the performance.
The key exchange step is a stage that cannot be bypassed during the SSL full handshake. We can only take the following measures:
Through session cache and session ticket, the rate of session reuse was increased, the number of complete handshake (full handshake) was reduced, and the rate of simplified handshake (abbreviated handshake) was increased.
For the consideration of forward encryption and false start, we first configure ecdhe for key exchange, but if the performance is insufficient, we can configure rsa as a key exchange algorithm to improve performance.
Openssl comes with tools that can calculate the performance of symmetric encryption, digital signatures and HASH functions, so I won't enumerate the detailed data and readers can test it on their own.
The conclusion is that symmetric encryption RC4 has the fastest performance, but RC4 itself is not secure, so it is still normal to use AES. The HASH function MD5 is similar to SHA1. Digital signature is the fastest ecdsa algorithm, but the support rate is not high.
In fact, since the performance consumption of key exchange accounts for 95% of the whole handshake process, while the performance consumption of symmetric encryption and decryption is less than 0.1%, the optimization benefit of symmetric encryption on server side is small. On the contrary, because the CPU computing power of the client, especially the mobile terminal, is already weak, the optimization of symmetric encryption and digital signature is mainly aimed at the mobile client.
Poly1350 is a symmetric encryption algorithm introduced by google which claims to be better than aes-gcm. It is suitable for mobile end and can be tried out.
Finally, after testing, the optimal cipher suite configuration for comprehensive security and performance is: ECDHE-RSA-AES128-GCM-SHA256.
If there is a significant decline in performance, you can modify the configuration to improve performance but weaken security. The configuration is: rc4-md5. According to the rules of openssl, rsa is used by default for key exchange and digital signature.
Analysis of the support rate of 4Jing HTTPS
Based on the analysis of the 1 million wireless access log on Baidu server (mainly the browsers of mobile phones and tablets), it is concluded that the relationship between the protocol and handshake time is as follows:
Tls protocol version client usage handshake time ms
Tls 1.224.8% 299.496
Tls 1.10.9% 279.383
Tls 1.074% 307.077
Ssl 3.00.3% 484.564
As can be seen from the above table, ssl3.0 is the slowest, but its approval rating is very low. Tls 1.0 has the widest approval rating.
The relationship between cipher suite and handshake time is as follows:
Cipher suite client utilization handshake time
ECDHE-RSA-AES128-SHA58.5%294.36
ECDHE-RSA-AES128-SHA25621.1%303.065
DHE-RSA-AES128-SHA16.7%351.063
ECDHE-RSA-AES128-GCM-SHA2563.7%274.83
Obviously, DHE has a great impact on speed, the performance of ECDHE is really much better, and AES128-GCM has a little improvement on speed.
Through the analysis of client hello requests through tcpdump, it is found that 56.53% of the requests sent session id. This means that these requests can be reused through session cache. Some of the other extended attributes are as follows:
Support rate for tls extension
Server_name76.99%
Session_tickets38.6%
Next_protocol_negotiation40.54%
Elliptic_curves 90.6%
Ec_point_formats90.6%
These extensions are all very meaningful and can be explained as follows:
Server_name, is sni (server name indicator). 77% of the requests carry the domain name you want to access in the client hello, allowing the server to use one IP to support multiple domain names.
Next_protocol_negotiation, or NPN, means that 40.54% of clients support spdy.
Session_tickets has a relatively low approval rating of 38.6%. This is why we modified the nginx backbone code to implement the session cache multi-machine sharing mechanism.
Elliptic_curves is the ECC (Elliptic Curve algorithm) introduced earlier, which can use smaller KEY length to achieve the same level of security as DH and greatly improve computing performance.
5, conclusion
At present, there are relatively few Chinese materials of HTTPS on the Internet, and because HTTPS involves a lot of knowledge of protocols, cryptography and PKI system, the learning threshold is relatively high. In addition, in the specific process of practice, there are still many holes and areas for continuous improvement. I hope this article will be of some help to you. At the same time, as I have a superficial knowledge in many places, I hope you can put forward more opinions and make progress together.
Finally, in order to prevent traffic hijacking and protect users' privacy, we all use HTTPS, which is supported by the whole site. In fact, HTTPS is not that difficult or scary, it's just that you haven't optimized it properly.
The above content is to HTTPS the whole site. What are the advantages? have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.