[Chat and Miscellaneous Talk] Detailed Explanation of the Principle of HTTPS

The difference between HTTPS and HTTP

Although HTTP is widely used, it has a lot of security flaws, mainly due to the lack of clear text transmission of data and message integrity detection, and these two points happen to be the most important aspects of security in emerging applications such as online payment and online transactions. of.

Regarding HTTP plaintext data transmission, the most common attack method used by attackers is network sniffing, trying to analyze sensitive data from the transmission process, such as the login process of the administrator to the background of the web program, etc., so as to obtain the website management authority, and then Permission to infiltrate the entire server. Even if the background login information cannot be obtained, the attacker can still obtain the secret information of ordinary users from the network, including important information such as mobile phone numbers, ID numbers, and credit card numbers, leading to serious security incidents. Conducting a network sniffing attack is very simple and requires very little from the attacker. Using any packet capture tool released on the Internet, it is possible for a novice to obtain user information of a large website.

HTTPS can be simply understood as a secure HTTP channel, based on the original HTTP protocol implementation, S refers to adding an SSL certificate to the session layer (it is also said to be a TLS certificate, TLS→Transport Layer Security is a subsequent version of SSL→Secure Socket Layer) . The seven-layer protocol of OSI is actually only a theoretical protocol of an international standard, and it is also a standard. What is generally implemented is the use of a five-layer protocol, that is, the application layer, presentation layer, and session layer in the seven layers are combined into one application layer. In this way, this SSL is the application layer in the five-layer protocol. The other is the TCP/IP four-layer protocol, which is a model implemented by the US government very early, mainly combining the physical layer and the data link layer into a network interface layer.

And what is the difference between SSL and TLS? In the early days, SSL was a private protocol launched by Netscape in 1990. At that time, this protocol only existed in the browser developed by Netscape itself. Netscape browser was very popular in the 1990s, with a market share as high as 90%. Later, it fought with Microsoft in full swing. Based on the advantages of the operating system, Microsoft's market share was gradually eroded by Netscape. In the end, it could only be acquired by AOL. Finally, they disbanded on July 15, 2003. However, the SSL protocol has been preserved. The single SSL protocol is widely used in today's Internet. will be bad.

After the defeat of Netscape, no one maintained these protocols of SSL. With the development of the version, it was later handed over to a working group of international standard protocols. By 1999, the SSL protocol was widely used in the Internet and had become the de facto Internet standard. Therefore, in 1999, the IETF organization standardized the SSL3.0 protocol specification, which became the TLS protocol (Transport Layer Secure). However, due to the differences in encryption algorithms between SSL3.0 and TLS, they cannot interoperate and are regarded as two different protocols.

Then the most important difference between HTTPS and HTTP is: security

HTTPS ensures the security of data transmission through encrypted transmission, identity authentication, integrity assurance, and non-repudiation/repudiation mechanisms.

Encrypted transmission: The entire data transmission is encrypted, even if the transmission is intercepted by a third party, the transmitted content cannot be identified. The implementation principle of encryption is realized through asymmetric encryption + symmetric encryption, which is the so-called hybrid encryption mode;

Identity authentication: Both the server and the client can confirm each other's identity, and each other can prove that the sending and receiving of data is credible. If there is no identity authentication mechanism, hackers can accept the data sent by the client by simulating the server, or steal data on the server by simulating the client. For example, two secret agents need to use a token recognized by both parties to confirm each other's identities. This so-called token can be a certificate issued by a third-party authority recognized by both parties. The certificate ensures that both parties have not been impersonated;

Guaranteed integrity: Although the transmitted content is encrypted, it does not mean that the transmitted content cannot be modified, so that even if the transmitted data is not known, the receiving party will receive wrong data. The sender generates a summary for the data to be sent, and when the receiver gets the data, it compares the summary with the original text according to a specific algorithm. If the comparison results are consistent, it means that there is no problem with the data. If there is an error in either the abstract or the original text, the comparison fails, which means that the data must have been tampered with during transmission. The receiving end will discard the wrong data and tell the sending end to resend;

Non-repudiation/repudiation mechanism: After each interaction between the client and the server is authenticated, a record of each operation is generated. Once an error occurs in the operation, it can be traced back to the source according to the records. Due to the identity authentication, every operation of each party will be recorded and cannot be denied;

Besides the most important security difference, there are some other differences between HTTPS and HTTP:

Because the HTTPS protocol incorporates a lot of encryption and decryption operations for safety first. Encryption and decryption itself is also a CPU-intensive operation that consumes a lot of CPU resources, so in terms of pure speed, it must be slower than HTTP, and the overall flexibility will be lower. Of course, there is still a very real problem, the HTTPS protocol needs to be charged. In terms of identity authentication, you need to obtain a certificate from a third-party organization. Of course, there are many third-party organizations, collectively referred to as CA organizations.

Workflow of HTTPS

Before talking about the workflow of https, let's briefly review the workflow of TCP. The TPC process is nothing more than the so-called three-way handshake and four-way breakup. In the process of three requests, the client first sends a meaningless data message to the server. After the server receives the message, it also returns a message to the client as a response. After the client gets the data from the server, it replies with a confirmation message to the server again, and then both parties confirm that the information is consistent and establish a connection.

After the TCP connection is established, the next step is to actually start the establishment of the HTTPS connection, that is, the handshake establishment of the TLS protocol. This handshake process is much more complicated than the TCP handshake. Some say 4 times, some say 7 times, and some say 12 times. This depends on how to divide it. The general process is:

1. The client initiates a request;
2. The server returns the public key certificate;
3. The client verifies the certificate;
4. The client generates a symmetric secret key, encrypts it with the public key and sends it to the server;
5. The server uses the private key to decrypt, Get the symmetric key;
6. Both the client and the server use the symmetric key to encrypt and decrypt;

In fact, the entire handshake process does one thing, which is to securely deliver the symmetric key generated by the client to the server . In the figure, the complicated operation is simplified and summarized. In fact, the actual operation process is far more complicated than the description.

We all know that the encryption mode of HTTPS adopts a hybrid encryption method, that is, symmetric encryption and asymmetric encryption are used. The asymmetric encryption just mentioned is the interaction of the secret key, and the symmetric encryption is used in the later data transmission.

After the hybrid encryption is completed, both the client and the server hold the same symmetric key. As mentioned earlier, the implementation of HTTPS is based on the TLS protocol, and the current mainstream versions of TLS are version 1.2 and version 1.3.

Handshake process based on TLS 1.2

1st time (ClientHello):

The client initiates a request to the server, and the passed parameters include: TLS version number, a random number Random C, and the cipher suite CipherSuites . The so-called cipher suite refers to that the client and the server need to encrypt data and generate a so-called secret key. Generating a secret key requires a series of algorithms to generate it. So when the client sends a request to the server for the first time, it will tell the server which algorithms I support here, and the server will use the algorithms supported by the client to perform encryption calculations. To put it bluntly, this handshake request is actually to tell the server all the basic information of the client itself, so that the server can teach students according to their aptitude;

2nd time (ServerHello):

When the server receives the client's request, it will also reply a message to the client. The content of the server's reply is more, including: TLS version number, a random number Random S, cipher suite CipherSuites (the mainstream algorithm used in version 1.2 is ECDHE), server certificate Certificate (used to prove the identity of the server and tell the client I am trustworthy and authoritative), ServerKeyExchange (contains the parameters required by the encryption algorithm) . This completes the exchange of random numbers at both ends, and also determines which encryption algorithm to use;

3rd time (ClientKeyExchange):

After the client receives a large amount of information returned by the server, it checks the received information. Generate a symmetric key according to the parameters given by the server, and then encrypt the symmetric key before sending it to the server;

The 4th time (ChangeCipherSpec):

After the server receives the secret key, it tells the client that the secret key is accepted.

After these 4 steps, the transmission of subsequent data between the client and the server is encrypted and decrypted using a symmetric key. In fact, the above 4-way handshake is a general process summarized after integrating many detailed steps. If these processes are refined, they can be divided into about 10 steps. For example: ServerHello in the second step, in fact, in this step, the server is divided into many small steps to send data to the client, A includes ACK confirmation is one step, TLS version and random number is one step, server certificate is one step, password Kits are one step.....and more. There is no clear order of these subdivided small steps, and the order of sending and the order of arrival will also be reversed due to network fluctuations.

Step 3 ClientKeyExchange will be a little more complicated. Actually, when generating the symmetric key, the client first verifies the certificate and signature sent by the server. After the verification, the client will generate a new random number. This new random number It is the client parameter, and then send this client parameter to the server. Then both the client and the server use the random number generated by both parties as a parameter, and generate the PreMaster through the ECDHE algorithm. This PreMaster is calculated by both parties themselves, without any data transmission in the middle, so it will not be illegally obtained by any interception. Then use PreMaster to add two random numbers to generate the final master key, which is the symmetric key that is actually used to encrypt the transmitted data. At this point, both parties have generated the master key, and the client will execute Change Cipher Spec, which is to tell the server to change the cipher specification. Both of us used asymmetric encryption before, and now we both have symmetric keys, and it is time for both of us to use symmetric encryption.

Finally, the server will give the client a confirmation after receiving it. This is the entire TLS 1.2 handshake process. The main difficulty is the generation of the master key in step 3. What is actually passed is the core parameters, not the master key itself.

Handshake process based on TLS 1.3

The birth of version 1.3 is mainly to solve some problems in version 1.2. What problems exist in lower versions are basically laid out around the three dimensions of security, performance and compatibility. Version 1.2 seems to have been launched in 2008. Today's Internet industry technology is changing with each passing day, and the algorithm of the lower version is naturally unable to keep up with the requirements of the new era. Therefore, in version 1.3, many outdated algorithms were cut off, and many mainstream algorithms were also optimized.

The most important thing is to make substantial improvements to the performance problems of 1.2. In version 1.2, both parties will perform multiple data transmissions for each handshake, which will undoubtedly cause low efficiency. Each round trip of data transmission is called a round trip delay (RTT). Two round trips require twice the RTT time, and multiple round trips require multiple RTT times. To optimize this problem, you have to reduce the number of times data is sent. Version 1.2 first generates a random number and sends it to the server. After the server gets the random number, it sends a bunch of data from the server to the client. Then the client uses the data returned by the server to generate the random number for the second time. number. In fact, is it really necessary to rely on the data on the server side for the second generation of random numbers? Anyway, the result is to generate random numbers. Is there any data on the server that generates "random" numbers? Wouldn't it be easier to generate both random numbers at the beginning?

So in version 1.3, the client sends all its data to the server during the first handshake, so that the server can directly click to generate the subsequent secret key. The server also does all the things it should do at one time, and then gives it all to the client. After the client gets the data, it also performs a series of follow-up operations by itself. Then send some data to the server during the third handshake, telling the server to change the encryption mode.

There are also some interesting points. For example, both parties have a key_share field, key means the secret key, and share means sharing: the secret key that can be shared, so obviously the value is the public key. Note that in the first step of ClientHello, the client has already sent the public key of the pair of secret keys generated by itself to the server. After obtaining the public key, the server encrypts the data with the public key, and then returns the corresponding data to the client. Version 1.2 is the public key from the server to the client, and version 1.3 directly becomes the private key from the client to the server. The problem is that the client directly encrypts all the data and sends it to the server, but both parties seem to have not yet determined what algorithm to use for encryption and decryption. Why does the client think that the server will be able to successfully decrypt the data it sends? Woolen cloth? In fact, this is a gamble. After all, there are not many mainstream encryption algorithms at present. Both parties default to each other’s support for mainstream algorithms, so there is no need to waste steps on which algorithm to use, which also greatly improves the efficiency of data exchange between the two parties.

Principles of HTTPS Implementation

confidentiality

The so-called confidentiality refers to the use of both asymmetric encryption and symmetric encryption, which is often referred to as hybrid encryption. The two encryption methods are also used in stages: asymmetric encryption is used in the key exchange stage, and symmetric encryption is used in the subsequent data transmission stage. The overall process, multiple handshakes, is that the communicating parties negotiate a secret key for symmetric encryption.

The ECDHE algorithm mentioned above is a sub-algorithm of the ECC algorithm, but the RSA algorithm actually has some security risks. The so-called security risks in version 1.2 actually refer to the hidden dangers of the RSA algorithm. Since the RSA algorithm was enthusiastically sought after when it was first launched, the RSA algorithm has a high reputation, but it has been forcibly eliminated in version 1.3. The principle of RSA is to use the product of two super large prime numbers as a parameter to generate a secret key. If a hacker obtains a large amount of data encrypted using the RSA algorithm, it is theoretically possible to reverse the original text. But it is actually impossible, because the HTTP protocol is a stateless protocol, so every connection between the client and the server will repeat all the steps, that is to say, the secret key every time can be understood as It is a one-off, even if it is reversed, it is meaningless. So the so-called hidden dangers are just hidden dangers, and it is almost impossible to actually implement them. However, the ECC algorithm is much more efficient than RSA in terms of security and efficiency, so with a better choice, RSA can be regarded as honorably fulfilling its mission.

Regarding symmetric encryption and asymmetric encryption, write a simple small program here to explain. We know that symmetric encryption uses the same secret key for both encryption and decryption, so the XOR operation is definitely a typical representative of doing my part:

public class SymmetricEncryption {

    private static final byte SYMMETRIC_KEY = 10;

    /**
     * 加解密同方法,原理很简单:
     * 一个数对另外一个数进行两次 ^ 运算,得到其本身
     */
    public static void encryptOrDecrypt(String from, String to) throws IOException {
        FileInputStream inputStream = new FileInputStream(new File(from));
        byte[] datas = inputStream.readAllBytes();

        for (int i = 0; i < datas.length; i++) {
            datas[i] = (byte) (datas[i] ^ SYMMETRIC_KEY);
        }

        FileOutputStream outputStream = new FileOutputStream(new File(to));
        outputStream.write(datas);

        outputStream.close();
        inputStream.close();
    }

    public static void main(String[] args) throws IOException {
        String source = "image/dzq.jpeg";
        String encryptPath = "image/symm-dzq.jpeg";
        String decryptPath = "image/dec-dzq.jpeg";

        // 加密
        encryptOrDecrypt(source, encryptPath);

        // 解密
        encryptOrDecrypt(encryptPath, decryptPath);
    }


}

A number is XORed twice with another number to get itself. Here a picture is used for encryption and decryption, the original picture:

Encrypted picture:

Decrypted picture:

This is to use the same secret key for symmetric encryption, and the decrypted picture is exactly the same as the picture before encryption. Let's take a look at the effect of using asymmetric encryption. In fact, asymmetric encryption means that encryption and decryption use different secret keys, as long as the secret keys are different. Encryption algorithms generally do not pursue confidentiality, but completeness, that is, the attacker cannot find a breakthrough from the algorithm without knowing the secret key. The advantage is that it cannot be decrypted without the private key. The disadvantage is that the public key is public and may be intercepted; the public key does not contain server information, cannot verify the identity of the server, and there is a risk of man-in-the-middle attacks; and the efficiency will be lower than symmetric encryption.

public class AsymmetricEncryption {

    private static final int PUBLIC_KEY = 1;
    private static final int PRIVATE_KEY = -1;

    public static void encryptOrDecrypt(int key, String from, String to) throws IOException {
        FileInputStream inputStream = new FileInputStream(new File(from));
        byte[] datas = inputStream.readAllBytes();

        for (int i = 0; i < datas.length; i++) {
            datas[i] = (byte) (datas[i] + key);
        }

        FileOutputStream outputStream = new FileOutputStream(new File(to));
        outputStream.write(datas);

        outputStream.close();
        inputStream.close();
    }

    public static void main(String[] args) throws IOException {
        String source = "image/dzq.jpeg";
        String encryptPath = "image/symm-dzq-asy.jpeg";
        String decryptPath = "image/dec-dzq-asy.jpeg";

        // 加密
        encryptOrDecrypt(PUBLIC_KEY, source, encryptPath);

        // 解密
        encryptOrDecrypt(PRIVATE_KEY, encryptPath, decryptPath);
    }


}

It is still the same picture for encryption and decryption, the original picture:

Encrypted picture:

Decrypted picture:

integrity

After the confidentiality is achieved, it doesn't matter even if the encrypted data is intercepted, anyway, the interceptor cannot understand the data content. But based on the idea of ​​destroying if you can’t get it, although you can’t understand the content, you can scramble the data, cut the head and pinch the tail, and add content at will in the middle... so that the recipients can’t understand the processed data , this data loses its original integrity. When the server receives incomplete data, it cannot restore it back. It can only ensure the integrity of the data through the retransmission mechanism by asking the client to retransmit a complete data.

The retransmission mechanism is beyond the scope of today's discussion. What I want to discuss here is how the server verifies the integrity of the data. That is, when the client sends data, it will also send a digital digest equivalent to the original text. The server can calculate the abstract from the original text. No matter whether the original text or the abstract is modified, the abstract calculated from the original text and the abstract delivered must not be consistent, and the server can know whether the data has been modified midway.

The abstract algorithm is actually a string calculated by a compression algorithm (Hash function), because this algorithm can calculate a value of any length into a value of fixed length, so this fixed-length string is also called a data fingerprint. This one-way encryption algorithm cannot be decrypted, so the original text cannot be reversed through the abstract, and both parties can only use the same algorithm to calculate whether the abstract is consistent.

Authentication and non-repudiation

Identity authentication is actually the client and the server prove to each other that they are a decent terminal. Hackers can easily imitate a fake website, submit important information to the fake website without knowing it, and be deceived. Therefore, both the client and the server are required to use digital signatures to prevent masquerading.

How to understand this digital signature is actually the same as we often use signatures to prove our identity in life. Of course, signatures are often counterfeited, so a series of information such as signature, issuer, certifier, and validity time are all encapsulated into a digital certificate. And this certificate is not self-generated, but issued by an authoritative third-party trusted organization. This certificate is like our ID card, and the organization that issues the certificate is equivalent to the country that issued the ID card.

It is easy for a hacker to forge a website, but it is very difficult for a fake website to obtain a credible certificate. A certificate indicates that the client or server has been officially filed and is absolutely credible. It is undeniable that when a party has a certificate, every step of the operation will be recorded and cannot be denied. When a certain party does something bad, it can be traced back to the source based on the records to find the source of the operation.

The authority that issues certificates is called CA, and the authority of these CAs depends entirely on your trust: as long as you think it is trustworthy, then it is trustworthy. Therefore, if you do not trust all the CAs on the market, you can choose to issue a certificate for yourself. This kind of operation is more common in major banks. When you go to the bank to do online banking in your life, the bank will probably give you a USB-shield or something like that. At this time, you must have this U-shield to ensure safety before you can trade.

The level of the certificate is currently divided into 3 levels:

DV certificate: domain name verification certificate, the certificate review method is to issue a certificate after verifying the ownership of the domain name. This type of certificate is suitable for individuals and small and micro enterprises to apply for . The price is low and the application is quick. However, the company information cannot be displayed in the certificate, and the security is poor. If it is deployed on a web site, a lock symbol will be displayed in the browser.

OV certificate: Enterprise verification certificate, the certificate review method is to verify the domain name ownership and the real identity information of the applicant company before issuing the certificate. At present, the OV type certificate is the most widely used and most compatible certificate type in the world. This certificate type is suitable for medium-sized enterprises and Internet business applications. If it is deployed on a web site, a lock icon will be displayed in the browser, and relevant information of the enterprise can be viewed by clicking. Support ECC high security strength encryption algorithm, the encrypted data is more secure, and the encryption performance is higher.

EV certificate: Enhanced verification certificate. The certificate review level is the most stringent verification method of all types. On the basis of OV type verification, it additionally verifies the relevant information of other enterprises, such as the bank account opening license certificate. EV type certificates are mostly used in high security standard industries such as banking, finance, securities, and payment. If it is deployed on a web site, it can display a unique EV green logo address bar in the address bar, marking the credibility level of the site to the greatest extent. Support ECC high security strength encryption algorithm, the encrypted data is more secure, and the encryption performance is higher.

In fact, most of the time in daily work, there is no need to study https too deeply. Generally speaking, small and micro enterprises will choose to directly purchase servers from large manufacturers for use. For example, cloud servers such as Alibaba Cloud and Tencent Cloud support Provide the https protocol, just spend money to buy a certificate from a trusted CA organization and configure it. As for the specific verification process, generation process, etc., as a developer, you don’t need to care too much, let alone the so-called self-built CA. Most of the time, you only need to configure it in Nginx.

But in my case, how can I just satisfy the simple configuration and get the job done? In the spirit of digging to the bottom, we will walk through the entire https generation process step by step.

Install the OpenSSL tools

Before starting, you need to complete the necessary preparations, that is, install the necessary OpenSSL tools:
win - http://slproweb.com/products/Win32OpenSSL.html
Linux - https://www.openssl.org/source/

This website is all in English, it doesn’t matter if you understand it or not, scroll down to find the download form

The upper half of this table is the high version of 3+, and the lower half is the low version of 1+. In fact, this low version is actually a long-term supported version, and its security, stability, and compatibility have been done very well. On the contrary, the compatibility of higher versions may not be as good as that of lower versions. Of course, the official must try to ensure the compatibility of higher versions as much as possible. Anyway, you can see for yourself.

After the installation is complete, open the installation directory and take a look at the folder, which contains tools to generate certificates and various encryption algorithms. Next, the server certificate will be generated in the win system, and then the relevant files will be uploaded to the Linux server. Generally, 3 files will be generated, namely key (private key), csr (certificate signing request file, that is, the certificate to be signed, which can also be used as a public key), and crt (certificate).

Generate certificate under win system

Go to the bin directory in the OpenSSL installation directory, open the cmd window, and execute the command:
openssl genrsa -des3 -out g:/my_cert/server.key

It is required to enter the secret key to generate the private key. The second line is for password confirmation. You will see that a private key has been generated in the specified directory:

When this thing is opened, it is a long encrypted string:

After having the private key, the next step is to generate the certificate to be signed, or execute it in the same cmd window:
openssl req -new -key g:/my_cert/server.key -out g:/my_cert/pub.csr

The password required here is the password for the private key just generated

The following input is some information encapsulated in the certificate, which is required to be provided when generating:

- Country Name (2 letter code) [XX]:CN # Request signer's information, CN represents China-
State or Province Name (full name) [Some-State]:Fujian # Request signer's province name-
Locality Name ( eg, city) []:Xiamen # City name of the person requesting the signatory-
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Feenix #Company name of the person requesting the signatory-
Organizational Unit Name (eg, section) []: Laboratory #Request signer's department name-
Common Name (eg server FQDN or YOUR name) []:www.feenix.com #Here, generally fill in the server domain name of the requester- 
Email Address []:[email protected] # Requester email address for

As for the last two:
A challenge password []: # A challenge password
An optional company name []: # An optional company name

It can be written or not. Anyway, I am playing by myself. I am too lazy to write. You will see that the certificate to be signed is generated in the specified directory:

Similarly, the opening of this thing is also a long encrypted string:

After having the private key and the certificate to be signed, the last step is to require an authoritative CA to sign the certificate to be signed. However, the messy information filled in the certificate to be signed just now is estimated to be verified by no serious CA institution, so it is better to ask for help than to ask ourselves. We build a CA institution to stamp and sign the certificate to be signed.

Self-built CA

Next, the same set of logic is used to generate the CA organization. The essence of the so-called CA organization is actually a set of certificates, but this set of certificates has credibility and can be used to sign others.

The operating systems we use (Win, Linux, Unix, Android, iOS...) all have many trusted root certificates preset. For example, my windows contains the root certificate of VeriSign. Then the browser accesses the server, such as When Alipay www.alipay.com, the server will send its server certificate to the user's browser during the SSL protocol handshake, and this server certificate is issued by such as VeriSign, so the verification is naturally passed.

In the same cmd window, create a CA private key command:
openssl genrsa -out g:/my_cert/myca.key 2048

Use the CA private key to generate the CA certificate to be signed command:
openssl.exe req -new -key g:/my_cert/myca.key -out g:/my_cert/myca.csr

Just like generating the certificate to be signed, don't think about repeating the process step by step. After you have the CA private key and the CA certificate to be signed, you can generate the CA root certificate command:
openssl.exe x509 -req -in g:/my_cert/myca.csr -extensions v3_ca -signkey g:/my_cert/myca.key -out g:/my_cert/myca.crt

Since the certificate of the CA organization is certified and signed by others, it has a relatively awesome name: the root certificate. After this thing is generated, even the icon is different from other certificates:

At this point, the CA organization has been generated, which are three files on the disk: myca.key, myca.csr, and myca.crt. Therefore, the CA institution is not an entity, but a set of digital certificates that encrypt various information of the entity.

Now you can finally use the CA authority to sign the previously generated certificate to be signed command:
openssl x509 -days 365 -req -in g:/my_cert/pub.csr -extensions v3_req -CAkey g:/my_cert/myca.key -CA g :/my_cert/myca.crt -CAcreateserial -out g:/my_cert/server.crt

-days 365 valid for 365 days
-in g:/my_cert/pub.csr certificate to be signed
-CAkey g:/my_cert/myca.key use CA institution private key
-CA g:/my_cert/myca.crt use CA institution certificate
-out g:/my_cert/server.crt Generate signed certificate

After the CA institution signs the certificate to be signed, the CA institution is useless. So far, it can be said that the generation of the https certificate has been completed, and the following only needs to configure the certificate on Nginx. In the deed production environment, it is basically only necessary to configure the certificate, and there is no need to generate the certificate by yourself.

HTTPS Nginx configuration and installation

Before the certificate is configured, it can only be accessed through http, and https cannot successfully access the nginx welcome page

Throw the two files server.key and server.crt generated earlier to the server, and then modify the nginx.conf configuration file:

At the end of the nginx.conf file, just write the paths of the two files. After modifying the configuration file, restart the nginx service, and you will find that you need to enter the password you entered when generating the private key:

After restarting nginx, you can see that it is now listening to port 443

Use https to access the nginx homepage, a security warning will pop up:

Why does this security warning pop up with the certificate? Because now only the server is configured with a certificate, but the client does not have the relevant certificate, so for the client, why should you recognize the pheasant certificate of your server. The client is currently unable to identify the authenticity of the server's certificate. Although the server's certificate is indeed signed and certified by a certain CA institution, whether the CA institution is authoritative or not has enough credibility is irrelevant to the client.

However, the warning is a warning and does not affect continued access:

So the next step is to ask the client to install the corresponding certificate. Here comes the question, two certificates are configured on nginx, which certificate should be installed on the client side? Or do you mean both? As I said before, when the two parties are shaking hands, the server will send its own certificate to the client for verification. The client takes the certificate sent by the server and looks around in the local certificate trust list, but finds nothing. The corresponding certificate, that is to say, I do not trust this certificate, so an insecure prompt will appear.

Execute the certmgr.msc command to view the list of all trusted CA institutions (root certificates) on the local system, there is no CA institution we just built ourselves:

Use Google Chrome to access the nginx homepage through https, and you will find that the certificate is invalid in the insecure prompt here.

In essence, the client's distrust of the certificate is actually the distrust of the issuer of the certificate signature. So as long as the CA organization is installed on the client, this distrust problem can be solved. The installation of the certificate on win is very simple, just double-click to run:

After the installation is complete, you can see the root certificate of the self-built CA organization in the root certificate directory just now

So far, the double-end certificate installation of https has been completed. But even after the client installs the root certificate, there will still be a warning when using https to access. The reason is that although the root certificate of the self-built CA has been installed in the operating system, the browser itself still thinks it is a bit suspicious. . Because the browsers produced by various major manufacturers are the same as the operating system, they are pre-installed with many trustworthy root certificates. Even if the root certificate of the self-built CA organization deceives the server and the client, but browsing Browsers cannot be fooled, unless the root certificate list that comes with the browser can be deeply changed, but the difficulty can be imagined.

HTTPS performance optimization

Why do you need to optimize

In fact, many so-called performance optimizations do not require us to optimize anything at all most of the time. On the contrary, in order to highlight their own abilities, some people forcefully optimize the results, which can only be counterproductive. Although mentioned earlier, the efficiency of https is worse than that of http, but after such a long period of iterative upgrades, the performance of https is almost the same as that of http.

If you must do something about optimization, you can consider thinking from these perspectives: protocol optimization, certificate optimization, session recovery, and other aspects. Also, the more forward-looking http2 is compatible with http1 and https.

As mentioned earlier, after adding s to http, the security is guaranteed, but each handshake based on the TLS protocol is a waste of time. After the client gets the certificate of the server, it takes time to verify the reliability of the certificate. Whether it is trustworthy, whether it has expired, etc. A series of questions; and each time the data of both parties is encrypted before it is sent, and it takes time to decrypt the data after it is received; these are very important invisible consumption.

Open a website at random, enter the developer mode in the browser, and observe the time-consuming waterfall flow:

Queueing: It means that the current browser can process only 6 requests concurrently. If there are too many requests, it needs to wait in line. You can see that the current request has been waiting in the queue for 1.41ms.

After the wait is completed, it is considered the actual processing time. Stalled: refers to the time it takes for the browser to prepare some resources. This preparation took 0.75ms; DNS Lookup: it took 22 microseconds to find the DNS domain name resolution, which is extremely Very short time; Initial connection: the initial connection, that is, the total time spent on the three-way handshake of TCP + the following SSL, up to 37.73ms, this time is not short; SSL: the TLS 1.2 or 1.3 version of https described above works Several handshakes of the process took 19.07ms.

After the connection is established, the real request and response will be made. Request sent: The sending of the request only takes 0.23ms; Waiting for server response: Waiting for the response from the server normally occupies the largest header forever, and consumes 248.5ms; Content download: Finally, it takes 27.18ms to download the content to the local , is also fairly fast.

From this picture, it can be clearly seen that the real effective communication between the client and the server is only the 0.23ms request and 27.18ms download, and the rest of the time is all kinds of delays, either to prepare for this or to After loading that, there is very little time for real work, which is very similar to the output time and fishing time when you usually go to work.

Optimization Strategy

Since we are waiting for various delays most of the time, the idea of ​​optimization is very simple, just find ways to reduce these delays.

From the perspective of protocol optimization , upgrading the protocol fundamentally and directly replacing 1.2 with 1.3 is the optimal solution, but if the protocol cannot be upgraded, then you can choose to replace the elliptic algorithm during key exchange with the x25519 algorithm. The efficiency of this algorithm lies in its support for False Start. Also try the preemptive mechanism, which can send messages to the other party in advance. The algorithm in version 1.2 can almost reach the efficiency of the algorithm in version 1.3.

From the perspective of certificate optimization , if certificate A performs signature verification for certificate B, then certificate A is the parent certificate of certificate B. This hierarchical relationship is called a certificate chain. Generally speaking, the server will send the certificates to the client one by one, and the client will accept and verify them one by one. And what the client does to verify the certificate is much more complicated than it sounds on the surface. The client will go to visit the official website of the certificate, and have to initiate an http request to go there. You know, it is in the basic handshake stage now, and the good guy came directly with a complete request! This efficiency is not slow who is slow.

In order to speed up related calculations, it is recommended to use the ECDSA algorithm instead of the RSA algorithm for server certificate encryption. In comparison, the ECDSA algorithm occupies less resources, operates faster, and has better confidentiality. And the validity period will also be generated when the certificate is generated, and the same is true for the root certificate. Assuming that once the root certificate expires and becomes invalid, when receiving the lower-level certificates for verification layer by layer, the root certificate is found to be invalid when it reaches the top layer, and the client will visit the website of the root certificate (CA organization) to remotely verify the current certificate. The certificate is legal. After such a toss, the time spent is almost unacceptable.

Therefore, it is recommended to replace the certificate revocation list with the online state OCSP protocol: you can directly access the CA authority of the current certificate, and let the CA authority reply the reliability of this certificate. However, this matter is not done by the client, but the server has already done it before sending it, so that the server can send the certificate and the status of the certificate to the client, and the client just accepts it.

From the perspective of session reuse , we all know that http is a stateless protocol. Every request is a new handshake action, and a new session will be established. For the https protocol, it is necessary to perform new sending certificates, accepting certificates, decryption, verification... and so on every time. If the session is established for the first time, a SessionID is generated, and the server sends the SessionID to the client. Each subsequent request, the client will bring this SessionID, and the server can reuse the original session without creating a new session, which can save the steps of decryption and verification in the middle, which is naturally efficient. greatly improved.

But there is a problem in this. For the client, there are not many connected servers after all, so it is not a big problem to store the SessionID; but the server will connect to countless clients at the same time. If the SessionID of each client is stored locally, This amount of data is not small. Thus, the concept of SessionTicket was born: the server encrypts the generated SessionID with the public key and sends it to the client. The encrypted string is the ticket. The next time the client requests to bring this ticket, the server only needs to decrypt the ticket to get the SessionID after getting it. It does not need to be stored locally, which saves precious server resources.

In addition to the SessionID mechanism, there is another PSK mechanism, which is Pre-shared Key (pre-shared key). It can achieve 0-RTT, that is, there is no time delay, that is, send data directly! That's right, since session multiplexing is required, that is to say, a connection has been established before, so what's the point of shaking hands? Just send data to the other party directly, which is very simple and rude. Note: The handshake mentioned here is the TLS protocol handshake mechanism after the TCP handshake. No matter what kind of session multiplexing, the TCP handshake of each connection is an essential stage . Of course, the disadvantage of this mechanism is also obvious. After the data is intercepted, the attacker can send the same data to the server indefinitely, because every time the data comes is legal, a large number of intensive requests will cause the server to crash. The solution is also very simple, just bring a timestamp.

In addition to the software optimization mentioned above, the hardware level can also be optimized and upgraded. This kind of scenario is almost never encountered in actual work, so I won't go into details here. The general points are listed in the above picture, please take a look if you are interested.

HTTP2 features

https means that http adds TLS protocol at the session layer to ensure data security, while http2 is just an upgrade for http and has nothing to do with https. But since it is upgraded based on http, it is also compatible with http and https. On the open Internet, http 2 will only be used for the https protocol, while the http protocol will continue to use http1. The purpose is to increase the use of encryption technology on the open Internet to provide strong protection to curb active attacks.

 Why is header compression required in http2? Just open it to see the content of the request header of a request is quite a lot:

After compressing the request header, it can be regarded as saving bandwidth to some extent. In fact, most of the time, there is not much data actually used for the request, but a lot of messy data is carried. How much data is required for physical energy, right? Write code every day and don’t know, typical top-heavy. For these data, a compression algorithm was specially invented for compression, which is HPACK (Huffman coding).

Huffman coding can be used to solve the problem of variable-length data coding, that is, some data codes are long and some data codes are short. Then create a huge binary tree for all characters to record. The leaf node of the binary tree is each character. The path from each leaf node to the root node must be different, and this path represents the so-called encoding. Of course, both the server and the client must have this set of codes, so that they can ensure that the content seen by both parties is consistent.

Another upgrade point is to change the characters transmitted by the old version to transmit binary streams. Characters are really convenient for human eyes to recognize, but computers don't know characters, computers only know binary codes. Therefore, in http2, binary streams are directly used for data transmission, so naturally there is no need to divide headers, bodies, etc., and all data is integrated into one header to form a complete package. Then divide this packet into N multiple segments, each of which is a frame (frame), making full use of the idea of ​​splitting into parts. The advantage of this is that these frame data can be sent out in parallel through different channels at the same time. And the same channel can be provided to different clients, so that the channel resources can be fully utilized, the performance can be squeezed as much as possible, and the channel can not be idle for as long as possible.

And every time the client sends a request to the server, the server not only responds to the requested data, but also actively pushes the data that the client may not request the next time or even the next few times to the client, reducing the need for the client to proactively The number of requests, this is the Cache Push mechanism.

Then some basic concepts of http2 will be introduced here first. After all, http2 is not yet the mainstream force in the market, and even TLS1.3 has not been popularized. There is still a long way to go for the complete popularization of these technologies.

Guess you like

Origin blog.csdn.net/FeenixOne/article/details/129261291