MySQL encoding format changed to utf8mb4

       The utf-8 encoding may be 2 bytes, 3 bytes, or 4 bytes of characters, but MySQL's utf8 encoding only supports 3 bytes of data, while the expression data of the mobile terminal is 4 bytes of characters. If you directly insert emoticon data into a database encoded with utf-8, an SQL exception will be reported in the java program.

 

java.sql.SQLException: Incorrect string value: ‘\xF0\x9F\x92\x94’ for column ‘name’ at row 1
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3593)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3525)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1986)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2140)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2620)
at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1662)
at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1581)

       4-byte characters can be encoded and stored, and then decoded when they are taken out. But doing so will cause encoding and decoding to occur wherever the character is used.

 

       utf8mb4 encoding is a superset of utf8 encoding, compatible with utf8, and can store 4-byte emoji characters.
       The advantage of using utf8mb4 encoding is that when storing and retrieving data, there is no need to consider the encoding and decoding of emoji characters.

 

Change database encoding to utf8mb4

1. MySQL version

      The minimum supported mysql version of utf8mb4 is 5.5.3+, if not, please upgrade to a newer version.

2. MySQL driver

      5.1.34 is available, the minimum cannot be lower than 5.1.13

3. Modify the MySQL configuration file

 

Modify the mysql configuration file my.cnf (windows is my.ini) 
my.cnf is generally located in etc/mysql/my.cnf. After finding it, please add the following content to the following three parts: 
[client] 
default-character-set = utf8mb4 
[mysql] 
default-character-set = utf8mb4 
[mysqld] 
character-set-client-handshake = FALSE #Ignore client handshake encoding
character-set-server = utf8mb4 
# collation-server = utf8mb4_unicode_ci can not be configured
# init_connect='SET NAMES utf8mb4' can not be configured

4. Restart the database and check the variables

SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%';

Variable_name Value
character_set_client utf8mb4
character_set_connection utf8mb4
character_set_database utf8mb4
character_set_filesystem binary
character_set_results utf8mb4
character_set_server utf8mb4
character_set_system utf8
collation_connection utf8mb4_unicode_ci
collation_database utf8mb4_unicode_ci
collation_server utf8mb4_unicode_ci

It doesn't matter what collation_connection, collation_database, collation_server are.

But it must be guaranteed

System Variable Description
character_set_client (Character set used by client source data)
character_set_connection (connection layer character set)
character_set_database (The default character set of the currently selected database)
character_set_results (Query result character set)
character_set_server (default internal operation character set)

These variables must be utf8mb4.

5. Configuration of database connection

In the database connection parameters: 
characterEncoding=utf8 will be automatically recognized as utf8mb4, or it will be automatically detected without this parameter. 
And autoReconnect=true must be added.

6. Convert the database and already built tables to utf8mb4

Change database encoding: ALTER DATABASE caitu99 CHARACTER SET  utf8mb4 COLLATE  utf8mb4_general_ci;

Change the table encoding: ALTER TABLE  TABLE_NAME CONVERT TO CHARACTER SET  utf8mb4 COLLATE utf8mb4_general_ci
also change the encoding of the column if necessary

 

See: https://www.cnblogs.com/shihaiming/p/5855616.html

See: http://blog.csdn.net/woslx/article/details/49685111

 

 

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326044152&siteId=291194637