Three queries of mysql JDBC (common, stream, cursor)

When using JDBC to send queries to mysql, there are three ways:

  • Regular query: The JDBC driver will block and read all the query data into the JVM memory at one time, or read in pages
  • Streaming query: Every time rs.next is executed, it will judge whether the data needs to be obtained from the mysql server. If it is necessary to trigger the reading of a batch of data (possibly n rows) and load it into the JVM memory for business processing
  • Cursor query: through the fetchSize parameter, control how many rows of data are read from the mysql server each time.

1. Regular query

public static void normalQuery() throws SQLException {
    Connection connection = DriverManager.getConnection("jdbc:mysql://localhost:3307/test?useSSL=false", "root", "123456");
    PreparedStatement statement = connection.prepareStatement(sql);
    //statement.setFetchSize(100); //不起作用
    ResultSet resultSet = statement.executeQuery();
    
    while(resultSet.next()){
        System.out.println(resultSet.getString(2));
    }
    resultSet.close();
    statement.close();
    connection.close();
}

1) Description:

  1. The fourth line setting feedSize has no effect.
  2. The fifth line statement.executeQuery() will block the execution of the query, because it needs to wait until all the data is returned and placed in memory; then each time the resultSet.next() method is executed, the data will be obtained from the memory.

2) Set the jvm memory to be smaller (-Xms16m -Xmx16m), and OOM will be generated for large data queries:

In order to avoid OOM, usually we will use paging query, or the following two ways.

2. Streaming query

public static void streamQuery() throws Exception { 
    Connection connection = DriverManager.getConnection("jdbc:mysql://localhost:3307/test?useSSL=false", "root", "123456");
    PreparedStatement statement = connection.prepareStatement(sql, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);    
    statement.setFetchSize(Integer.MIN_VALUE); 
    //或者通过 com.mysql.jdbc.StatementImpl
    ((StatementImpl) statement).enableStreamingResults();
    
    ResultSet rs = statement.executeQuery();
    while (rs.next()) {
        System.out.println(rs.getString(2));
    }
    rs.close();
    statement.close();
    connection.close();
}

2.1) Conditions for streaming queries:

With the advent of big data, using streaming queries for millions or tens of millions of data can effectively avoid OOM. When statement.executeQuery() is executed, all data will not be read from the TCP response stream. When rs.next() is executed below, part of the data will be read from the TCP response stream as needed.

  1. When creating a Statement, you need to specify ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY
  2. Set fetchSize to Integer.MIN_VALUE

Or set through the enableStreamingResults() method of com.mysql.jdbc.StatementImpl. The two are consistent. Look at the jdbc (com.mysql.jdbc.StatementImpl) source code of mysql:

2.2) Principle of streaming query:

1) Basic concepts

We need to know that the communication between the jdbc client and the mysql server is established through TCP, and the data is transmitted using the mysql protocol. First declare a concept: After the three-way handshake establishes a TCP connection, you can communicate on this channel until the connection is closed.

In TCP, the sender and receiver** can be client/server or server/client**, and the two parties in the communication can either receive data or send data (full-duplex) at any time. In communication, neither the sender nor the sender maintains the boundary of the record, so it needs to be expressed according to a certain agreement. In mysql, it will interact according to the mysql protocol.

With the above concepts, we redefine these two queries:

When executing st.executeQuery(), the jdbc driver will establish a TCP connection with the mysql server through the connection object, and at the same time send the sql command in this link channel and accept the return. The difference between the two is:

  1. Ordinary query: also called batch query, the jdbc client will block and read the returned data of the mysql service from the TCP channel at one time;
  2. Streaming query: Read the data returned by the mysql service from the TCP channel in batches. The amount of data read each time is not one line (usually the size of a package). When the jdbc client calls the rs.next() method, it will Read partial data from the TCP stream channel as needed. (Not every time you read a row of data, almost everything on the Internet is wrong!)

2) Source code viewing:

Follow up from the statement.executeQuery() method, the main calls are as follows:

protected ResultSetInternalMethods executeInternal(int maxRowsToRetrieve, Buffer sendPacket, boolean createStreamingResultSet, boolean queryIsSelectOnly,
            Field[] metadataFromCache, boolean isBatch) throws SQLException {
        synchronized (checkClosed().getConnectionMutex()) {
            MySQLConnection locallyScopedConnection = this.connection;
            rs = locallyScopedConnection.execSQL(this, null, maxRowsToRetrieve, sendPacket, this.resultSetType, this.resultSetConcurrency,
                            createStreamingResultSet, this.currentCatalog, metadataFromCache, isBatch);
            return rs;
        }
public ResultSetInternalMethods execSQL(StatementImpl callingStatement, String sql, int maxRows, Buffer packet, int resultSetType, int resultSetConcurrency,
            boolean streamResults, String catalog, Field[] cachedMetadata, boolean isBatch) throws SQLException {
        synchronized (getConnectionMutex()) {
            return this.io.sqlQueryDirect(callingStatement, null, null, packet, maxRows, resultSetType, resultSetConcurrency, streamResults, catalog,
                        cachedMetadata);
        }
}
final ResultSetInternalMethods sqlQueryDirect(StatementImpl callingStatement, String query, String characterEncoding, Buffer queryPacket, int maxRows,
            int resultSetType, int resultSetConcurrency, boolean streamResults, String catalog, Field[] cachedMetadata) throws Exception {
        Buffer resultPacket = sendCommand(MysqlDefs.QUERY, null, queryPacket, false, null, 0);
        ResultSetInternalMethods rs = readAllResults(callingStatement, maxRows, resultSetType, resultSetConcurrency, streamResults, catalog, resultPacket,
                    false, -1L, cachedMetadata);
        return rs;
}
ResultSetImpl readAllResults(StatementImpl callingStatement, int maxRows, int resultSetType, int resultSetConcurrency, boolean streamResults,
            String catalog, Buffer resultPacket, boolean isBinaryEncoded, long preSentColumnCount, Field[] metadataFromCache) throws SQLException {
        ResultSetImpl topLevelResultSet = readResultsForQueryOrUpdate(callingStatement, maxRows, resultSetType, resultSetConcurrency, streamResults, catalog,
                resultPacket, isBinaryEncoded, preSentColumnCount, metadataFromCache);
        return topLevelResultSet;
}
protected final ResultSetImpl readResultsForQueryOrUpdate(StatementImpl callingStatement, int maxRows, int resultSetType, int resultSetConcurrency,
            boolean streamResults, String catalog, Buffer resultPacket, boolean isBinaryEncoded, long preSentColumnCount, Field[] metadataFromCache) throws SQLException {
            com.mysql.jdbc.ResultSetImpl results = getResultSet(callingStatement, columnCount, maxRows, resultSetType, resultSetConcurrency, streamResults,
                    catalog, isBinaryEncoded, metadataFromCache);
            return results;
        }
}
protected ResultSetImpl getResultSet(StatementImpl callingStatement, long columnCount, int maxRows, int resultSetType, int resultSetConcurrency,
            boolean streamResults, String catalog, boolean isBinaryEncoded, Field[] metadataFromCache) throws SQLException {
        Buffer packet; // The packet from the server
        RowData rowData = null;
        if (!streamResults) {
            rowData = readSingleRowSet(columnCount, maxRows, resultSetConcurrency, isBinaryEncoded, (metadataFromCache == null) ? fields : metadataFromCache);
        } else {
            rowData = new RowDataDynamic(this, (int) columnCount, (metadataFromCache == null) ? fields : metadataFromCache, isBinaryEncoded);
            this.streamingData = rowData;
        }
        ResultSetImpl rs = buildResultSetWithRows(callingStatement, catalog, (metadataFromCache == null) ? fields : metadataFromCache, rowData, resultSetType,
                resultSetConcurrency, isBinaryEncoded);
        return rs;
}

illustrate:

  1. The sendCommand in the sqlQueryDirect() method will send the sql command request to the mysql server through io, and get the return stream mysqlOutput
  2. The getResultSet() method will determine whether it is a streaming query or a batch query. The MySQL driver will select the corresponding ResultSet implementation class according to different parameter settings, corresponding to three query methods:
  • RowDataStatic Static result set, default query method, normal query
  • RowDataDynamic dynamic result set, streaming query
  • RowDataCursor cursor result set, server-side query based on cursor

Look at the above code (line 41), for batch query: the readSingleRowSet method will cycle through the nextRow method to get all the data, and then put it in the rows of the jvm memory:

For streaming queries: directly create a RowDataDynamic object and return it. Later, when using rs.next() to obtain data, it will read data from the mysqlOutput stream as needed.

2.3) The pit of streaming query:

public static void streamQuery2() throws Exception { 
    Connection connection = DriverManager.getConnection("jdbc:mysql://localhost:3307/test?useSSL=false", "root", "123456");
    //statement1
    PreparedStatement statement = connection.prepareStatement(sql, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);    
    statement.setFetchSize(Integer.MIN_VALUE); 
    ResultSet rs = statement.executeQuery();
    if (rs.next()) {
        System.out.println(rs.getString(2));
    }
    //statement2
    PreparedStatement statement2 = connection.prepareStatement(sql, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);    
    statement2.setFetchSize(Integer.MIN_VALUE); 
    ResultSet rs2 = statement2.executeQuery();
    if (rs2.next()) {
        System.out.println(rs2.getString(2));
    }
//      rs.close();
//      statement.close();
//      connection.close();
}

Results of the:

test1
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@45c8e616 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:869)
	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:865)
	at com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:3217)
	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2453)
	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2683)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2482)
	at com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1465)
	at com.mysql.jdbc.StatementImpl.setupStreamingTimeout(StatementImpl.java:726)
	at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1939)
	at com.tencent.clue_disp_api.MysqlTest.streamQuery2(MysqlTest.java:79)
	at com.tencent.clue_disp_api.MysqlTest.main(MysqlTest.java:25)

MySQL Connector/J 5.1 Developer Guide中原文:

There are some caveats with this approach. You must read all of the rows in the result set (or close it) before you can issue any other queries on the connection, or an exception will be thrown . After getting a ResultSet, before iterating all the elements through next or calling close to close it, you cannot use the same database connection to initiate another query, otherwise an exception will be thrown (the first call is normal, and the second one is thrown abnormal).

2.4) Packet capture verification:

Looking at the packages of 3307 > 62169, we can find that the acks are all 1324, which proves that they are all for the returned data of the SQL request at that time.

3. Cursor query

public static void cursorQuery() throws Exception {
    Connection connection = DriverManager.getConnection("jdbc:mysql://localhost:3307/test?useSSL=false&useCursorFetch=true", "root", "123456");
    ((JDBC4Connection) connection).setUseCursorFetch(true); //com.mysql.jdbc.JDBC4Connection
    Statement statement = connection.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);    
    statement.setFetchSize(2);    
    ResultSet rs = statement.executeQuery(sql);    
    while (rs.next()) {
        System.out.println(rs.getString(2));
        Thread.sleep(5000);
    }
    
    rs.close();
    statement.close();
    connection.close();
}

1) Description:

  • In the connection parameters, useCursorFetch=true needs to be spliced;
  • When creating a Statement, you need to set ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY
  • Set fetchSize to control how many pieces of data are fetched each time

2) Packet capture verification:

Through wireshark capture, you can see that every time rs.next() is executed, a request will be sent to the mysql service, and the mysql service will return two pieces of data:

3) Points to pay attention to in cursor query:

Since MySQL does not know when the client will consume the data, and its own corresponding table may have DML write operations, MySQL needs to create a temporary space to store the data that needs to be taken away. Therefore, when you enable useCursorFetch to read large tables, you will see several phenomena on MySQL:

  1. IOPS soars (IOPS (Input/Output Per Second): the number of disk reads and writes per second)
  2. Disk space soars
  3. After the client JDBC initiates SQL, it waits for a long time for the SQL response data. During this time, the server is preparing data.
  4. After the data preparation is completed and the data transmission stage starts, the network response starts to soar, and the IOPS changes from "reading and writing" to "reading".
  5. CPU and memory will increase by a certain percentage

Guess you like

Origin blog.csdn.net/liuxiao723846/article/details/130726967