NIO和Netty的思考和理解(一)

一直以来对于java IO和NIO的阻塞和非阻塞的理解片面,这段时间深入学习下,在此基础之上,拓展下Netty的架构和设计思想。

Java NIO和IO的主要区别

NIO和IO的区别,网上的教程博客以及书籍说的很多,大多有以下几个最大的区别。

IO NIO
面向流 面向缓冲
阻塞IO 非阻塞IO
选择器

可简单认为:IO是面向流的处理,NIO是面向块(缓冲区)的处理,面向流的I/O 系统一次一个字节地处理数据。
一个面向块(缓冲区)的I/O系统以块的形式处理数据。
很多书籍和教程在说明阻塞和非阻塞直接是使用网络编程来作为实例的,实际上要分为文件读写和网络编程区别对待。

1. 线程阻塞的原因

线程阻塞的原因有很多,按照《Java网络编程精解》里的描述,主要有以下几个方面。

  • 线程执行了Thread.sleep(int n)方法,线程放弃CPU,睡眠n毫秒,然后恢复运行。
  • 线程要执行一段同步代码,由于无法获得相关的同步锁,只好进入阻塞状态,等到获得了同步锁,才能恢复运行。
  • 线程执行了一个对象的wait()方法,进去阻塞状态,只有等待其他线程执行了该对象的notify()或notifyAll()方法,才可能将其唤醒。
  • 线程执行I/O操作或者进行远程通信时,会因为等待相关的资源而进入阻塞状态。例如,当线程执行System.in.read()时,如果用户没有向控制台输入数据,则该进程会一直等读到了用户的输入数据才能read()方法返回。
    Stackoverflow对于InputStream.read()的阻塞是这样解释的,

OK, this is a bit of a mess so first thing lets clear this up: InputStream.read() blocking has nothing to do with multi-threading. If you have multiple threads reading from the same input stream and you trigger two events very close to each other - where each thread is trying to consume an event then you’d get corruption: the first thread to read will get some bytes (possibly all the bytes) and when the second thread gets scheduled it will read the rest of the bytes. If you plan to use a single IO stream in more then one thread, always synchronized() {} on some external constraint.
Second, if you can read from your InputStream until you get -1 and then wait and can read again later, then the InputStream implementation you are using is broken! The contract for InputStream clearly states that an InputStream.read() should only return -1 when there is no more data to read because the end of the entire stream has been reached and no more data will EVER be available - like when you read from a file and you reach the end.
The behavior for “no more data is available now, please wait and you’ll get more” is for read() to block and not return until there is some data available (or an exception is thrown).

第三点说的是类似于加了一个synchronized的方法,多线程执行的时候,只会同时执行一个。

2. 文件读写IO都会阻塞

不管是对于IO还是NIO,操作文件的时候都是阻塞的。

2.1 传统IO读取文件

daily01.txt就写了几个字符:abcdefg

public static void main(String[] args) throws Exception{
        try {
            FileInputStream fis = new FileInputStream(new File("daily01.txt"));
            InputStreamReader isr = new InputStreamReader(fis);//读取

            Thread  t1 = new Thread(()->{
                try {
                    int b = 0;
                    while ((b = isr.read()) != -1) {//从底层源码上分析这样写有问题
                        Thread.sleep(1000);
                        System.out.print((char)b);
                    }
                }catch (FileNotFoundException e) {
                    e.printStackTrace();
                } catch (IOException e) {
                    e.printStackTrace();
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            });

            t1.start();

            Thread  t2 = new Thread(()->{
                try {
                    int b = 0;
                    while ((b = isr.read()) > 0) {
                        Thread.sleep(1500);
                        System.out.print((char)b);
                    }
                }catch (FileNotFoundException e) {
                    e.printStackTrace();
                } catch (IOException e) {
                    e.printStackTrace();
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            });

            t2.start();

        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

输出结果为:

abcedfg
Process finished with exit code 0

结果表明是线程安全的,这里能最终能够结束的while ((b = isr.read()) != -1)的原因是文件在末尾会有结束的标志位EOF。
这里深入分析下while ((b = isr.read()) != -1)while ((b = isr.read()) >0)的写法更加规范,网上这两种写法都有。
首先肯定是查看openJdk里的openjdk\jdk\src\share\native\java\io\FileInputStream.c文件,定位到

JNIEXPORT jint JNICALL
Java_java_io_FileInputStream_readBytes(JNIEnv *env, jobject this,
        jbyteArray bytes, jint off, jint len) {
    return readBytes(env, this, bytes, off, len, fis_fd);//这里是调用io_util.c里的readByte方法
}

继续查看io_util.c文件

jint
readBytes(JNIEnv *env, jobject this, jbyteArray bytes,
          jint off, jint len, jfieldID fid)
{
    jint nread;
    char stackBuf[BUF_SIZE];
    char *buf = NULL;
    FD fd;
    //下面是一系列的校验,大多会抛出异常,返回-1
    if (IS_NULL(bytes)) {
        JNU_ThrowNullPointerException(env, NULL);
        return -1;
    }

    if (outOfBounds(env, off, len, bytes)) {
        JNU_ThrowByName(env, "java/lang/IndexOutOfBoundsException", NULL);
        return -1;
    }
   //注意这里是返回0
    if (len == 0) {
        return 0;
    } else if (len > BUF_SIZE) {
        buf = malloc(len);
        if (buf == NULL) {
            JNU_ThrowOutOfMemoryError(env, NULL);
            return 0;
        }
    } else {
        buf = stackBuf;
    }

    //下面是开始获取FD文件并读取
    fd = GET_FD(this, fid);
    if (fd == -1) {
        JNU_ThrowIOException(env, "Stream Closed");
        nread = -1;
    } else {
        //IO_Read这里是JVM根据不同操作系统映射到不同的底层IO方法,如果是liunx是调用的
        //#include <unistd.h>
        //ssize_t read (int fd, void *buf, size_t len); 
        nread = IO_Read(fd, buf, len);
        if (nread > 0) {
            (*env)->SetByteArrayRegion(env, bytes, off, nread, (jbyte *)buf);
        } else if (nread == -1) {
            JNU_ThrowIOExceptionWithLastError(env, "Read error");
        } else { /* EOF */
            nread = -1;//如果是小于等于0也是返回-1
        }
    }

    if (buf != stackBuf) {
        free(buf);
    }
    return nread;
}

百度下了linux的read()方法

NAME
       read - read from a file descriptor
				读 从   文件描述符 
SYNOPSIS
       #include <unistd.h>

       ssize_t read(int fd, void *buf, size_t count);
			   参数一:需要读取的文件描述符  
			   参数二:读取后的数据缓存地址  
			   参数三:需要读取的数据大小
				返回值:   成功   返回读取到字节数  
						    0  	  已经读到文件的末尾了  
							-1    读取错误  

	

注意: 读函数,他只会根据文本中的内容去读取。  假设文本中只有10 个字节的数据,
	  用户要求读取100个字节的数据,那么最终只会读到   10个!!!   (因为它都没得给你读取啊。。。。。)
      所以利用读取函数的时候,都是“越界"读取,要求读取的数据大小,要比文件中数据的真实大小要大!!!! 

对read()调用可能会有许多结果:

1.返回一个等于len的值,所有字节存入buf中。
2.返回一个大于0小于len的 值,该情况出现在一个信号打断了读取过程或读取中发生错误,有效字节大于0小于len或文件已抵达EOF.
3.返回0,标志EOF,无数据可读。
4.调用阻塞,无可用数据读取。
5.返回-1,且errno设置为EINTR。表示读取前收到一个信号,可以重新调用

综上分析得出,大部分的结果都是返回-1,即使是操作系统正常的文件结束0标识,java也会加工返回-1,返回的0的情况是只有传参len=0和内存溢出。
所以最正确的写法还是while ((b = isr.read()) >0).

2.2 NIO读取文件

  RandomAccessFile randomAccessFile = new RandomAccessFile("test.txt", "rw");  
    FileChannel inChannel = aFile.getChannel();  
    ByteBuffer buf = ByteBuffer.allocate(1024);  
    int read = inChannel.read(buf);  
    while (read!= -1) {   
        System.out.println("Read " + read);  
        buf.flip();  
      
        while(buf.hasRemaining()){  
            System.out.print((char) buf.get());  
        }  
        buf.clear();  
        bytesRead = inChannel.read(buf);  
    }  
    randomAccessFile .close();  

写法上还是阻塞的,观察FileChannel 的实现FileChannelImpl.read()底层代码也是同步的。

public int read(ByteBuffer var1) throws IOException {
        this.ensureOpen();
        if (!this.readable) {
            throw new NonReadableChannelException();
        } else {
            Object var2 = this.positionLock;
            synchronized(this.positionLock) {
                int var3 = 0;
                int var4 = -1;

                try {
                    this.begin();
                    var4 = this.threads.add();
                    if (!this.isOpen()) {
                        byte var12 = 0;
                        return var12;
                    } else {
                        do {
                            var3 = IOUtil.read(this.fd, var1, -1L, this.nd);
                        } while(var3 == -3 && this.isOpen());

                        int var5 = IOStatus.normalize(var3);
                        return var5;
                    }
                } finally {
                    this.threads.remove(var4);
                    this.end(var3 > 0);

                    assert IOStatus.check(var3);

                }
            }
        }
    }

3. 网络阻塞IO

传统的BIO模型丽socket.accept()、socket.read()、socket.write()三个主要函数都是同步阻塞的,所以一般都是用多线程模型去处理work,对于read()、write()方法必须要传递结束符,否则会一直阻塞。

ServerSocket serverSocket = new ServerSocket();
        serverSocket.bind(new InetSocketAddress(8899));

        while(true) {//接收必须是单线程阻塞的
            Socket socket = serverSocket.accept();//这里会阻塞,如果work不用多线程,下一个请求在队列里排队,类似于加了synchronized方法
            if (socket.isConnected()) {
                SocketAddress remoteAddress = socket.getRemoteSocketAddress();
                System.out.println("连接建立:" + remoteAddress.toString());

                try {
                    InputStream is = socket.getInputStream();
                    InputStreamReader isr = new InputStreamReader(is);//读取
                    int b = 0;
                    while ((b = isr.read()) != 13) {//这里加了回车结束符标识,否则会一直阻塞等待
                        System.out.print((char)b);
                    }
                    socket.close();
                }catch (IOException e){
                    e.printStackTrace();
                }
                }
            }
        }

上面示例中的InputStream实现是SocketInputStream,查看socketRead0源码如下:

JNIEXPORT jint JNICALL
Java_java_net_SocketInputStream_socketRead0(JNIEnv *env, jobject this,
                                            jobject fdObj, jbyteArray data,
                                            jint off, jint len, jint timeout)
{
    char *bufP;
    char BUF[MAX_BUFFER_LEN];
    jint fd, newfd;
    jint nread;

    if (IS_NULL(fdObj)) {
        JNU_ThrowByName(env, JNU_JAVANETPKG "SocketException", "socket closed");
        return -1;
    }
    fd = (*env)->GetIntField(env, fdObj, IO_fd_fdID);
    if (fd == -1) {
        NET_ThrowSocketException(env, "Socket closed");
        return -1;
    }

    /*
     * If the caller buffer is large than our stack buffer then we allocate
     * from the heap (up to a limit). If memory is exhausted we always use
     * the stack buffer.
     */
    if (len <= MAX_BUFFER_LEN) {
        bufP = BUF;
    } else {
        if (len > MAX_HEAP_BUFFER_LEN) {
            len = MAX_HEAP_BUFFER_LEN;
        }
        bufP = (char *)malloc((size_t)len);
        if (bufP == NULL) {
            /* allocation failed so use stack buffer */
            bufP = BUF;
            len = MAX_BUFFER_LEN;
        }
    }


    if (timeout) {
        if (timeout <= 5000 || !isRcvTimeoutSupported) {
            int ret = NET_Timeout (fd, timeout);

            if (ret <= 0) {
                if (ret == 0) {
                    JNU_ThrowByName(env, JNU_JAVANETPKG "SocketTimeoutException",
                                    "Read timed out");
                } else if (ret == JVM_IO_ERR) {
                    JNU_ThrowByName(env, JNU_JAVANETPKG "SocketException", "socket closed");
                } else if (ret == JVM_IO_INTR) {
                    JNU_ThrowByName(env, JNU_JAVAIOPKG "InterruptedIOException",
                                    "Operation interrupted");
                }
                if (bufP != BUF) {
                    free(bufP);
                }
                return -1;
            }

            /*check if the socket has been closed while we were in timeout*/
            newfd = (*env)->GetIntField(env, fdObj, IO_fd_fdID);
            if (newfd == -1) {
                NET_ThrowSocketException(env, "Socket Closed");
                if (bufP != BUF) {
                    free(bufP);
                }
                return -1;
            }
        }
    }

    nread = recv(fd, bufP, len, 0);//该方法为阻塞方法
    if (nread > 0) {
        (*env)->SetByteArrayRegion(env, data, off, nread, (jbyte *)bufP);
    } else {
        if (nread < 0) {
            // Check if the socket has been closed since we last checked.
            // This could be a reason for recv failing.
            if ((*env)->GetIntField(env, fdObj, IO_fd_fdID) == -1) {
                NET_ThrowSocketException(env, "Socket closed");
            } else {
                switch (WSAGetLastError()) {
                    case WSAEINTR:
                        JNU_ThrowByName(env, JNU_JAVANETPKG "SocketException",
                            "socket closed");
                        break;

                    case WSAECONNRESET:
                    case WSAESHUTDOWN:
                        /*
                         * Connection has been reset - Windows sometimes reports
                         * the reset as a shutdown error.
                         */
                        JNU_ThrowByName(env, "sun/net/ConnectionResetException",
                            "");
                        break;

                    case WSAETIMEDOUT :
                        JNU_ThrowByName(env, JNU_JAVANETPKG "SocketTimeoutException",
                                       "Read timed out");
                        break;

                    default:
                        NET_ThrowCurrent(env, "recv failed");
                }
            }
        }
    }
    if (bufP != BUF) {
        free(bufP);
    }
    return nread;
}

由上面的源码可知,一般返回-1的场景都是异常或者超时,Recv本身是阻塞的,对于http这种短连接协议来说,会有CRLF的标识来标识当前的socket已结束。
这是sun.net.httpserver.Request.readLine()方法,会将InputStream读取到CRLF结尾作为结束。

public String readLine() throws IOException {
        boolean var1 = false;
        boolean var2 = false;
        this.pos = 0;
        this.lineBuf = new StringBuffer();

        while(!var2) {
            int var3 = this.is.read();
            if (var3 == -1) {
                return null;
            }

            if (var1) {
                if (var3 == 10) {//LF换行
                    var2 = true;
                } else {
                    var1 = false;
                    this.consume(13);
                    this.consume(var3);
                }
            } else if (var3 == 13) {//CR回车
                var1 = true;
            } else {
                this.consume(var3);
            }
        }

        this.lineBuf.append(this.buf, 0, this.pos);
        return new String(this.lineBuf);
    }

同样在tomcat6.0之前(6.0支持BIO和NIO)也是采用的BIO的SocketInputStream读取String,下面是CoyoteReader源码,路径为apache-tomcat-6.0.53-src\java\org\apache\catalina\connector

private static final char[] LINE_SEP = { '\r', '\n' };
private static final int MAX_LINE_LENGTH = 4096;
public String readLine()
        throws IOException {

        if (lineBuffer == null) {
            lineBuffer = new char[MAX_LINE_LENGTH];
       }

        String result = null;

        int pos = 0;
        int end = -1;
        int skip = -1;
        StringBuffer aggregator = null;
        while (end < 0) {
            mark(MAX_LINE_LENGTH);
            while ((pos < MAX_LINE_LENGTH) && (end < 0)) {
                int nRead = read(lineBuffer, pos, MAX_LINE_LENGTH - pos);
                if (nRead < 0) {
                    if (pos == 0 && aggregator == null) {
                        return null;
                    }
                    end = pos;
                    skip = pos;
                }
                for (int i = pos; (i < (pos + nRead)) && (end < 0); i++) {
                    if (lineBuffer[i] == LINE_SEP[0]) {
                        end = i;
                        skip = i + 1;
                        char nextchar;
                        if (i == (pos + nRead - 1)) {
                            nextchar = (char) read();
                        } else {
                            nextchar = lineBuffer[i+1];
                        }
                        if (nextchar == LINE_SEP[1]) {
                            skip++;
                        }
                    } else if (lineBuffer[i] == LINE_SEP[1]) {
                        end = i;
                        skip = i + 1;
                    }
                }
                if (nRead > 0) {
                    pos += nRead;
                }
            }
            if (end < 0) {
                if (aggregator == null) {
                    aggregator = new StringBuffer();
                }
                aggregator.append(lineBuffer);
                pos = 0;
            } else {
                reset();
                skip(skip);
            }
        }

        if (aggregator == null) {
            result = new String(lineBuffer, 0, end);
        } else {
            aggregator.append(lineBuffer, 0, end);
            result = aggregator.toString();
        }

        return result;

    }

httpcore-4.4.10里的DefalutBHttpClientConnection会构造ContentLengthInputStram,这个类会传参len,实际上就是http协议里的Content-Length:,读取流的时候会将长度作为标识。

    public int read (final byte[] b, final int off, final int len) throws java.io.IOException {
        if (closed) {
            throw new IOException("Attempted read from closed stream.");
        }
        //记录当前读取的长度,如大于http请求当中的长度,则返回-1,标识流已经结束
        if (pos >= contentLength) {
            return -1;
        }

        int chunk = len;
        if (pos + len > contentLength) {
            chunk = (int) (contentLength - pos);
        }
        final int count = this.in.read(b, off, chunk);
        if (count == -1 && pos < contentLength) {
            throw new ConnectionClosedException(
                    "Premature end of Content-Length delimited message body (expected: "
                    + contentLength + "; received: " + pos);
        }
        if (count > 0) {
            pos += count;
        }
        return count;
    }

4. 网络非阻塞NIO

说到非阻塞IO的网络编程,通常都会涉及到操作系统的底层支持,一般来说,会有这张图
在这里插入图片描述
所有的系统I/O都分为两个阶段:等待就绪和操作。举例来说,读函数,分为等待系统可读和真正的读;同理,写函数分为等待网卡可以写和真正的写。

需要说明的是等待就绪的阻塞是不使用CPU的,是在“空等”;而真正的读写操作的阻塞是使用CPU的,真正在”干活”,而且这个过程非常快,属于memory copy,带宽通常在1GB/s级别以上,可以理解为基本不耗时。
以socket.read()为例子:
传统的BIO里面socket.read(),如果TCP RecvBuffer里没有数据,函数会一直阻塞,直到收到数据,返回读到的数据。
对于NIO,如果TCP RecvBuffer有数据,就把数据从网卡读到内存,并且返回给用户;反之则直接返回0,永远不会阻塞。
最新的AIO(Async I/O)里面会更进一步:不但等待就绪是非阻塞的,就连数据从网卡到内存的过程也是异步的。
换句话说,BIO里用户最关心“我要读”,NIO里用户最关心”我可以读了”,在AIO模型里用户更需要关注的是“读完了”。
NIO一个重要的特点是:socket主要的读、写、注册和接收函数,在等待就绪阶段都是非阻塞的,真正的I/O操作是同步阻塞的(消耗CPU但性能非常高)。

实际上Java的NIO正是基于I/O多路复用的底层实现的,是基于事件的,意思就是如果有轮询有事件就处理,否则就直接接返回,不需要傻等着(阻塞),继续去处理其他的事件。本质上类似于一个断路开关。
在这里插入图片描述
用代码可能更容易理解

    public static void main(String[] args) {
        try {
            ServerSocketChannel serverSocketChannel = ServerSocketChannel.open();
            serverSocketChannel.configureBlocking(false);
            serverSocketChannel.bind(new InetSocketAddress(8899));
            // 获取选择器
            Selector selector = Selector.open();
            // 注册连接事件
            serverSocketChannel.register(selector, SelectionKey.OP_ACCEPT);

            while(selector.select() > 0){//主线程轮询获取就绪事件
                 Iterator<SelectionKey> iterator = selector.selectedKeys().iterator();
                 while(iterator.hasNext()){
                     SelectionKey selectionKey = iterator.next();
                     if(selectionKey.isAcceptable()){
                         SocketChannel socketChannel = serverSocketChannel.accept();
                         socketChannel.configureBlocking(false);
                         socketChannel.register(selector, SelectionKey.OP_READ);//当已连接的时候,注册可读事件
                         System.out.println("客户端已连接:" + socketChannel.getRemoteAddress().toString());
                     }else if(selectionKey.isReadable()){
                         SocketChannel socketChannel = (SocketChannel) selectionKey.channel();
                         StringBuilder sb = new StringBuilder();
                         ByteBuffer buffer = ByteBuffer.allocate(1024);
                         Charset charset = Charset.forName("utf-8");
                         CharsetDecoder decoder = charset.newDecoder();
                         while(socketChannel.read(buffer)>0){//这里不会阻塞,当没有数据的时候会直接返回0
                             buffer.flip();
                             CharBuffer charBuffer = decoder.decode(buffer);
                             String str = charBuffer.toString();
                             sb.append(str);
                             buffer.clear();
                         }
                         System.out.println(sb.toString());
                     }
                 }
                 iterator.remove();
             }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

最后总结

之前一直对于IO和NIO的阻塞和非阻塞比较模糊,通过部分源码和其他的文章深刻理解了它们的关系,后面会继续学习NIO的阻塞模型、sendFile、内存映射、零拷贝等知识,最后介绍下学习Netty的笔记内容,这些知识是高性能服务、队列的基础,深刻理解是很有必要的。

参考资料

  1. http://imgfeve.co-/java-novs-io
  2. https://stackoverflow.com/questions/611760/java-inputstream-blocking-read
  3. https://www.zhihu.com/question/337609338/answer/769836232
  4. https://tech.meituan.com/2016/11/04/nio.html
发布了7 篇原创文章 · 获赞 13 · 访问量 3725

猜你喜欢

转载自blog.csdn.net/xiaolong7713/article/details/105443259