netty源码阅读与分析----HashedWheelTimer

netty是一个基础通信框架，管理的连接数较多，可能多至百万级，每一个连接都有或多或少有超时任务，比如发送数据超时，心跳检测等。如果为每一个连接都启动一个Timer，不仅效率低下，而且占用资源。基于论文Hashed and hierarchical timing wheels: data structures for the efficient implementation of a timer facility提出的定时轮，netty采用这种方式来管理和维护大量的定时任务，实现就在HashedWheelTimer这个类中，其原理如下：

定时轮其实就是一种环型的数据结构，可以把它想象成一个时钟，分成了许多格子，每个格子代表一定的时间，在这个格子上用一个链表来保存要执行的超时任务，同时有一个指针一格一格的走，走到哪个格子时就执行格子对应的超时任务，超时任务通过一定的规则放入到格子中，如下图所示：

以上图为例子，假设一格代表1s，上图能表示的时间段则为8s，假设当前指针指向3，有一个任务需要3s后执行，那么这个任务应该放在3+3=6的格子中，如果有一个任务需要6s后执行，那么这个任务应该放在(3+6)%8=1的格子中。接下来看下netty中的HashedWheelTimer类是具体如何实现这个算法的，构造函数如下：

public HashedWheelTimer(
            ThreadFactory threadFactory,//用于创建worker线程
            long tickDuration, //表示一格的时长，就是多久走一格
        TimeUnit unit, //时间单位
        int ticksPerWheel, //一圈有多少格
        boolean leakDetection,//是否开启内存泄露检测
            long maxPendingTimeouts
        ) {

        if (threadFactory == null) {
            throw new NullPointerException("threadFactory");
        }
        if (unit == null) {
            throw new NullPointerException("unit");
        }
        if (tickDuration <= 0) {
            throw new IllegalArgumentException("tickDuration must be greater than 0: " + tickDuration);
        }
        if (ticksPerWheel <= 0) {
            throw new IllegalArgumentException("ticksPerWheel must be greater than 0: " + ticksPerWheel);
        }

        // Normalize ticksPerWheel to power of two and initialize the wheel.
        wheel = createWheel(ticksPerWheel);//创建定时轮,格数为2的幂次方
        mask = wheel.length - 1;//因为格子数为2的幂次方，此处用于代替%取余操作，可以提高效率

        // Convert tickDuration to nanos.
        this.tickDuration = unit.toNanos(tickDuration);//转换成纳秒

        // Prevent overflow.
        if (this.tickDuration >= Long.MAX_VALUE / wheel.length) {//校验是否存在溢出。即指针转动的时间间隔不能太长而导致tickDuration*wheel.length>Long.MAX_VALUE
            throw new IllegalArgumentException(String.format(
                    "tickDuration: %d (expected: 0 < tickDuration in nanos < %d",
                    tickDuration, Long.MAX_VALUE / wheel.length));
        }
        workerThread = threadFactory.newThread(worker);

        leak = leakDetection || !workerThread.isDaemon() ? leakDetector.track(this) : null;

        this.maxPendingTimeouts = maxPendingTimeouts;

        if (INSTANCE_COUNTER.incrementAndGet() > INSTANCE_COUNT_LIMIT &&
            WARNED_TOO_MANY_INSTANCES.compareAndSet(false, true)) {
            reportTooManyInstances();
        }
    }

接下来看下创建定时轮的方法：

private static HashedWheelBucket[] createWheel(int ticksPerWheel) {
        if (ticksPerWheel <= 0) {
            throw new IllegalArgumentException(
                    "ticksPerWheel must be greater than 0: " + ticksPerWheel);
        }
        if (ticksPerWheel > 1073741824) {
            throw new IllegalArgumentException(
                    "ticksPerWheel may not be greater than 2^30: " + ticksPerWheel);
        }

        ticksPerWheel = normalizeTicksPerWheel(ticksPerWheel);／／确保为2的米次方
        HashedWheelBucket[] wheel = new HashedWheelBucket[ticksPerWheel];
        for (int i = 0; i < wheel.length; i ++) {
            wheel[i] = new HashedWheelBucket();
        }
        return wheel;
    }

这里我们可以看到，一个定时轮就是包含了一个数组（确切的说是环形数组，对应环形的数据结构），每个元素HashedWheelBucket是一个链表，

private static final class HashedWheelBucket {
        // Used for the linked-list datastructure
        private HashedWheelTimeout head;
        private HashedWheelTimeout tail;
	.....
}

接下来看下定时轮的启动，停止和添加任务，首先是启动：

public void start() {// 启动定时轮。这个方法其实在添加定时任务（newTimeout()方法）的时候会自动调用此方法，因为如果时间轮里根本没有定时任务，启动时间轮也是空耗资源
        
	// 判断当前时间轮的状态，如果是初始化，则启动worker线程，启动整个定时轮；如果已经启动则略过；如果是已经停止，则报错。这里因为可能有多个线程争抢启动定时轮，所以采用了cas方式的无锁设计
	switch (WORKER_STATE_UPDATER.get(this)) {
            case WORKER_STATE_INIT:
                if (WORKER_STATE_UPDATER.compareAndSet(this, WORKER_STATE_INIT, WORKER_STATE_STARTED)) {
                    workerThread.start();
                }
                break;
            case WORKER_STATE_STARTED:
                break;
            case WORKER_STATE_SHUTDOWN:
                throw new IllegalStateException("cannot be started once stopped");
            default:
                throw new Error("Invalid WorkerState");
        }

        // Wait until the startTime is initialized by the worker.
        while (startTime == 0) {
            try {
                startTimeInitialized.await();//等待worker启动
            } catch (InterruptedException ignore) {
                // Ignore - it will be ready very soon.
            }
        }
    }

接下来看下stop：

public Set<Timeout> stop() {
	    // worker线程不能停止定时轮，也就是加入的定时任务的线程不能调用这个方法。防止恶意的定时任务调用这个方法造成定时任务失效
        if (Thread.currentThread() == workerThread) {
            throw new IllegalStateException(
                    HashedWheelTimer.class.getSimpleName() +
                            ".stop() cannot be called from " +
                            TimerTask.class.getSimpleName());
        }
	// 尝试CAS替换当前状态变为“停止：2”。如果失败，则当前时间轮的状态只能是“初始化：0”或者“停止：2”。直接将当前状态设置为“停止：2“
        if (!WORKER_STATE_UPDATER.compareAndSet(this, WORKER_STATE_STARTED, WORKER_STATE_SHUTDOWN)) {
            // workerState can be 0 or 2 at this moment - let it always be 2.
            if (WORKER_STATE_UPDATER.getAndSet(this, WORKER_STATE_SHUTDOWN) != WORKER_STATE_SHUTDOWN) {
                INSTANCE_COUNTER.decrementAndGet();
                if (leak != null) {
                    boolean closed = leak.close(this);
                    assert closed;
                }
            }

            return Collections.emptySet();
        }
	
        try {
            boolean interrupted = false;
            while (workerThread.isAlive()) {
                workerThread.interrupt();//中断worker线程
                try {
                    workerThread.join(100);
                } catch (InterruptedException ignored) {
                    interrupted = true;
                }
            }

            if (interrupted) {
                Thread.currentThread().interrupt();
            }
        } finally {
            INSTANCE_COUNTER.decrementAndGet();
            if (leak != null) {
                boolean closed = leak.close(this);
                assert closed;
            }
        }
        return worker.unprocessedTimeouts();//返回未处理的任务
    }

接下来看下添加任务的方法:

public Timeout newTimeout(TimerTask task, long delay, TimeUnit unit) {
        if (task == null) {
            throw new NullPointerException("task");
        }
        if (unit == null) {
            throw new NullPointerException("unit");
        }

        long pendingTimeoutsCount = pendingTimeouts.incrementAndGet();

        if (maxPendingTimeouts > 0 && pendingTimeoutsCount > maxPendingTimeouts) {
            pendingTimeouts.decrementAndGet();
            throw new RejectedExecutionException("Number of pending timeouts ("
                + pendingTimeoutsCount + ") is greater than or equal to maximum allowed pending "
                + "timeouts (" + maxPendingTimeouts + ")");
        }

        start();

        // Add the timeout to the timeout queue which will be processed on the next tick.
        // During processing all the queued HashedWheelTimeouts will be added to the correct HashedWheelBucket.
        long deadline = System.nanoTime() + unit.toNanos(delay) - startTime;

        // Guard against overflow.
        if (delay > 0 && deadline < 0) {
            deadline = Long.MAX_VALUE;
        }
        HashedWheelTimeout timeout = new HashedWheelTimeout(this, task, deadline);
        timeouts.add(timeout);
        return timeout;
    }

其实就是包装成一个HashedWheelTimeout任务对象，然后放入到timeouts队列中。

从阅读netty源码以及之前disruptor的源码后我们可以发现，无锁设计，juc并发包在框架中的应用非常普遍。

netty源码阅读与分析----HashedWheelTimer

猜你喜欢