Netty 申請記憶體入口 PoolArena 原始碼分析-原始碼解析-CodeUp Hub

PoolArena 是 Netty 申請記憶體的主要入口，Netty 借鑑 jemalloc 中 Arena 的設計思想，採用固定數量的多個 Arena 進行記憶體分配，預設數量通常為 CPU 核數 * 2。執行緒在首次申請分配記憶體時，會透過 round-robin 的方式輪詢 PoolArena 陣列，選擇一個固定的 PoolArena ，該執行緒在整個生命週期內都只會與該 PoolArena 打交道，所以每個執行緒都會儲存對應的 PoolArena 資訊，從而提高訪問效率。

本篇文章深入分析 PoolArena 的原始碼及核心原理。

Netty 申請記憶體入口 PoolArena 原始碼分析

PoolArena 的結構

PoolArena 是一個抽象類，它有兩個子類：DirectArena 和 HeapArena，其類圖如下：

Netty 申請記憶體入口 PoolArena 原始碼分析

PoolArena 繼承 SizeClass，實現 PoolArenaMetric 介面。PoolArenaMetric 提供了一些方法來獲取 PoolArena的指標資訊，它可以讓我們更好地瞭解記憶體池的使用情況，以便最佳化和調優應用程式。

PoolArena 重要的屬性如下：

abstract class PoolArena<T> extends SizeClasses implements PoolArenaMetric {
    enum SizeClass {
        Small,
        Normal
    }
  
    private final PoolSubpage<T>[] smallSubpagePools;

    private final PoolChunkList<T> q050;
    private final PoolChunkList<T> q025;
    private final PoolChunkList<T> q000;
    private final PoolChunkList<T> qInit;
    private final PoolChunkList<T> q075;
    private final PoolChunkList<T> q100;
    
    // ....    
}

從這裏可以看出，PoolArena 只有 Small 和 Norma 兩種記憶體規格，兩種記憶體規格，就有兩種記憶體分配的方式：

PoolSubpage 型別的陣列：smallSubpagePools，用於分配小於 28K 的記憶體。
由 6 個 PoolChunkList 組成的雙向連結串列：用於分配小於 4MB 的記憶體。

結構如下：

Netty 申請記憶體入口 PoolArena 原始碼分析

PoolArena 的建構函式

建構函式如下：

    protected PoolArena(PooledByteBufAllocator parent, int pageSize,
          int pageShifts, int chunkSize, int cacheAlignment) {
        super(pageSize, pageShifts, chunkSize, cacheAlignment);
        
        // 所屬分配器
        this.parent = parent;
        directMemoryCacheAlignment = cacheAlignment;
         
        // 39
        numSmallSubpagePools = nSubpages; 
        smallSubpagePools = newSubpagePoolArray(numSmallSubpagePools);
        for (int i = 0; i < smallSubpagePools.length; i ++) {
            // 初始化 Subpage 首節點
            smallSubpagePools[i] = newSubpagePoolHead();
        }

        q100 = new PoolChunkList<T>(this, null, 100, Integer.MAX_VALUE, chunkSize);
        q075 = new PoolChunkList<T>(this, q100, 75, 100, chunkSize);
        q050 = new PoolChunkList<T>(this, q075, 50, 100, chunkSize);
        q025 = new PoolChunkList<T>(this, q050, 25, 75, chunkSize);
        q000 = new PoolChunkList<T>(this, q025, 1, 50, chunkSize);
        qInit = new PoolChunkList<T>(this, q000, Integer.MIN_VALUE, 25, chunkSize);

        q100.prevList(q075);
        q075.prevList(q050);
        q050.prevList(q025);
        q025.prevList(q000);
        q000.prevList(null);
        qInit.prevList(qInit);

        List<PoolChunkListMetric> metrics = new ArrayList<PoolChunkListMetric>(6);
        metrics.add(qInit);
        metrics.add(q000);
        metrics.add(q025);
        metrics.add(q050);
        metrics.add(q075);
        metrics.add(q100);
        chunkListMetrics = Collections.unmodifiableList(metrics);
    }

建構函式主要是初始化 smallSubpagePools 陣列和 PoolChunkList 雙向連結串列。這裏重點講下 PoolChunkList 雙向連結串列。從建構函式中我們可以看到該雙向連結串列由 6 個節點組成，每個節點代表不同的記憶體使用率，如下：

qInit，記憶體使用率為 0% ~ 25% 的 Chunk。
q000，記憶體使用率為 1% ~ 50% 的 Chunk。
q025，記憶體使用率為 25% ~ 75% 的 Chunk。
q050，記憶體使用率為 50% ~ 100% 的 Chunk。
q075，記憶體使用率為 75% ~ 100% 的 Chunk。
q100，記憶體使用率為 100% 的 Chunk。

構建的雙向連結串列結構如下

Netty 申請記憶體入口 PoolArena 原始碼分析

針對這個結構，有兩個問題需要解答：

qInit 和 q000 有什麼區別？這樣相似的兩個節點為什麼不設計成一個？
節點與節點之間的記憶體使用率重疊很大，為什麼要這麼設計？

第一個問題：qInit 和 q000 有什麼區別？這樣相似的兩個節點為什麼不設計成一個？

仔細觀察這個 PoolChunkList 的雙向連結串列，你會發現它並不是一個完全的雙向連結串列，它與完全的雙向連結串列有兩個區別：

qInit 的前驅節點是自己。這就意味著在 qInit 節點中的 PoolChunk 使用率到達 0% 後，它並不會被回收。
q000 則沒有前驅節點，這樣就導致一個問題，隨著 PoolChunk 的記憶體使用率降低，直到小於 1% 後，它並不會退回到 qInit 節點，而是等待完全釋放後被回收。

所以如果某個 PoolChunk 的記憶體使用率一直都在 0 ~ 25% 之間波動，那麼它就可以一直停留在 qInit 中，這樣就避免了重複的初始化工作，故而 qInit 的作用主要在於避免某 PoolChunk 的記憶體使用變化率不大的情況下的頻繁初始化和釋放，提高記憶體分配的效率。而 q000 則用於 PoolChunk 記憶體使用變化率較大，待完全釋放後進行記憶體回收，防止永遠駐留在記憶體中。

qInit 和 q000 的配合使用，使得 Netty 的記憶體分配和回收效率更高效了。

第二個問題：節點與節點之間的記憶體使用率重疊很大，為什麼要這麼設計？

我們先看下圖：

Netty 申請記憶體入口 PoolArena 原始碼分析

從上圖可以看出，這些節點幾乎有一半空間是重疊的，為什麼要這麼設計呢？我們假定，q025 的範圍為 [25%,50%)，q050 的範圍為 [50%,75%)，如果有一個 PoolChunk 它的記憶體使用率變化情況為 40%、55%、45%、60%、48%，66%，這樣就會導致這個 PoolChunk 會在 q025 、q050 這兩個 PoolChunkList 不斷移動，勢必會造成效能損耗。如果範圍是 [25%,75%) 和 [50%,100%)，這樣的記憶體使用率變化情況只會在 q025 中，只要當記憶體使用率超過了 75% 纔會移動到 q050，而隨著該 PoolChunk 的記憶體使用率降低，它也不是降到 75% 就回到 q025，而是要到 50%，這樣可以調整的範圍就大的多了。

記憶體分配

PoolArena 提供了 allocate() 用於記憶體分配，該方法根據申請記憶體的大小規格來分配不同規格的記憶體：

    PooledByteBuf<T> allocate(PoolThreadCache cache, int reqCapacity, int maxCapacity) {
        PooledByteBuf<T> buf = newByteBuf(maxCapacity);
        allocate(cache, buf, reqCapacity);
        return buf;
    }
    
    private void allocate(PoolThreadCache cache, PooledByteBuf<T> buf, final int reqCapacity) {
        // 根據 size 計算 sizeIdex
        final int sizeIdx = size2SizeIdx(reqCapacity);

        if (sizeIdx <= smallMaxSizeIdx) {
            // Small 規格，在 PoolSubpage 中分配
            tcacheAllocateSmall(cache, buf, reqCapacity, sizeIdx);
        } else if (sizeIdx < nSizes) {
            // Normal 規則，在 PoolChunk 中分配
            tcacheAllocateNormal(cache, buf, reqCapacity, sizeIdx);
        } else {
            // Huge 規格，直接分配
            int normCapacity = directMemoryCacheAlignment > 0
                    ? normalizeSize(reqCapacity) : reqCapacity;
            allocateHuge(buf, normCapacity);
        }
    }

首先根據申請的記憶體大小 reqCapacity 計算 sizeIdex，sizeIdex 是在 SizeClass 中計算的，如下：

   public int size2SizeIdx(int size) {
        if (size == 0) {
            return 0;
        }
        if (size > chunkSize) {
            return nSizes;
        }

        size = alignSizeIfNeeded(size, directMemoryCacheAlignment);
        
        // 對於小於 lookupMaxSize 這段，可以直接在 size2idxTab 表中取
        if (size <= lookupMaxSize) {
            return size2idxTab[size - 1 >> LOG2_QUANTUM];
        }
        
        // 這裏要跟計算 size 的個公式來倒推，大明哥數學都還給老師就不推到了
        int x = log2((size << 1) - 1);
        int shift = x < LOG2_SIZE_CLASS_GROUP + LOG2_QUANTUM + 1
                ? 0 : x - (LOG2_SIZE_CLASS_GROUP + LOG2_QUANTUM);

        int group = shift << LOG2_SIZE_CLASS_GROUP;

        int log2Delta = x < LOG2_SIZE_CLASS_GROUP + LOG2_QUANTUM + 1
                ? LOG2_QUANTUM : x - LOG2_SIZE_CLASS_GROUP - 1;

        int deltaInverseMask = -1 << log2Delta;
        int mod = (size - 1 & deltaInverseMask) >> log2Delta &
                  (1 << LOG2_SIZE_CLASS_GROUP) - 1;

        return group + mod;
    }

得到 sizeIdex 後我們就可以確認使用哪種方式來進行記憶體分配：

Small：[0,38]
Normal：[39,68]
Huge：(68,)

tcacheAllocateSmall：Small 規格

tcacheAllocateSmall() 用於分配 Small 規格的記憶體：

    private void tcacheAllocateSmall(PoolThreadCache cache, PooledByteBuf<T> buf, final int reqCapacity,
                                     final int sizeIdx) {
        // 使用快取
        if (cache.allocateSmall(this, buf, reqCapacity, sizeIdx)) {
            // was able to allocate out of the cache so move on
            return;
        }

        // 確定是哪個 PoolSubpage 塊
        final PoolSubpage<T> head = smallSubpagePools[sizeIdx];
        final boolean needsNormalAllocation;
        // 鎖定整個連結串列
        synchronized (head) {
            final PoolSubpage<T> s = head.next;
            needsNormalAllocation = s == head;
            // 這裏表示該連結串列中有空閒的記憶體可供分配
            if (!needsNormalAllocation) {
                assert s.doNotDestroy && s.elemSize == sizeIdx2size(sizeIdx) : "doNotDestroy=" +
                        s.doNotDestroy + ", elemSize=" + s.elemSize + ", sizeIdx=" + sizeIdx;
                long handle = s.allocate();
                assert handle >= 0;
                s.chunk.initBufWithSubpage(buf, null, handle, reqCapacity, cache);
            }
        }
        
        // needsNormalAllocation == true，說明該 PoolSubpage 中沒有對應的記憶體，需要從 PoolChunk 中分配 PoolSubpage
        if (needsNormalAllocation) {
            synchronized (this) {
                allocateNormal(buf, reqCapacity, sizeIdx, cache);
            }
        }
        
        // allocationsSmall count + 1
        incSmallAllocation();
    }

PoolThreadCache 快取中是否存在，有就直接分配即可
如果在 PoolThreadCache 快取中沒有，則從 smallSubpagePools 陣列中取，這裏需要注意，因為併發的關係，這裏使用了 synchronized (head) 來保證執行緒安全，鎖定 head 就是鎖定整個連結串列。
如果 head.next == head 說明當前連結串列中沒有空閒的記憶體可分配，需要從 PoolChunk 中分配 PoolSubpage。

tcacheAllocateNormal：Normal 規格

tcacheAllocateNormal() 用於分配 Normal 規格的記憶體。

    private void tcacheAllocateNormal(PoolThreadCache cache, PooledByteBuf<T> buf, final int reqCapacity,
                                      final int sizeIdx) {
        if (cache.allocateNormal(this, buf, reqCapacity, sizeIdx)) {
            // was able to allocate out of the cache so move on
            return;
        }
        // 注意這裏是對整個 PoolArena 加鎖
        synchronized (this) {
            allocateNormal(buf, reqCapacity, sizeIdx, cache);
            ++allocationsNormal;
        }
    }

因為 Normal 規格的記憶體需要從 PoolChunk 中分配，其主要是利用 5種不同型別的 PoolChunkList 來進行分配，而一個 PoolArena 中只有一個 PoolChunkList 連結串列，所以需要對整個 PoolArena 加鎖。

    private void allocateNormal(PooledByteBuf<T> buf, int reqCapacity, int sizeIdx, PoolThreadCache threadCache) {
        if (q050.allocate(buf, reqCapacity, sizeIdx, threadCache) ||
            q025.allocate(buf, reqCapacity, sizeIdx, threadCache) ||
            q000.allocate(buf, reqCapacity, sizeIdx, threadCache) ||
            qInit.allocate(buf, reqCapacity, sizeIdx, threadCache) ||
            q075.allocate(buf, reqCapacity, sizeIdx, threadCache)) {
            return;
        }

        // 生成一個新的 PoolChunk
        PoolChunk<T> c = newChunk(pageSize, nPSizes, pageShifts, chunkSize);
        boolean success = c.allocate(buf, reqCapacity, sizeIdx, threadCache);
        assert success;
        // 加入到 qInit
        qInit.add(c);
    }

從這個方法我們可以看出，在 PoolChunkList 雙向連結串列中它並不是從 qInit 到 q100 按照順序來分配的，而是按照q050 —> q025 —> q000 —> qInit —> q075 這樣的順序，這樣做的目的是這樣的順序記憶體分配效率相對更高些。

allocateHuge：Huge 規格

allocateHuge() 用於分配 Huge 規格的記憶體，其分配方式是不進行池化處理，直接從堆或者堆外記憶體分配。

    private void allocateHuge(PooledByteBuf<T> buf, int reqCapacity) {
        PoolChunk<T> chunk = newUnpooledChunk(reqCapacity);
        activeBytesHuge.add(chunk.chunkSize());
        buf.initUnpooled(chunk, reqCapacity);
        allocationsHuge.increment();
    }

記憶體釋放

PoolArena 提供了 free() 用於對記憶體進行釋放：

    void free(PoolChunk<T> chunk, ByteBuffer nioBuffer, long handle, int normCapacity, PoolThreadCache cache) {
        if (chunk.unpooled) { 
            // 非池化，直接釋放即可
            int size = chunk.chunkSize();
            destroyChunk(chunk);
            activeBytesHuge.add(-size);
            deallocationsHuge.increment();
        } else {
            SizeClass sizeClass = sizeClass(handle);
            // 加入到快取中
            if (cache != null && cache.add(this, chunk, nioBuffer, handle, normCapacity, sizeClass)) {
                // cached so not free it.
                return;
            }
            
            // 釋放記憶體
            freeChunk(chunk, handle, normCapacity, sizeClass, nioBuffer, false);
        }
    }

對於 Huge 這類沒有池化的記憶體，則直接釋放 PoolChunk 即可。
對於池化的記憶體，優先加入到 PoolThreadCache 快取中，如果新增失敗的話，則呼叫 freeChunk() 釋放記憶體

    void freeChunk(PoolChunk<T> chunk, long handle, int normCapacity, SizeClass sizeClass, ByteBuffer nioBuffer,boolean finalizer) {
        final boolean destroyChunk;
        // 加鎖
        synchronized (this) {
            // 在 PoolChunkList 中進行釋放,並調整其對應的數據結構
            destroyChunk = !chunk.parent.free(chunk, handle, normCapacity, nioBuffer);
        }
        if (destroyChunk) {
            destroyChunk(chunk);
        }
    }

由於 PoolArena 只是記憶體的分配和釋放的入口，真正執行記憶體分配的是在 PoolChunk 和 PoolSubpage 中，所以這篇文章在記憶體分配和釋放地方並沒深入到這兩個類當中，在後麵講解 PoolChunk 和 PoolSubpage 時再詳細深入分析。

轉自：大明哥_

Netty 申請記憶體入口 PoolArena 原始碼分析

PoolArena 的結構

PoolArena 的建構函式

記憶體分配

tcacheAllocateSmall：Small 規格

tcacheAllocateNormal：Normal 規格

allocateHuge：Huge 規格

記憶體釋放

相關推薦

每個開發人員都應瞭解的記...

程式的執行過程原來是這樣！

Netty 申請記憶體入...

SpringBoot G...

JDK 中 Execut...

你不知道的原型到原型鏈再...

0則評論