《STL源码剖析》笔记-空间配置器

上一篇：《STL源码剖析》笔记-STL概论和版本介绍

单从运用STL来说，空间配置器是了解不到的，因为它隐藏在了容器的背后默默工作。整个STL操作对象都存放在容器中，而容器需要空间用来存放数据。那么为什么不叫内存配置器？因为空间不一定是内存，也可以是磁盘或者其他存储介质。不过，以下介绍SGI(Silicno Graphics Computer Systems,Inc) STL提供的配置器，配置的对象是内存。

空间配置器的标准接口

// 一系列类型定义，在后续会进行介绍
allocator::value_type
allocator::pointer
allocator::const_pointer
allocator::reference
allocator::const_reference
allocator::size_type
allocator::difference_type
allocator::rebind // class rebind<U>拥有唯一成员other；是一个typedef，代表allocator<U>

// 默认构造函数和析构函数，因为没有数据成员，所以不需要初始化，但是必须被定义
allocator::allocator()
allocator::allocator(const allocator&)
template <class U> allocator::allocator(const allocator<U>&)
allocator::~allocator()

// 初始化、地址相关函数
pointer allocator::allocate(size_type n, const void*=0)
size_type allocator::max_size() const
pointer allocator::address(reference x) const
const_pointer allocator::address(const_reference x) const
void deallocate(pointer p, size_type n)

// 构建函数
void allocator::construct(pointer p, const T& x)
void allocator::destory(pointer p)

SGI特殊的空间配置器

SGI STL中的空间配置器与规范不同，名称为alloc而不是allocater，且不接受任何参数。

std::vector<int, std::allocator<int>> vec vec;  // VC编译器
std::vector<int, std::alloc> vec;               // GCC编译器

不过对于使用来说影响并不大，因为使用是通常都是使用默认的空间配置器，不需要自行指定。SGI STL也定义了符合部分标准的、名为allocator的配置器，但是因为效率不佳不建议使用。

C++中new和delete其实都包含两步操作，new包含了申请内存、调用构造函数，delete包含了调用析构函数、释放内存。而
allocator为了更加精密分工，区分开了两个步骤：allocate和deallocate负责内存申请和释放，construct和destory负责构造和析构。

构造和析构

    template <class T1, class T2>  
    inline void construct(T1* p, const T2& value) {  
        // placement new，在已经分配的内存上进行构造
        new (p) T1(value);  
    }  

    // 以下是 destroy() 第一版本，接受一个指针，仅仅调用析构函数 
    template <class T>  
    inline void destroy(T* pointer) {  
        pointer->~T();  
    }  

    // 以下是 destroy() 第二版本，接受两个迭代器。此函数设法找出元素的数值型别，  
    // 进而利用 __type_traits<> 求取最适当措施  
    template <class ForwardIterator>  
    inline void destroy(ForwardIterator first, ForwardIterator last) {  
      __destroy(first, last, value_type(first));  
    }  

   // 判断元素的数值型别(value type)是否有 trivial destructor  
    template <class ForwardIterator, class T>  
    inline void __destroy(ForwardIterator first, ForwardIterator last, T*) {  
      typedef typename __type_traits<T>::has_trivial_destructor trivial_destructor;  
      __destroy_aux(first, last, trivial_destructor());  
    }  

    // 如果元素的数值型别(value type)有 non-trivial destructor  
    template <class ForwardIterator>  
    inline void  
    __destroy_aux(ForwardIterator first, ForwardIterator last, __false_type) {  
      for ( ; first < last; ++first)  
        destroy(&*first);  
    }  

    // 如果元素的数值型别(value type)有 trivial destructor  
    template <class ForwardIterator>   
    inline void __destroy_aux(ForwardIterator, ForwardIterator, __true_type) {}  

    // 以下是 destroy() 第二版本针对迭代器为 char * 和 wchar_t * 的特化版  
    inline void destroy(char*, char*) {}  
    inline void destroy(wchar_t*, wchar_t*) {}

STL标准规定空间配置器必须拥有construct和destory两个函数。上述construct函数的作用是将初值设定到指针所指的空间上，实现上使用了placement new(详见https://blog.csdn.net/d_guco/article/details/54019495)。destory有两个版本，第一个版本接受一个指针，直接调用析构函数进行析构。第二个版本接收两个迭代器，会将它们之间所有的对象进行析构，因为范围可能很大，所以会先判断类型是否为trivial(trivial：不重要的，详见https://blog.csdn.net/WizardtoH/article/details/80767740)，如果是不重要则不需要析构也不影响，因此不做析构操作，如果不是trivial类型，则对每个对象进行释放。如何判断类型是否为trivial类型，在后续会介绍。

空间的配置和释放，std::alloc

SGI STL的空间配置和释放，有以下的考虑：

向system heap申请空间。
考虑多线程。
考虑内存不足的情况。
考虑内存碎片的问题。

下面介绍的内容都不考虑多线程的情况，简化问题的复杂度。

C++中申请和释放内容分别为new和delete，对应C语言中的malloc和free，SGI STL中使用了malloc和free。为了解决内存碎片的问题，SGI STL设计了双层配置器，第一级直接使用malloc和free，第二级按照不同情况采用不同策略：

当配置区块超过128bytes时，视之为“足够大”，便调用第一级配置器。
当配置区块小于128bytes时，视之为“过小”，为了降低额外负担，便采用复杂的memory pool整理方式（后面会介绍），而不再求助于第一级配置器。

整个设计究竟只开放第一级配置器，或者是同时开放第二级配置器，取决于 __USE_MALLOC是否被定义。通常情况下未被定义，因此大多数情况只开放了一级配置器。无论 alloc 被定义成第一级还是第二级配置器， SGI 都将 alloc 进行了上层封装，类似于一个转接器，使其配置器的接口符合 STL 规范：

template<class _Tp, class _Alloc>
class simple_alloc {
public:
    static _Tp* allocate(size_t __n)
      { return 0 == __n ? 0 : (_Tp*) _Alloc::allocate(__n * sizeof (_Tp)); }

    static _Tp* allocate(void)
      { return (_Tp*) _Alloc::allocate(sizeof (_Tp)); }

    static void deallocate(_Tp* __p, size_t __n)
      { if (0 != __n) _Alloc::deallocate(__p, __n * sizeof (_Tp)); }

    static void deallocate(_Tp* __p)
      { _Alloc::deallocate(__p, sizeof (_Tp)); }
};

内部的四个成员函数只是单纯的传递给配置器的成员函数，可能是第一级也有可能是第二级，SGI STL 的所有容器全部用该接口，比如我们常用的vector的定义：

template<class T, class Alloc = alloc>        //默认缺省alloc为空间配置器
class vector
{
protected:

    // 使用了simple_alloc，每次配置一个元素大小
    typedef simple_alloc<value_type, Alloc> data_allocator;

    void deallocate()
    {
        if(...)
        {
            data_allocator::deallocate(start, end_of_storage - start);
        }
    }
    //...

};

第一级配置器

#if 0   
#   include <new>   
#   define  __THROW_BAD_ALLOC throw bad_alloc   
#elif !defined(__THROW_BAD_ALLOC)   
#   include <iostream.h>   
#   define  __THROW_BAD_ALLOC cerr << "out of memory" << endl; exit(1)   
#endif   

// malloc-based allocator. 通常比稍后介绍的 default alloc 速度慢，   
//一般而言是 thread-safe，并且对于空间的运用比较高效（efficient）。   
//以下是第一级配置器。   
//注意，无「template 型别参数」。至于「非型别参数」inst，完全没派上用场。  
template <int inst>     
class __malloc_alloc_template {   

private:   
//以下都是函式指标，所代表的函式将用来处理内存不足的情况。   
// oom : out of memory.   
static void *oom_malloc(size_t);   
static void *oom_realloc(void *, size_t);   
static void (* __malloc_alloc_oom_handler)();   

public:   

static void * allocate(size_t n)   
{   
    void  *result =malloc(n);//第一级配置器直接使用 malloc()   
    // 以下，无法满足需求时，改用 oom_malloc()   
    if (0 == result) result = oom_malloc(n);   
    return  result;   
}   
static void deallocate(void *p, size_t /* n */)   
{   
free(p); //第一级配置器直接使用 free()   
}   

static void * reallocate(void *p, size_t /* old_sz */, size_t new_sz)   
{   
    void  *  result  =realloc(p, new_sz);//第一级配置器直接使用 rea  
    // 以下，无法满足需求时，改用 oom_realloc()   
    if (0 == result) result = oom_realloc(p, new_sz);   
    return  result;   
}   

//以下模拟 C++的 set_new_handler(). 换句话说，你可以透过它，   
//指定你自己的 out-of-memory handler   
static void (* set_malloc_handler(void (*f)()))()   
{   
    void  (*  old)()  =  __malloc_alloc_oom_handler;   
__malloc_alloc_oom_handler = f;   
    return(old);   
}   
};   

// malloc_alloc out-of-memory handling   
//初值为 0。有待客端设定。   
template <int inst>   
void (* __malloc_alloc_template<inst>::__malloc_alloc_oom_handler)() = 0;   

template <int inst>   
void * __malloc_alloc_template<inst>::oom_malloc(size_t n)   
{   
    void  (* my_malloc_handler)();   
    void  *result;   

    for (;;)  {   

//不断尝试释放、配置、再释放、再配置…   
my_malloc_handler = __malloc_alloc_oom_handler;   
        if  (0  ==  my_malloc_handler)  {  __THROW_BAD_ALLOC; }   
        (*my_malloc_handler)();//呼叫处理例程，企图释放内存。   
        result = malloc(n);  //再次尝试配置内存。   
        if  (result)  return(result);   
    }   
}   

template <int inst>   
void * __malloc_alloc_template<inst>::oom_realloc(void *p, size_t n)   
{   
    void  (* my_malloc_handler)();   
    void  *result;   
       for (;;)  {  //不断尝试释放、配置、再释放、再配置…   
my_malloc_handler = __malloc_alloc_oom_handler;   
        if  (0  ==  my_malloc_handler)  {  __THROW_BAD_ALLOC; }   
        (*my_malloc_handler)();//呼叫处理例程，企图释放内存。   
        result = realloc(p, n);//再次尝试配置内存。   
        if  (result)  return(result);   
    }   
}   

//注意，以下直接将参数 inst指定为 0。   
typedef __malloc_alloc_template<0> malloc_alloc;

以上代码可以看到，第一级配置器实现了类似于new-handler的机制，因为它本身不是使用new和delete来进行空间配置的，所以无法使用C++中的new-handler。可以看到allocote和reallocote都是先直接调用C函数，失败之后才调用oom_alloc和oom_realloc反复尝试，希望能够在某次尝试成功。当然，前提是必须使用set_malloc_handler设置处理函数，否则直接抛出异常。需要注意的是，设计set_malloc_handler处理是客端的责任。关于设计new-handler(内存不足处理例程)的具体做法可参考《Effective C++（第二版）》条款7(详见https://blog.csdn.net/wangqiulin123456/article/details/8253279)。

第二级配置器

第二级配置器比第一级多了一些机制，防止连续小内存空间配置造成的内存碎片问题。SGI STL第二级配器的做法是，当配置的区块大于128bytes时，交给第一级配置器处理；区块小于等于128bytes时，就以内存池的方式进行管理：每次配置一大块内存，以内存链表的方式进行管理。为了方便管理，所有区块的内存都会被调整至8的倍数，并维护16个free-lists，分别为8bytes~128bytes，每个free-lists中都存在相同大小的一些区块。

free-lists节点定义如下：

union obj
{
    union obj* free_list_link;
    char client_data[1];
};

这个结构可以看做是从一个内存块中4个字节大小(32位的情况下)，当这个内存块空闲时，它存储了下个空闲块，当这个内存块交付给用户时，它存储的是用户的数据。使用两个成员的原因应该是便于在两种情况下进行切换(猜测)，因为如果只使用一个成员就需要进行强制转换，并且使用union也不会占用更多空间。

以下是二级配置器的实现部分：

enum {__ALIGN = 8};  //小型区块的上调边界
enum {__MAX_BYTES = 128};   //小型区块的上界
enum {__NFREELISTS = __MAX_BYTES/__ALIGN};   //free-list个数

 // 无template型参数，且第二个参数没有用上
 // 第一个用于多线程，暂不讨论
template <bool threads, int inst>
class __default_alloc_template {

private:
    /*将bytes上调至8的倍数
    用二进制理解,byte整除align时尾部为0，结果仍为byte；否则尾部肯定有1存在，加上
    align - 1之后定会导致第i位(2^i = align)的进位,再进行&操作即可得到8的倍数
    */
    static size_t ROUND_UP(size_t bytes) {
        return (((bytes) + __ALIGN-1) & ~(__ALIGN - 1));
    }
private:
    union obj {   //free-list的节点
        union obj * free_list_link;
        char client_data[1];    /* The client sees this.     */
    };

private:
    //16个free-lists
    static obj * __VOLATILE free_list[__NFREELISTS]; 
    //根据区块大小，找到合适的free-list，返回其下标(从0起算)
    static  size_t FREELIST_INDEX(size_t bytes) {
        return (((bytes) + __ALIGN-1)/__ALIGN - 1);
  }

  //返回一个大小为n的对象，并可能编入大小为n的区块到相应的free-list
  static void *refill(size_t n);
  //配置一大块空间，可容纳nobjs个大小为“size”的区块
  //如果配置nobjs个区块有所不便，nobjs可能会降低
  static char *chunk_alloc(size_t size, int &nobjs);

  //Chunk allocation state
  static char *start_free;
  static char *end_free;
  static size_t heap_size;

public:
    // 下面会介绍
    static void * allocate(size_t n); 
    static void * deallocatr(void *p, size_t n);
    static void * reallocate(void *p, size_t old_sz, size_t new_sz);
};

//以下是static data member的定义与初值设定
template <bool threads, int inst>
char * __default_alloc_template<threads, inst>::start_free = 0;

template <bool threads, int inst>
char * __default__alloc_template<threads, inst>::end_free = 0;

template <bool threads, int inst>
size_t __default_alloc_template<threads, inst>::heap_size = 0;

template <bool threads, int inst>
__default_alloc_template<threads, inst>::obj * volatile
__default_alloc_template<threads, inst>::free_list[__NFREELISTS] = 
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, };

__default_alloc_template的allocate函数

 static void * allocate(size_t n)
  {
    obj * __VOLATILE * my_free_list;
    obj * __RESTRICT result;

    if (n > (size_t) __MAX_BYTES) {
        return(malloc_alloc::allocate(n));
    }

    //如果所开辟的区块的大小小于128bytes
    my_free_list = free_list + FREELIST_INDEX(n);
    result = *my_free_list;
    if (result == 0) {
        // 没有找到可用的free list，重新配置
        void *r = refill(ROUND_UP(n));     // refill会在后续介绍
        return r;
    }
    //如果此时free_list上有空间，则拨出一块空间给对象使用
    *my_free_list = result -> free_list_link;
    return (result);
  };

这里写图片描述

__default_alloc_template的deallocate函数

static void deallocate(void *p, size_t n)
{
    obj *q = (obj *)p;
    obj * __VOLATILE * my_free_list;

    if (n > (size_t) __MAX_BYTES) {
        malloc_alloc::deallocate(p, n);
        return;
    }

    //如果是小内存块的释放，则还是先找到所释放内存块在free_list[]中的位置
    my_free_list = free_list + FREELIST_INDEX(n);
    q -> free_list_link = *my_free_list;

    // 调整free_list上的首地址，即返还给free_list中的这个节点
    *my_free_list = q;
}

这里写图片描述

refill函数重新填充

对于第二级空间配置器从free-lists中获取空间和释放空间的流程已经做了介绍，其中获取空间的过程中会出现所需大小的区块已经被获取完了，这时候需要重新填充这个区块，也就是调用refill函数。

template <bool threads, int inst>
void* __default_alloc_template<threads, inst>::refill(size_t n)
{
    // 默认填充20个（n字节上调至8的整数倍）的内存块
    int nobjs = 20;

    // 这个函数的作用是尝试取得nobjs个（n字节上调至8的整数倍）的内存块作为free_list的新节点
    // 这里需要注意取得的不一定是20个区块，如果内存池的空间不够，它所获得的区块数目可能小于20个
    char * chunk = chunk_alloc(n, nobjs);
    obj * __VOLATILE * my_free_list;
    obj * result;
    obj * current_obj, * next_obj;
    int i;

    // 如果申请到的区块数目为1，则直接返还给对象使用
    if (1 == nobjs) return(chunk);

    // 如果不为1则找到区块在free_list[]中所对应位置
    my_free_list = free_list + FREELIST_INDEX(n);

    //从头拨出1个申请好的区块在下面返还给对象，把剩余的区块全部链在free_list[FREELIST_INDEX]下面
    result = (obj *)chunk;
    *my_free_list = next_obj = (obj *)(chunk + n);
    for (i = 1; ; i++) {
        current_obj = next_obj;

        //chunck_alloc返回的空间类型为char*
        next_obj = (obj *)((char *)next_obj + n);
        //如果已经链上的节点的个数等于从内存池申请的节点的个数-1，终止循环
        if (nobjs - 1 == i) {
            current_obj -> free_list_link = 0;
            break;
        } else {
            current_obj -> free_list_link = next_obj;
        }
    }

    return(result);
}

内存池

refill函数中使用到的chunk_alloc，就是从内存池中获取空间，以下为具体实现：

template <bool threads, int inst>
char*
__default_alloc_template<threads, inst>::chunk_alloc(size_t size, int& nobjs)
{
    char * result;
    size_t total_bytes = size * nobjs;
    size_t bytes_left = end_free - start_free;  // 内存池剩余大小

    if (bytes_left >= total_bytes) {
        // 剩余大小满足申请需求
        result = start_free;
        start_free += total_bytes;
        return(result);
    }
    else if (bytes_left >= size) {
        // 剩余大小不满足申请需求，但是能够供应一个区块以上的大小，也就是refill里说明的获得的区块数目可能小于20个
        nobjs = bytes_left/size;
        total_bytes = size * nobjs;
        result = start_free;
        start_free += total_bytes;
        return(result);
    }
    else {
        // 剩余大小连一个区块都不满足
        size_t bytes_to_get = 2 * total_bytes + ROUND_UP(heap_size >> 4);

        // 尝试内存池中剩余的小空间分配给合适的free-lists
        if (bytes_left > 0) {
            obj * __VOLATILE * my_free_list = free_list + FREELIST_INDEX(bytes_left);
            ((obj *)start_free) -> free_list_link = *my_free_list;
            *my_free_list = (obj *)start_free;
        }

        // 用malloc给内存池分配空间
        start_free = (char *)malloc(bytes_to_get);
        if (0 == start_free) {
            // 分配失败
            int i;
            obj * __VOLATILE * my_free_list, *p;

            // 在free_lists中查找没有使用过的内存块，并且它足够大
            for (i = size; i <= __MAX_BYTES; i += __ALIGN) {
                my_free_list = free_list + FREELIST_INDEX(i);
                p = *my_free_list;

                // 把free-lists中的内存编入内存池
                if (0 != p) {
                    *my_free_list = p -> free_list_link;
                    start_free = (char *)p;
                    end_free = start_free + i;
                    return(chunk_alloc(size, nobjs));  // 递归调用，剩余的的零头会被重新编入合适的free-lists
                }
            }

        // 如果free_list中也没有内存块了
        end_free = 0;

        // 试着调用一级空间配置器，可能会抛出异常或者申请到内存
        start_free = (char *)malloc_alloc::allocate(bytes_to_get);
        }

        heap_size += bytes_to_get;
        end_free = start_free + bytes_to_get;
        return (chunk_alloc(size, nobjs));
    }
}

内存池的处理主要有一下一些情况：

内存池空间足够，直接分配空间。
内存池空间不满足所有需求，但是能够分配一个区块以上的大小，分配能分配的最大空间。
内存池空间连一个区块都无法满足，尝试使用malloc进行申请，扩展内存池大小然后分配。
malloc申请失败，则尝试用第一级配置器，因为其中有针对申请空间失败的处理(new-handler机制)，可能能够释放其他的内存来使用，如果失败会抛出异常。

这里写图片描述

内存基础处理工具

STL定义了五个全局函数，其中construct和destroy在上文已经介绍。另外的3个分别是uninitialized_copy、uninitialized_fill、uninitialized_fill_n。

// uninitialized_copy函数模板
template <class InputIterator, class ForwardIterator>
inline ForwardIterator
uninitialized_copy(InputIterator first, InputIterator last,
        ForwardIterator result)
{
    // value_type是个模板函数，萃取迭代器指向类型，用该类型生成临时对象，
    return __uninitialized_copy(first, last, result, value_type(result));

}

uninitialized_copy的作用是把[first, last)之间的所有对象进行复制，并放入result当中。当我们自己实现容器的时候，就可以使用uninitialized_copy，因为容器的全区间构造函数通常使用两个步骤：配置内存区块、使用uninitialized_copy构造。C++标准规定uninitialized_copy具有“commit or rollback”语意，就是说要么构造出所有元素，要么就回滚不构造任何元素。

// uninitialized_fill函数模板
template <class ForwardIterator, class T>
inline void uninitialized_fill(ForwardIterator first, ForwardIterator last,
        const T& x)
{
    __uninitialized_fill(first, last, x, value_type(first));
}

uninitialized_fill能够使我们将内存配置与对象的建构行为分离开。如果[first, last)范围内每个迭代器都指向未初始化的内存，那么uninitialized_fill()会在该范围内产生x的复制品。全部初始化为x。与uninitialized_copy不同，uninitialized_copy是以一段迭代器标记的区块内的元素去初始化自己的未初始化元素，这里是全部初始化为同一个指定的值x。

 // uninitialized_fill_n函数模板
37 template <class ForwardIterator, class Size, class T>
38 inline ForwardIterator uninitialized_fill_n(ForwardIterator first,Size n,
39         const T& x)
40 {
41     return __uninitialized_fill_n(first, n, value_type(first));
42 }

uninitialized_fill_n和uninitialized_fill类似，不过范围是从first到first+n。

以上的三个函数，实际运行时都会先判断类型是否为POD类型(详见https://blog.csdn.net/WizardtoH/article/details/80767740)，如果是POD类型就采用最有效的方法进行填充初值，否则采用最保守的方法。以下用uninitialized_fill_n为例：

template <class ForwardIterator, class Size, class T>
inline ForwardIterator
__uninitialized_fill_n_aux(ForwardIterator first, Size n,
        const T& x, __true_type)
{
    // pod类型直接批量填充
    return fill_n(first, n, x);
}

template <class ForwardIterator, class Size, class T>
ForwardIterator
__uninitialized_fill_n_aux(ForwardIterator first, Size n,
        const T& x, __false_type)
{
    // 非POD类型一个一个填充，并捕捉异常
    ForwardIterator cur = first;
    __STL_TRY
    {
        for (; n > 0; --n, ++cur)
            construct(&*cur, x);
        return cur;
    }
    __STL_UNWIND(destroy(first, cur));
}

template <class ForwardIterator, class Size, class T, class T1>
inline ForwardIterator __uninitialized_fill_n(ForwardIterator first,
        Size n, const T& x, T1*)
{
    typedef typename __type_traits<T1>::is_POD_type is_POD;
    return __uninitialized_fill_n_aux(first, n, x, is_POD());
}

template <class ForwardIterator, class Size, class T>
inline ForwardIterator uninitialized_fill_n(ForwardIterator first,Size n,
        const T& x)
{
    return __uninitialized_fill_n(first, n, value_type(first));
}

下一篇：《STL源码剖析》笔记-迭代器iterators

《STL源码剖析》笔记-空间配置器

猜你喜欢