数据结构基本定义

在具体的分配内存的时候，都会看见Layout这个数据结构，这里来对源码进行研究
定义为：

#[stable(feature = "alloc_layout", since = "1.28.0")]
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
#[lang = "alloc_layout"]
pub struct Layout {
    // size of the requested block of memory, measured in bytes.
    size_: usize,
    // alignment(一排) of the requested block of memory, measured in bytes.
    // we ensure that this is always a power-of-two, because API's
    // like `posix_memalign` require it and it is a reasonable
    // constraint to impose on Layout constructors.
    // 确保大小必须为2的幂次，因为POSIX标准的内存挨着一排分配要求
    // 但是注意，并不同样的要求下面这个，即使POSIX的这个方式要求
    // (However, we do not analogously require `align >= sizeof(void*)`,
    //  even though that is *also* a requirement of `posix_memalign`.)
    align_: NonZeroUsize,
}
// 插播一下NonZeroUsize这个东西的定义,用了很多宏：
pub struct NonZeroUsize(usize);
// 实现了这几个方法
pub const unsafe fn new_unchecked(n: usize) -> Self {
    // SAFETY: this is guaranteed to be safe by the caller.
    unsafe { Self(n) }
}
// 核心在这里，确保不为0就是在这里实现的
pub const fn new(n: usize) -> Option<Self> {
    if n != 0 {
        // SAFETY: we just checked that there's no `0`
        Some(unsafe { Self(n) })
    } else {
        None
    }
}
pub const fn get(self) -> $Int {
    self.0
}
// 然后实现了各种按位取或的逻辑：
#[stable(feature = "nonzero_bitor", since = "1.45.0")]
#[rustc_const_unstable(feature = "const_ops", issue = "90080")]
impl const BitOr for $Ty {
    type Output = Self;
    #[inline]
    fn bitor(self, rhs: Self) -> Self::Output {
        // SAFETY: since `self` and `rhs` are both nonzero, the
        // result of the bitwise-or will be nonzero.
        unsafe { $Ty::new_unchecked(self.get() | rhs.get()) }
    }
}
// 然后一些默认的整数类型都继承的方法：
//$Uint 是 usize::MAX
pub const fn trailing_zeros(self) -> u32 {
    // 获取后置0，比如0b000_111_000 就是 3 个后置0
    // SAFETY: since `self` can not be zero it is safe to call cttz_nonzero
    unsafe { intrinsics::cttz_nonzero(self.0 as $Uint) as u32 }
}
pub const fn leading_zeros(self) -> u32 {
    // 获取前置0，0b000_111_000 就是 3 个前置0
    // SAFETY: since `self` can not be zero it is safe to call ctlz_nonzero
    unsafe { intrinsics::ctlz_nonzero(self.0 as $Uint) as u32 }
}
// 还有就是各种实现的乘法除法等没有列出来

看懂源码的核心需要两个关键的概念：

align：必须是2的幂次（意味着不为0），alignment意味着内存对齐，必须是2的幂次来进行分配，就算我需要的大小不是这么大。见下图，如果内存对齐的话，我们保证4byte的缓存中，能够只进行一次读取，而没有对齐的情况下，从缓存中读取int b_需要两次，保证内存对齐可以确保缓存读取的高效率。(Rust in Action 提到了4Kb缓存的情况，保证每一次从缓存中读取的是完整的4Kb)

size：我们实际需要的内存的大小

因此就能看懂初始化内存布局的情况下的边界情况了：

细说内存对齐

来源于博客
很多 CPU ，如基于 Alpha, IA-64, MIPS, 和 SuperH 体系的，拒绝读取未对齐数据。当一个程序要求其中之一的 CPU 读取未对齐数据时，这时 CPU 会进入异常处理状态并且通知程序不能继续执行。
对齐性定义为：
对齐性是一种内存地址的特性，表现为内存地址模上 2 的幂。例如，内存地址 0x0001103F 模 4 结果为 3 ；这个地址就叫做与 4n ＋ 3 对齐， 4 指出了所选用的 2 的幂的值。内存地址的对齐性取决于所选择的关于 2 的幂值。同样的地址模 8 结果为 7 。
所谓的4对齐与8对齐是这样：

单子节存取的时候就是单纯的一个字节一个字节的读，但是实际上的CPU为了提高效率，不会一次性就读一个字节进入寄存器。

双字节存储器，比单字节存储器减少了一倍的开销，可以看见，我们在地址1读取数据，而数据并没有对齐，即从地质1开始读取，因此数据会出现截断的情况，这时候必须再读一次才能够读取到正确的数据

四字节，问题就更严重了，本来只需要读取一次的，但是因为取的时候内存并未对齐，造成了很多额外开销

CPU 执行指令就是对存储于内存上的数据进行操作，这些数据在内存上是以地址为标记的。对于地址来说，单独的数据会有其占用内存的字节数。如果它的地址对齐于它的字节数，那么就称作该数据自然对齐，否则称为未对齐。例如，一个标记 8 字节的浮点数据的地址对齐于 8 ，那么这个数据就自然对齐。
因此，假设我们的CPU选择8字节对齐，align就是8，那么虽然u8类型是大小为1字节，那么实际上他也会被扩展到8个字节。而对于结构体来说，结构体第一个成员相对结构体位置的偏移量为0，那么就按照最小的align，不断的排列
比如：

struct Test {
    data: u8,           // 8 位,一个字节
    // name: String,       
    age: [u8; 13],      // 8位
    // gogo: Vec<i32>      
}

fn main() {
    let test = Test {
        data: 1,
        // name: "sisi".to_string(),
        age: [0; 13],
        // gogo: vec![1, 2, 3]
    };
    println!("内存大小为：{}字节， 对齐大小为：{}字节",std::mem::size_of_val(&test) ,std::mem::align_of_val(&test));
    println!("{}", std::mem::size_of::<String>());
}
// 输出是
内存大小为：14， 对齐大小为：1
24
// 如果放开string类型是，有趣的是，这个结果对于任意长度的string一样：
内存大小为：40， 对齐大小为：8
// 再放开vec也是，不论vec多长，说明结构体的大小被编译器优化过
内存大小为：64， 对齐大小为：8

注意第一二个输出，原本的结构大小是14，String的大小是24，但是加起来之后，偏移到了40，多了2字节用以对齐
第二三个输出以及随意修改里面的内容，可以发现string和vec竟然是固定大小！！！
为什么Vec是固定大小呢？
因为实际上结构体存储的Vec是一个引用的大小：

fn main() {
    let kk = vec!["1", "2", "3"];
    let al : &&str = kk.get(1).unwrap();
    println!("vec长度为{}", kk.len());
    println!("vec的大小为{}", core::mem::size_of_val(&kk));
    println!("vec中元素的大小为：{}", core::mem::size_of_val(al));
    println!("&str中元素的大小为：{}", core::mem::size_of_val(*al));
}

// 输出
vec长度为3
vec的大小为24
vec中元素的大小为：16
&str中元素的大小为：1

当然，对于末尾的位置，也需要填充直到结构的大小完整，每个字段都对齐：

// 在layout中的方法
// Returns the amount of padding we must insert after self to ensure that the following address will satisfy align (measured in bytes).
// e.g., if self.size() is 9, then self.padding_needed_for(4) returns 3, because that is the minimum number of bytes of padding required to get a 4-aligned address (assuming that the corresponding memory block starts at a 4-aligned address).
// 所谓的舍入大小即为，在内存对齐要求下，通过填充能够得到的完整内存大小，这个函数就是为了计算满足align的内存大小需要添加的内存大小
pub const fn padding_needed_for(&self, align: usize) -> usize {
    let len = self.size();

    // Rounded up value is:
    //   len_rounded_up = (len + align - 1) & !(align - 1);
    // and then we return the padding difference: `len_rounded_up - len`.
    //
    // We use modular arithmetic throughout:
    //
    // 1. align is guaranteed to be > 0, so align - 1 is always
    //    valid.
    //
    // 2. `len + align - 1` can overflow by at most `align - 1`,
    //    so the &-mask with `!(align - 1)` will ensure that in the
    //    case of overflow, `len_rounded_up` will itself be 0.
    //    Thus the returned padding, when added to `len`, yields 0,
    //    which trivially satisfies the alignment `align`.
    //    即，如果是4位的align(方便计算，但是不一定是这样)，我们分配的内存大小最多还差3到达溢出，如果是小于3的话，将不会填充，此时出现了溢出，直接返回需要填充的大小为0；如果是大于三的话，那么就可以按照align再次分配一部分内存，返回的就会是正确需要的大小
    // (Of course, attempts to allocate blocks of memory whose
    // size and padding overflow in the above manner should cause
    // the allocator to yield an error anyway.)

    let len_rounded_up = len.wrapping_add(align).wrapping_sub(1) & !align.wrapping_sub(1);
    // 比如上面官方的例子，没有溢出情况下，舍入大小为 (9 + 4 - 1)&!(3)
    // 1100 & !0011 = 1100 = 12
    // 12 - 9 = 3 即为了满足align需要的内存大小
     len_rounded_up.wrapping_sub(len)
 }

// 返回了一个layout满足了末尾元素的内存对齐
pub fn pad_to_align(&self) -> Layout {
    // 上面的代码确保了不会溢出
    let pad = self.padding_needed_for(self.align());
    // This cannot overflow. Quoting from the invariant of Layout:
    // > `size`, when rounded up to the nearest multiple of `align`,
    // > must not overflow (i.e., the rounded value must be less than
    // > `usize::MAX`)
    let new_size = self.size() + pad;

    // SAFETY: self.align is already known to be valid and new_size has been
    // padded already.
    unsafe { Layout::from_size_align_unchecked(new_size, self.align()) }
}

Layout的构造

明白了内存对齐的概念之后就知道下面这两个函数了：
其中可以看见我们能在堆上分配的最大内存是必须小于64位（我的电脑）的寻址空间所能代表的最大大小

    pub const fn from_size_align(size: usize, align: usize) -> Result<Self, LayoutError> {
        if !align.is_power_of_two() {
            return Err(LayoutError);
        }

        // (power-of-two implies align != 0.)

        // Rounded up size is:
        //     上面的舍入大小的计算公式
        //     在这里，我们能够分配的最大内存是usize::MAX，因此基于此来防止溢出，这也意味着我们分配的内存的最大大小不能大过64位的寻址空间的最大值。
        //   size_rounded_up = (size + align - 1) & !(align - 1);
        //
        // We know from above that align != 0. If adding (align - 1)
        // does not overflow, then rounding up will be fine.
        //
        // Conversely, &-masking with !(align - 1) will subtract off
        // only low-order-bits. Thus if overflow occurs with the sum,
        // the &-mask cannot subtract enough to undo that overflow.
        //
        // Above implies that checking for summation overflow is both
        // necessary and sufficient.
        if size > usize::MAX - (align - 1) {
            return Err(LayoutError);
        }

        // SAFETY: the conditions for `from_size_align_unchecked` have been
        // checked above.
        unsafe { Ok(Layout::from_size_align_unchecked(size, align)) }
    }

    /// Creates a layout, bypassing all checks.
    ///
    /// # Safety
    ///
    /// This function is unsafe as it does not verify the preconditions from
    /// [`Layout::from_size_align`].
    #[stable(feature = "alloc_layout", since = "1.28.0")]
    #[rustc_const_stable(feature = "alloc_layout", since = "1.36.0")]
    #[must_use]
    #[inline]
    pub const unsafe fn from_size_align_unchecked(size: usize, align: usize) -> Self {
        // SAFETY: the caller must ensure that `align` is greater than zero.
        Layout { size_: size, align_: unsafe { NonZeroUsize::new_unchecked(align) } }
    }

对于单纯的new函数来说，就更为简单一些了：

const fn size_align<T>() -> (usize, usize) {
    (mem::size_of::<T>(), mem::align_of::<T>())
}
// 为了照顾没有好好优化的代码，才在未检查上面包装了一个new
pub const fn new<T>() -> Self {
    let (size, align) = size_align::<T>();
    // SAFETY: the align is guaranteed by Rust to be a power of two and
    // the size+align combo is guaranteed to fit in our address space. As a
    // result use the unchecked constructor here to avoid inserting code
    // that panics if it isn't optimized well enough.
    unsafe { Layout::from_size_align_unchecked(size, align) }
}

最后就是我们需要注意的，创建的对于数组的layout：

pub fn array<T>(n: usize) -> Result<Self, LayoutError> {
    let array_size = mem::size_of::<T>().checked_mul(n).ok_or(LayoutError)?;

    // SAFETY:
    // - Size: `array_size` cannot be too big because `size_of::<T>()` must
    //   be a multiple of `align_of::<T>()`. Therefore, `array_size`
    //   rounded up to the nearest multiple of `align_of::<T>()` is just
    //   `array_size`. And `array_size` cannot be too big because it was
    //   just checked by the `checked_mul()`.
    // - Alignment: `align_of::<T>()` will always give an acceptable
    //   (non-zero, power of two) alignment.
    Ok(unsafe { Layout::from_size_align_unchecked(array_size, mem::align_of::<T>()) })
}

一个工具函数

// 将align变为了一个悬挂指针，关键是这个指针是well-aligned的，一定是类型T的align的倍数
pub const fn dangling(&self) -> NonNull<u8> {
    // SAFETY: align is guaranteed to be non-zero
    unsafe { NonNull::new_unchecked(self.align() as *mut u8) }
}

有趣的一个方法

有一个方法，尚未完全理解透：

// Creates a layout describing the record for self followed by next, including any necessary padding to ensure that next will be properly aligned, but no trailing padding.
#[stable(feature = "alloc_layout_manipulation", since = "1.44.0")]
#[inline]
pub fn extend(&self, next: Self) -> Result<(Self, usize), LayoutError> {
    let new_align = cmp::max(self.align(), next.align());
    let pad = self.padding_needed_for(next.align());

    let offset = self.size().checked_add(pad).ok_or(LayoutError)?;
    let new_size = offset.checked_add(next.size()).ok_or(LayoutError)?;

    let layout = Layout::from_size_align(new_size, new_align)?;
    Ok((layout, offset))
}

// 例子：
pub fn repr_c(fields: &[Layout]) -> Result<(Layout, Vec<usize>), LayoutError> {
    let mut offsets = Vec::new();
    let mut layout = Layout::from_size_align(0, 1)?;
    for &field in fields {
        let (new_layout, offset) = layout.extend(field)?;
        layout = new_layout;
        offsets.push(offset);
    }
    // Remember to finalize with `pad_to_align`!
    Ok((layout.pad_to_align(), offsets))
}

数据结构与算法

内存布局：Layout

数据结构基本定义

细说内存对齐

Layout的构造

一个工具函数

有趣的一个方法