锁 - Cache Line - 《并发编程》

在CPU中，磁盘读取内存中的数据是按块读取的。
ALU读取X的时候，会先找L1 Cache中有无数据，有则返回，无则找L2中有无数据，然后再找L3中，最后才查找Main Memory中。在Main Memory中读取的时候是会把X所在的这一个内存块一起读取出来的，每个内存块占64个字节，读取到数据后，会依次更新L3，L2，L1缓存，然后再返回给磁盘。

public class T01_CacheLineDemo {
    private static class T{
        public volatile long x = 0L;
    }
    public static T[] arr = new T[2];
    static {
        arr[0] = new T();
        arr[1] = new T();
    }
    public static void main(String[] args) throws InterruptedException {
        Thread t1 = new Thread(() -> {
            for (long i = 0; i < 1000_0000L; i++) {
                arr[0].x = i;
            }
        });
        Thread t2 = new Thread(() -> {
            for (long i = 0; i < 1000_0000L; i++) {
                arr[1].x = i;
            }
        });
        final long start = System.nanoTime();
        t1.start();t2.start();
        t1.join();t2.join();
        System.out.println((System.nanoTime() - start) / 100_0000);
    }
}

运行结果大约在300ms左右

在第一个程序中，t1线程循环1000w次修改arr[0].x就要通知1000w次t2线程来保证内存一致性,因为arr[0]对象个arr[1]对象时在同一个内存块中。

public class T02_CacheLineDemo {
    private static class Padding{
        public volatile long p1,p2,p3,p4,p5,p6,p7;
    }
    private static class T extends Padding{
        public volatile long x = 0L;
    }
    public static T[] arr = new T[2];
    static {
        arr[0] = new T();
        arr[1] = new T();
    }
    public static void main(String[] args) throws InterruptedException {
        Thread t1 = new Thread(() -> {
            for (long i = 0; i < 1000_0000L; i++) {
                arr[0].x = i;
            }
        });
        Thread t2 = new Thread(() -> {
            for (long i = 0; i < 1000_0000L; i++) {
                arr[1].x = i;
            }
        });
        final long start = System.nanoTime();
        t1.start();t2.start();
        t1.join();t2.join();
        System.out.println((System.nanoTime() - start) / 100_0000);
    }
}

运行时间大约在60ms左右

在第二个程序中,因为父类Padding中有7个long类型的属性，我们都知道long类型占8个字节，所有7个属性就占有8*7=56个字节。在子类T中有个long类型的x占8个字节，所有56+8=64个字节，这说明一个T对象就占有一个内存块，不会和其他的对象在同一个内存块中，不需要考虑一致性问题。