计算机组成原理 - 极客时间计算机组成原理专栏笔记（二） - 《💻 计算机基础》

05|计算机指令
06|指令跳转
07|函数调用
08|ELF和静态链接
09|程序装载
- 内存分页
- 课后思考：JAVA程序是如何装载到内存里面的？
10|动态链接
- 课后思考
11|二进制编码
12|理解电路
13|加法器
14|乘法器
15|浮点数和定点数

05|计算机指令

计算机指令集 instruction set
Intel ARM 架构

常见的五大类指令

算数类指令
数据传输类指令
逻辑类指令
条件分支指令
无条件跳转指令

极客时间计算机组成原理专栏笔记（二） - 图1

打开win10 上的ubutun
sudo apt-get install build-essential 安装 gcc
gcc —version
vim test.c
gcc -g -c test.c
$ objdump -d -M intel -S test.o

06|指令跳转

CPU是如何执行指令的？

PC寄存器（program counter register）/指令地址寄存器（instruction address register）
用来存放下一条需要处理执行的计算机指令的内存地址。
指令寄存器（instruction register），存放当前正在执行的指令
条件码寄存器（status register ），用里面的一个一个标记位，存放 CPU进行算数或者逻辑计算的结果
通用寄存器，存放数据，地址等，比如整数寄存器，浮点数寄存器，向量寄存器，地址寄存器等

一个程序执行的时候，CPU会根据PC寄存器中的地址，把内存中需要执行的指令读取到指令寄存器里面执行，然后根据指令长度自增，开始顺序读取下一条指令。跳转指令，会修改PC寄存器里面的地址值。

SwitchCase 汇编

paradise@PradiseXPS:~$ cat switch.c
#include <stdio.h>
int main()
{
        int a = 3;
        int x = 0;
        switch(a)
        {
                case 0:
                        a = 0;
                        break;
                case 1:
                        a = 1;
                        break;
                case 2:
                        a = 2;
                        break;
                case 3:
                        a = 3;
                        break;
                case 4:
                        a = 4;
                        break;
                case 5:
                        a = 5;
                        break;
                case 6:
                        a = 6;
                        break;
                default:
                        a = 0;
        }
        return 0;
}
paradise@PradiseXPS:~$ gcc -g -c switch.c
paradise@PradiseXPS:~$ ls
print.c  print.o  switch.c  switch.o  test.c  test.o
paradise@PradiseXPS:~$ objdump -d -M intel -S switch.o
switch.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:
#include <stdio.h>
int main()
{
   0:   55                      push   rbp
   1:   48 89 e5                mov    rbp,rsp
        int a = 3;
   4:   c7 45 f8 03 00 00 00    mov    DWORD PTR [rbp-0x8],0x3
        int x = 0;
   b:   c7 45 fc 00 00 00 00    mov    DWORD PTR [rbp-0x4],0x0
        switch(a)
  12:   83 7d f8 06             cmp    DWORD PTR [rbp-0x8],0x6
  16:   77 63                   ja     7b <main+0x7b>
  18:   8b 45 f8                mov    eax,DWORD PTR [rbp-0x8]
  1b:   48 8d 14 85 00 00 00    lea    rdx,[rax*4+0x0]
  22:   00
  23:   48 8d 05 00 00 00 00    lea    rax,[rip+0x0]        # 2a <main+0x2a>
  2a:   8b 04 02                mov    eax,DWORD PTR [rdx+rax*1]
  2d:   48 63 d0                movsxd rdx,eax
  30:   48 8d 05 00 00 00 00    lea    rax,[rip+0x0]        # 37 <main+0x37>
  37:   48 01 d0                add    rax,rdx
  3a:   ff e0                   jmp    rax
        {
                case 0:
                        a = 0;
  3c:   c7 45 f8 00 00 00 00    mov    DWORD PTR [rbp-0x8],0x0
                        break;
  43:   eb 3d                   jmp    82 <main+0x82>
                case 1:
                        a = 1;
  45:   c7 45 f8 01 00 00 00    mov    DWORD PTR [rbp-0x8],0x1
                        break;
  4c:   eb 34                   jmp    82 <main+0x82>
                case 2:
                        a = 2;
  4e:   c7 45 f8 02 00 00 00    mov    DWORD PTR [rbp-0x8],0x2
                        break;
  55:   eb 2b                   jmp    82 <main+0x82>
                case 3:
                        a = 3;
  57:   c7 45 f8 03 00 00 00    mov    DWORD PTR [rbp-0x8],0x3
                        break;
  5e:   eb 22                   jmp    82 <main+0x82>
                case 4:
                        a = 4;
  60:   c7 45 f8 04 00 00 00    mov    DWORD PTR [rbp-0x8],0x4
                        break;
  67:   eb 19                   jmp    82 <main+0x82>
                case 5:
                        a = 5;
  69:   c7 45 f8 05 00 00 00    mov    DWORD PTR [rbp-0x8],0x5
                        break;
  70:   eb 10                   jmp    82 <main+0x82>
                case 6:
                        a = 6;
  72:   c7 45 f8 06 00 00 00    mov    DWORD PTR [rbp-0x8],0x6
                        break;
  79:   eb 07                   jmp    82 <main+0x82>
                default:
                        a = 0;
  7b:   c7 45 f8 00 00 00 00    mov    DWORD PTR [rbp-0x8],0x0
        }
        return 0;
  82:   b8 00 00 00 00          mov    eax,0x0
}
  87:   5d                      pop    rbp
  88:   c3                      ret
paradise@PradiseXPS:~$

补充内容

paradise@PradiseXPS:~$ objdump -H
Usage: objdump <option(s)> <file(s)>
 Display information from object <file(s)>.
 At least one of the following switches must be given:
  -a, --archive-headers    Display archive header information
  -f, --file-headers       Display the contents of the overall file header
  -p, --private-headers    Display object format specific file header contents
  -P, --private=OPT,OPT... Display object format specific contents
  -h, --[section-]headers  Display the contents of the section headers
  -x, --all-headers        Display the contents of all headers
  -d, --disassemble        Display assembler contents of executable sections
  -D, --disassemble-all    Display assembler contents of all sections
  -S, --source             Intermix source code with disassembly
  -s, --full-contents      Display the full contents of all sections requested
  -g, --debugging          Display debug information in object file
  -e, --debugging-tags     Display debug information using ctags style
  -G, --stabs              Display (in raw form) any STABS info in the file
  -W[lLiaprmfFsoRtUuTgAckK] or
  --dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames,
          =frames-interp,=str,=loc,=Ranges,=pubtypes,
          =gdb_index,=trace_info,=trace_abbrev,=trace_aranges,
          =addr,=cu_index,=links,=follow-links]
                           Display DWARF info in the file
  -t, --syms               Display the contents of the symbol table(s)
  -T, --dynamic-syms       Display the contents of the dynamic symbol table
  -r, --reloc              Display the relocation entries in the file
  -R, --dynamic-reloc      Display the dynamic relocation entries in the file
  @<file>                  Read options from <file>
  -v, --version            Display this program's version number
  -i, --info               List object formats and architectures supported
  -H, --help               Display this information
 The following switches are optional:
  -b, --target=BFDNAME           Specify the target object format as BFDNAME
  -m, --architecture=MACHINE     Specify the target architecture as MACHINE
  -j, --section=NAME             Only display information for section NAME
  -M, --disassembler-options=OPT Pass text OPT on to the disassembler
  -EB --endian=big               Assume big endian format when disassembling
  -EL --endian=little            Assume little endian format when disassembling
      --file-start-context       Include context from start of file (with -S)
  -I, --include=DIR              Add DIR to search list for source files
  -l, --line-numbers             Include line numbers and filenames in output
  -F, --file-offsets             Include file offsets when displaying information
  -C, --demangle[=STYLE]         Decode mangled/processed symbol names
                                  The STYLE, if specified, can be `auto', `gnu',
                                  `lucid', `arm', `hp', `edg', `gnu-v3', `java'
                                  or `gnat'
  -w, --wide                     Format output for more than 80 columns
  -z, --disassemble-zeroes       Do not skip blocks of zeroes when disassembling
      --start-address=ADDR       Only process data whose address is >= ADDR
      --stop-address=ADDR        Only process data whose address is <= ADDR
      --prefix-addresses         Print complete address alongside disassembly
      --[no-]show-raw-insn       Display hex alongside symbolic disassembly
      --insn-width=WIDTH         Display WIDTH bytes on a single line for -d
      --adjust-vma=OFFSET        Add OFFSET to all displayed section addresses
      --special-syms             Include special symbols in symbol dumps
      --inlines                  Print all inlines for source line (with -l)
      --prefix=PREFIX            Add PREFIX to absolute paths for -S
      --prefix-strip=LEVEL       Strip initial directory names for -S
      --dwarf-depth=N        Do not display DIEs at depth N or greater
      --dwarf-start=N        Display DIEs starting with N, at the same depth
                             or deeper
      --dwarf-check          Make additional dwarf internal consistency checks.
objdump: supported targets: elf64-x86-64 elf32-i386 elf32-iamcu elf32-x86-64 a.out-i386-linux pei-i386 pei-x86-64 elf64-l1om elf64-k1om elf64-little elf64-big elf32-little elf32-big pe-x86-64 pe-bigobj-x86-64 pe-i386 plugin srec symbolsrec verilog tekhex binary ihex
objdump: supported architectures: i386 i386:x86-64 i386:x64-32 i8086 i386:intel i386:x86-64:intel i386:x64-32:intel i386:nacl i386:x86-64:nacl i386:x64-32:nacl iamcu iamcu:intel l1om l1om:intel k1om k1om:intel plugin
The following i386/x86-64 specific disassembler options are supported for use
with the -M switch (multiple options should be separated by commas):
  x86-64      Disassemble in 64bit mode
  i386        Disassemble in 32bit mode
  i8086       Disassemble in 16bit mode
  att         Display instruction in AT&T syntax
  intel       Display instruction in Intel syntax
  att-mnemonic
              Display instruction in AT&T mnemonic
  intel-mnemonic
              Display instruction in Intel mnemonic
  addr64      Assume 64bit address size
  addr32      Assume 32bit address size
  addr16      Assume 16bit address size
  data32      Assume 32bit data size
  data16      Assume 16bit data size
  suffix      Always display instruction suffix in AT&T syntax
  amd64       Display instruction in AMD64 ISA
  intel64     Display instruction in Intel64 ISA
Report bugs to <http://www.sourceware.org/bugzilla/>.
paradise@PradiseXPS:~$

07|函数调用

StackOverflow 栈溢出

函数调用示例程序

paradise@PradiseXPS:~$ cat example.c
// function_example.c
#include <stdio.h>
int static add(int a,int b)
{
        return a + b;
}
int main()
{
        int x = 5;
        int y = 10;
        int u = add(x,y);
}
paradise@PradiseXPS:~$ objdump -d -M intel -S example.o
example.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <add>:
// function_example.c
#include <stdio.h>
int static add(int a,int b)
{
   0:   55                      push   rbp
   1:   48 89 e5                mov    rbp,rsp
   4:   89 7d fc                mov    DWORD PTR [rbp-0x4],edi
   7:   89 75 f8                mov    DWORD PTR [rbp-0x8],esi
        return a + b;
   a:   8b 55 fc                mov    edx,DWORD PTR [rbp-0x4]
   d:   8b 45 f8                mov    eax,DWORD PTR [rbp-0x8]
  10:   01 d0                   add    eax,edx
}
  12:   5d                      pop    rbp
  13:   c3                      ret
0000000000000014 <main>:
int main()
{
  14:   55                      push   rbp
  15:   48 89 e5                mov    rbp,rsp
  18:   48 83 ec 10             sub    rsp,0x10
        int x = 5;
  1c:   c7 45 f4 05 00 00 00    mov    DWORD PTR [rbp-0xc],0x5
        int y = 10;
  23:   c7 45 f8 0a 00 00 00    mov    DWORD PTR [rbp-0x8],0xa
        int u = add(x,y);
  2a:   8b 55 f8                mov    edx,DWORD PTR [rbp-0x8]
  2d:   8b 45 f4                mov    eax,DWORD PTR [rbp-0xc]
  30:   89 d6                   mov    esi,edx
  32:   89 c7                   mov    edi,eax
  34:   e8 c7 ff ff ff          call   0 <add>
  39:   89 45 fc                mov    DWORD PTR [rbp-0x4],eax
  3c:   b8 00 00 00 00          mov    eax,0x0
}
  41:   c9                      leave
  42:   c3                      ret
paradise@PradiseXPS:~$

call 指令后面跟着的，仍然是跳转后的程序地址；

add函数

函数开始执行了一条push指令和一条mov指令
函数结束的时候执行了一条pop和一条ret指令
这四条指令的执行其实就是压栈和出栈的操作。

函数调用和条件跳转不同之处在于，执行了内存地址的跳转指令之后，还需要再回来继续执行。
在内存中开辟一段空间，用栈这个 LIFO后进先出的数据结构。

栈帧 stack frame 整个函数A的所占用的所有的内存空间，就是函数A的栈帧；

push rbp 把之前调用函数，也就是main函数的栈帧的栈底地址，压到栈顶。
mov rbp,rsp 把rsp这个栈指针的值复制到到rbp里，而rsp始终会指向栈顶

rbp - register base pointer (start of stack) 栈帧指针，存放了当前栈帧位置的寄存器
rsp - register stack pointer (current location in stack, growing downwards) 栈指针

小结

学的云里雾里的，有点懵逼了。
这里需要有时间回过头来，回顾下汇编的知识。

08|ELF和静态链接

objdump命令

可执行文件 executale program
目标文件 object file
链接器 linker

通过 gcc -o 参数，可以生成对应的可执行文件

“c语言代码 - 汇编代码 - 机器码”

编译compile 汇编 assemble 链接 link
通过装载器loader把可执行文件装载load到内存中，CPU从内存中读取指令和数据，开始执行程序。

ELF格式和链接：理解链接过程

ELF executable and linkable file format 可执行与可链接文件格式

如果我们有一个能够解析PE格式的装载器，我们就有可能在linux下运行windows程序了。Wine
WSL windows subsystem for linux ，可以解析家在ELF格式的文件

课后思考：

readelf 读取程序的符号表
objdump 读取重定位表

09|程序装载

把程序装载到内存：

可执行程序加载后占用的内存空间应该是连续的
我们需要同时加载很多个程序，并且不能让程序自己规定在内存中加载的位置

内存分段 segmentation

内存碎片 memory fragmentation
内存交换 memory swapping

内存分页

课后思考：JAVA程序是如何装载到内存里面的？

10|动态链接

动态链接 dynamic link
静态链接 static link
共享库 shared libraries

dll dynamic-link-library
so shared object

相对地址 relative address

程序链接表 procedure link table
全局偏移表 GOT global offset table
虽然共享库的代码部分的物理内存是共享的，但是数据部分是各个动态链接它的应用

课后思考

节省内存空间

11|二进制编码

整数，二进制与十六进制转化
负数

原码

补码

ASCII码：用8位二进制中的128个不同的数，映射到128个不同的字符串。

存储数据的时候，要采用二进制序列化的方式，不管是整数也好，浮点数也好，采用二进制序列化会比存储文本省下不少空间。

字符集 charset
字符编码 character encoding

UTF

手持两把锟斤拷，口中疾呼烫烫烫
脚踏千朵屯屯屯，笑看万物锘锘锘

极客时间计算机组成原理专栏笔记（二）