Skip to content

L09 Machine-Level Programming V: Advanced Topics

Source: https://www.bilibili.com/video/BV1iW411d7hd?p=9

Today

​ 内存安排 Memory Layout

​ 缓冲区溢出 Buffer Overflow

​ 漏洞 Vulnerability

​ 防护 Protection

​ 联合体 Unions【这是一种数据结构】


x86-64 Linux 内存安排

--------------------- \  <- 0000 7FFF FFFF FFFF 由于硬件限制,内存仅使用 47 位(128TB)而非 64 位
|       Stack       |  \
---------------------   \
|         |         |    8MB 此为常用机器的设置,你可以使用 limit 命令查看得到  stacksize 这一数据
|         v         |   /
|                   |  /
--------------------- /
|                   |
|                   |
|                   |
---------------------
|  Shared Libraries |
---------------------
|                   |
|         ^         |
|         |         |
---------------------
|                   |       
|       Heap        |
|                   |
---------------------
|       Data        |
---------------------
|       Text        |
--------------------- <- Hex address 0000 4000 0000 0000
|                   | 
--------------------- <- Hex address 0000 0000 0000 0000

栈 Stack

​ Runtime stack (8MB limit)

​ 例如本地变量 e. g. local variables

堆 Heap【这部分用于存放所有因 malloc() 或其相关函数申请的变量】

​ Dynamically allocated as needed

​ When call malloc(), calloc(), new()

数据区 Data 【这部分存放了所有的全局变量】

​ (存储)静态分配数据 Statically allocated data

​ 例如全局变量、静态变量、字符串常量 e.g. global vars, static vars, string constants

文本段 Text / Shared Libraries 【文本段用于存放可执行程序】【SL 用于存放库函数对应的可执行程序,动态加载】

​ (存储)可执行机器指令 Executable machine instructions

​ 只读 Read-only

例子如下:

char big_array[1L<<24];       /*  16 MB */ // -> 0000 0000 8060 1060 Data
char huge_array[1L<<31];      /*   2 GB */ // -> 0000 0000 0060 1060 Data

int global = 0;

int useless() { return 0; }                // -> 0000 0000 0040 0590 Text

int main() {                               // -> 0000 0000 0040 060c Text
    void *p1, *p2, *p3, *p4;
    int local = 0;                         // -> 0000 7ffe 4d3b e87c Stack 
    p1 = malloc(1L << 28);    /* 256 MB */ // -> 0000 7f72 62a1 e010 Heap (Upper)
    p2 = malloc(1L << 8);     /* 256  B */ // -> 0000 0000 8359 d010 Heap (Lower)
    p3 = malloc(1L << 32);    /*   4 GB */ // -> 0000 7f71 62a1 d010 Heap (Upper)
    p4 = malloc(1L << 8);     /* 256  B */ // -> 0000 0000 8359 d120 Heap (Lower)
    /* Some print statements ... */
}

可以看到有上下两个堆,其扩展方向也相反,小空间接近数据区,大空间接近栈区,原因是 malloc() 使用 mmap 分配大块内存,使用 brk 分配小块内存。

缓冲区溢出

一般称作缓冲区溢出 Generally called a "buffer overflow"

​ 当超过了为数组分配的内存大小时 when exceeding the memory size allocated for an array

为什么很严重? Why a big deal?

​ 这是第一技术漏洞 It's the #1 technical cause of security vulnerability

​ 在所有起因中第一的时社会工程/用户忽略 #1 overall cause is social engineering / user ignorance

最常见的形式 Most common form

​ 未检查字符串输入的长度 Unchecked lengths on string inputs

​ 部分是因为栈上有界字符数组 Particularity for bounded character arrays on the stack

​ sometimes referred to as stack smashing

字符串库代码 String Library Code

Unix 中函数 gets() 的实现 Implementation of Unix function gets()

/* Get string form stdin */
char *gets(char *dest) {
    int c = getchar();
    char *p = dest;
    while (c != EOF && c != '\n') {
        *p++ = c;
        c = getchar();
    }
    *p = '\0';
    return dest;
}

不去特殊地限制读入字符数量 No way to specify limit on number of characters to read

其他库函数也有类似问题 Similar problems with other library functions

strcpy()strcat():复制一定长度的字符串 Copy strings of arbitrary length

scanf()fscanf()sscanf(),特别是使用 %s 时 when given %s conversion specifically.

有漏洞的缓冲区代码 Vulnerable Buffer Code

/* Echo Line */
void echo() {
    char buf[4];  /* Way too small! */
    gets(buf);
    puts(buf);
}
void call_echo() {
    echo();
}
unix>./bufdemo-nsp
Type a string:012345678901234567890123

缓冲区溢出反编译 Buffer Overflow Disassembly

0000 0000 0040 06cf <echo>:
  4006cf: 48 83 ec 18        sub     $0x18,%rsp
  4006d3: 48 89 e7           mov     %rsp,%rdi
  4006d6: e8 a5 ff ff ff     callq   400680 <gets>
  4006db: 48 89 e7           mov     %rsp,%rdi
  4006de: e8 3d fe ff ff     callq   400520 <puts@plt>
  4006e3: 48 83 c4 18        add     $0x18,%rsp
  4006e7: c3                 retq

call_echo()

  4006e8: 48 83 ec 08        sub     $0x8,%rsp
  4006ec: b8 00 00 00 00     mov     $0x0,%eax
  4006f1: e8 d9 ff ff ff     callq   4006cf <echo>
  4006f6: 48 83 c4 08        add     $0x8,%rsp
  4006fa: c3                 retq

缓冲区溢出到栈 Buffer Overflow Stack

--------------------- 
|    Stack Frame    |
|   for call_echo   |   
|                   |   
|                   |  
--------------------- 
|  Return Address   |
|     (8 Bytes)     |
---------------------
|  20 Bytes Unused  |
|                   |
---------------------
|  [3]|[2]|[1]|[0]  | buf <- %rsp
---------------------

如上图所示,我们输入 23 个字符,恰好不越界(字符串结尾自带值为 00 的 \0 字符),用光了整个缓冲区。

再加,则会溢出到返回地址区域,并引发段错误(segmentation fault)【当试图访问不被允许访问的内存区域就会段错误】

代码注入攻击 Code Injection Attacks

void P() {
    Q();
    ...   // <- return address
}
int Q() {
    char buf[64];
    gets(buf);
    ...
    return ...;
}
---------------------  ---
|                   |   ^
|                   |   |
|                   |   P stack frame
---------------------   |---
|         B         |   v ^
---------------------  ---|--- 
|        pad        |     | ^
|                   |     da|ta written by gets()
---------------------     | |
|      exploit      |     v Q stack frame
|       code        | <--B- |
---------------------       |
|                   |       v
---------------------      ---

换句话说,攻击者通过输入特定字符串,将程序返回地址篡改了,从而劫持了程序运行的控制权,进而执行任何该控制权可以执行的命令(即执行注入的代码),当这个命令为最高权限时,攻击者就完全掌握了这个机器。【此事在 NAD pwn 中亦有记载】

Input string contains byte representation of executable code

Overwrite return address A with address of buffer B

When Q() executes ret, will jump to exploit code.

代码注入攻击曾经是一个相当大的问题(你看这些输入字符串的库函数随随便便拉出来一个都有这漏洞),1988 年,第一个互联网攻击被称为「莫里斯蠕虫」(the Morris worm),当时互联网规模还比较小,所以它基本上感染了当时互联网上大部分机器(笑点解析:MIT6.824 分布式系统的课程的作者就是该病毒的作者)

基于缓冲区溢出的攻击 Exploits Based on Buffer Overflows

​ 缓冲区溢出漏洞能让远端机器在受害机上执行任意命令 Buffer overflow bugs can allow remote machines to execute arbitrary code on victim machines

​ Distressingly common in real programs

​ Programmers keep making the same mistakes.

​ Recent measures make these attacks much more difficult

​ Examples across the decades

​ Original "Internet worm" (1988)

​ "IM wars" (1999)

​ Twilight hack on Wii (2000s)

​ ... and many, many more

​ You will learn some of the tricks in attack lab

​ Hopefully to convince you to never leave such holes in your programs.

Example: the original Internet worm (1988)

​ Exploit a few vulnerability to spread

​ Early versions of the finger server (fingered) used gets() to read the argument sent by the client.

finger droh @cs.cmu.edu 【现在被广泛禁用的 finger 命令,用于向远端机器发送消息】

​ Worm attacked fingered server by sending phony argument:

finger "exploit-code padding new-returen-address"

​ exploit code: executed a root shell on the victim machine with a direct TCP connection to the attacker

​ Once on a machine, scanned for other machines to attack

​ invaded about 6000 computers in hours (10% of the Internet)

​ see June 1989 article in Comm. of the ACM

​ the young author of the worm was prosecuted...

​ and CERT was formed... still homed at CMU

Example 2: IM War

​ July, 1999

​ Microsoft launches MSN Messenger (instant messaging system)

​ Messenger clients can access popular AOL Instant Messaging Service (AIM) servers

                              AIM client
                                    ^
                                    |
                                    v 
MSN server <-> MSN client <-> AIM server
                                    ^
                                    |
                                    V
                              AIM client
August, 1999

​ Mysteriously, Messenger clients can no longer access AIM servers

​ Microsoft and AOL begin the IM war:

​ AOL changes server to disallow Messenger clients

​ Microsoft makes changes to clients to defeat AOL changes

​ At least 13 such skirmishes

​ What was really happening?

​ AOL had discovered a buffer overflow bug in their own AIM clients

​ They exploited it to detect and block Microsoft: the exploit code returned a 4-byte signature (the bytes at some location in the AIM client to server)

​ When Microsoft changed code to match signature, AOL changes signature location

AOL 利用漏洞偷信息这件事被一个自称为 Phil Bucking 的人披露,后来确认这封邮件是从微软内部发出的。

异同:蠕虫和病毒 Aside: Worms and Viruses

​ 蠕虫:一段这样的程序 Worm: A program that

​ 可以自行运行 Can run by itself

​ 可以传播一个自己的可工作版本到其他计算机上 Can propagate a fully working version of itself to other computers.

​ 病毒:这样的代码 Virus: Code that

​ 将自己添加到其他程序当中 Adds itself to other programs

​ 无法独立运行 Does not run independently

​ 两者都(通常)被设计出来在计算机之间传播,并且大肆破坏 Both are (usually) designed to spread among computers and to wreak havoc.

如何处理缓冲区溢出 OK, what to do about buffer overflow attacks

​ 1. 避免溢出漏洞 Avoid overflow vulnerabilities

​ 2. 利用系统级别防护 Employ system-level protections

​ 3. 编译器在栈上使用「金丝雀」Have compiler use "stack canaries"【金丝雀是一个常用比喻,以前矿工下煤矿,会带一只金丝雀等脆弱动物来探测矿洞内气体成分变化,人无法察觉出来的气体成分变化能够被它们察觉到,因此金丝雀用于比喻敏感的探测器。】

​ Lets talk about each...

  1. 避免溢出漏洞
/* Echo Line */
void echo() {
    char buf[4]; /* Way too small! */
    fgets(buf, 4, stdin);
    puts(buf);
}

例如,使用限制字符串长度的库函数 For example, use library routines that limit string lengths

​ 使用 fgets() 而不是 gets() fgets instead of gets

​ 使用 strncpy 而不是 strcpy strncpy instead of strcpy

​ 不要使用带 %sscanf() Don't use scanf with %s conversion specification

​ 使用 fgets() 函数来读入字符串 Use fgets to read the string

​ 或者使用 %ns,其中 n 是一个合适的整数 Or use %ns where n is a suitable integer.

  1. 系统级别防护 System-Level Protection can help

​ 随机栈偏移 Randomized stack offsets【这种技术也被称为「地址空间布局随机化」ASLR, address space layout randomization】

​ At start of program, allocate random amount of space on stack

​ Shifts stack addresses for entire program

​ Makes it difficult for hacker to predict beginning of inserted code

​ e.g. 5 executions of memory allocation code

​ stack repositioned each time program executes

​ 不可执行代码段 Nonexecutable code segments

​ In traditional x86, can mark region of memory as either "read-only" or "writeable"

​ Can execute anything readable

​ x86-64 added explicit "execute" permission【这逐渐演化成了 3 位的读写执权限码】

​ stack marked as nonexecutable

  1. 使用金丝雀 Stack Canaries can help

​ Idea

​ Place special value ("canary") on stack just beyond buffer

​ Check for corruption before exiting function

​ GCC Implementation

-fstack-pretector

​ Now the default (disabled earlier)

unix>./bufdemo-sp
Type a string:0123456
0123456

unix>./bufdemo-sp
Type a string:01234567
*** stack smashing detected ***

受保护的缓冲区反编译 Protected Buffer DIsassembly

echo:

  40072f: sub     $0x18,%rsp 
  400733: mov     %fs:0x28,%rax                  #
  40073c: mov     %rax,0x8(%rsp)                 #
  400741: xor     %eax,%eax
  400743: mov     %rsp,%rdi
  400746: callq   4006e0 <gets>
  40074b: mov     %rsp,%rdi
  40074e: callq   400570 <puts@plt>
  400753: mov     0x8(%rsp),%rax                 #
  400758: xor     %fs:0x28,%rax                  #
  400761: je      400768 <echo+0x39>
  400763: callq   400580 <__stack_chk_fail@plt>  #
  400768: add     $0x18,%rsp
  40076c: retq

针对以上常见保护措施,又有了新的攻击手段,其被称为面向返回编程(return oriented programming),成功率可观。前两种都可以比较轻易地破解,金丝雀比较难。

Return-Oriented Programming Attacks

​ Challenge (for hackers)

​ Stack randomization makes it hard to predict buffer location

​ Marking stack nonexecutable makes it hard to insert binary code

​ Alternative Strategy

​ Use existing code

​ e.g. library code from stdlib

​ string together fragments to achieve overall desired outcome

Does not over come stack canaries

​ Construct program from gadgets (gadget 指部分可执行程序的一系列字节)

​ Sequence of instructions ending in ret

​ Encoded by single byte 0xc3

​ Code positions fixed from run to run

​ Code is executable

【面向返回编程就是利用 gadget,gadget 利用各个可执行代码中 ret 及前面的一部分代码,因为返回地址可以得知,所以可以使用缓冲区溢出跳转到这些地方,然后每到一个地方执行一部分代码后返回,这个返回又跳转到下一个 gadget 处,由此拼凑出了可以完成目标的指令序列。】

联合体

联合体内存分配

根据最大元素分配,一次只能使用一个字段 Allocate according to largest element. Can only use one field at a time.

union U1 {
    char c;
    int i[2];
    double v;
} *up;
struct S1 {
    char c;
    int i[2];
    double v;
} *sp;
---
|c|
--+--+-----
|i[0]|i[1]|
-----+-----
|v        |
-----------
up+0      up+8

-----+----+----+----+---------
|c\3B|i[0]|i[1]\4B  |v       |
-----+----+----+----+---------
sp+0 sp+4 sp+8 sp+12sp+16    sp+24
typedef union {
    float f;
    unsigned u;
} bit_float_t;

float bit2float(unsigned u) {
    bit_float_t arg;
    arg.u = u;
    return arg.f;
}
unsigned float2bit(float f) {
    bit_float_t arg;
    arg.f = f;
    return arg.u;
} 
// 此处 u 和 f 的空间完全重叠。

Byte Ordering Revisited

​ Idea

​ Short/long/quad words stored in memory as 2/4/8 consecutive bytes

​ Which byte is most (least) significant?

​ Can cause problems when exchanging binary data between machines

​ 大端方式 Big Endian

​ Most significant byte has lowest address

​ Sparc

​ 小端方式 Little Endian

​ Least significant byte has lowest address

​ Intel x86, ARM Android and IOS

​ 双端方式 Bi Endian

​ Can be configured either way

​ ARM

联合体最初是用来解决字节序列切换问题的,让数据在大端小端之间切换。

IA32 使用小端,Sun 使用大端,x86-64 使用小端,总之现在大多数机器都是小端


Summary of Compound Types in C

​ Arrays

​ Contiguous allocation of memory

​ Aligned to satisfy every element's alignment requirement

​ Pointer to first element

​ No bounds checking

​ Structures

​ Allocate bytes in order declared

​ Pad in middle and at end to satisfy alignment

​ Unions

​ Overlay declaration

​ Way to circumvent type system