0%

'pwntools源码分析'

0x00 前言

不得不说pwntools的源码注释写的不错,每一个文件里面,每一个函数,都注释有相应的用途和例子,非常的友好

0X01 process类

这个类位于 pwntools/pwnlib/tubes/process.py 这个文件夹下

1
class process(tube):

我们可以看到process这个类继承了tube类,tube类位于同一文件夹下 tube.py 里面,tube类里面就定义了好多常用的函数,比如send()、sendline()等函数,如果想要定制函数,在这个tube类里面写就行了

0x02 fmtstr类

这个类位于pwntools/pwnlib/fmtstr.py 里面

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
def __init__(self, execute_fmt, offset = None, padlen = 0, numbwritten = 0):
"""
Instantiates an object which try to automating exploit the vulnerable process

Arguments:
execute_fmt(function): function to call for communicate with the vulnerable process
offset(int): the first formatter's offset you control
padlen(int): size of the pad you want to add before the payload
numbwritten(int): number of already written bytes
"""
self.execute_fmt = execute_fmt
self.offset = offset
self.padlen = padlen
self.numbwritten = numbwritten


if self.offset == None:
self.offset, self.padlen = self.find_offset()
log.info("Found format string offset: %d", self.offset)

self.writes = {}
self.leaker = MemLeak(self._leaker)

首先,看一下init函数有四个参数,

  • 第一个是execute_fmt函数指针,我们自己定义的,就是和程序交互的触发格式化字符串漏洞交互的函数,有点抽象2333

  • 第二个参数 offset,就是你能控制第几个偏移量的内容,这个偏移量的数值为offset

  • 第三个参数是padlen,在payload之前填充的长度
  • 第四个参数是numbwrite,就是printf已经输出的字符的数量

下面用例子来说明

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
//gcc  -z noexecstack -fstack-protector-all -fpie -pie -z now stack_printf_format.c -o stack_printf_format
#include<stdio.h>
#include<string.h>
void init(){
setbuf(stdout,0);
setbuf(stdin,0);
setbuf(stderr,0);
}
int main(){
init();
char op[0x10];
char context[0x100];
memset(context,0,0x100);
memset(op,0,0x10);
while(1){
puts("do you want input something");
fgets(op,0x10,stdin);
if(!strcmp(op,"yes\n")){
puts("ok,input something,I will call back for you");
fgets(context,0x100,stdin);
printf(context);
}else{
break;
}
}
return 0;
}

我们可以看到在

1
printf(context)

这一行代码有格式化字符串漏洞,因为是一个循环,所以,我们应该定义一个函数完成一次循环交互,如

1
2
3
4
5
6
7
def execute_fmt(payload):
p.sendlineafter('do you want input something\n','yes')
p.sendlineafter('ok,input something,I will call back for you\n',payload)
msg=p.recvuntil('\n')
print("msg="+msg)
print("payload="+payload)
return msg

函数必须有返回值,返回值为格式化输出字符串的内容,两次printf为调试内容

然后如果没有设置offset和padlen,他就会自己通过find_offset()函数去找offset和padlen

然后看一个 find_offset() 这个函数

1
2
3
4
5
6
7
8
9
10
11
12
def find_offset(self):
marker = cyclic(20)
for off in range(1,1000):
leak = self.leak_stack(off, marker)
leak = pack(leak)

pad = cyclic_find(leak)
if pad >= 0 and pad < 20:
return off, pad
else:
log.error("Could not find offset to format string on stack")
return None, None

首先生成长度为20的特定的字串作为marker

首先循环1000次,然后通过leak_stack得到某个偏移的内容,然后,用cyclic_find()函数找这个leak,如果找到且符合范围就return,没有的话就继续循环,直到循环结束,没有的话就return None

下面看一下leak_stack()函数

def leak_stack(self, offset, prefix=b""):
    leak = self.execute_fmt(prefix + b"START%%%d$pEND" % offset)
    try:
        leak = re.findall(br"START(.*)END", leak, re.MULTILINE | re.DOTALL)[0]
        leak = int(leak, 16)
    except ValueError:
        leak = 0
    return leak

这个函数就是执行一遍execute_fmt()函数,得到输出正则匹配START和END中间的内容,也就是对应偏移的内容,转成16进制return

到此我们分析完了init()函数

那么如何向一个地址写呢,可以使用write()函数和execute_writes函数来实现

接下来是write()函数

def write(self, addr, data):
    self.writes[addr] = data

向writes字典添加一个,没什么好说的

def execute_writes(self):
    fmtstr = randoms(self.padlen).encode()
    fmtstr += fmtstr_payload(self.offset, self.writes, numbwritten=self.padlen, write_size='byte')
    self.execute_fmt(fmtstr)
    self.writes = {}

execute_writes()调用fmtstr_payload()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def fmtstr_payload(offset, writes, numbwritten=0, write_size='byte', write_size_max='long', overflows=16, strategy="small", badbytes=frozenset(), offset_bytes=0):
sz = WRITE_SIZE[write_size]
szmax = WRITE_SIZE[write_size_max]
all_atoms = make_atoms(writes, sz, szmax, numbwritten, overflows, strategy, badbytes)

fmt = b""
for _ in range(1000000):
data_offset = (offset_bytes + len(fmt)) // context.bytes
fmt, data = make_payload_dollar(offset + data_offset, all_atoms, numbwritten=numbwritten)
fmt = fmt + cyclic((-len(fmt)-offset_bytes) % context.bytes)

if len(fmt) + offset_bytes == data_offset * context.bytes:
break
else:
raise RuntimeError("this is a bug ... format string building did not converge")

return fmt + data

再分析一个函数为fmtstr_split()

1
2
3
4
5
6
7
8
9
def fmtstr_split(offset, writes, numbwritten=0, write_size='byte', write_size_max='long', overflows=16, strategy="small", badbytes=frozenset()):
if write_size not in ['byte', 'short', 'int']:
log.error("write_size must be 'byte', 'short' or 'int'")
if write_size_max not in ['byte', 'short', 'int', 'long']:
log.error("write_size_max must be 'byte', 'short', 'int' or 'long'")
sz = WRITE_SIZE[write_size]
szmax = WRITE_SIZE[write_size_max]
atoms = make_atoms(writes, sz, szmax, numbwritten, overflows, strategy, badbytes)
return make_payload_dollar(offset, atoms, numbwritten)

解释一下重要的参数:

  • offset是偏移
  • write为一个字典存储着{addr:val}的键值对
  • numberwriten是printf已经输出的字符长度
  • write_size是以什么方式,有四个选项
    • byte:hhn
    • short:hn
    • int:n
    • long:lln
  • write_size_max就是地址是多长的
    • byte:1个字节
    • short:2个字节
    • int:4个字节
    • long:8个字节

返回值有为一个数组为payload和编码之后的地址,这样stack不在栈上的时候,可以利用fmtstr_split()函数来构造payload

0x00 总结

pwntools fmtstr模块很好的实现了对于格式化字符串buf在stack上的情况,对于buf不在stack,麻烦一点,需要自行构造

0x03 ROP类