September 22, 2022

GoogleCTF2022 eldar

翻译了国外一个大神的WP,我什么时候能有这么强😭

原文链接:GoogleCTF2022 - eldar
复现强网杯的deeprev的时候知道的这题,本文是照着复现,翻译了一部分,部分名词我会标注英文原文。很多地方翻的很别扭,因为本人的英语水平实在是太🌶️🐔了……
墙裂建议阅读原文 不然看我写的可能只会收获一头雾水。
其实感觉还是搞得不是很明白,以后如果顿悟了再过来补充(x

涉及到的技术:
“Weird Machines” in ELF: A Spotlight on the Underappreciated Metadata
Programming Weird Machines with ELF Metadata

ELF重定位

ELF动态重定位的结构如下:

1
2
3
4
5
typedef struct {
Elf64_Addr r_offset;
Elf64_Xword r_info;
Elf64_Sxword r_addend;
} Elf64_Rela;

info包括一个type和symble(可选)索引:

1
2
#define ELF64_R_SYM(info)             ((info)>>32)
#define ELF64_R_TYPE(info) ((Elf64_Word)(info))

动态符号表也是相关的,结构如下:

1
2
3
4
5
6
7
8
typedef struct {
Elf64_Word st_name;
unsigned char st_info;
unsigned char st_other;
Elf64_Half st_shndx;
Elf64_Addr st_value;
Elf64_Xword st_size;
} Elf64_Sym;

所有重定位信息都写入r_offset指向的地址。写入的值取决于重定位类型。

eldar使用了以下4种重定位信息:

REL (type 8)

将重定位信息设置为一个相对基址的值(但是基址始终为0)

1
*(uint64_t *)r.r_offset = r.r_addend;

R64(type 1)

将重定位信息设置为 符号值+常数 (原文中是addend)

1
*(uint64_t *)r.r_offset = symbols[ELF64_R_SYM(r.r_info)] + r.r_addend;

R32 (type 10)

和R64一样,只不过被截断为32位。

1
*(uint32_t *)r.r_offset = symbols[ELF64_R_SYM(r.r_info)].st_value + r.r_addend;

CPY(type 5)

从符号地址复制n字节到重定位地址

1
2
Elf64_Sym sym = symbols[ELF64_R_SYM(r.r_info)];
memcpy(r.r_offset, sym.st_value, sym.st_size);

Part1 解析

我发现lief解析重定位数据和符号信息非常方便

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import lief
from collections import namedtuple

b = lief.ELF.parse('./eldar')

def to_sym(name):
assert len(name) == 1
return ord(name[0])

Rel = namedtuple('REL', ['dst','val','ridx'])
Copy = namedtuple('CPY', ['dst', 'symbol', 'ridx'])
R64 = namedtuple('R64', ['dst','symbol','addend','ridx'])
R32 = namedtuple('R32', ['dst','symbol','addend','ridx'])

def parse(b) -> list:
print('[*] Loading relocations...')
relocs = list(b.relocations)

print('[*] Parsing...')
instructions = []
for i in range(3, len(relocs)):
r = relocs[i]
match r.type:
case 1: # R64
instructions.append(
R64(r.address, to_sym(r.symbol.name), r.addend, i))
case 5: # CPY
instructions.append(
Copy(r.address, to_sym(r.symbol.name), i))
case 8: # REL
instructions.append(
Rel(r.address, r.addend, i))
case 10: # R32
instructions.append(
R32(r.address, to_sym(r.symbol.name), r.addend, i))

return instructions

def dump(instructions):
for op in instructions:
match op:
case Rel(dst, val, ridx):
print(f'[{ridx:04d}] :: rel {dst}, {val}')
case Copy(dst, symbol, ridx):
print(f'[{ridx:04d}] :: copy {dst}, {symbol}')
case R64(dst, symbol, addend, ridx):
print(f'[{ridx:04d}] :: r64 {dst}, {symbol} + {addend}')
case R32(dst, symbol, addend, ridx):
print(f'[{ridx:04d}] :: r64 {dst}, {symbol} + {addend}')

instructions = parse(b)
dump(instructions)

结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
...
[0354] :: copy 8405268, 2
[0355] :: rel 8405148, 8405268
[0356] :: rel 8405156, 8
[0357] :: copy 8414226, 2
[0358] :: r64 8405196, 4 + 0
[0359] :: rel 8414226, 0
[0360] :: r64 8405197, 1 + 0
[0361] :: rel 8405148, 8405690
[0362] :: copy 8405244, 2
[0363] :: rel 8405148, 8405196
[0364] :: copy 8414394, 2
[0365] :: r64 8405220, 4 + 0
[0366] :: rel 8414394, 0
[0367] :: rel 8405148, 8405220
[0368] :: copy 8414490, 2
[0369] :: r64 8405220, 5 + 0
[0370] :: rel 8414490, 0
[0371] :: copy 8414562, 2
[0372] :: r64 8405220, 5 + 0
[0373] :: rel 8414562, 0
[0374] :: r64 8405220, 5 + 8405690
[0375] :: copy 8405690, 5
[0376] :: copy 8414666, 2
[0377] :: r64 0, 6 + 0
[0378] :: rel 8405148, 8405698
...
[101463] :: copy 10840770, 2
[101464] :: r64 8405244, 6 + 0
[101465] :: rel 10840770, 0
[101466] :: rel 8405148, 8405244
[101467] :: copy 10840866, 2
[101468] :: r64 8405244, 6 + 0
[101469] :: rel 10840866, 0
[101470] :: rel 8405148, 8405220
[101471] :: copy 10840962, 2
[101472] :: r64 8405244, 6 + 0
[101473] :: rel 10840962, 0
[101474] :: rel 8405148, 8405244
[101475] :: copy 10841058, 2
[101476] :: r64 8405244, 6 + 0
[101477] :: rel 10841058, 0
[101478] :: copy 10841130, 2
[101479] :: r64 8405244, 6 + 0
[101480] :: rel 10841130, 0
[101481] :: rel 8405148, 8405220

10w多行重定位信息对于人类来说还是太抽象了 (这句话是我写的
这时我注意到了一些奇怪的重定位信息:[0377] :: r64 0, 6 + 0
这是往一个空指针里写东西,但是程序没🐔,所以可能是一些重定位信息在执行时会被修改。

我们可以检查数字落在哪个范围内,如果它位于.rela.dyn.dynsym部分内,则为其分配一个符号值,而不是使用原始数字。我们也知道flag(serial符号)在[0x404040:0x40405d]fail0x404060

定义我们的符号值:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
@dataclass
class Symbol(object):
idx: int
def __repr__(self):
return f's{self.idx}'

@dataclass
class Reloc(object):
idx: int
def __repr__(self):
return f'r{self.idx}'

@dataclass
class Ref(object):
val: Any
def __repr__(self):
return f'&{self.val}'

@dataclass
class SymAddr(object):
sym: Symbol
field: str
def __repr__(self):
return f'{self.sym}.{self.field}'

@dataclass
class RelocAddr(object):
reloc: Reloc
field: str
def __repr__(self):
return f'{self.reloc}.{self.field}'

def vaddr(self):
off = 0
match self.field:
case 'r_address':off = 0
case 'r_info': off = 8
case 'r_addend': off = 16

return (self.reloc.idx * 24) + off + rela.virtual_address

@dataclass
class FlagAddr(object):
idx: int
def __repr__(self):
return f'flag[{self.idx}]'

BaseAddr = namedtuple('baseaddr', [])
FailAddr = namedtuple('fail', [])

然后再写一个函数format_addr把虚拟地址转为符号值:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
rela = [x for x in b.sections if x.name == '.rela.dyn'][0]
dynsym = [x for x in b.sections if x.name == '.dynsym'][0]

def format_addr(addr: int):
if (addr >= rela.virtual_address
and addr < rela.virtual_address + rela.size
):
offset = addr - rela.virtual_address
r_offset = (offset // 24)
r_rem = offset % 24

match r_rem:
case 0: return RelocAddr(Reloc(r_offset), 'r_address')
case 8: return RelocAddr(Reloc(r_offset), 'r_info')
case 16: return RelocAddr(Reloc(r_offset), 'r_addend')
case _: return RelocAddr(Reloc(r_offset), r_rem)
elif (addr > dynsym.virtual_address
and addr < dynsym.virtual_address + dynsym.size
):
offset = addr - dynsym.virtual_address
r_offset = (offset // 24)
r_rem = offset % 24

match r_rem:
case 0: return SymAddr(Symbol(r_offset), 'st_name')
case 8: return Symbol(r_offset)
case 16: return SymAddr(Symbol(r_offset), 'st_size')
case _: return SymAddr(Symbol(r_offset), r_rem)
elif addr >= 0x404040 and addr < 0x404040+29:
off = addr-0x404040
return FlagAddr(off)
elif addr == 0x804000:
return BaseAddr()
elif addr == 0x404060:
return FailAddr()
else:
return addr

解析时,对每一个数值先调用函数format_addr(此处未展示)。可以得到以下内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// Interesting section of NOPs:
[0083] :: rel baseaddr(), 0
[0084] :: rel baseaddr(), 0
[0085] :: rel baseaddr(), 0
[0086] :: rel baseaddr(), 0
[0087] :: rel baseaddr(), 0
[0088] :: rel baseaddr(), 0
[0089] :: rel baseaddr(), 0

// Writing some consecutive values into previously
// applied relocations:
[0090] :: rel r3.r_address, 0
[0091] :: rel r3.r_info, 1
[0092] :: rel r3.r_addend, 2
[0093] :: rel r4.r_address, 3
[0094] :: rel r4.r_info, 4
[0095] :: rel r4.r_addend, 5
[0096] :: rel r5.r_address, 6
[0097] :: rel r5.r_info, 7
[0098] :: rel r5.r_addend, 8
[0099] :: rel r6.r_address, 9
[0100] :: rel r6.r_info, 10
[0101] :: rel r6.r_addend, 11

// Much more readable code!
[0343] :: rel r87.r_info, 253
[0344] :: rel r87.r_addend, 254
[0345] :: rel r88.r_address, 255
[0346] :: rel s4, 0
[0347] :: rel s2, r3.r_address
[0348] :: rel s2.st_size, 8
[0349] :: copy r350.r_addend, s2
[0350] :: r64 s4, s4 + 0
[0351] :: rel r350.r_addend, 0
[0352] :: rel s2, flag[0]
[0353] :: rel s2.st_size, 1

我们还能看到类似的东西:

1
2
[0349] :: copy r350.r_addend, s2
[0350] :: r64 s4, s4 + 0

349覆写了350中的加数。实际上,这两句话应该是s4 += s2(0被覆写为s2)

下面也有相似的覆写地址的操作:

1
2
3
[0376] :: copy r377.r_address, s2
[0377] :: r64 0, s6 + 0
//s2 = s6

如果我们对这些结构进行模式匹配并定义更高级别的操作,我们可能会得到一些更简洁的代码

Part2

不要问我为什么不翻译标题,因为这句话我也没看懂!😭(如果有知道是什么意思的师傅请务必告诉我

Arr + Out references

在解析过程中,我注意到一些重定位信息被作为NOP使用(向固定地址写入0)。其他重定位信息也会向那些重定位点(如一个uint64_t数组)写入任意的值。

特别指出,r3到r88似乎被用作一个数组。r89似乎是某种输出值(之后会说)。

我们可以修改format_addr来理解这些引用:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
@dataclass
class OutAddr(object):
idx: int

def __repr__(self):
return f'out[{self.idx}]'

@dataclass
class ArrAddr(object):
idx: int

def __repr__(self):
return f'arr[{self.idx}]'

def format_addr(addr: int):
if (addr >= rela.virtual_address
and addr < rela.virtual_address + rela.size
):
offset = addr - rela.virtual_address
r_offset = (offset // 24)
r_rem = offset % 24

# --- NEW ---
if r_offset >= 3 and r_offset <= 88:
arr_idx = (r_offset - 3) * 3 + (r_rem // 8)
return ArrAddr(arr_idx)
elif r_offset == 89:
return OutAddr(r_rem)
# --- END ---

match r_rem:
# ...

样例1(102000条指令)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Nice array references!
[0338] :: rel arr[248], 248
[0339] :: rel arr[249], 249
[0340] :: rel arr[250], 250
[0341] :: rel arr[251], 251
[0342] :: rel arr[252], 252
[0343] :: rel arr[253], 253
[0344] :: rel arr[254], 254
[0345] :: rel arr[255], 255
[0346] :: rel s4, 0
[0347] :: rel s2, arr[0] // neat
[0348] :: rel s2.st_size, 8
[0349] :: copy r350.r_addend, s2
[0350] :: r64 s4, s4 + 0
[0351] :: rel r350.r_addend, 0
[0352] :: rel s2, flag[0] // flag!
[0353] :: rel s2.st_size, 1
[0354] :: copy s7, s2
[0355] :: rel s2, s7
[0356] :: rel s2.st_size, 8
[0357] :: copy r358.r_addend, s2
[0358] :: r64 s4, s4 + 0
[0359] :: rel r358.r_addend, 0
[0360] :: r64 s4.9, s1 + 0
[0361] :: rel s2, arr[0]
[0362] :: copy s6, s2

好看多了!

Mov + Add

下一步我们可以把RelR64Copy使用更容易理解的MovAdd替换

为了知道mov的大小(根据Copy变化),我们需要追踪所有指令,然后跟踪每个偏移处的符号大小。我们可以只看向符号st_size写的部分来观察这些变化。幸运的是,它们在这个程序里是确定的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Mov = namedtuple('mov', ['dst', 'src', 'sz', 'ridx'])
Add = namedtuple('add', ['dst', 'src', 'addend', 'ridx'])

def lift_mov_add(instructions):
idx = 0

sizes = []
curr = [8] * 8
sizes.append(curr)

# Trace instructions and monitor the size of every symbol.
for instr in instructions:
c = list(curr)
match instr:
case Rel(SymAddr(Symbol(idx), 'st_size'), val, ridx):
c[idx] = val
sizes.append(c)

# Iterate and find Mov/Add patterns.
while idx < len(instructions):
match instructions[idx]:
case Rel(dst, val, ridx):
instructions[idx] = Mov(dst, Ref(val), 8, ridx)
case Copy(dst, sym, ridx):
instructions[idx] = Mov(
dst, sym, sizes[idx][sym.idx], ridx)
case R64(dst, sym, add, ridx):
instructions[idx] = Add(dst, sym, add, ridx)
idx += 1
return instructions

def remove_sizes(instructions):
# Sizes are now nops.
idx = 0
while idx < len(instructions):
match instructions[idx]:
case Mov(SymAddr(Symbol(s), 'st_size'), _, _, _):
instructions[idx:idx+1] = []

idx += 1
return instructions

我们还需要更新我们的dump函数以支持这些新类型:

1
2
3
4
5
6
7
8
9
10
def dump(instructions):
for op in instructions:
match op:
# --- snip ---
case Mov(dst, src, 8, ridx):
print(f'[{ridx:04d}] :: mov {dst}, {src}')
case Mov(dst, src, sz, ridx):
print(f'[{ridx:04d}] :: mov({sz}) {dst}, {src}')
case Add(dst, src, add, ridx):
print(f'[{ridx:04d}] :: add {dst}, {src}, {add}')

示例2(96179条指令)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// smoooth
[0336] :: mov arr[246], &246
[0337] :: mov arr[247], &247
[0338] :: mov arr[248], &248
[0339] :: mov arr[249], &249
[0340] :: mov arr[250], &250
[0341] :: mov arr[251], &251
[0342] :: mov arr[252], &252
[0343] :: mov arr[253], &253
[0344] :: mov arr[254], &254
[0345] :: mov arr[255], &255
[0346] :: mov s4, &0
[0347] :: mov s2, &arr[0]
[0349] :: mov r350.r_addend, s2
[0350] :: add s4, s4, 0
[0351] :: mov r350.r_addend, &0
[0352] :: mov s2, &flag[0]
[0354] :: mov(1) s7, s2
[0355] :: mov s2, &s7
[0357] :: mov r358.r_addend, s2
[0358] :: add s4, s4, 0
[0359] :: mov r358.r_addend, &0
[0360] :: r64 s4.9, s1 + 0
[0361] :: mov s2, &arr[0]
[0362] :: mov s6, s2
[0363] :: mov s2, &s4

间接mov

前面提到过,某些重定位里的r_addend会在用到之前被修改。举个🌰:

1
2
3
[0349] :: mov r350.r_addend, s2
[0350] :: add s4, s4, 0
[0351] :: mov r350.r_addend, &0

这3句指令其实就是s4 = s2。我感觉产生第3条指令是程序编译方式的问题,这玩意其实应该没什么用。

用python3.10来match,我们可以对这种类型的结构执行清晰的模式匹配,并用一条Add指令替换它:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def lift_indirect(instructions):
idx = 0
while idx < len(instructions):
match instructions[idx:idx+3]:
case [
Mov(RelocAddr(Reloc(r1), 'r_addend'), sym, _, ridx_1),
Add(dst_2, sym_2, _, ridx_2),
Mov(RelocAddr(Reloc(r3), 'r_addend'), Ref(0), _, _),
] if (
(r1 == ridx_2) and (r3 == ridx_2)
):
instructions[idx:idx+3] = [
Add(dst_2, sym_2, sym, ridx_1)
]

idx += 1
return instructions

样例3 (58155条指令)

1
2
3
4
5
6
7
8
9
10
11
12
13
[0346] :: mov s4, &0
[0347] :: mov s2, &arr[0]
[0349] :: add s4, s4, s2 // wow
[0352] :: mov s2, &flag[0]
[0354] :: mov(1) s7, s2
[0355] :: mov s2, &s7
[0357] :: add s4, s4, s2 // cool
[0360] :: r64 s4.9, s1 + 0
[0361] :: mov s2, &arr[0]
[0362] :: mov s6, s2
[0363] :: mov s2, &s4
[0364] :: add s5, s4, s2 // amazing
[0367] :: mov s2, &s5

分块

检查汇编代码,我们可以发现重复的代码块:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// ---------------
[0378] :: mov s2, &arr[1]
[0379] :: add s4, s4, s2
[0382] :: mov s2, &flag[1]
[0384] :: mov(1) s7, s2
[0385] :: mov s2, &s7
[0387] :: add s4, s4, s2
[0390] :: r64 s4.9, s1 + 0
[0391] :: mov s2, &arr[1]
[0392] :: mov s6, s2
[0393] :: mov s2, &s4
[0394] :: add s5, s4, s2
[0397] :: mov s2, &s5
[0398] :: add s5, s5, s2
[0401] :: add s5, s5, s2
[0404] :: add s5, s5, arr[0]
[0405] :: mov arr[1], s5
[0406] :: mov r407.r_address, s2
[0407] :: add 0, s6, 0
// ---------------
[0408] :: mov s2, &arr[2]
[0409] :: add s4, s4, s2
[0412] :: mov s2, &flag[0]
[0414] :: mov(1) s7, s2
[0415] :: mov s2, &s7
[0417] :: add s4, s4, s2
[0420] :: r64 s4.9, s1 + 0
[0421] :: mov s2, &arr[2]
[0422] :: mov s6, s2
[0423] :: mov s2, &s4
[0424] :: add s5, s4, s2
[0427] :: mov s2, &s5
[0428] :: add s5, s5, s2
[0431] :: add s5, s5, s2
[0434] :: add s5, s5, arr[0]
[0435] :: mov arr[2], s5
[0436] :: mov r437.r_address, s2
[0437] :: add 0, s6, 0
// ---------------

这些块干的事情就是:

1
2
3
4
5
6
7
# ---
s6 = (s6 + arr[1] + flag[1]) & 0xff
arr[1], arr[s6] = arr[s6], arr[1]
# ---
s6 = (s6 + arr[2] + flag[0]) & 0xff
arr[2], arr[s6] = arr[s6], arr[2]
# ---

似乎只有arr和flag的索引改变了。改一下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Block = namedtuple('block', ['arr', 'flag', 'ridx'])

def lift_block(instructions):
idx = 0
while idx < len(instructions):
match instructions[idx:idx+18]:
case [
Mov(_,arr,_,ridx),
Add(_,_,_,_),
Mov(_,flag,_,_),
Mov(_,_,_,_),
Mov(_,_,_,_),
Add(_,_,_,_),
R32(_,_,_,_),
Mov(_,_,_,_),
Mov(_,_,_,_),
Mov(_,_,_,_),
Add(_,_,_,_),
Mov(_,_,_,_),
Add(_,_,_,_),
Add(_,_,_,_),
Add(_,_,_,_),
Mov(_,_,_,_),
Mov(_,_,_,_),
Add(_,_,_,_),
]:
instructions[idx:idx+18] = [
Block(arr, flag, ridx)
]
idx += 1
return instructions

样例4(23339条指令)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
// array set
[0342] :: mov arr[252], &252
[0343] :: mov arr[253], &253
[0344] :: mov arr[254], &254
[0345] :: mov arr[255], &255

// s4 = 0
[0346] :: mov s4, &0

// repeating shuffles...
[0347] :: shuffle &arr[0], &flag[0]
[0378] :: shuffle &arr[1], &flag[1]
[0408] :: shuffle &arr[2], &flag[0]
[0438] :: shuffle &arr[3], &flag[1]
[0468] :: shuffle &arr[4], &flag[0]
[0498] :: shuffle &arr[5], &flag[1]
[0528] :: shuffle &arr[6], &flag[0]

// later
[24275] :: add s5, s5, s2
[24278] :: add s5, s5, s2
[24281] :: add s5, s5, arr[0]
[24282] :: mov out[8], s5

// array is reset again...
[24283] :: mov arr[0], &0
[24284] :: mov arr[1], &1
[24285] :: mov arr[2], &2
[24286] :: mov arr[3], &3
[24287] :: mov arr[4], &4
[24288] :: mov arr[5], &5
[24289] :: mov arr[6], &6
[24290] :: mov arr[7], &7
[24291] :: mov arr[8], &8
[24292] :: mov arr[9], &9
[24293] :: mov arr[10], &10
[24294] :: mov arr[11], &11

重置(Reset)和块移动(ShuffleBlock)

(这里的重置就是清零的意思,块移动看看样例就懂了(雾)

很多指令是用于数组初始化的,我们可以把它们淦掉(雾
还可以淦掉很多用于移动的指令(lift和shuffle不知道怎么翻译了,不过看了样例就知道什么意思了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
Reset = namedtuple('reset', ['ridx'])
ShuffleBlock = namedtuple('shuffleblock', ['f1', 'f2', 'ridx'])

def lift_reset(instructions):
idx = 0
while idx < len(instructions) - 256:
good = True

for i in range(256):
op = instructions[idx+i]
if type(op) == Mov:
dst, src, _, _ = op
if dst != ArrAddr(i) or src != Ref(i):
good = False
break
else:
good = False
break

if good:
instructions[idx:idx+256] = [
Reset(instructions[idx].ridx)
]

idx += 1
return instructions

def lift_shuffle_block(instructions):
idx = 0
while idx < len(instructions) - 256:
good = True

for i in range(256):
op = instructions[idx+i]
if type(op) == Block:
arr, flag, ridx = op
if arr != Ref(ArrAddr(i)):
good = False
break
else:
good = False
break

if good:
instructions[idx:idx+256] = [
ShuffleBlock(
instructions[idx].flag,
instructions[idx+1].flag,
instructions[idx].ridx)]

idx += 1
return instructions

样例5(19259条指令)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[0090] :: reset                             // wow
[0346] :: mov s4, &0
[0347] :: shuffleblock &flag[0], &flag[1] // so smol
[8028] :: mov s4, &0
[8029] :: mov s2, &arr[0]
[8030] :: add s4, s4, s2
[8033] :: r32 s4.9, s1, 0
// ...
[8154] :: mov out[2], s5
[8155] :: reset
[8411] :: mov s4, &0
[8412] :: shuffleblock &flag[2], &flag[3] // another!
[16092] :: mov s4, &0
[16093] :: mov s2, &arr[0]
[16094] :: add s4, s4, s2
[16097] :: r32 s4.9, s1, 0

Output

在每个reset/shuffleblock对之间,似乎有三个重复的指令块写入该out区域:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
// ---------------
[8029] :: mov s2, &arr[0]
[8030] :: add s4, s4, s2
[8033] :: r32 s4.9, s1, 0
[8034] :: mov s6, s2
[8035] :: mov s2, &s4
[8036] :: add s5, s4, s2
[8039] :: mov s2, &s5
[8040] :: add s5, s5, s2
[8043] :: add s5, s5, s2
[8046] :: add s5, s5, arr[0]
[8047] :: mov arr[0], s5
[8048] :: mov s2, &arr[0]
[8049] :: mov s7, s2
[8050] :: mov s2, &s5
[8051] :: mov r8052.r_address, s2
[8052] :: add 0, s6, 0
[8053] :: mov s2, &s7
[8054] :: add s6, s6, s2
[8057] :: r32 s6.9, s1, 0
[8058] :: mov s2, &s6
[8059] :: add s5, s6, s2
[8062] :: mov s2, &s5
[8063] :: add s5, s5, s2
[8066] :: add s5, s5, s2
[8069] :: add s5, s5, arr[0]
[8070] :: mov out[0], s5
// ---------------
[8071] :: mov s2, &arr[1]
[8072] :: add s4, s4, s2
[8075] :: r32 s4.9, s1, 0
[8076] :: mov s6, s2
[8077] :: mov s2, &s4
[8078] :: add s5, s4, s2
[8081] :: mov s2, &s5
[8082] :: add s5, s5, s2
[8085] :: add s5, s5, s2
[8088] :: add s5, s5, arr[0]
[8089] :: mov arr[1], s5
[8090] :: mov s2, &arr[1]
[8091] :: mov s7, s2
[8092] :: mov s2, &s5
[8093] :: mov r8094.r_address, s2
[8094] :: add 0, s6, 0
[8095] :: mov s2, &s7
[8096] :: add s6, s6, s2
[8099] :: r32 s6.9, s1, 0
[8100] :: mov s2, &s6
[8101] :: add s5, s6, s2
[8104] :: mov s2, &s5
[8105] :: add s5, s5, s2
[8108] :: add s5, s5, s2
[8111] :: add s5, s5, arr[0]
[8112] :: mov out[1], s5
// ---------------

它们干的事情就是:

1
2
3
4
5
6
7
8
9
# ---
s4 = (s4 + arr[0]) & 0xff
arr[0], arr[s4] = arr[s4], arr[0]
out[0] = arr[(arr[0] + arr[s4]) & 0xff]
# ---
s4 = (s4 + arr[1]) & 0xff
arr[1], arr[s4] = arr[s4], arr[1]
out[1] = arr[(arr[1] + arr[s4]) & 0xff]
# ---

我们把它淦掉:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Output = namedtuple('output', ['out', 'arr', 'ridx'])

def lift_output(instructions):
idx = 0
while idx < len(instructions):
match instructions[idx:idx+26]:
case [
Mov(_,arr,_,ridx),
Add(_,_,_,_),
R32(_,_,_,_),
Mov(_,_,_,_),
Mov(_,_,_,_),
Add(_,_,_,_),
Mov(_,_,_,_),
Add(_,_,_,_),
Add(_,_,_,_),
Add(_,_,_,_),
Mov(_,_,_,_),
Mov(_,_,_,_),
Mov(_,_,_,_),
Mov(_,_,_,_),
Mov(_,_,_,_),
Add(_,_,_,_),
Mov(_,_,_,_),
Add(_,_,_,_),
R32(_,_,_,_),
Mov(_,_,_,_),
Add(_,_,_,_),
Mov(_,_,_,_),
Add(_,_,_,_),
Add(_,_,_,_),
Add(_,_,_,_),
Mov(out,_,_,_),
]:
instructions[idx:idx+26] = [Output(out, arr, ridx)]
idx += 1
return instructions

样例6(18659条指令)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// hey this is looking almost readable
[0090] :: reset
[0346] :: mov s4, &0
[0347] :: shuffleblock &flag[0], &flag[1]
[8028] :: mov s4, &0
[8029] :: output out[0], &arr[0] // neat
[8071] :: output out[1], &arr[1]
[8113] :: output out[2], &arr[2]
[8155] :: reset
[8411] :: mov s4, &0
[8412] :: shuffleblock &flag[2], &flag[3]
[16092] :: mov s4, &0
[16093] :: output out[3], &arr[0]
[16135] :: output out[4], &arr[1]
[16177] :: output out[5], &arr[2]
[16219] :: reset
[16475] :: mov s4, &0
[16476] :: shuffleblock &flag[4], &flag[5]
[24156] :: mov s4, &0
[24157] :: output out[6], &arr[0]
[24199] :: output out[7], &arr[1]
[24241] :: output out[8], &arr[2]
[24283] :: reset
[24539] :: mov s4, &0
[24540] :: shuffleblock &flag[6], &flag[7]
[32220] :: mov s4, &0
[32221] :: output out[9], &arr[0]
[32263] :: output out[10], &arr[1]
[32305] :: output out[11], &arr[2]
[32347] :: reset
[32603] :: mov s4, &0

MultAdd

(就是多次加法)

check到flag[16]之后,上面那种代码没了,换了新的加密方式。
遇到一些新的块,它们似乎不太一样:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
/ ----------
[64730] :: mov s2, &out[2]
[64732] :: mov(1) s5, s2
[64733] :: mov s6, &0
[64734] :: mov s2, &s6
[64736] :: add s6, s6, s2
[64739] :: mov s2, &s5
[64740] :: add s6, s6, s2
[64743] :: mov s2, &s6
[64744] :: add s6, s6, s2
[64747] :: mov s2, &s5
[64748] :: add s6, s6, s2
[64751] :: mov s2, &s6
[64752] :: add s6, s6, s2
[64755] :: add s6, s6, s2
[64758] :: mov s2, &s5
[64759] :: add s6, s6, s2
[64762] :: mov s2, &s6
[64763] :: add s6, s6, s2
[64766] :: add s6, s6, s2
[64769] :: add s6, s6, s2
[64772] :: add s6, s6, s2
[64775] :: add s7, s7, s2
// ----------
[64778] :: mov s2, &out[3]
[64780] :: mov(1) s5, s2
[64781] :: mov s6, &0
[64782] :: mov s2, &s6
[64784] :: add s6, s6, s2
[64787] :: mov s2, &s5
[64788] :: add s6, s6, s2
[64791] :: mov s2, &s6
[64792] :: add s6, s6, s2
[64795] :: mov s2, &s5
[64796] :: add s6, s6, s2
[64799] :: mov s2, &s6
[64800] :: add s6, s6, s2
[64803] :: mov s2, &s5
[64804] :: add s6, s6, s2
[64807] :: mov s2, &s6
[64808] :: add s6, s6, s2
[64811] :: mov s2, &s5
[64812] :: add s6, s6, s2
[64815] :: mov s2, &s6
[64816] :: add s6, s6, s2
[64819] :: add s6, s6, s2
[64822] :: mov s2, &s5
[64823] :: add s6, s6, s2
[64826] :: mov s2, &s6
[64827] :: add s6, s6, s2
[64830] :: mov s2, &s5
[64831] :: add s6, s6, s2
[64834] :: mov s2, &s6
[64835] :: add s7, s7, s2
/ ----------

这些块干的事情就是:

1
2
3
4
5
6
7
8
9
// s5 = out[3]
[64778] :: mov s2, &out[3]
[64780] :: mov(1) s5, s2
[64781] :: mov s6, &0

// compute some multiple of s5

// add result to s7
[64835] :: add s7, s7, s2

内部的一些代码执行的就是s6 += s5(+)或s6 += s6(×2)

举个🌰:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[64730] :: mov s2, &out[2]
[64732] :: mov(1) s5, s2 // s5 = out[2]
[64733] :: mov s6, &0 // s6 = 0

[64734] :: mov s2, &s6
[64736] :: add s6, s6, s2 // s6 = 0
[64739] :: mov s2, &s5
[64740] :: add s6, s6, s2 // s6 = out[2]
[64743] :: mov s2, &s6
[64744] :: add s6, s6, s2 // s6 = 2 * out[2]
[64747] :: mov s2, &s5
[64748] :: add s6, s6, s2 // s6 = 3 * out[2]
[64751] :: mov s2, &s6
[64752] :: add s6, s6, s2 // s6 = 6 * out[2]
[64755] :: add s6, s6, s2 // s6 = 12 * out[2]
[64758] :: mov s2, &s5
[64759] :: add s6, s6, s2 // s6 = 13 * out[2]
[64762] :: mov s2, &s6
[64763] :: add s6, s6, s2 // s6 = 26 * out[2]
[64766] :: add s6, s6, s2 // s6 = 52 * out[2]
[64769] :: add s6, s6, s2 // s6 = 104 * out[2]
[64772] :: add s6, s6, s2 // s6 = 208 * out[2]

[64775] :: add s7, s7, s2 // s7 += (208 * out[2])

虽然有点麻烦,但我们还是可以把它淦掉:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
MultAdd = namedtuple('multadd', ['out', 'val', 'k', 'ridx'])

def lift_multadd(instructions):
idx = 0
while idx < len(instructions):
match instructions[idx:idx+3]:
# block prefix
case [
Mov(Symbol(2), out, _, ridx),
Mov(Symbol(5), Symbol(2), _, _),
Mov(Symbol(6), Ref(0), _, _),
]:
k = 0
double = False

ptr = idx + 3

good = True
while ptr < len(instructions):
match instructions[ptr]:
case Mov(Symbol(2), Ref(Symbol(6)), _, _):
double = True
case Mov(Symbol(2), Ref(Symbol(5)), _, _):
double = False
case Add(Symbol(6), Symbol(6), Symbol(2), _):
k = (k * 2) if double else (k + 1)
case Add(Symbol(7), Symbol(7), Symbol(2), _):
ptr += 1
break
case _:
good = False
break

ptr += 1

if good:
instructions[idx:ptr] = [
MultAdd(Symbol(7), out, k, ridx)
]

idx += 1

return instructions

样例7(2377条指令)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
[56795] :: mov s4, &0
[56796] :: shuffleblock &flag[14], &flag[15]
[64476] :: mov s4, &0
[64477] :: output out[21], &arr[0]
[64519] :: output out[22], &arr[1]
[64561] :: output out[23], &arr[2]
[64603] :: mov s2, &r101917.r_address
[64605] :: mov(2064) arr[0], s2
[64606] :: mov s4, &0
[64607] :: mov s5, &0
[64608] :: mov s7, &-393039
[64609] :: madd s7 += (&out[0] * 155) // math!
[64667] :: madd s7 += (&out[1] * 190)
[64730] :: madd s7 += (&out[2] * 208)
[64778] :: madd s7 += (&out[3] * 123)
[64838] :: madd s7 += (&out[4] * 214)
[64896] :: madd s7 += (&out[5] * 146)
[64944] :: madd s7 += (&out[6] * 211)
[65002] :: madd s7 += (&out[7] * 85)
[65052] :: madd s7 += (&out[8] * 87)
[65107] :: madd s7 += (&out[9] * 251)
[65175] :: madd s7 += (&out[10] * 122)
[65230] :: madd s7 += (&out[11] * 60)
[65277] :: madd s7 += (&out[12] * 237)
[65340] :: madd s7 += (&out[13] * 27)
[65384] :: madd s7 += (&out[14] * 63)
[65441] :: madd s7 += (&out[15] * 207)
[65504] :: madd s7 += (&out[16] * 33)
[65541] :: madd s7 += (&out[17] * 138)
[65589] :: madd s7 += (&out[18] * 51)
[65636] :: madd s7 += (&out[19] * 103)
[65691] :: madd s7 += (&out[20] * 58)
[65738] :: madd s7 += (&out[21] * 68)
[65778] :: madd s7 += (&out[22] * 120)
[65828] :: madd s7 += (&out[23] * 111)
[65888] :: mov s2, &s5.11
[65890] :: mov(5) s7.11, s2
[65891] :: mov s2, &s7
[65893] :: add s4, s4, s2
[65896] :: mov s7, &-368502
[65897] :: madd s7 += (&out[0] * 254)
[65965] :: madd s7 += (&out[1] * 21)
[66004] :: madd s7 += (&out[2] * 157)
[66062] :: madd s7 += (&out[3] * 204)
[66115] :: madd s7 += (&out[4] * 56)
[66157] :: madd s7 += (&out[5] * 134)

Trunc

每组madd指令之后,我们看到一些奇怪的指令:

1
2
[65888] :: mov s2, &s5.11
[65890] :: mov(5) s7.11, s2

我们正在往一个符号值的偏移为11的地方(就是&sym.st_value + 3)写入值。

检查这些代码,可以发现s5总为0。所以这个指令就是把这个符号值的高5字节清零,即sym &= 0xffffff

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Trunc = namedtuple('trunc', ['val', 'k', 'ridx'])

def lift_truncate(instructions):
idx = 0
while idx < len(instructions):
match instructions[idx:idx+2]:
case [
Mov(Symbol(2), Ref(SymAddr(Symbol(5), 11)), _, ridx),
Mov(SymAddr(Symbol(7), 11), Symbol(2), 5, _)
]:
instructions[idx:idx+2] = [
Trunc(Symbol(7), 0xffffff, ridx)]
idx += 1
return instructions

样例8(2336条指令)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[80967] :: madd s7 += (&out[17] * 219)
[81030] :: madd s7 += (&out[18] * 74)
[81075] :: madd s7 += (&out[19] * 223)
[81143] :: madd s7 += (&out[20] * 220)
[81201] :: madd s7 += (&out[21] * 235)
[81264] :: madd s7 += (&out[22] * 151)
[81322] :: madd s7 += (&out[23] * 165)
[81375] :: trunc s7 &= 0xffffff // yay
[81378] :: mov s2, &s7
[81380] :: add s4, s4, s2
[81383] :: mov s7, &-488931
[81384] :: madd s7 += (&out[0] * 207)
[81447] :: madd s7 += (&out[1] * 123)
[81507] :: madd s7 += (&out[2] * 9)

Array slots

在新的字节码里,我们发现了巨多常量。还有一些像数组的新地方:

注意:setinfo只是一个进入st_name设置其他字段的8字节mov。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
// NOP relocations for array space
[93678] :: mov baseaddr(), &0
[93679] :: mov baseaddr(), &0
[93680] :: mov baseaddr(), &0
[93681] :: mov baseaddr(), &0
[93682] :: mov baseaddr(), &0
[93683] :: mov baseaddr(), &0
// Array data...
[93684] :: mov r93678.r_address, &6124966137932793672
[93685] :: mov r93678.r_info, &-4041842636978026168
[93686] :: mov r93678.r_addend, &5027987508982512709
[93687] :: mov r93679.r_address, &-8413873093579112200
[93688] :: mov r93679.r_info, &5011179070235933765
[93689] :: mov r93679.r_addend, &4323655821306185960
[93690] :: mov r93680.r_address, &7154245383694349856
[93691] :: mov r93680.r_info, &-3458402876407461680
[93692] :: mov r93680.r_addend, &-5186046161349200369
[93693] :: mov r93681.r_address, &51797902590214145
[93694] :: mov r93681.r_info, &5009120185953550336
[93695] :: mov r93681.r_addend, &-4648246994148326916
[93696] :: mov r93682.r_address, &-8410243179269569141
[93697] :: mov r93682.r_info, &51245831810508869
[93698] :: mov r93682.r_addend, &5011371985779865103
[93699] :: mov r93683.r_address, &215243856300280
[93700] :: mov s3, &r93678.r_address
[93701] :: setinfo s3, name=0x1a, info=0x1, other=0x0, shndx=0x0
[93702] :: add s5, s3, 0
[93703] :: mov s2, &r101997.r_address
[93705] :: mov(144) r93678.r_address, s2
[93706] :: mov s2, &s5
[93708] :: add s4, s4, s2
[93711] :: mov s5, &0
[93712] :: mov s7, &-55708
[93713] :: madd s7 += (&flag[16] * 76)
[93758] :: madd s7 += (&flag[17] * 104)
[93803] :: mov s2, &flag[18]
[93805] :: mov(1) s5, s2
// Array space?
[93806] :: mov baseaddr(), &0
// Array data...
[93807] :: mov r93806.r_address, &-6989445605485108096
[93808] :: mov r93806.r_info, &195
[93809] :: mov s3, &r93806.r_address
[93810] :: setinfo s3, name=0x1a, info=0x1, other=0x0, shndx=0x0
[93811] :: add r93806.r_address, s3, 0
[93812] :: mov s2, &r102002.r_address
[93814] :: mov(24) r93806.r_address, s2
[93815] :: mov s2, &s5

这些数组似乎总是由一堆nop指令(mov baseaddr(), &0)组成,后面跟着一堆mov来给数组赋值。

我们可以扫描这种字节码并放到一个固定的常量数组中:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
ArraySlots = namedtuple('arr', ['values', 'ridx'])

def lift_array_slots(instructions):
idx = 0
while idx < len(instructions):
match instructions[idx]:
case Mov(BaseAddr(), Ref(0), _, ridx):
ptr = idx+1
while ptr < len(instructions):
op = instructions[ptr]
if type(op) != Mov or op.dst != BaseAddr():
break
ptr += 1

start = idx
end = ptr

data = []

# Check for movs into array.
vstart = RelocAddr(Reloc(ridx), 'r_address').vaddr()
offset = 0
while end + offset < len(instructions) and offset < ((end - start) * 3):
op = instructions[end + offset]
if type(op) == Mov and type(op.dst) is RelocAddr and op.dst.vaddr() == vstart + (offset * 8):
data.append(op.src.val)
else:
break
offset += 1

if len(data) > 0:
data += [0] * (((end - start) * 3) - len(data))
instructions[idx:end+offset] = [
ArraySlots(data, ridx)
]

idx += 1
return instructions

样例9(2161条指令)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
[93677] :: mov s5, &0
// what a chonker
[93678] :: array [0x5500404040c7c748, -0x38178276b71a76b8, 0x45c700000000fc45, -0x74c414ffffffff08, 0x458b48d06348fc45, 0x3c00b60fd00148e8, 0x6348fc458b147620, -0x2ffeb717ba74b730, -0x47f88981c3ff49f1, 0xb805eb00000001, 0x4583f84509000000, -0x4081e403827cfe04, -0x74b72f9cb703ba75, 0xb60fd00148e845, 0x458bf84509c0b60f, 0xc3c35d9848f8, 0x0, 0x0]
[93700] :: mov s3, &r93678.r_address
[93701] :: setinfo s3, name=0x1a, info=0x1, other=0x0, shndx=0x0
[93702] :: add s5, s3, 0
[93703] :: mov s2, &r101997.r_address
[93705] :: mov(144) r93678.r_address, s2
[93706] :: mov s2, &s5
[93708] :: add s4, s4, s2
[93711] :: mov s5, &0
[93712] :: mov s7, &-55708
[93713] :: madd s7 += (&flag[16] * 76)
[93758] :: madd s7 += (&flag[17] * 104)
[93803] :: mov s2, &flag[18]
[93805] :: mov(1) s5, s2
// smol
[93806] :: array [-0x60ff7fbf1bdadb80, 0xc3, 0x0]
[93809] :: mov s3, &r93806.r_address
[93810] :: setinfo s3, name=0x1a, info=0x1, other=0x0, shndx=0x0
[93811] :: add r93806.r_address, s3, 0
[93812] :: mov s2, &r102002.r_address

Shellcode

现在问题是那些值是啥东西?我刚开始以为它们是某些矩阵算出来的值。

但是我注意到0x48前缀和0xc3后缀,并对字节码产生了怀疑。我之前还注意到,重新定位部分被标记了rwx。

所以让我们把第一个块扔进 Binary Ninja,看看会发生什么……

彳亍,那肯定是字节码了。第一个大函数似乎只是检查所有标志字节是否在(0x20, 0x7e)范围内。

这也是唯一的大功能,其余的似乎都是9字节的shellcode数据。

我不太了解shellcode 是如何在这里实际执行的,但它与加载指向 shellcode 的符号有关。我们可以将以下块识别为“执行 shellcode”块:

1
2
3
4
5
6
[93806] :: array [-0x60ff7fbf1bdadb80, 0xc3, 0x0]
[93809] :: mov s3, &r93806.r_address
[93810] :: setinfo s3, name=0x1a, info=0x1, other=0x0, shndx=0x0
[93811] :: add r93806.r_address, s3, 0
[93812] :: mov s2, &r102002.r_address
[93814] :: mov(24) r93806.r_address, s2

把它淦掉:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from capstone import *
md = Cs(CS_ARCH_X86, CS_MODE_64)

# ---

Shellcode = namedtuple('shellcode', ['dst', 'code', 'ridx'])

def lift_shellcode(instructions):
idx = 0
while idx < len(instructions):
match instructions[idx:idx+6]:
case [
ArraySlots(values, ridx),
Mov(Symbol(3), Ref(RelocAddr(Reloc(rel2), 'r_address')), _, _),
Mov(SymAddr(Symbol(3), 'st_name'), _, _, _),
Add(dst, Symbol(3), _, _),
Mov(Symbol(2), _, _, _),
Mov(RelocAddr(Reloc(rel6), 'r_address'), Symbol(2), _, _)
] if (rel2 == ridx) and (rel6 == ridx):
instructions[idx:idx+6] = [
Shellcode(dst, b''.join([(x & 0xffffffffffffffff).to_bytes(8, 'little') for x in values]), ridx)
]
idx += 1
return instructions

dump的时候可以通过capstone得到漂亮的反汇编代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def dump(instructions):
# ...
case Shellcode(dst, code, ridx):
print(f'[{ridx:04d}] :: exec {dst} <- {code.hex()}')
print('-' * 20)
for i in md.disasm(code, 0):
if i.mnemonic == 'ret':
break
print(" 0x%x:\t%s\t%s" % (
i.address,
i.mnemonic,
i.op_str.replace('0x8040e4', 's5')
.replace('0x8040cc', 's4')))
print('-' * 20)

样例10(1771条指令)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// have you seen sexier disassembly?

[100408] :: add s4, s4, s2
[100411] :: mov s7, &-30801
[100412] :: madd s7 += (&flag[16] * 15)
[100453] :: madd s7 += (&flag[17] * 41)
[100495] :: mov s2, &flag[18]
[100497] :: mov(1) s5, s2
[100498] :: exec r100498.r_address <- c00425e4408000b4c3000000000000000000000000000000
--------------------
0x0: rol byte ptr [s5], 0xb4
--------------------
[100507] :: mov s2, &s5
[100509] :: add s7, s7, s2
[100512] :: mov s2, &flag[19]
[100514] :: mov(1) s5, s2
[100515] :: exec r100515.r_address <- c02c25e440800003c3000000000000000000000000000000
--------------------
0x0: shr byte ptr [s5], 3
--------------------
[100524] :: mov s2, &s5
[100526] :: add s7, s7, s2
[100529] :: mov s2, &flag[20]
[100531] :: mov(1) s5, s2
[100532] :: exec r100532.r_address <- c00c25e440800064c3000000000000000000000000000000
--------------------
0x0: ror byte ptr [s5], 0x64
--------------------
[100541] :: mov s2, &s5
[100543] :: add s7, s7, s2
[100546] :: madd s7 += (&flag[21] * 167)
[100604] :: mov s2, &flag[22]
[100606] :: mov(1) s5, s2
[100607] :: exec r100607.r_address <- c00425e440800051c3000000000000000000000000000000
--------------------
0x0: rol byte ptr [s5], 0x51
--------------------
[100616] :: mov s2, &s5

Aop

所以这个shellcode只是被用来做更多类型的操作的一种机制。

举个🌰:

1
2
3
4
5
6
7
8
[100512] :: mov s2, &flag[19]
[100514] :: mov(1) s5, s2
[100515] :: exec r100515.r_address <- c02c25e440800003c3000000000000000000000000000000
--------------------
0x0: shr byte ptr [s5], 3
--------------------
[100524] :: mov s2, &s5
[100526] :: add s7, s7, s2

这个块就是s7 += (flag[19] >> 3)

我们可以定义一个类似于madd的aop(加操作),不过它多了一个操作符号:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Aop = namedtuple('aop', ['dst', 'op', 'val', 'k', 'ridx'])

def lift_aop(instructions):
idx = 0
while idx < len(instructions):
match instructions[idx:idx+5]:
case [
Mov(Symbol(2), val, _, ridx),
Mov(Symbol(5), Symbol(2), _, _),
Shellcode(_, data, _),
Mov(Symbol(2), Ref(Symbol(5)), _, _),
Add(dst, dst2, Symbol(2), _)
] if len(data) == 24 and (dst == dst2):
op = next(md.disasm(data, 0))

t = op.mnemonic
k = int(op.op_str.split(', ')[-1], 16)

instructions[idx:idx+5] = [
Aop(dst, t, val, k, ridx)
]

idx += 1
return instructions

样例11(1147条指令):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
// out madd computation
[64607] :: mov s5, &0
[64608] :: mov s7, &-393039
[64609] :: madd s7 += (&out[0] * 155)
[64667] :: madd s7 += (&out[1] * 190)
[64730] :: madd s7 += (&out[2] * 208)
[64778] :: madd s7 += (&out[3] * 123)
[64838] :: madd s7 += (&out[4] * 214)
[64896] :: madd s7 += (&out[5] * 146)
[64944] :: madd s7 += (&out[6] * 211)
[65002] :: madd s7 += (&out[7] * 85)
[65052] :: madd s7 += (&out[8] * 87)
[65107] :: madd s7 += (&out[9] * 251)
[65175] :: madd s7 += (&out[10] * 122)
[65230] :: madd s7 += (&out[11] * 60)
[65277] :: madd s7 += (&out[12] * 237)
[65340] :: madd s7 += (&out[13] * 27)
[65384] :: madd s7 += (&out[14] * 63)
[65441] :: madd s7 += (&out[15] * 207)
[65504] :: madd s7 += (&out[16] * 33)
[65541] :: madd s7 += (&out[17] * 138)
[65589] :: madd s7 += (&out[18] * 51)
[65636] :: madd s7 += (&out[19] * 103)
[65691] :: madd s7 += (&out[20] * 58)
[65738] :: madd s7 += (&out[21] * 68)
[65778] :: madd s7 += (&out[22] * 120)
[65828] :: madd s7 += (&out[23] * 111)
[65888] :: trunc s7 &= 0xffffff
[65891] :: mov s2, &s7
[65893] :: add s4, s4, s2

// flag 16-27 computation
[98563] :: mov s2, &s7
[98565] :: add s4, s4, s2
[98568] :: mov s7, &-62593
[98569] :: madd s7 += (&flag[16] * 162)
[98617] :: aop s7 += (&flag[17] ror 59)
[98634] :: madd s7 += (&flag[18] * 4)
[98657] :: aop s7 += (&flag[19] shl 0)
[98674] :: madd s7 += (&flag[20] * 207)
[98737] :: madd s7 += (&flag[21] * 128)
[98775] :: aop s7 += (&flag[22] ror 21)
[98792] :: aop s7 += (&flag[23] or 72)
[98809] :: madd s7 += (&flag[24] * 23)
[98853] :: aop s7 += (&flag[25] rol 1)
[98870] :: madd s7 += (&flag[26] * 247)
[98938] :: aop s7 += (&flag[27] shr 3)
[98955] :: trunc s7 &= 0xffffff
[98958] :: mov s2, &s7
[98960] :: add s4, s4, s2
[98963] :: mov s7, &-80711
[98964] :: madd s7 += (&flag[16] * 62)
[99016] :: aop s7 += (&flag[17] or 139)
[99033] :: madd s7 += (&flag[18] * 21)
[99072] :: madd s7 += (&flag[19] * 239)
[99140] :: madd s7 += (&flag[20] * 145)
[99188] :: madd s7 += (&flag[21] * 103)
[99243] :: aop s7 += (&flag[22] shr 0)
[99260] :: aop s7 += (&flag[23] rol 228)
[99277] :: madd s7 += (&flag[24] * 184)
[99330] :: madd s7 += (&flag[25] * 241)
[99388] :: madd s7 += (&flag[26] * 72)
[99428] :: aop s7 += (&flag[27] xor 38)
[99445] :: trunc s7 &= 0xffffff
[99448] :: mov s2, &s7
[99450] :: add s4, s4, s2
[99453] :: mov s7, &-98024

跟着走下来突然感觉1k条指令已经算少的了🤣
怪不得av神曾曰:“区区几千个函数

![](/img/code/chall_photo/chunk2.png width=50% />

Part3 解决

根据前面的反汇编,我们可以看到约束检查器(constraint checker)被分成了这样的块:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[98963] :: mov s7, &-80711
[98964] :: madd s7 += (&flag[16] * 62)
[99016] :: aop s7 += (&flag[17] or 139)
[99033] :: madd s7 += (&flag[18] * 21)
[99072] :: madd s7 += (&flag[19] * 239)
[99140] :: madd s7 += (&flag[20] * 145)
[99188] :: madd s7 += (&flag[21] * 103)
[99243] :: aop s7 += (&flag[22] shr 0)
[99260] :: aop s7 += (&flag[23] rol 228)
[99277] :: madd s7 += (&flag[24] * 184)
[99330] :: madd s7 += (&flag[25] * 241)
[99388] :: madd s7 += (&flag[26] * 72)
[99428] :: aop s7 += (&flag[27] xor 38)
[99445] :: trunc s7 &= 0xffffff
[99448] :: mov s2, &s7
[99450] :: add s4, s4, s2

接近结尾的时候可以看到:

1
2
3
4
5
6
7
8
9
[101625] :: mov fail(), &0
[101626] :: exec s4 <- 8b0425cc408000f7d819c0c3000000000000000000000000
--------------------
0x0: mov eax, dword ptr [s4]
0x7: neg eax
0x9: sbb eax, eax
--------------------
[101635] :: mov s2, &s4
[101637] :: mov fail(), s2

所以s4必须保持0,否则就设置fail符号,说明flag是错误的。

所以上面的块表示对部分flag的操作,然后再加上-80711(s7里)必须等于0

后半段flag

flag[16:27]约束一下,z3一把梭了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
from z3 import *

def solve_end_flag(instructions):
orig = [BitVec('f%d' % i, 8) for i in range(16,28)]
flag = ([0] * 16) + [ZeroExt(24, x) for x in orig]

s = Solver()
for o in orig:
s.add(UGT(o, 0x20))
s.add(ULT(o, 0x7f))

idx = 0
while idx < len(instructions):
match instructions[idx]:
case Mov(Symbol(7), Ref(v), _, ridx) if type(v) is int:
x = BitVecVal(v, 32)

ptr = idx+1
while ptr < len(instructions):
match instructions[ptr]:
case MultAdd(Symbol(7), Ref(FlagAddr(f)), k, _):
x += (flag[f] * k)
case Aop(Symbol(7), op, Ref(FlagAddr(f)), k, _):
v = None
match op:
case 'and': v = flag[f] & k
case 'xor': v = flag[f] ^ k
case 'or': v = flag[f] | k
case 'rol': v = ZeroExt(24, RotateLeft(Extract(7,0,flag[f]), k))
case 'ror': v = ZeroExt(24, RotateRight(Extract(7,0,flag[f]), k))
case 'shl': v = ZeroExt(24, Extract(7,0,flag[f]) << k)
case 'shr': v = flag[f] >> k
case _:
raise Exception(f'unknown aop: {op}')
x += v
case Trunc(Symbol(7), k, _):
s.add(x == 0)
case _:
break

ptr += 1

idx += 1

print(s.check())

m = s.model()
flag = bytes([m.eval(o).as_long() for o in orig])

return flag
'''
sat
b'3_3LF_m4g1c}'
'''

output数组

我用z3去约束out数组(这数组里只用了madd),但是8太彳亍。

ctfer宁愿用z3解线性方程组,也不愿学习矩阵是个啥。
— cts (@gf_256) July 2, 2022

每个单独的约束只有madd,相当于矩阵里的一行,因为正好有24个out值和24组约束(是个方阵),所以我们可以求解这个矩阵。

这次不用z3,用np.linalg.solve来求解矩阵:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import numpy as np

# ---

def solve_out_arr(instructions):
X = []
Y = []

idx = 0
while idx < len(instructions):
match instructions[idx]:
case Mov(Symbol(7), Ref(v), _, ridx) if type(v) is int:
row = []

ptr = idx+1
while ptr < len(instructions):
match instructions[ptr]:
case MultAdd(Symbol(7), Ref(OutAddr(_)), k, _):
row.append(k)
case Trunc(Symbol(7), k, _):
X.append(row)
Y.append(-v)
case _:
break

ptr += 1

idx += 1

a = np.array(X, dtype=np.uint32)
b = np.array(Y, dtype=np.uint32)

return [int(x) for x in np.linalg.solve(a,b)]

#[134, 72, 7, 236, 29, 49, 88, 228, 231, 231, 227, 16, 242, 80, 242, 1, 224, 113, 46, 224, 108, 91, 103, 182]

前半段flag

这个out数组是根据flag的前16个字节和这些块计算的:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[0347] :: shuffleblock &flag[0], &flag[1]
[8028] :: mov s4, &0
[8029] :: output out[0], &arr[0]
[8071] :: output out[1], &arr[1]
[8113] :: output out[2], &arr[2]
[8155] :: reset
[8411] :: mov s4, &0
[8412] :: shuffleblock &flag[2], &flag[3]
[16092] :: mov s4, &0
[16093] :: output out[3], &arr[0]
[16135] :: output out[4], &arr[1]
[16177] :: output out[5], &arr[2]
[16219] :: reset
[16475] :: mov s4, &0
[16476] :: shuffleblock &flag[4], &flag[5]
[24156] :: mov s4, &0
[24157] :: output out[6], &arr[0]
[24199] :: output out[7], &arr[1]
[24241] :: output out[8], &arr[2]
[24283] :: reset

具体来说,out[0:3]flag[0:2]计算得到。out[3:6]flag[2:4]得到,以此类推。

对于每个3元组可以直接爆破:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
def solve_start_flag(output):
def sim(a,b):
arr = list(range(256))

z = 0
for i in range(256):
p = a if i % 2 == 0 else b
z = (z + arr[i] + p) & 0xff
arr[i], arr[z] = arr[z], arr[i]

out = [0,0,0]
z = 0
for i in range(3):
z = (z + arr[i]) & 0xff
arr[i], arr[z] = arr[z], arr[i]
out[i] = arr[(arr[i] + arr[z]) & 0xff]

return out

def solve_chunk(k1,k2,k3):
# Brute force chunk.
for a in range(0x20, 0x7f):
for b in range(0x20, 0x7f):
out = sim(a,b)

if abs(out[0] - k1) <= 1 and abs(out[1] - k2) <= 1 and abs(out[2] - k3) <= 1:
return (a,b)

return None

f = []
for i in range(0,len(output),3):
f += list(solve_chunk(*output[i:i+3]))

return bytes(f)

#CTF{H0p3_y0u_l1k

完结撒花

完整脚本:https://gist.github.com/hgarrereyn/9e536e8b3471d3cb8ecbb5932a776b95

关于本文

本文作者 云之君, 许可由 CC BY-NC 4.0.