INFECTING ISO CD IMAGES ----------------------- additional stuff: INFISO\*.* Era of the computer virus began from information exchange via diskettes. After some time the most part of this exchange has moved into networking. A bit later cd-roms became widely spreaded. There even appeared cd-related scenes, and today cd burning technology is available mostly to all. And now its time to our move. So, there is a specificity of burning cd disks - since cds are read-only, the whole cd image should be created before burning. Such images has different formats and contains cd filesystem, all the files in defragmented form, and other info. Sometimes these images are called ISOs, because they're in ISO9660 image format specification. There are a lot of different image formats. Look into ISOBUSTER program, OpenFileDialog to see some of them. Btw, this program is used to look into different cd images and extract files, and is required when testing image infecting. But, though formats are different and supporting them is hard, there are the following advantages of cd images infecting: 1. Cd disks can contain AUTORUN.INF file in the root directory, so when cd is inserted, some setup file can be automatically runned. (its default windows settings). 2. Since cd-r disk is burned, it will never be erased, and in most cases user will keep it for a while, and probably pass to some friends. 3. In most cases cd-disks contains distributives, i.e. installation programs. 4. Cds are stamped by warez pirates. (same way as CIH became popular) 5. Cd disks contains extremly big amount of information - 600-700 mbs, and checking such cd image will require tons of time for avers, moreover, most of files on such disks are archives. Now, about information exchange. Some of images are created temporary, and, in theory, it is possible to infect them on-the-fly - if you know this temporary format. But, other ISOs are to be exchanged via internet, and this is our chance to spread via them. In other words, modern virus should be able to deal with cd images and to infecte them when required. So, here is some thoughts (and sources) about cd images infecting. At first, some comments. 1. All the information here were obtained in experimental way, so it can differ from reality. 2. There are also multitrack and multisession disks, which makes our task more complex. 3. There are also audio disks, and mixed disks, where audio tracks follows data track(s). Now, lets talk about the main thing - CD sector. CD sector has fixed length of 2352 bytes, which in case of audio sector can be any bytes, and in case of data sector has the following format: 16-bytes - header: (probably used by drive to search sector) [12 bytes] PREFIX: 00 FF FF FF FF FF FF FF FF FF FF 00 [1 byte] minute } sector number on the disk (or track?), [1 byte] second } represented in [minute : second : second/75] form, [1 byte] second/75 } all in BCD format, 00:02:00 - based. I.e. sector #0 begins from 2 seconds, and each second contains 75 cd sectors. Since cd sector is of 2352 bytes length, each 75 sectors uses 75*2352 = 176400 bytes, which is the same as 44100*2*2 - one second of audio data in [44kHz 16-bit Stereo] format. [1 byte] sector type 0=empty, 1=MODE1, 2=MODE2, never saw other values data: [8 bytes] some shit, or data, depending on sector type [2040 bytes] data [8 bytes] data, or some shit [276 bytes] data, or error correction code (ecc), calculation begins from "minute" byte, and 1st 4 bytes (m/s/s75/type) should be temporary zerofilled when calculating. [4 bytes] total sector checksum, which in some cases, when burning, is re-calculated by the cd drive itself. Some copy-protection schemes are based on it. Now, about track types. There exists two data track types: MODE1 and MODE2. So, sector_type field will contain 1 or 2. Also, information about track type is written somewhere in sub-codes, which are not directly available, since are not contained within iso images. Also, sector_type field can be of zero value, that means that except 1st 16 bytes, all other sector bytes are zero. This is also used by some copy-protection schemes. MODE1 is used mostly by all of the iso images, since this is IBM disks. MODE2 is used mostly by Sony. Now, about cd image format. It is evident, that best cd image format is just a set of cd sectors. And it is so, but only in case of single track. In case of multiple tracks, somewhere should be stored information about track layout and types. For example, CDR-Win program uses additional .CUE-file: ----[begin of XZ.CUE]---- FILE "XZ.BIN" BINARY TRACK 01 MODE1/2352 INDEX 01 00:00:00 TRACK 02 AUDIO PREGAP 00:02:00 INDEX 01 25:44:70 ----[end of XZ.CUE]------ As it seems, all the information from such .CUE file will be written to cd in a form of sub-codes. But its not interesting for us, since in most cases 1st track within image is main data track where main stuff is located. Btw, two and more data tracks can be on both multitrack and multisession disks, and, if you see multiple data tracks, you can not know if it was multitrack or multisession; and it is used by some copy-protection shemes. In addition, it should be said that sector size within cd image can be not only of 2352 bytes; it can be also 2336(MODE2,XA) bytes and 2048 bytes(MODE1). But, in any case, each sector contains 2048 bytes of data. Also, the stuff that was shown in the .CUE file, in case of some other cd image format can be located in the header of the whole image, which complicates situation. Now lets talk about cd file system. Main 2048 bytes of sector data begins from offset 10h in case of MODE1/2352, and from offset 18h in case of MODE2/2352. In case of 2048-bytes sectors, each sector contains data without any prefixes/eccs. In case of 2336(MODE2,XA)-sectors, only 1st 16-byte sector header is stripped, and sector data can be of 2048 or 2336 size, depending on file type - normal or XA. Btw, XA means eXtended Audio. Now lets imagine that we can deal with cd image on 2048-byte sector level. For sure, there is no file fragmentation at all, and each file is in continuous form, and to specify file location it is enough to know only 1st file sector. In most cases, 1st 16 sectors are unused. Sector #16 (0-based) begins from CD001 signature. If it is so, at offsets 0x9C and 0xA6 (here and later, add 10h or 18h to all offsets in case of 2352-byte sectors) there are two DWORDs, first one is 1st TOC sector, and 2nd one is size of TOC, in bytes (2048-aligned). TOC means Table Of Contents. It is the same as root directory. But here is a little trouble. In case of MODE1, there can be 2nd TOC, which differs from 1st one in one thing: all file names in 2nd TOC are in 2-byte character format. Its called long file names. Both TOC entries are parallel. If 2nd TOC exists, then cd file system driver will use its filenames. 2nd TOC can be found same as 1st one, with only difference that you should look not to 16th, but to 17,18 or 19 (or maybe higher) sector, and signature of this sector should be CD001 (1st byte = 0x01 for 1st TOC, 0x02 for 2nd). Also, there exists bootable disks. In this case somewhere within cd image there is image of bootable diskette (or kind of). Would be interesting to infect it too. Now, about TOC format. Lets we have one or two TOCs, each of them is specified by starting sector and size. TOC format is the following: WORD n; // entry size, in bytes BYTE entry[n]; // data, specifies single file or directory. ... WORD 0; // end of TOC sector Here should be said that such structure is in each TOC sector, and number of TOC sectors may be calculated as toc_size/2048. I dont know, if one entry can be splitted into two parts, on the border of sectors; i think it cant, and each TOC sector ends with 0000. Now, about single TOC entry. 00 WORD n; // length of the entry, in bytes. 0000=last entry 02 DWORD sector; // 1st sector of file or directory 06 DWORD sector2; // the same in non-intel format 0A DWORD size; // length of file or directory 0E DWORD size2; // the same in non-intel format 12 BYTE datetime[6]; // ? 18 BYTE attr[2]; // if file: MODE1:000C, MODE2:0024 1A BYTE xz[6]; // 00 00 01 00 00 01 20 BYTE/WORD namelen;// length of the following name, in _bytes_ 21 BYTE*namelen // BYTE-elements for 1st TOC, WORD-elements for 2nd TOC Size of namelen field is BYTE in TOC1 and WORD in TOC2. File name characters are of the same size. In some cases entry sizes are 2 or 4-aligned, but, as it seems, its not required. File names may optionaly terminate with 00 (0000), mostly in case of alignment. In case of MODE2, after filename follows padding with 0s to make WORD-alignment, then four (4) zero-bytes, then four (4) bytes 0D,55,58,41 and six (6) zero-bytes. So, it is enough to infect iso image. In theory, image infecting is divided into the following steps: 1. Determine image format. 2. Setup sector read/write subprograms, to deal with image on the sector level. 3. By means of these subprograms determine if it is MODE1 or MODE2. 4. Process TOC and collect file information, including info about file location within image. 5. Depending on infectiion method: a) choose some executable file and replace or modify it. b) 1. Find some free/unused/whatever place within image and insert own executable there. 2. Add TOC entry about inserted file. In the INFISO tool, cd images are infected in the following way: both TOCs are scanned for AUTORUN.EXE/AUTORUN.INF filenames, which are replaced with OUTARUN.*; then, to the end of the each TOC, new AUTORUN.EXE/.INF are added, if possible; then new files are inserted into the middle of the image. * * *