Self-splicing software

Self-splicing software and molecular genetics

Overdrive and Ultralock were two utilities for MSDOS that we wrote and marketed in the late 1980s. Overdrive accelerated access to large files and Ultralock kept data transparently encrypted on the disk. Things have moved on and neither program is now particularly interesting for what it did –but the technique that they used to set themselves up was novel, has not as far as I know been documented, and may well be used in the future for less beneficent purposes.

To do its work, Ultralock (or Overdrive) had to splice itself into the program code of the operating system itself. For example, it had to detect when MSDOS was about to write some data to the disk, check whether the data belonged to a file that was defined as needing to be encrypted, and if so, it had to encrypt the data before allowing MSDOS writing them to the disk. Other interceptions were needed in order to identify which file the data belonged to in the first place, to decrypt data when the disk was being read, and so on. At the deepest level, all the interceptions worked in the same way. They overwrote 5 bytes of program instructions in the operating system, replacing them with an instruction saying "Start executing instructions at XXXX", where XXXX was a location within Ultralock; of course they then also had to remember where the interception had come from so that Ultralock could arrange to return to the operating system at the appropriate point.

MSDOS at that time was slightly different on different computers, and each flavour had its program code stored in a slightly different location in memory. To start with, Ultralock had a list of flavours of MSDOS that it understood, and for each of those flavours it had a list of memory locations to check. It would go through these lists and check them against the computer that it found itself in, and when it came across a flavour that matched the computer exactly, it knew what host it had to deal with and where to place its interceptions. If nothing matched exactly, it asked the user to send us a system disk so that we could analyse it, work out where the necessary interceptions were, and add the new flavour to Ultralock's repertoire. Biology is not quite so simple as computing, but you can imagine a host that came in various strains and a parasite that had to match each strain exactly in order to infect it.

This worked but it became tedious, and with the appearance of version 3 of MSDOS we took a different approach. Now Ultralock was given a list of instruction patterns to look for. It would scan through the operating system's program code looking for these patterns, and as it found each pattern it inserted the appropriate splice. The net effect was that we did not have to educate Ultralock any more: the parasite had become adaptive.

This is (at least to an educated layman) exactly what is done when splicing genes using restriction enzymes: a pattern is recognised and an insertion is made at that point. To complete the analogy, the only thing that Ultralock lacked was a reverse transcriptase: it operated solely on the copy of the operating system that was already loaded into the computer's memory (the messenger RNA, transcribed and active), and the copy that was on disk (the DNA) was untouched.

The future of viruses

The present generation of computer viruses do relatively benign things. They erase crucial files or an entire hard disk, or they bring a network to its knees by a chain reaction of broadcasts. These actions are benign at a system level because they are detectable, and what you detect, you can (given adequate backups) recover from. They also do not bring financial benefit to their creators.

Now imagine a virus that did nothing –not even replicate itself.... except that it used recombinant technology to recognise and splice itself into a spreadsheet program, where it did nothing.... except to bias a certain financial calculation by one per cent (net present value, perhaps, or redemption yields). Such is the reliance of organisations on computers, and such is their uncritical trust of what a spreadsheet tells them (especially when the calculation is too tedious to do by hand) that the bias would go undetected for a long time. Imagine a company that depends on such calculations –a leasing company, or an annuity provider –for which a mismatch of 1% could be crippling. Such a situation might free an important market for other players or might make a substantial profit for speculators on the victim's share price.

The contribution of recombinant technology in this case is that the spreadsheet program itself does not have to be altered: there is no need to manufacture and supply forged Microsoft CDROMs. With the prospect of substantial benefits, a substantial investment could be justified: for example, a batch of devices (CDROM drives, display cards,...) could be offered to the victim at an irresistible price, and the driver software of those devices could incorporate the virus. Device drivers have unlimited privileges within Microsoft operating systems, so the driver could do its work during for the whole of the year 2005 and then remain inert for the rest of its life. Operating system suppliers must recognise that their architectural choices have made such attacks possible and do something to protect against them. Ultimately, this is one more argument for product liability in software and the concomitant changes in design practices that are needed to establish exactly where liability lies.

Martin Kochanski’s web site / Software

Self-splicing software and molecular genetics

The future of viruses