40Hex Number 9 Volume 2 Issue 5 File 007 ------------------------------------------------------------------------- A New Virus Naming Convention At the Anti-Virus Product Developers Conference organized by NCSA in Washington in November 1991 a committee was formed with the objective of reducing the confusion in virus naming. This committee consisted of Fridrik Skulason (Virus Bulletin's technical editor) Alan Solomon (S&S International) and Vesselin Bontchev (University of Hamburg). The following naming convention was chosen: The full name of a virus consists of up to four parts, desimited by points ('.'). Any part may be missing, but at least one must be present. The general format is Family_Name.Group_Name.Major_Variant.Minor_Variant Each part is an identifier, constructed with the characters [A-Za-z0-9_$%&!'`#-]. The non-alphanumeric characters are permitted, but should be avoided. The identifier is case-insensitive, but mixed-case characters should be used for readability. Usage of underscore ('_') (instead of space) is permitted, if it improves readability. Each part is up to 20 characters long (in order to allow such monstriosities like "Green_Caterpillar"), but shorter names should be used whenever possible. However, if the shorter name is just an abbreviation of the long name, it's better to use the long name. 1. Family names. The Family_Name represents the family to which the virus belongs. Every attempt is made to group the existing viruses into families, depending on the structural similarities of the viruses, but we understand that a formal definition of a family is impossible. When selecting a Family_Name, the following guidelines must be applied: "Must" 1) Do not use company names, brand names or names of living people, except where the virus is provably written by the person. Common first names are permissible, but be careful - avoid if possible. In particular, avoid names associated with the anti-virus world. If a virus claims to be written by a particular person or company do not believe it without further proof. 2) Do not use an existing Family_Name, unless the viruses belong to the same family. 3) Do not invent a new name if there is an existing, acceptable name. 4) Do not use obscene or offensive names. 5) Do not assume that just because an infected sample arrives with a particular name, that the virus has that name. 6) Avoid numeric Family_Names like V845. They should never be used as family names, as the members of the family may have different lengths. When a new virus appears and a new Family_Name must be selected for it, it is acceptable to us a temporary name like _1234, but this must be changed as soon as possible. "Should" 1) Avoid Family_Names like Friday 13th, September 22nd. They should not be used as family names, as members of the family may have different activation dates. 2) Avoid geographic names which are based on the discovery site - the same virus might appear simultaneously in several different places. 3) If multiple acceptable names exist, select the original one, the one used by the majority of existing anti-virus programs or the more descriptive one. "General" 1) All short (less than 60 bytes) overwriting viruses are grouped under a Family_Name, called Trivial. 2. Group names. The Group_Name represents a major group of similar viruses in a virus family, something like a sub-family. Examples are AntiCAD (a distinguished clone of the Jerusalem family, containing numerous variants), or 1704 (a group of several virus variants in the Cascade family). When selecting a Group_Name, the same guidelines as for a Family_Name should be applied, except that numeric names are more permissible - but only if the respective group of viruses is well known under this name. 3. Major variant name. The major variant name is used to group viruses in a Group_Name, which are very similar, and usually have one and the same infective length. Again, the above guidelines are applied, with one major exception. The Major_Variant is almost always a number, representing the infective length, since it helps to distinguish that particular sub-group of viruses. The infective length should be used as Major_Variant name always when it is known. Exceptions of this rule are: 1) When the infective length is not known, because the viruses are not yet analyzed. In this case, consecutive numbers are used (1, 2, 3, etc.). This should be changed as soon as more information about the viruses becomes known. 2) When an alpha-numeric name of the virus sub-group already exists and is popular, or more descriptive. 4. Minor variant name. Minor variants are viruses with the same infective length, with similar structure and behaviour, but slightly different. Usually the minor variants are different patches of one and the same virus. When selecting a Minor_Variant name, usually consecutive letters of the alphabet are used (A, B, C, etc...). However, this is not a very hard restriction and longer names can be used as well, especially if the virus is already known under this (longer) name, or if the name is more descriptive than just a letter. The producers of virus detection software are strongly usrged to use the virus names proposed here. The anti-virus researchers are advised to use the described guidelines when selecting names for new viruses, in order to avoid further confusion. If a scanner is not able to distinguish between tow minor variants of a virus, it should output the virus name up to the recognized major variant. For instance, if it cannot distinguish between Dark_Avenger.2000.Traveller.Copy and Dark.Avenger.Traveller.Zopy, it should report both variants of the virus as Dark.Avenger.Traveller. If it is also not able to distinguish between the major variants, it should report the virus up to the recognized group name. That is, if the scanner cannot make the difference between Dark_Avenger.2000.Traveller.* and Dark_Avenger.2000.Die_Young, it should report all the variants as Dark_Avenger.2000. ------------------------------------------------------------------------- We at Phalcon/Skism welcome the proposals of this new committee. It is a step in the right direction, helping clear up the mess caused by the generation disorganisation which has dominated the virus naming conventions to date. Additionally, if implemented properly, it will aid in identification of strains. John McAfee's SCAN, which had been the best virus scanner, fell from grace recently, when it implemented a new policy of merging scan strings, causing confusion in identification. Fridrik Skulason's F-Prot is the current champion of virus identification. However, we must voice concerns that the rules are not strict enough. There are clearly too few rules to cover the numerous viruses which currently exist. Family, group, and major variant names for most current common viruses should be established now. These guidelines need be created ASAP to avoid later confusion. In the example in the last two paragraphs, Dark Avenger strains are labelled separately as Dark_Avenger.2000 and Dark.Avenger. Such confusion is simply not acceptable. Wherever possible, the current common names should be kept. It would be a shame if the world lost the Jerusalem family to some mad individual who wishes to name it 1808. The rules cover this, but it is important to set this down initially before stupid people butcher the rules. Number names are neither informative nor interesting. Imagine advertising a product as being able to catch "the deadly 605 virus." Some knobs have proposed a numerical classification scheme of viruses. They're living in a dream world. We applaud the efforts of the committee and may only hope that anti- virus developers attempt to adhere to the proposed rules. Hopefully, Mr. Skulason and Dr. Solomon will lead the way, converting their own products to this new naming convention. And who will classify the viruses? We propose an open forum for discussion on a large network such as UseNet or FidoNet moderated by either a virus researcher or anti-virus developer. This will allow input from many people, some of whom have particular specialties within certain groups of viruses.