ISO 639 is the set of international standards that lists short codes for language names. ISO 639 consists of different parts, of which two parts have been approved and a third part that is in the final approval (FDIS) stage. The other parts are works in progress.
- ISO 639-1: 2002 Codes for the representation of names of languages -- Part 1: Alpha-2 code List of ISO 639-1 codes
- ISO 639-2: 1998 Codes for the representation of names of languages -- Part 2: Alpha-3 code List of ISO 639-2 codes
- ISO 639-3: 2007 Codes for the representation of names of languages -- Part 3: Alpha-3 code for comprehensive coverage of languages List of ISO 639-3 codes
- ISO/CD 639-4: 2008? Codes for the representation of names of languages -- Part 4: Implementation guidelines and general principles for language coding
- ISO/DIS 639-5: 2008? Codes for the representation of names of languages -- Part 5: Alpha-3 code for language families and groups
- ISO/CD 639-6: 2008? Codes for the representation of names of languages -- Part 6: Alpha-4 representation for comprehensive coverage of language variation
Contents |
Use of ISO-639 codes
The language codes defined in the several sections of ISO-639 are used for bibliographic purposes and, in computing and internet environments, as a key element of locale data. The codes also find use in various applications, such as Wikipedia URLs for its different language editions.
Alpha-2 code space
"Alpha-2" codes (for codes composed of 2 letters of the basic Latin alphabet) are used in ISO 639-1. Thus, there are <math>26^2=676</math> distinct Alpha-2 codes. This is clearly insufficient to cover all languages, which led to the creation of ISO 639-2 and the use of Alpha-3 codes.
Alpha-3 code space
"Alpha-3" codes (for codes composed of 3 letters of the basic Latin alphabet) are used in ISO 639-2 and ISO 639-3 and will eventually be used in ISO 639-5. Mathematically, the upper limit for the number of languages and language collections that can be so represented is <math>26^3=17,576</math>. The common use of Alpha-3 codes by three parts of ISO 639 requires some coordination within a larger system. Part 2 defines four special codes mul, und, mis, zxx, a reserved range qaa-qtz (20 × 26 = 520 codes) and has 23 double entries (the B/T codes). This sums up to 520 + 23 + 4 = 547 codes that cannot be used in part 3 to represent languages or in part 5 to represent language families or groups. The remainder is 17,576 – 547 = 17,029. There are somewhere around six or seven thousand languages on Earth today[1][2]. So those 17,029 codes are adequate to assign a unique code to each language, although some languages may end up with arbitrary codes that sound nothing like traditional name(s) of that language.
Alpha-4 code space
"Alpha-4" codes (for codes composed of 4 letters of the basic Latin alphabet) is proposed to be used in ISO 639-6. Mathematically, the upper limit for the number of languages and dialects that can be so represented is <math>26^4=456,976</math>.
See also
- list of ISO 639-1 codes
- list of ISO 639-2 codes
- list of ISO 639-3 codes
- language code
- language families and languages
- list of languages
- list of official languages
- ISO 3166 (codes for countries)
- ISO 15924 (codes for writing systems)
- IETF language tags (based on ISO 639)
External links
- ISO 639-2 Registration Authority
- XML version of the official ISO 639-2 HTML data from the Library of Congress
- ISO 639-3 Registration Authority
- ISO 639 and the Ethnologue
- Language codes in English and Italian with Perl scripts for parsing and PHP code
- British Standards Institute

