Latin alphabet

The Latin alphabet, also called the Roman alphabet, is the most widely used alphabetic writing system in the world, the standard script of the English language and most of the languages of western and central Europe, and of those areas settled by Europeans. In the nineteenth and twentieth centuries, the Latin alphabet became the standard script for a number of non-European languages as well.

Letters of the alphabet

As used by the English language, it consists of the following characters:

these letters may or may not be exclusive to a single language

Evolution

The Latin, or Roman, alphabet was created in the 8th century BC (more precisely 753 BC), according to legend. It was based on the Etruscan alphabet, which was derived from the Greek. Of the original twenty-six Etruscan letters the Romans adopted twenty-one. The original Latin alphabet was:

A	B	C	D	E	F	I	H	I	K	L
M	N	O	P	Q	R	S	T	V	X

File:Older Latin glyphs.png

C stood for both g and k.
The first I (between F and H) is the Greek zeta.
The second I stood for both i and j.
For a long time, R was written P.
V stood for u, v, and w.

Later the Greek zeta (I) was dropped and a new letter G was placed in its position. After the conquest of Greece in the first century BC the letters Y and Z were adopted from the Greek alphabet and placed at the end. Now the new Latin alphabet contained twenty-three letters. It was not until the Middle Ages that the letter J (to distinguish it from I) and the letters U and W (to distinguish them from V) were added. [1]

The alphabet used by the Romans consisted only of capital (upper case or majuscule) letters. The lower case (minuscule) letters developed in the Middle Ages from cursive writing, first as the uncial script, and later as minuscule script. The old Roman letters were retained for formal inscriptions and for emphasis in written documents. The languages that use the Latin alphabet generally use capital letters to begin paragraphs and sentences and for proper nouns. The rules for capitalization have changed over time, and different languages vary somewhat in their rules for capitalization. English, for example, used to capitalize all nouns, as German still does today.

Spread of the Latin Alphabet

The Latin alphabet spread from Italy, along with the Latin language, to the lands surrounding the Mediterranean Sea with the expansion of the Roman Empire. The eastern half of the Roman Empire, including Greece, Asia Minor, the Levant, and Egypt, continued to use Greek as a lingua franca, but Latin was widely spoken in the western half of the Empire, and as the western Romance languages, including Spanish, French, Catalan, Portuguese and Italian, evolved out of Latin they continued to use and adapt the Latin alphabet. The Latin alphabet spread to the Germanic peoples of northern Europe with the spread of western Christianity, displacing the earlier Runic alphabets. During the Middle Ages the Latin alphabet also came into use among the western Slavic peoples, including the Poles, Czechs, Croats, Slovenes, and Slovaks, as these nations adopted Roman Catholicism; the eastern Slavs generally adopted both Orthodox Christianity and the Cyrillic alphabet. The Baltic Lithuanians and Latvians, as well as the non-Indo-European Finns, Estonians, and Hungarians, also adopted the Latin alphabet.

By 1492, the Latin alphabet was limited primarily to the Roman Catholic and Protestant nations of western and central Europe. The Orthodox Christian Slavs of eastern and southern Europe mostly used the Cyrillic alphabet, and the Greek alphabet was still in use by Greek-speakers around the eastern Mediterranean. The Arabic alphabet was widespread within Islam, both among Arabs and non-Arab nations like the Turks and Iranians. Most of the rest of Asia used a variety of Brahmic alphabets or the Chinese script.

In the last 500 years, the Latin alphabet spread around the world. It spread to the Americas, Australia, and parts of Asia, Africa, and the Pacific with European colonization, along with the Spanish, Portuguese, English, French, and Dutch languages. In the late eighteenth century, the Romanians adopted the Latin alphabet; although Romanian is a Romance language, the Romanians were predominantly Orthodox Christians, and until the nineteenth century the Church used the Cyrillic alphabet. Vietnam, under French rule, adapted the Latin alphabet for use with the Vietnamese language, which had previously used Chinese characters. The Latin alphabet is also used for many Austronesian languages, including Tagalog and the other languages of the Philippines, and the official Malaysian and Indonesian languages, replacing earlier Arabic and indigenous Brahmic alphabets. In 1928, as part of Kemal Ataturk's reforms, Turkey adopted the Latin alphabet for the Turkish language, replacing the Arabic alphabet. After the collapse of the Soviet Union in 1991, several of the newly-independent Turkic-speaking republics adopted the Latin alphabet, replacing Cyrillic. Azerbaijan, Uzbekistan, and Turkmenia have officially adopted the Latin alphabet for Azeri, Uzbek, and Turkmen, respectively. In the 1970's, the Peoples Republic of China developed an official transliteration of Mandarin Chinese into the Latin alphabet, called Pinyin, although use of Chinese characters is still predominant.

Use in other languages

In the course of its history, the Latin alphabet was used for new languages, and therefore, some new letters and diacritics were created, e.g.:

the cedilla in ç (originally a little z written below the c) that symbolized /ts/ in Romance
the háček in Slavonic languages, used to mark palatalised versions of the base letter, e.g. č.
the tilde in Spanish ñ, some Portuguese vowels (originally a little n written above the letter) used to mark the elision of a former N, and then later to mark nasalisation of the base letter and the Estonian õ.
the ă, â, î, ş and ţ, as used in the Romanian language

Please see 'Alphabets derived from the Latin' for a more complete list.

W is a letter made up from two V's or U's. It was added in late Roman times to represent a Germanic sound. U and J were originally not distinguished from V and I respectively. In Old English, ash æ, eth ð and the Runic letters thorn þ, and wynn ƿ were added. Eth and thorn were replaced with 'th', and wynn with the new letter 'w'. In modern Icelandic, thorn and eth are still used. The additional letters added in German are special presentations of earlier ligature forms (ae → ä, ue → ü or ſs → ß). French adds the circumflex to record elided consonants that were present in earlier forms and are often still present in the modern English cognate forms (Old French hostel → French hôtel = English hotel or Late Latin pasta → Middle French paste → French pâte and English paste).

Some Slavic languages use the Latin alphabet rather than the Cyrillic. Among these, Polish uses a variety of digraphs with z to represent special phonetic values, and a dark l - ł - for a sound similar to w. Czech uses diacritics as in Dvořák — the term háček (caron) originates from Czech. Croatian uses carons in č, š, ž, an acute in ć and a bar in đ. The languages of Eastern Orthodox Slavs generally use Cyrillic instead which is much closer to the Greek alphabet.

The African language Hausa uses three additional consonants: ɓ, ɗ and ƙ.

Collating in other languages

Alphabets derived from the Latin have varying collating rules:

In French and English, characters with diaeresis (ä, ë, ï, ö, ü, ÿ) are usually treated just like their un-accented versions. If two words differ only by an accent in French, the one with the accent is greater. (However, the Unicode 3.0 book specifies a more complex traditional French sorting rule for accented letters.)
In German umlaut (Ä,Ö,Ü) are treated generally just like their non-umlauted versions; ß is always sorted as ss. This makes the alphabetic order Arg, Ärgerlich, Arm, Assistent, Aßlar, Assoziation. For phone directories and similar lists of names, the umlauts are to be collated like the letter combinations "ae", "oe", "ue". This makes the alphabetic order Udet, Übelacker, Uell, Ülle, Ueve, Üxküll, Uffenbach.
In the Swedish alphabet, "W" is seen as a variant of "V" and not a separate letter. It is however recognised and maintained in names, like in "William". The alphabet also has three extra vowels placed at its end (..., X, Y, Z, Å, Ä, Ö). The same alphabet and collating rules are used for Finnish.
The same extra vowels as in Swedish are also present in the Danish and Norwegian alphabets but in a different order and with different glyphs (..., X, Y, Z, Æ, Ø, Å). Also, "Aa" collates as an equivalent to "Å". The Danish alphabet sees "W" as a variant of "V".
The Faroese alphabet also has some of these extra letters, namely Æ and Ø. Furthermore, the Faroese alphabet uses the eth, which follows the D. Five of the six vowels A, I, O, U and Y can get accents and are after that considered separate letters. The consonants C, Q, X, W and Z are not found. Therefore the first five letters are A, Á, B, D and Ð, and the last five are V, Y, Ý, Æ, Ø
Some languages have more complex rules: for example, Spanish treated (til 1997) "CH" and "LL" as single letters, giving an ordering of CINCO, CREDO, CHISPA and LOMO, LUZ, LLAMA. This is not true anymore since in 1997 RAE adopted the more conventional usage, and now LL is collated between LI and LO, and CH between CE and CI. The only Spanish specific collating question is Ñ (eñe) as a different letter collated after N.
In Dutch the combination IJ (representing Ĳ (Dutch Y)) was formerly to be collated as Y (or sometimes, as a separate letter Y < IJ < Z), but is currently mostly collated as 2 letters (II < IJ < IK). Note that a word starting with ij that is written with a capital I is also written with a capital J, e.g. the town IJmuiden (mun. Velsen) and the river IJssel.
The Hungarian language has accents, umlauts, and double accents. The accent is ignored in collating, and the double accent, which indicates a long umlaut vowel, is treated as equal to the umlaut.
In Icelandic, Þ is added, and D is followed by Ð.
Both letters were also used by Anglo-Saxon scribes who also used the Runic letter Wynn to represent /w/.
Þ (called thorn; lowercase þ) is also a Runic letter, some scholars derive it from Latin D.
Ð (called eth; lowercase ð) is the letter D with an added stroke.
In Polish, specifically Polish letters derived from the Latin alphabet are collated after their originals: A, Ą, B, C, Ć, D, E, Ę, ..., L, Ł, M, N, Ń, O, Ó, P, ..., S, Ś, T, ..., Z, Ź, Ż.
In Czech, accented vowels are treated as their unaccented forms, but accented consonants (the ones with hacek) immediatelly follow their unaccented counterparts. The letter CH goes between H and I.
In Esperanto, consonants with circumflex accents (ĉ, ĝ, ĥ, ĵ, ŝ), as well as ŭ (u with breve), are counted as separate letters and collated separately (c, ĉ, d, e, f, g, ĝ, h, ĥ, i, j, ĵ ... s, ŝ, t, u, ŭ, v, z).
In Romanian, special characters derived from the latin alphabet are collated after their orginals: A, Ă, Â, ..., I, Î, ..., S, Ş, T, Ţ, ..., Z.
In Tatar, there are 9 additional letters. 5 of them are vowels, paired with main alphabet vowels as hard-smooth: a-ä, o-ö, u-ü, í-i, ı-e. The four remaining are consonants: ş is sh, ç is ch, ñ is ng and ğ is gh.
In Croatian and Serbian and related South Slavic languages, the five accented characters and two conjoined characters are sorted after the originals: ..., C, Č, Ć, D, DŽ, Đ, E, ..., L, LJ, M, N, NJ, O, ..., S, Š, T, ..., Z, Ž.

For multilingual situations with no one preferred language or alphabet, the Unicode Collation Algorithm can be used.

References

Jensen, Hans. 1970. Sign Symbol and Script. London: George Allen and Unwin Ltd. Transl. of Die Schrift in Vergangenheit und Gegenwart. VEB Deutscher Verlag der Wissenschaften. 1958, as revised by the author.
Rix, Helmut. 1993. "La scrittura e la lingua" In: Cristofani, Mauro (hrsg.) 1993. Gli etruschi - Una nuova immagine. Firenze: Giunti. S.199-227.
Sampson, Geoffrey. 1985. Writing systems. London (etc.): Hutchinson.
Wachter, Rudolf. 1987. Altlateinische Inschriften: sprachliche und epigraphische Untersuchungen zu den Dokumenten bis etwa 150 v.Chr. Bern (etc.): Peter Lang.
Biktaş, Şamil, 2003, Tuğan Tel.

Letters of the alphabet

Capital letters

Minuscule (lower-case) letters

Exclusive Letters

Evolution

Spread of the Latin Alphabet

Use in other languages

Collating in other languages

See Also

References