Ubykh is a language of the Northwestern Caucasian group, spoken by the Ubykh people up until the early 1990s.

The word is derived from wəbəx, its name in the Abdzakh Adyghe (Circassian) language. It is known in linguistic literature by many names: variants of Ubykh, such as Ubikh, Ubıh (Turkish) and Oubykh (French); and Pekhi (from Ubykh tʷaχə) and its Germanicised variant Päkhy.

Major features

Ubykh is distinguished by the following features, some of which are shared with other Northwest Caucasian languages:

  • It is ergative, making no syntactic distinction between the subject of an intransitive sentence and the direct object of a transitive sentence.
  • It is highly agglutinative, using mainly monosyllabic or bisyllabic roots, but with single morphological words sometimes reaching nine or more syllables in length: [aχʲazbatʂʾaʁawdətʷaajlafaqʾajtʾmadaχ] if only you had not been able to make him take it all out from under me again for them. Affixes rarely fuse in any way.
  • It has a simple nominal system, contrasting just four noun cases, and not marking grammatical number in the direct or locative cases.
  • Its system of verbal agreement is quite complex. English verbs must agree only with the subject; Ubykh verbs must agree with the subject, the direct object and the indirect object, and benefactive objects must also be marked in the verb.
  • It is phonologically complex as well, with 83 distinct consonants (three of which, however, appear only in loan words). It only has two phonological vowels, but these vowels have a large range of allophones because the range of consonants which surround them is so large.}}


  • Ubykh once held the world record for consonant sounds. It has since been eclipsed by the !Kung Bushman language, which is now known to exceed Ubykh by 34 consonants.
  • Ubykh has 26 fricative phonemes, more than any other known language.
  • Ubykh has just seven of the 10 phonemes noted in Pirahã, the language with the fewest phonemes.
  • The Ubykh phoneme , a pharyngealised labiodental voiced fricative, may not exist in any other language on Earth.
  • Ubykh has some 17 ejective phonemes, but lacks a glottal stop.
  • The sounds , and tʷʾ only appear in Ubykh, its relatives Abkhaz and Abaza, and two other languages, both of which are found only in the Amazon rainforest.
  • Ubykh has more than twenty verb prefixes to delineate spatial relationships.
  • Ubykh may be related to Hattic, a language spoken in Anatolia before 2000 BC and written in a cuneiform script.


Unfortunately, the phonetics of Ubykh are so complex that it still does not have a satisfactory ASCII transcription system. Ubykh had no native writing system, so all transcriptions here are in IPA.


Ubykh has only two (arguably three) basic phonemic vowels: closed /ə/ - schwa, as in English "about", and open /a/ and /aa/ (which actually differ in quality but do not differ in length, although diachronically aa is derived from sequences of a + a).

However, there are many vowel allophones, which are affected by the secondary articulation of the consonants that surround them. Ten basic phonetic vowels appear, derived from the two phonemic vowels adjacent to labialised or palatalised consonants. These ten phonetic vowels are a e i o u and . The phonetic vowels are the standard five found in many of the world's languages, such as Georgian, and the same five vowels with increased phonetic length. In general, the following rules apply:

Cʷa > Co aw >
Cʲa > Ce aj >
Cʷə > Cu əw >
Cʲə > Ci əj >

Other, more complex vowels have been noted in Ubykh. The word [ayəwɕqʾa] you did it can become [ayɕqʾa] rather than the expected [ajəwɕqʾa]. On occasion, nasal sonorants (particularly n) may even decay into vowel nasality. For instance, [najnɕʷ] young man has been noted as [nɛ̃jɕʷ], not [najnɕʷ] as the phonemic notation would indicate.

a appears initially very frequently, particularly in the function of the definite article. ə is extremely restricted initially, appearing only in doubly transitive verb forms where all three arguments are third person. Even then, ə itself may be dropped to provide an even shorter form:

He gave it to him

Both vowels appear without restriction finally, although when ə is unstressed finally, it tends to be dropped: tʷə father becomes the definite form a.tʷ the father.


Eighty-three basic consonants are noted at nine basic points of articulation. Labialisation is present on all classes barring the glottal, bilabial, labiodental and retroflex consonants; palatalisation may be noted on uvulars and velars. Pharyngealisation of consonants, rare among the world's languages, is a distinctive feature. The system is very symmetrical in the main - for instance, the sets of affricates are all complete - but some interesting asymmetries may be noted.

                       Voiced  Voiceless  Ejective  Nasal  Approximant
Bilabial stop          b       p          pʾ        m      w
Phar. bilabial stop    b       p          p'        m      w
Labiodental fricative  v       f
Alveolar stop          d       t          t'        n      r
Alveolar fricative     z       s
Alveolar affricate     j       c          c' 
Alv. labialised stop   dw      tw          tw' 
Alveolar lateral               lh         lh'/l'           l
Postalveolar fric.     zh      sh                          y
Postalveolar affr.     jh      ch         ch' 
Postalv. lab. fric.    zhw     shw
Alveolopalatal fric.   zj      sj
Alveolopalatal affr.   jj      cj         cj' 
Alv-pal. lab. fric.    zy      sy
Alv-pal. lab. affric.  jy      cy         cy' 
Retroflex fric.        zr      sr
Retroflex affr.        jr      cr         cr' 
Velar stop             g *     k *        k' *
Velar fricative        g       k
Palatalised velar stop gj      kj         kj' 
Labialised velar stop  gw      kw         kw' 
Uvular stop                    q          q' 
Uvular fricative       gh      qh
Pal. uvular stop               qj         qj' 
Pal. uvular fric.      ghj     qhj
Labialised uvular stop         qw         qw' 
Lab. uvular fricative  ghw     qhw
Phar. uvular stop              q          q' 
Phar. uvular fric.     gh      qh
Phar. lab. uvular stop         qw         qw' 
Phar. lab. uv. fric.   ghw     qhw
Glottal                        h
* borrowed from Turkish and Circassian

An IPA rendition of the above is available in the Ubykh phonology article.

The consonants /r/ and /h/ are very rare in Ubykh, freeing them for use as digraphic elements. Underlining marks a pharyngealised consonant, /h/ marks postalveolarisation and frication of dorsal sounds (as in English orthography), /r/ marks retroflexion (as in Vietnamese orthography) and apostrophe marks ejectivity.

All but three of the 83 consonants are found in native vocabulary. The plain velars [k' g k] are found only in loans: gaarga crow (from Turkish), kawar slat, batten (from Abdzakh Adyghe), mak'æf estate, legacy. As well, the pharyngealised labial consonants p and p' are almost exclusively noted in words where they are associated with another pharyngealised consonant (for instance, q'aap'a handful), but are occasionally found outside this context (the verb root t'ap' is an example, meaning to explode, to burst).

Some consonants are extremely rare: g (fricative) is noted in the words adæga Circassia and ga testis, and v is noted in just five words: va (four homophones meaning oak, to spy on, moustache and acorn), vacr'ækj' meaning spark, vasra firebrand, ava thick (of fabric) and sæp'ava coarse flour. The frequency of consonants in Ubykh is very variable; the two phonemes n and q' account for over 20% of the consonant phonemes encountered.

Far fewer allophones of consonants are noted, mainly because a small acoustic difference can be phonemic when so many consonants are involved. The alveolopalatal labialised fricatives were sometimes realised as alveolar labialised fricatives, and the uvular ejective stop q' in the past tense suffix -q'a was often pronounced as glottal stop due to the influence of the Kabardian and Adyghe languages.

All consonants can appear word-initially. Restrictions on word-final consonants have not yet been investigated; however, Ubykh has a slight preference for open syllables (CV) over closed ones (VC or CVC). The pharyngealised consonants m, w, p and p' have not been noted word-finally.


Ubykh is agglutinative and polysynthetic, meaning that many sentence components can be incorporated into one word:

we shall not be able to go back
if you had said it

Ubykh is often extremely concise in its word forms: the word azbacr'aghawtwaaylafaq'ayt'daqh if only you had been able to take it all out from under me again is just nine syllables, much shorter than the 19 syllables of the English translation.

The boundaries between nouns and verbs in Ubykh is somewhat blurred. Any noun can be used as the root of a stative verb (mæzæ child, sæmæzæyt' I was a child), and many verb roots can become nouns simply by the use of noun affixes (q'a to say, sæ.q'a my speech, what I say).


The noun system in Ubykh is quite simple. Ubykh has four noun cases (the oblique-ergative case may be two homophonous cases with differing function, thus presenting five cases in total):

A pair of postpositions, -laaq and -ghaafa, have been noted as synthetic datives (cf. a.xjæ.laaq a.s.twad(æ).aw I will send it to the prince), but their status as cases is best discounted.

Nouns do not distinguish grammatical gender; feminine gender is distinguished in the verb paradigm only. The definite article is a-: atæt the man. There is no indefinite article, but za-(root)-gwara (literally one-(root)-certain) translates French un and Turkish bir: za.naynshw.gwara a certain young man.

Number is only marked on the noun in the ergative case, with -na. The number marking of the absolutive argument is either by suppletive verb roots (e.g. akwæn blas he is in the car vs akwæn blazhwa they are in the car) or by a verb suffix -aa: akj'an he goes, akj'aan they go. Interestingly, the second person plural prefix syæ- triggers this plural suffix regardless of whether that prefix represents the ergative, the absolutive or the oblique argument:

syastwaan   I give you all to him (absolutive)
sæsyæntwaan he gives me to you all (oblique)
asæsytwaan  you all give it/them to me (ergative)

Note that in this last sentence, the plurality of it (a-) is obscured; the meaning can be either I give it to you all or I gave them to you all.

Adjectives, in most cases, are simply suffixed to the noun: chæbzjæya pepper with plhæ red becomes chæbzjæyaplhæ red pepper.

Postpositions are rare; most locative semantic functions, as well as some non-local ones, are provided with preverbal elements: a.wæ.s.qhja.txæ.n you wrote it for me. However, there are a few postpositions: sæghwa sæ.gjacr', like me; a.xjæ.laaq, near the prince.


A past-present-future distinction of verb tense exists (the suffixes -q'a and -awt represent past and future) and an imperfective aspect suffix is also found (-yt' , which can combine with tense suffixes). Dynamic and stative verbs are contrasted, as in Arabic, and verbs have several nominal forms. The conjunctions and and but are given with verb suffixes: -gjæ and, -gjæla but, however, although. Morphological causatives are not uncommon.

Verbs agree with the subject, the direct object and the indirect object. Pronominal benefactives are also part of the verbal complex:

He gives it to you for me

Gender only appears as part of the second person paradigm, and then only at the speaker's discretion. The feminine second person index is qha-, which behaves like other pronominal prefixes:

He gives it to you (masc/fem) for me
He gives it to you (fem) for me

Note that the normal second person prefix - can represent either a male or female referent, borne out by the fact that the free pronoun for second person singular is wæghwa (*qhaghwa is not used):

wæghwa wæ.gjacr' 
you 2sg-POSS.like
like you (masc/fem)
wæghwa qha.gjacr' 
you 2sg(fem)-POSS.like
like you (fem)
*qhaghwa qha.gjacr' 
(not possible)


A few meanings covered in English by adverbs or auxiliary verbs are given in Ubykh by verb suffixes:

I need to eat it
I can eat it
I eat it all the time
I am eating it all up
I eat it too much
I eat it again


Questions may be marked grammatically, using verb suffixes or prefixes:

wana a.w.bya.q'a.sr
that 3sg-OBJ.2sg-SUBJ.see.PAST.QUESTION(yes/no)
Did you see that?
saakj'a wæ.p'c'a.y
what 2sg-SUBJ.name.QUESTION(complex)
What is your name?

Other types of questions, involving the pronouns where and what, may also be marked only in the verbal complex:

Where are you going?
What had you said?

Preverbs and Determinants

Many local and other functions are provided by preverbal elements, and it is in this that Ubykh is hideously complex. Two main types of preverbal elements exist in Ubykh: determinants and preverbs. The number of preverbs is limited, and mainly show location and direction. The number of determinants is also limited, but the class is more open; some determinant prefixes include cja- with regard to a horse and lha- with regard to the foot or base of an object.

For simple locations, there are a number of possibilities that can be encoded with preverbs, including (but not limited to):

  • above and touching
  • above and not touching
  • below and touching
  • below and not touching
  • at the side of
  • through a space
  • through solid matter
  • on a flat horizontal surface
  • on a non-horizontal or vertical surface
  • in a homogeneous mass
  • towards
  • in an upward direction
  • in a downward direction
  • into a tubular space
  • into an enclosed space

There is also a separate directional preverb meaning towards the speaker: y-, which occupies a separate slot in the verbal complex. However, preverbs can have meanings that would take up entire phrases in English. The preverb ycy'aa signifies on the earth or in the earth, for instance:

gha.dya a.ycy'aa.naa.lh.q'a
3sg-POSS.corpse 3sg-OBJ.in-the-earth.3pl-SUBJ.lie.PAST
They buried his body (lit. They put his body in the earth)

and faa signifies that an action is done out of, into or with regard to a fire:

a.mjja.n za.cjæcjaqja faa.s.tqhwæ.n
DEF.fire.OBL one.fire-brand fire.1sg-SUBJ.extract.PRES
I take a brand out of the fire


Native Vocabulary

Ubykh syllables have a strong tendency to be CV, although VC and CVC also exist. Consonant clusters are not so large as in Abzhui Abkhaz or in Georgian, being almost always of two terms. Three-term clusters exist in two words - ndgha sun and psta to swell up, but psta is a loan from Adyghe, and ndgha is more often pronounced nædgha when it appears alone. Compounding plays a large part in Ubykh and, indeed, in all Northwest Caucasian semantics. There is no verb to love, for instance; one says I love you this way:

ch'a.næ wæ.z.bya.n
good.ADV 2sg-OBJ.1sg-SUBJ.see.PRES
I see you well

Reduplication occurs in some roots, often those with onomatopoeic values (qhaqha to curry(comb) from qha to scrape; k'ærk'ær, to cluck like a chicken (a loan from Adyghe); warqwarq, to croak like a frog).

Roots and affixes can be as small as one phoneme. The word wantwaan they give you to him, for instance, contains six phonemes, and each is a separate morpheme:

w-    2nd singular absolutive
a-    3rd singular dative
n-    3rd ergative
tw-   to give
aa-   ergative plural
n-    present tense

However, some words may be as long as seven syllables (although these are usually compounds): shæqw'awæsjalhadæcja staircase.

Slang and Idioms

As with all other languages, Ubykh is replete with idioms. The word ntwa door, for instance, is an idiom meaning either magistrate, court or government. Some slang terms and idioms can be shown to be caused by historical events; the term wæræs Russian, a Turkish loan, has come to be a slang term meaning infidel, non-Muslim or enemy (see section History).

Foreign Loans

The majority of loanwords in Ubykh are derived from either Adyghe or Turkish. Towards the end of Ubykh's life, a large influx of Adyghe words was noted; Hans Vogt's Ubykh dictionary of some 3000 roots notes more than a hundred examples. The phonemes g, k, and k' (all as stops, not fricatives) were borrowed from Turkish and Adyghe. lh' also appears to be an Adyghe loan, although at a greater time depth. It is possible, too, that g (fricative) is a loan from Adyghe.

The following words are all loans:

alaman   Germany, German (Turkish)
aslan    lion (Turkish)
bærwæ    to drill, to auger, to perforate (Turkish, via Abdzakh)
ga       testis (Adyghe)
gaarga   crow (Turkish)
kawar    slat, batten (Abdzakh)
k'ærk'ær to cluck like a chicken (Adyghe)

Many loanwords have Ubykh equivalents, but were being forced out by the influence of these other languages:

  • bærwæ to make a hole in, to perforate (Turkish) = psjaatqhw
  • chay tea (Turkish) = bzæpsræ
  • wæræs enemy (Turkish) = baq'a
  • kawar slat, batten (Abdzakh) = sætqha

Some words, usually much older ones, are borrowed from less influential stock: xwa pig is believed to be borrowed from a proto-Semitic *huka, and agjaræ slave from an Iranian root.


In the scheme of Northwest Caucasian evolution, Ubykh is the most divergent language of the Abkhaz-Abaza branch, and has a number of features which are unique even within that family. It has fossilised palatal class markers where all other Northwest Caucasian languages preserve traces of an original labial class: the Ubykh word for heart, gjæ, corresponds to gwæ (where æ stands for schwa), which is the word for heart in Abkhaz, Abaza, Kabardian and Adyghe.

Ubykh also possesses groups of pharyngealised consonants otherwise found in the Northwest Caucasian family only in some dialects of Abkhaz and Abaza. All other NWC languages possess true pharyngeal consonants, but Ubykh is the only language to use pharyngealisation as a feature of secondary articulation.

With regard to the other languages of the family, Ubykh is closer to Abkhaz than to any other member, but is quite close, both lexically and grammatically, to Adyghe.


While not many dialects of Ubykh exist, one divergent dialect of Ubykh has been noted. Grammatically, it is basically the same as standard Ubykh, but has a very different sound system, which has collapsed into just 62-odd phonemes:

  • tw tw' dw have collapsed into p p' b.
  • sy zy are indistinguishable from shw zhw.
  • Fricative g seems to have disappeared.
  • Pharyngealisation is no longer distinctive, having been replaced in many cases by geminate consonants.
  • Palatalisation of the uvular consonants is no longer phonemic.


Ubykh was spoken in the eastern coast of the Black Sea, around Sochi until 1875, when the Ubykhs were driven out of the region by the Russians. They eventually came to settle in Turkey, and came to use Turkish and Circassian for everyday communication. Many words from these languages entered Ubykh in that period.

The Ubykh language died out on October 7 1992, when its last fluent speaker (Tevfik Esenç) passed away in his sleep. Fortunately, before that time thousands of pages of material and many audio recordings had been collected and collated by a number of linguists, including Georges Dumézil, Hans Vogt and George Hewitt, with the help of some of its last speakers, particularly Tevfik Esenç and Huseyin Kozan.

Julius von Mészáros, a Hungarian linguist, visited Turkey in 1930 and took down some notes on Ubykh. His work Die Päkhy-Sprache was extensive and accurate to the extent allowed by his transcription system (which could not represent all the phonemes of Ubykh), and marked the foundation of Ubykh linguistics.

The Frenchman Georges Dumézil also visited Turkey in 1930 to record some Ubykh, and would eventually become the most celebrated Ubykh linguist of all time. He published a collection of Ubykh folktales in the late 1950s, and the language soon attracted the attention of linguists for its small number (two) of phonemic vowels. Hans Vogt, a Norwegian, produced a monumental dictionary that, in spite of its many errors (later corrected by Dumézil), is still one of the masterpieces and essential tools of Ubykh linguistics.

Later in the 1960s and into the early 1970s, Dumézil published a series of papers on Ubykh etymology in particular and Northwest Caucasian etymology in general. Dumézil's book Le Verbe Oubykh (1975), a comprehensive account of the verbal and nominal morphology of the language, is another cornerstone of Ubykh linguistics.

Since the 1980s, Ubykh linguistics has slowed drastically. No other major treatises have been published; however, one Dutch linguist is currently trying to compile a new Ubykh dictionary based on Vogt's 1963 book, and a similar project is also underway in Australia. The Ubykh themselves have shown interest in relearning their difficult language. A partial Ubykh to English dictionary (in Microsoft Word format) is available for downloading.

People who have published literature on Ubykh include

  • Brian George Hewitt
  • Catherine Paris
  • Christine Leroy
  • Georg Bossong
  • Georges Dumézil
  • Hans Vogt
  • John Colarusso
  • Julius von Mészáros
  • Rieks Smeets
  • Tevfik Esenç
  • Wim Lucassen


  • Dumézil, Georges Le Verbe Oubykh: Etudes Descriptives et Comparatives. Imprimerie Nationale, Paris (1975).
  • Vogt, Hans Dictionnaire de la Langue Oubykh. Universitetsforlaget, Oslo (1963).

Sample of Ubykh

faaqhja t'qw'a.kwabjja kj'aghæ.n a.za.qhja.sjæ.na.n 
   a.mghja.n gjæ.kja.q'an
once two.men companion.OBL they.each-other.BEN.become.ERG-PL.PART 
   DEF.road.OBL on-a-surface.enter-PL.PAST-PL
Once, two men set out together on the road.
a.f.awtæ.næ mghjawæf a.qhwada.wt.æn a.kja.na.n a.za.n
   facr'.aala syæb.aala qhwada.q'a
3sABS.eat.FUT.ADV provision 3sABS.buy.FUT.PART DEF.enter.ERG-PL.PART DEF.one.ERG 
   cheese.and bread.and buy.PAST
They bought some provisions for the journey. The one bought cheese and bread;
aydæqhæ.n.gjæ syæb.aala ps(a).aala qhwada.n a.y.næ.w.q'a
other.ERG.and bread.and fish.and buy.PART 3sABS.hither.3sERG.enter(SG).PAST
the other bought bread and fish.
a.mghja.n gjæ.kja.na.gjæ
DEF.road.OBL on-a-surface.enter.PART.GER(?)
While they were on the road,
wa facr' dæ.qhwada.q'a.yt'.æ gha.kj'agh.ghaafa "syæghwalha psa yada sy.f.aa.n,
that cheese REL.buy.PAST.ASP.GER his.companion.DAT you-all fish much 2pERG.eat.ABS-PL.PRES
the one who had bought the cheese asked the other, "You people eat a lot of fish;"
saaba wana.n.gjaafæ psa sy.f.aa.n.æy?" q'a.n gh.aa.jgha.q'a
why that.OBL.as-much-as fish 2pERG.eat.ABS-PL.PRES.INTERROG say.PART to-him.IND.ask.PAST
"why do you eat fish as much as that?"
"psa wæfæba wæ.cr'a yada sj.awt,
fish 2sERG.eat.COND your.knowledge much become.FUT-II
"If you eat fish, you get smarter,"
wana.ghaafa sjæghwalha psa yada sj.fæ.n" q'a.q'a
that.for we fish much 1pERG.eat.PRES say.PAST
"so we eat a lot of fish," he answered.

