Kanji

This article is about a form of writing; for the Australian shrub, see Kanji bush.

Kanji (漢字^ⓘ, literally "Han characters") is the name of Chinese characters in the Japanese language. Kanji are one of the three main forms of Japanese writing, the other two being hiragana and katakana, the kana.

This article focuses on the Japanese use of these characters; see Chinese character for a general discussion of Chinese characters, which are also used in several other languages.

History

The characters for **Kanji**, lit. "Han characters".

There is some disagreement about how Chinese characters came to Japan, but it is generally accepted that Buddhist monks brought Chinese texts back to Japan in about the 5th century. These texts were in the Chinese language and would have been read as such at first. Over time, however, a system known as kanbun (漢文) emerged; it essentially used Chinese text with diacritical marks to allow Japanese speakers to read it in accordance with the rules of Japanese grammar.

The Japanese language itself had no written form at the time. A writing system called man'yōgana (used in the ancient poetry anthology Man'yōshū) evolved that used a limited set of Chinese characters for their sound, rather than for their meaning.

Man'yōgana written in curvilinear style became hiragana, a writing system that was accessible to women (who were denied higher education). Major works of Heian era literature by women were written in hiragana. Katakana emerged via a parallel path: monastery students simplified man'yōgana to a single constituent element. Thus the two other writing systems, hiragana and katakana, referred to collectively as kana, are actually forms of kanji.

In modern Japanese, kanji is used to write parts of the language such as nouns, adjective stems and verb stems, while hiragana is used to write inflected verb and adjective endings (okurigana), particles, and words where the kanji is too difficult to read or remember. Katakana is used for representing onomatopoeia and non-Chinese loanwords. The usage of katakana to write loan words is a very recent phenomenon dating to after World War II. Originally loanwords were written using kanji, either used for their meaning (煙草 or 莨 tabako; "tobacco") or to spell the word phonetically (天婦羅 or 天麩羅 tempura). For example, many Japanese words of Portuguese origin borrowed from the 16th century onwards, have kanji forms.

Types of kanji: categorized by history

Kokuji

While some kanji and Chinese hanzi are mutually readable, many more are not. In addition to characters that have different meanings in Japanese, and characters that have identical meanings but are written differently, there are also characters peculiar to Japan known as kokuji (国字; literally "national characters"). Kokuji are also known as wasei kanji (和製漢字; lit. "Chinese characters made in Japan"). There are hundreds of kokuji (see the sci.lang.japan AFAQ list), and although some are rarely used, many others have become important additions to the written Japanese language. These include:

峠 tōge (mountain pass)
榊 sakaki (sakaki tree, genus Camellia)
畑 hatake (field of crops)
辻 tsuji (crossroads, street)
働 dō, hatara(ku) (work)

Kokkun

In addition to kokuji, there are kanji that have been given meanings in Japanese different from their original Chinese meanings. These kanji are not considered kokuji but are instead called kokkun (国訓) and include characters such as:

沖 oki (offing, offshore; Ch. chōng rinse)
森 mori (forest; Ch. sēn gloomy, majestic, luxuriant growth)
椿 tsubaki (Camellia japonicus; Ch. chūn Ailantus)

Old characters and new characters

The same kanji character can sometimes be written in two different ways, 旧字体 (kyū-jitai; lit. "old character") (舊字體 in kyū-jitai) and 新字体 (shin-jitai; "new character"). The following are some examples of kyū-jitai followed by the corresponding shin-jitai:

國国 kuni (country)
號号 gō (number)
變変 hen, ka(waru) (change)

Kyū-jitai were used before the end of World War II, and are mostly, if not completely, the same as the Traditional Chinese characters. After the war the government introduced the simplified shin-jitai. Some of the new characters are similar to simplified characters used in the People's Republic of China. Also, like the simplification process in China, some of the shinjitai were once abbreviated forms (略字 ryakuji) used in handwriting, but in contrast with the "proper" unsimplified characters (正字 seiji) were only acceptable in colloquial contexts. This page [1] shows examples of these handwritten abbreviations, identical to their modern shinjitai forms, from the postwar era. There are also handwritten simplifications today that are significantly simpler than their standard forms (either untouched or received only minor simplification in the post-war reforms), examples of which can be seen here [2], but despite their wide usage and popularity, they, like their postwar counterparts, are not considered socially acceptable and are only used in handwriting.

Some Chinese characters are only used phonetically in Japanese (当て字 ateji), and many Chinese characters are not used in Japanese at all. Theoretically, however, any Chinese character can also be a Japanese character—the Daikanwa Jiten, one of the largest dictionaries of kanji ever compiled, has about 50,000 entries, even though most of the entries have never been used in Japanese.

Readings

A kanji character may have several possible pronunciations, or "readings", depending on its context, intended meaning, use in compounds, and location in the sentence. Some common kanji have ten or more possible readings. These readings are categorized as either Chinese derived (on'yomi or on) or native (kun'yomi or kun).

On'yomi (Chinese reading)

The on'yomi (音読み), the Chinese reading, is a Japanese approximation of the Chinese pronunciation of the character at the time it was introduced. Some kanji were multiply introduced from different parts of China at different times, and so have multiple on'yomi, and often multiple meanings. The kanji invented in Japan typically have no on'yomi. For example, the kanji 込 is Japanese, in origin, and thus lacks any on'yomi.

Generally, on'yomi are classified into four types:

Go-on (呉音; literally Wu sound) readings, from the pronunciation of the Wu region (in the vicinity of modern Shanghai), during the 5th and 6th centuries.
Kan-on (漢音; literally Han sound) readings, from the pronunciation during the Tang Dynasty in the 7th to 9th centuries, primarily from the standard speech of the capital, Chang'an.
Tō-on (唐音；literally Tang sound) readings, from the pronunciations of later dynasties, such as the Song and Ming, covers all readings adopted from the Heian era to the Edo period
Kan'yō-on (慣用音) readings, which are mistaken readings of the kanji which have become accepted into the language.

Examples

Kanji	Meaning	Go-on	Kan-on	Tō-on	Kan'yō-on
明	light	myō	mei	min	*
行	go	gyō	kō	an	*
極	extremely	goku	kyoku	*	*
珠	pearl	*	shu	*	ju, zu
度	level	do	taku	to	*

The most common form of readings is the kan-on one. The tō-on readings occur in some words such as isu "chair" or futon. The go-on readings are especially common in Buddhist terminology such as gokuraku 極楽 "paradise".

Due to trade and navigation patterns, a great volume of Chinese vocabulary was introduced to Japan by natives of southern China, thus many common pronunciations more closely mirror those of Southern Chinese languages ("dialects") than Northern pronunciations. Chinese languages have changed over time and pronunciations used at the time of introduction of vocabulary from China to Japan may no longer be used in a recognizable form by contemporary Chinese.

On'yomi are usually single-syllable readings, since each character expresses a single Chinese syllable. However, tonality aside, most Chinese syllables (especially in Middle Chinese, in which final stop consonants were more prevalent than in most modern dialects) did not fit the largely-CV (consonant-vowel) phonotactics of classical Japanese. Thus most on'yomi are composed of two moras (syllables or beats), the second of which is either a lengthening of the vowel in the first mora (this being i in the case of e and u in the case of o, due to linguistic drift in the centuries since), or one of the syllables ku, ki, tsu, chi, or syllabic n, chosen for their approximation to the final consonants of Middle Chinese. In fact, palatalized consonants before vowels other than i, as well as syllabic n, were probably added to Japanese to better simulate Chinese; none of these features occur in words of native Japanese origin.

On'yomi primarily occur in multi-kanji compound words (熟語 jukugo), many of which are the result of the adoption (along with the kanji themselves) of Chinese words for concepts that either didn't exist in Japanese or could not be articulated as elegantly using native words. This borrowing process is often compared to the English borrowings from Latin and Norman French, since Chinese-borrowed terms are often more specialized, or considered to sound more erudite or formal, than their native counterparts. The major exception to this rule is surnames, in which the native kun'yomi reading is usually used (see below).

Kun'yomi (Japanese reading)

The kun'yomi (訓読み), Japanese reading, or somewhat misleadingly native reading, is a reading based on the pronunciation of a native Japanese word, or yamatokotoba, that closely approximated the meaning of the Chinese character when it was introduced. Again, there can be multiple kun readings for the same kanji, and some kanji have no kun'yomi at all.

For instance, the kanji for east, 東, has the on reading tō. However, Japanese already had two words for east, higashi and azuma. Thus the kanji character 東 had the latter pronunciations added as kun'yomi. However, the kanji 寸, denoting a Chinese unit of measurement (slightly over an inch), had no native Japanese equivalent; thus it only has an on'yomi, sun.

Kun'yomi are characterized by the strict (C)V syllable structure of yamatokotoba. Most noun or adjective kun'yomi are two to three syllables long, while verb kun'yomi are more often one or two syllables in length (not counting trailing hiragana called okurigana, although those are usually considered part of the reading).

In a number of cases, multiple kanji were assigned to cover a single word. Typically when this occurs, the different kanji have different meanings. For instance, the word なおす, naosu, when written 治す, means "to heal an illness or sickness". When written 直す it means "to fix or correct something" (e.g. a bicycle or a badly written Wikipedia article). Sometimes the differences are very clear, other times they are quite subtle. Sometimes there are differences of opinion in different reference works -- one dictionary may say the kanji are equivalent, while another dictionary may draw distinctions of use between them. Because of this confusion, Japanese people have trouble knowing which kanji to use in some cases. One workaround is simply to write the word in hiragana, a method frequently employed with more complex cases such as もと moto, which has five different kanji, 元, 基, 本, 下, 素, three of which have only very subtle differences.

When to use which reading

Words for similar concepts, such as "east" (東), "north" (北) and "northeast" (東北), can have completely different pronunciations: the kun readings higashi and kita are used for the first two, while the on reading tōhoku is used for the third.

To complicate the matter, there are two basic guidelines for determining the pronunciation of a particular kanji in a given context. First, and most simply, kanji occurring in compounds are usually read using on'yomi. These sorts of words are sometimes called jukugo (熟語). For example, 情報 jōhō "information", 学校 gakkō "school", and 新幹線 shinkansen "bullet train" all follow this pattern.

Secondly, kanji occurring in isolation -- that is, written adjacent only to kana, not to other kanji -- are typically read using their kun'yomi. Together with their okurigana, if any, they generally function either as a noun or as an inflected adjective or verb: e.g. 月 tsuki "moon", 情け nasake "sympathy", 赤いakai "red", 新しい　atarashii "new ", 見る miru "(to) see". Kanji compounds that also have okurigana, such as 空揚げ (also written 唐揚げ) karaage "fried food" and 折り紙 origami "artistic paper folding", also fall into this category. It should be noted, however, that many of the latter category of compounds can be written alternatively with the okurigana omitted (e.g. 空揚 or 折紙).

There are numerous exceptions to both rules. 手紙 tegami "letter", 日傘 higasa "parasol", and the famous 神風 kamikaze "divine wind" all use kun'yomi despite being simple kanji compounds. Fortunately, most exceptions to the second rule are simple nouns: 愛 ai "love", 禅 Zen, 点 ten "mark, dot" -- most of these cases involve kanji that have no kun'yomi, so there can be no confusion.

The situation is further complicated by the fact that many kanji have more than one on'yomi: witness 先生 sensei "teacher" versus 一生 isshō "one's whole life".

There are many kanji compounds that use a mixture of on'yomi and kun'yomi, known as jūbako (重箱) or yutō (湯桶) words. The words jūbako and yutō themselves are examples: the first character of jūbako is read using on'yomi, the second kun'yomi, while it is the other way around with yutō. Other examples include 金色 kin'iro "golden" (on-kun) and 合気道 aikidō "the martial art Aikido" (kun-on-on).

There are also several words that can be read multiple ways, like English words like "live" or "read" -- in some cases the words have different meanings depending on how they are read. One example is 上手, which can be read in three different ways -- jōzu (skilled), uwate (upper part), or kamite (upper part). In addition, 上手い has the reading umai (skilled).

Some famous place names, including those of Tokyo (東京 Tōkyō) and Japan itself (日本 Nihon or sometimes Nippon) are read with on'yomi; however, the majority of Japanese place names are read with kun'yomi (e.g. 大阪 Ōsaka, 青森 Aomori, 箱根 Hakone). Family names are also usually read with kun'yomi (e.g., 山田 Yamada, 田中 Tanaka, 鈴木 Suzuki). Personal names, although they are not typically considered jūbako/yutō, often contain mixtures of kun'yomi, on'yomi, and nanori, and are generally only readable with some experience (e.g., 大助 Daisuke [on-kun], 夏美 Natsumi [kun-on]).

Pronunciation assistance

Because of the ambiguities involved, kanji sometimes have their pronunciation for the given context spelled out in ruby characters known as furigana (small kana written above or to the right of the character) or kumimoji (small kana written in-line after the character). This is especially true in texts for children or foreign learners and manga (comics). It is also used in newspapers for rare or unusual readings and for characters not included in the officially recognized set of essential kanji (see below).

Orthographic reform and lists of kanji

In 1946, following World War II, the Japanese government instituted a series of orthographic reforms. Some characters were given simplified glyphs, called 新字体 (shinjitai). The number of characters in circulation was reduced, and formal lists of characters to be learned during each grade of school were established. Many variant forms of characters and obscure alternatives for common characters were officially discouraged. This was done with the goal of facilitating learning for children and simplifying kanji use in literature and periodicals. These are simply guidelines, so many characters outside these standards are still widely known and commonly used.

Kyōiku kanji

Main article: Kyōiku kanji

The Kyōiku kanji 教育漢字 are 1006 characters that Japanese children learn in elementary school. The number was 881 until 1981. The grade-level breakdown of the education kanji is known as the Gakunen-betsu kanji haitōhyō 学年別漢字配当表), or the gakushū kanji.

Jōyō kanji

Main article: Jōyō kanji

The Jōyō kanji 常用漢字 are 1,945 characters consisting of all the kyōiku kanji, plus an additional 939 kanji taught in junior high and high school. In publishing, characters outside this category are often given furigana. The Jōyō kanji were introduced in 1981. They replaced an older list of 1850 characters known as the General-use kanji (tōyō kanji 当用漢字) introduced in 1946.

Jinmeiyō kanji

Main article: Jinmeiyō kanji

The Jinmeiyō kanji 人名用漢字 are 2,928 characters consisting of the Jōyō kanji, plus an additional 983 kanji found in people's names. Over the years, the Minister of Justice has on several occasions added to this list. Sometimes the phrase Jinmeiyō kanji refers to all 2928, and sometimes it only refers to the 983 that are only used for names.

Japanese Industrial Standards for kanji

The Japanese Industrial Standards for kanji and kana define character code-points for each kanji and kana, as well as other forms of writing such as arabic numerals, for use in information processing. They have had numerous revisions. The current standards are:

JIS X 0208:1997, the most recent version of the main standard. It has 6,355 kanji.
JIS X 0212:1990, a supplementary standard containing a further 5,801 kanji. This standard is rarely used, mainly because the common Shift JIS encoding system could not use it. This standard is effectively obsolete;
JIS X 0213:2000, a further revision which extended the JIS X 0208 set with 3,625 additional kanji, of which 2,741 were in JIS X 0212. The standard is in part designed to be compatible with Shift JIS encoding;
JIS X 0221:1995, the Japanese version of the ISO 10646/Unicode standard.

Gaiji

Gaiji (外字), literally meaning "external characters", are kanji that are not represented in existing Japanese encoding systems. These include variant forms of common kanji that need to be represented alongside the more conventional glyph in reference works, and can include non-kanji symbols as well.

Gaiji can be either user-defined characters or system-specific characters. Both are a problem for information interchange, as the code-point used to represent an external character will not be consistent from one computer or operating system to another.

Gaiji were nominally prohibited in JIS X 0208-1997, and JIS X 0213-2000 used the range of code-points previously allocated to gaiji, making them completely unusable. Nevertheless, they persist today with NTT DoCoMo's "iMode" service, where they are used for pictorial characters.

Unicode allows for optional encoding of gaiji in private use areas.

Total number of kanji characters

The number of possible characters is disputed. The "Daikanwa Jiten" contains about 50,000 characters, and this was thought to be comprehensive, but more recent mainland Chinese dictionaries contain 80,000 or more characters, many consisting of obscure variants. Most of these are not in common use in either Japan or China.

Types of Kanji: by Category

Main article: Chinese character classification

The Buddhist scholar Xu Shen, in the Shuowen jiezi ca. 100 CE, classified Chinese characters into six categories (Japanese: 六書 rikusho). The classification is open to interpretation, and some characters belong to more than one category. The first four categories refer to the structure of characters; the last two refer to functions of the characters.

(For a table of all the 教育漢字 broken down by category see this page, from which the above description has been extracted.)

象形文字 (shōkeimoji)

These characters are sketches of the object they represent. For example, 目 is an eye, 木 is a tree, etc. The current forms of the characters are very different from the original, and it is now hard to see the origin in many of these characters. It is somewhat easier to see in seal script. This kind of character is often called a "pictograph" in English (象形 is also the Japanese word for Egyptian hieroglyphs).

指事文字 (shijimoji)

These are called "logograms", "simple ideographs" and sometimes just "symbols" in English. They are usually simple and represent an abstract concept such as a direction: 上: up/above, 下: down/below, etc.

会意文字 (kaiimoji)

Often called "compound ideographs", or just "ideographs". These are usually a combination of pictographs that combine to present an overall meaning. An example is 峠 (mountain pass) made from 山 (mountain), 上 (up) and 下 (down). Another is 休 (rest) from 人 (person) and 木 (tree).

形声文字 (keiseimoji)

These are called "semasio-phonetic" or "phonetic-ideographic" characters in English. They are by far the largest category, making up about 85% of characters. Typically they are made up of two components, one of which indicates the meaning or semantic context, and the other the pronunciation. (The pronunciation really relates to the original Chinese, and may now only be distantly detectable in the modern Japanese ON reading of the kanji. The same is true of the semantic context, which may have changed over the centuries or in the transition from Japanese to Chinese.)

As examples of this, consider the kanji with the 言 shape: 語, 記, 訳, 説, etc. All are related to word/language/meaning. Similarly kanji with the 雨 (rain) shape (雲, 電, 雷, 雪, 霜, etc.) are almost invariably related to weather. Kanji with the 寺 (temple) shape on the right (詩, 持, 時, 侍, etc.) usually have an ON reading of SHI or JI. Sometimes one can guess the meaning and/or reading simply from the components. However, exceptions do exist -- for example, neither 需 nor 霊 have anything to do with weather (at least in their modern usage), and 待 has an ON reading of TAI.

転注文字 (tenchūmoji)

This group are sometimes called "derivative characters", and is rather vaguely defined. It refers to kanji where the meaning or application has become extended. For example, 楽 is used for 'music' and 'comfort, ease', with different pronunciations in Chinese reflected in Sino-Japanese gaku 'music' and raku 'pleasure'.

仮借文字 (kashamoji)

These are called "phonetic loan characters." Historically, they were the predecessors of the "phonetic-ideographic" characters. For example, 来 in ancient Chinese was originally a pictograph for 'wheat'. Its syllable was homophonous with the verb meaning 'to come' and the character is used for that verb as a result, without any embellishing "meaning" element attached.

Related symbols

The ideographic iteration mark (々) is used to indicate that the preceding kanji is to be repeated, functioning similarly to a ditto mark in English. It is pronounced as though the kanji were written twice in a row, for example 色々 (iroiro "various") and 時々 (tokidoki "sometimes"). This mark also appears in personal and place names, as in the surname Sasaki (佐々木). This symbol is a simplified version of the kanji 仝.

Another frequently used symbol is ヶ (a small katakana "ke"), pronounced "ka" when used to indicate quantity (such as 六ヶ月, rokkagetsu "six months") or "ga" in place names like Kasumigaseki (霞ヶ関). This symbol is a simplified version of the kanji 箇.

Kanji Kentei

Main article: Kanji Kentei

The Japanese government provides the Kanji kentei (日本漢字能力検定試験 Nihon kanji nōryoku kentei shiken; "Test of Japanese Kanji Aptitude") which tests the ability to read and write kanji. The highest level of the Kanji kentei tests about 6000 kanji.

References

DeFrancis, John (1990). The Chinese Language: Fact and Fantasy. Honolulu: University of Hawaii Press. ISBN 0824810686.
Hannas, William. C. (1997). Asia's Orthographic Dilemma. Honolulu: University of Hawaii Press. ISBN 082481892X (paperback); ISBN 0824818423 (hardcover).
Kaiser, Stephen (1991). Introduction to the Japanese Writing System. In Kodansha's Compact Kanji Guide. Tokyo: Kondansha International. ISBN 4-7700-1553-4.
Mitamura, Joyce Yumi and Mitamura, Yasuko Kosaka (1997). Let's Learn Kanji. Tokyo: Kondansha International. ISBN 4-7700-2068-6.
Unger, J. Marshall (1996). Literacy and Script Reform in Occupation Japan: Reading Between the Lines. ISBN 0195101669

External links

Dictionaries and other kanji lists

Japanese Kanji Dictionary Search for Kanji by stroke count, reading etc.
Dictionary of Kokuji Japanese only
Jim Breen's WWWJDIC Online Kanji and Japanese Dictionary
Kiki's Kanji Dictionary
Kanji-a-day: Downloadable kanji list (xls) available
Moji: a kanji dictionary extension for the Mozilla Firefox web browser.
Kanji dictionary
Kanji, japanese and sentence dictionary, based on JDIC
KanjiQuick Kanji-English/German dictionary with translation and TTS (Text To Speech) modules

Translators

Japanese Kanji to Romaji Hiragana Converter & Translator A site that translates Kanji into Kana with translation rollovers.
Rikai.com A web-mediator that adds kanji readings to Japanese web-pages
Japanese -> English / English -> Japanese online translator. Includes English -> Kanji / Kanji -> English.