Yes, it really can take a year to learn how to look up a Chinese character

Est Reading Time: 4 min

Kangxi Dictionary 1827

I’ve posted about some of the difficulties non-Chinese speakers may face with the language before. Recently I’ve read a wonderful 1991 piece called Why Is Chinese So Damn Hard? by none other than Moser, a professor of Chinese studies. It’s very interesting and worth a read in full, although decent rebuttals do exist.

One item from the article that really highlights the difficulty is that it can take a really long time to even learn to look up a new character in the dictionary. So I thought I’d go through the process involved in looking up one character. Because I bet most people think: “You know who I should read about Chinese from? Some guy who’s never learned any but is randomly self-taught in looking up characters!”

Now, the article is from the pre-internet, pre-smartphone era so it’s not such a big deal today. But even now, anyone looking at any non-digital written Chinese (eg. magazines, shop signs) will need to look something up and the OCR app may fail. So, let’s suspend disbelief and imagine ourselves staring at an unfamiliar character:

(To avoid bias I downloaded a list of the 10000 most frequent characters [HTML|Excel] and used a random number to arrive at the 5579th one. Original character image is from here, others are modifications I’ve made.)

Most characters are made up of a radical component and a phonetic component. The radical component often has something to do with the meaning of the character (eg. characters for sea creatures often contain the radical for “fish”) but might not. The phonetic component is usually a character with a similar pronunciation to the one in question. Or at least it was when the character was codified which could have been millenia ago. The most common combination is for the left side to have the radical and for the right to have the phonetic (although the radical can also be above, below, around or even to the right of the phonetic). Looks like we’re in luck because the left hand side looks like it could be one of the radicals. There are only 214 but I don’t have them memorised, not being an actual student of Chinese.

The next step is to try find this radical by counting the number of strokes it has. So you have to know a little about the order in which characters are written. This one’s not rocket surgery: even if you’re not sure of the order, you can count five strokes.

Now that we know our radical has 5 strokes, we can find it among the 214 since they’re arranged by number of strokes. So we just look at all the ones under 5 and see if we can find ours on sight. The image below is an old-school list in a fancier script which doesn’t show the numbers — but the radical we want is number 115).

Great! We now know which of the 214 sections in the dictionary we should look in: 115. If we couldn’t find the radical, it was probably because we identified the wrong part of the character and need to divide it another way. Or the radical looks a bit different when it’s by itself compared to when it’s inside a character. Or many, many other reasons. But we were lucky here.

The characters that belong to each radical are arranged by the number of extra strokes. So we need to count the number of strokes in the phonetic component, which we already decided was probably this one:

The top part is easy, it’s clearly 4 strokes.

The bottom part is trickier so we’d need to know a bit about the order which I’ve marked. Particularly, a top right corner is drawn in one motion (stroke 2). So we get 5 for the bottom.

The total number of strokes is then 9. Since it’s easy to miscount, we’ll use 9 as the first guess, otherwise we can try 8 or 10. To give an example, this is what I got when using the Mandarin Tools online character dictionary with the above settings (radical 115 + 9 strokes).

Again we’re in luck. It’s possible to get a shitload of characters to wade through, all with the same radical and same number of strokes. Here, we just have two so our culprit’s easy to find. We can now get the pronunciations and meanings — usually there are lots!

“Fuckenell!” you might say, “this is pretty hard!” The writing system’s actually quite hard for everyone, including native speakers. And of course finding out the meaning of the individual characters can form a tenth of the work in getting the meaning of a sentence. But for that, read Moser.