Wednesday, June 12, 2013

Hanyu Da Zidian 漢語大字典(汉语大字典) Online

The Hanyu Da Zidian (Chinese: 漢語大字典/汉语大字典; pinyin: Hànyǔ Dà Zìdiǎn; literally "Great Compendium of Chinese Characters") is one of the best available reference works on Chinese characters. A group of more than 400 editors and lexicographers began compilation in 1979, and it was published in eight volumes from 1986 to 1989. A separate volume of essays (Li and Zhao 1990) documents the lexicographical complexities for this full-scale Chinese dictionary. Besides the weighty 5,790-page first edition, there are 3-volume (1995) and pocket (1999) editions. A second edition (pictured at right) was published in 2006, and has a list of radicals printed on the dust jacket of each volume for quicker character look up.

 The above snippet is from the Wikipedia entry for the Hanyu Da Zidian.

Hanyu Da Zidian

It's only a character dictionary. But it does have more individual unique characters than the Hanyu Da Cidian.

Hanyu Da Cidian

I think they also came out with an edition in traditional characters as well.

Not sure about that. But you might want to use the online edition in traditional.

online Hanyu Da Zidian

It only has slightly more than 3,000 of the most common characters, but, then beggars can't be choosers.

And there are minor errors because of font issues. But nothing major.

For instance, for the character 幸, they've got.


Whereas a scanned copy obtained off the Internet has。


I figure this is a font issue in that they couldn't type out the character so they used a place holder in place of the actual character.

It's a pretty good literary Chinese dictionary.






Who would have known that 甫 meant “I, me” according to the 爾雅(尔雅)。 Look in any literary Chinese dictionary。 And I've looked in several。

Oh, and you could also use the Firefox add to search bar function if you've got the add on installed. This dictionary and the Guoyu Cidian put out by the Ministry of Education on The Republic of China (Taiwan) make pretty good literary dictionaries.

Kobo.

Dictionary of Chinese Character Variants 教育部異體字字典

The Dictionary of Chinese Character Variants (教育部異體字字典) put out by the Republic of China (Taiwan)'s Ministry of Education is an invaluable resource, but, if you do use it use the beta version because the older edition has a lot of errors. Probably because of issues with font.

Dictionary of Chinese Character Variants 教育部異體字字典 (old edition)

Dictionary of Chinese Character Variants 教育部異體字字典 (new trial edition)

For instance, here is the explanation behind the variant character 将 for the more "traditional" 將.


As you can see they've got the wrong character in the explanation behind their research. They've got the character 婔 instead of 将.




There are a lot of these errors in the current edition of the dictionary. I compiled a long list of examples, but, lost them all because of a catastrophic hard disk failure.

So, if you're doing research on character variants, be aware and use the trial version. Though it probably has errors as well. Wonder how many errors are being introduced because of issues with techonology, 

Kobo wrote:
So, if you're doing research on character variants, be aware and use the trial version. Though it probably has errors as well. Wonder how many errors are being introduced because of issues with techonology, 
 I just realized that if you don't have the proper fonts installed 将 will look like either 


The variant that is now the standard on Japan. Or it'll look like this.



This means that the new trial version is going to have a lot of errors because of fonts.

Because if I recall correctly from Ken Lunde book published by O'Reilly on CJKV fonts. When they came up with the encoding scheme they consulted on the codepoints between the various nations and regions that used Chinese characters at one time or another in the various languages. The mainland would use the same codepoints but use "simplified" while the other regions used "traditional" according to their chosen "standard". A core character set for writing. It was only later that they decided to include every character variant under the sun. So they screwed up with the original set. Where the same character codepoint was used for the various variants then in the core set.

Difficult to explain. Anyway, with the Dictionary of Character Variants going for character encoding for the variants instead of graphics, this is going to introduce a whole new variety of error to their dictionary.
 

Add to Search Bar 2.0

Here is a useful tip for those using the Firefox Internet browser.

It's the Add to Search Bar 2.0 add on.

If a site that you regularly use has a search bar function to it, with this add on you can add that search to the search bar on your browser.

For instance, let's say you want to add the search feature from the Lin Yutang's Chinese-English Dictionary of Modern Usage - 林語堂《當代漢英詞典》網絡版.

After you've installed the add on from Mozilla. You go to the web page and right click on the search bar.



  


You'll have an option to"Add to Search Bar..."

 

Once the search function has been added to the search bar, all you have to do is highlight any character or character combination to look up in Lin Yutang. It'll pop open a tab with the definition page.



Cool, huh?

It'll work for most sites that have a search feature but not all.

Edit:

Here are some sites with search functions that you might like to add.

http://www.zdic.net/

Hanyu Da Zidian

Guoyu Cidian

Lin Yutang's Chinese-English Dictionary of Modern Usage

Unicode's Unihan index

The Quest For Kobo

Space...the final frontier...these are the voyages of the StarDict, enterprise edition (ok, highly unlikely since it's an open source program.). It's ongoing mission...to boldly go...where no Chinese learner has gone before.

I know it's been quite a while since this blog was last updated, but, I had fallen into a wormhole and have just got back.

Actually, I forgot my log in password and Google wouldn't let me back in. They said they'd send me the password by e-mail, but, I forgot that password as well. I had written it down on a piece of paper, but, didn't know where I'd laid it.

Well, now, I'm back and rarin' to get back to blogging.

So much to write.   :)

Monday, September 6, 2010

Tuesday, August 3, 2010

Microsoft Mines The Web To Build Up Her Online Chinese-English Dictionary

An interesting August 3, 2010 Wall Street Journal article titled "Microsoft Mines Web to Hone Language Tool" about how Microsoft is mining the web for data to buildup her Engkoo.com Chinese web dictionary.

Eventually it'll include machine generated dictation of sample sentences and even machine generated videos of sentences being spoken so that language learners with see the movement of the lips when the words are spoken.

The focus so far is for Chinese learners of English but might eventually go the other way as well.

Plans are also in the works for other languages such as Japanese and English.

http://online.wsj.com/article/SB10001424052748703545604575406771145298614.html

I wonder why it's Engkoo and not Yingku?

Wednesday, July 28, 2010

Unihan & Mojikyo Both Down

I like looking up Chinese character variants but both Mojikyo and Unicode's Unihan Database lookup features are down.

http://www.mojikyo.org/

This is what Mojikyo says:

We are now reconstructing our "mojikyo.org" pages.
The work will be completed in a few year.
Therefore, the download service is discontinued at a while*1.
Please use the actual expenses distribution service of CD-R*2that we are doing.

*1 We expect that it might extend for a considerable long term.
If you cannot wait, it might be good for you to visit the following sites, but those information is not the latest.
Information Technology standards Commission of Japan's IPSJ-TS 0002:2004 "Character Shapes Identification".
*2 However, this service is done only to those who demand in Japan at present.


"The work will be completed in a few year"?

What does that mean?

Will it be completed within a year? Or in a few years?

It's been down for quite a while already.

It says if you can't wait then visit the itscj site but the information isn't the latest.

This really pisses Kobo. I feel like that stand-up comic Lewis Black who's always ranting about something or other in his act. :)

The Unicode Unihan database's look-up feature is down for maintenance and his been down at least 2 days.

http://www.unicode.org/charts/unihanrsindex.html

I hope they get everything up and running soon.

Update: The Unihan look-up feature is back on line.