Added Cantonese support, fixed some bugs. #11

jckt · 2018-07-27T13:52:34Z

Added Cantonese pronunciation support

Main features added:

Cantonese support, including a new dictionary with Cantonese pingjam (Cantonese pinyin).
Integration of Cantonese pingjam into theme/colour display options.
Ability to limit number of entries shown in popup (partly because the new dictionary has many more entries, Cantonese-specific words/phrases, etc.).

Fixed bugs, notably:

Popup would not jump up when cursor is near bottom of window.
Conversion to simplified chars was incorrectly implemented (for example see here), and also some other issues with the string "source" and "target" used (for example, 沒 is the result of 冇, which is incorrect). The current implementation is less efficient but does not lead to incorrect/incomplete results.

…g. Ctrl), but that modifier key is not in use by LiuChan, LiuChan would still try to interpret the key. For example, user wishes to copy (Ctrl+C) some text, does not have Ctrl as a modifier key. Then LiuChan would (before this fix) interpret the pressed C to copy the dictionary text, when really user expects the behaviour of Ctrl+C not to be changed.

Paperfeed · 2018-07-31T09:34:54Z

Which dictionary are you using for cantonese? Is it CC-Canto from http://cantonese.org/?

edit:
Thanks for the contribution by the way :)

jckt · 2018-08-01T01:16:11Z

The raw data comes from CC-Canto and the CC-CEDICT Cantonese readings (both from cantonese.org), these were processed into a single file.

You're welcome!

…u3107\u3128\u311b). fixes Paperfeed#12

gkovacs · 2018-09-16T12:41:23Z

This is great! Tested it and it resolves an issue with words like 捨棄 failing to be looked up that has been constantly annoying me, is there anything blocking this from being merged?

fix incorrect zhuyin - mo corresponds to ㄇㄛ (\u3107\u311b) not ㄇㄨㄛ (\u3107\u3128\u311b)

gkovacs · 2018-10-05T17:38:42Z

I noticed that with this branch jyutping seems to be unavailable for 律 and all words containing it ie 法律，律师，旋律，音律，因果律，定律，菲律宾 - I'm not sure why

gkovacs · 2018-10-05T22:20:35Z

Found the reason for the above error, it looks like the scripts that generates cedict_combined.u8 might have some bugs as it doesn't seem to include jyutping everywhere. See the below (jyutping should be between the { } )

法律 法律 [fa3 lu:4] { } /law/CL:條|条[tiao2], 套[tao4], 個|个[ge4]/

gkovacs · 2018-10-05T22:36:42Z

Oh this seems to impact every word containing a character that has pinyin pronunciation v (u:), like 女，绿，吕，驴. Presumably an issue with the script that generates cedict_combined.u8 (which unfortunately doesn't seem to be included in the repository)

jckt · 2018-10-06T11:29:44Z

I wrote a big message just now about how in general I've tried to avoid autocompleting jyutpings on a per-character basis (leads to many errors, even the Pleco dictionary on iPhone has it, which uses a better version of the CC-Canto sources AFAIK). But you're right, actually in this case it's my fault and that there is a bug in the generator scripts. In fact, the entry is double-entered; somewhere else in the file:
法律法律 [fa3 lv4] {faat3 leot6}
So there's now two ways of expressing ü in the dictionary (I forget if this is a problem, I'll check again soon when I have the time). In this case I guess one could either condense the two entries (easy in this case since the entry above is deformed -- it as no / / field for a (blank) definition, so the regex just misses it completely (that's why it doesn't even show up as a definition-free entry). Or one can just leave the two entries but auto-clean the pinyins and / / definition field. I'll try to fix it as soon as I have the time.

For now, I've attached the dictionary generator scripts. I didn't include them in the branch since I thought I would quickly clean them up and include some autocomplete system that also gave correct results (but that's actually a much harder problem than I thought it was).

Thanks again for pointing this out.
generators.zip

orientalperil · 2020-04-26T08:06:15Z

@Paperfeed Any chance this can get merged and deployed to the Chrome Web Store? I'm interested in being able to use Cantonese and can help push this along if more changes are needed

jckt added 5 commits July 27, 2018 15:18

Cantonese support

8ae1d09

Cantonese support

313ccfc

Clean up (accidental piped file).

7bb7177

Fixed popup location when popup is near bottom of screen

7f597a8

gkovacs added 2 commits September 16, 2018 04:59

fix incorrect zhuyin - mo corresponds to ㄇㄛ (\u3107\u311b) not ㄇㄨㄛ (\…

49d4327

…u3107\u3128\u311b). fixes Paperfeed#12

Merge branch 'master' into cantonese

8b1eef2

gkovacs and others added 2 commits September 16, 2018 13:20

fix zhuyin for po (should be ㄆㄛ is currently ㄆㄨㄛ). fixes Paperfeed#14

9baa67d

Merge pull request #1 from gkovacs/cantonese

478d305

fix incorrect zhuyin - mo corresponds to ㄇㄛ (\u3107\u311b) not ㄇㄨㄛ (\u3107\u3128\u311b)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Cantonese support, fixed some bugs. #11

Added Cantonese support, fixed some bugs. #11

jckt commented Jul 27, 2018

Paperfeed commented Jul 31, 2018 •

edited

Loading

jckt commented Aug 1, 2018

gkovacs commented Sep 16, 2018

gkovacs commented Oct 5, 2018 •

edited

Loading

gkovacs commented Oct 5, 2018 •

edited

Loading

gkovacs commented Oct 5, 2018 •

edited

Loading

jckt commented Oct 6, 2018

orientalperil commented Apr 26, 2020

Added Cantonese support, fixed some bugs. #11

Are you sure you want to change the base?

Added Cantonese support, fixed some bugs. #11

Conversation

jckt commented Jul 27, 2018

Added Cantonese pronunciation support

Paperfeed commented Jul 31, 2018 • edited Loading

jckt commented Aug 1, 2018

gkovacs commented Sep 16, 2018

gkovacs commented Oct 5, 2018 • edited Loading

gkovacs commented Oct 5, 2018 • edited Loading

gkovacs commented Oct 5, 2018 • edited Loading

jckt commented Oct 6, 2018

orientalperil commented Apr 26, 2020

Paperfeed commented Jul 31, 2018 •

edited

Loading

gkovacs commented Oct 5, 2018 •

edited

Loading

gkovacs commented Oct 5, 2018 •

edited

Loading

gkovacs commented Oct 5, 2018 •

edited

Loading