-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why does GoNotoCurrent not render Korean glyphs whereas GoNotoCJKCore does? #39
Comments
Hi Phillip, thanks for the bug report. The reason is that GoNotoCurrent does not include "Hangul Syllables" Unicode block (U+AC00 to U+D7AF) whereas GoNotoCJKCore does. This block contains about 11000+ codepoints and at least as many glyphs. However, GoNotoCurrent is currently at ~61000 glyphs in the font file, the maximum limit being 64K (this limit is imposed by spec). Hence there is not really enough "glyph space" for including all of Hangul syllables. So, there is really not much that can be done. One option is to find a smaller subset (say ~2500 glyphs) of the 11K codepoints and include them in GoNotoCurrent, still honouring the 64K limit. Obviously this leaves out a large chunk of the Korean repertoire, so it is of little practical utility. |
Many precomposed syllables are not actually used in Korean. You could use KS X 1001’s list of 2,350 common Hangul syllables. |
@dscorbett I gave it a try on my local machine (KSX1001 subset), but now we hit the cmap format 4 table limit of 65535. Such subsetting causes fragmentation of "Hangul Syllables" block (U+AC00 to U+D7AF) -- the subset ttf's cmap 4 table is about 13000 length whereas GoNotoCurrent is already at 64706, so the total 77666 > 65535 |
I forgot about 'cmap' fragmentation. I guess that idea won’t work. The syllables are exploded because the lookups that join them together are not applied when the language system is Korean. I’m not sure why. |
Thanks a lot for the explanations and taking a stab at it already! I wasn’t aware Korean relied so heavily on the precomposed syllables. If the glyph limit is reached then I suppose there is not so much that can be done. I think for my personal use case, having Korean in the font is more important than the Math, Music, and Symbol Fonts, though. I quickly tried to rebuild the GoNotoCurrent font without those four ( Out came a font file that seems to render my Korean sample texts fine. The command |
@xplip Yes, that is a good approach and that's all there is to it. Enjoy your new font! |
Hey there, sorry for bringing this topic up again, I originally thought I could just follow the steps proposed by @xplip and generate a GoNotoCurrent file with increased support for Korean Hangul syllables, but when trying to run the The stacktrace is as follows:
From what I can tell this exception gets thrown whilst trying to merge the base font files into the big single font file. Since I am unfortunately pretty new to this field I am quite clueless on what to do in order to fix this issue. The last logs before this exception happens are always different, so there's nothing that would help debugging it. The first issue I was thinking of was that maybe there might somehow be too many glyphs to fit into the font file. Confusingly enough this exception occurred even after removing more fonts from the I am running the Any help or hint on how to get this working would be greatly appreciated! Thanks so much for the awesome work :) |
For the record: I have managed to fix the issue I was facing. Basically, there were more glyphs than what the spec allows (64K). Thus, the error @xplip explanation is good, but, to make it clearer and easier, I would change the following line:
for:
That way all the Hangul syllables are added to the korean subset font and the glyph count limit is respected. I hope I am not skipping any important glyphs for Korean. All my tests were successful, so I don't think so. |
AFAIK, usually open source fonts projects, especially large fonts with many glyphs, have their fonts made in 2 files. Take "Hanazono fonts" as example: They release their font Hanazono in 2 files: HanaMinA.ttf are font containing CJK glyphs, which are more commonly used, and HanaMinB are font with less used glyphs. Most systems nowadays - Windows, *nix, Android can be set to use them as a pair. 2 files each 65536 glyphs should be enough for daily uses. |
@stephen1864 Thanks, that is a good idea to create two "A" and "B" fonts, one with Korean glyphs and one without them. I could work on it in the coming days or weeks. |
I also trapped in the issue of the Korean symbols missing. I think the workaround of xplip is the one I need (I can easly skip Math, Music, and Symbol Fonts, but I need Korean) , but currently I have no idea how to create the font correctly? On the other hand the separation to GoNotoCurrent A and B Font may help, if the A font is similar the GoNotoCurrent with Korean. As I like to use the Font for embedding a PDF, I think to use it as a pair may not be a working idea. I need to use one TTF font. |
@user6905 https://fonts.google.com/noto/fonts?noto.lang=ko_Kore¬o.continent=Asia¬o.script=Kore |
@stephen: That does not help in my case. |
@user6905 here's the font I've created back when I participated in this thread. Feel free to use it and test it in your specific scenario. I don't remember the details of what IS and what IS NOT included. But you can check by yourself. |
Thank you Miguel.
from GoNotoCurrent, I build a own Font based on that. @satbyy: May you consider to include that Font in your collection? I think that can be helpful for some others too. @satbyy: BTW - Is GoNoto... a correct name for the fonts? According to OFL License I thought you must not use reserved names (RFNs). And Noto is a TM of Google. |
@user6905 and all, can you please download the font from the CI pipeline? Now there are two variants:
If you are satisfied, I will close this issue and make a new release. |
Generally the scirpt on Ubuntu works well and the created font included the Korean signs - I can confirm that. Thanks a lot. I only wonder that GoNotoCurrent-Regular.ttf from your zip file has only 14.669.722 Bytes. Mine have 15.485.612 Bytes and 64623 Glyphs. |
Amazing thank you @satbyy and @xplip We are using this receipe and specifically the Kurrent font within the @globaleaks project all together with the FPDF2 library. This makes us possible to print PDF able to render texts coming by any international user! |
Thank you for providing this great library!
I am currently trying to render text in various languages with the pygame library and it seems that when I am using GoNotoCurrent, I can render Japanese and Chinese glyphs just fine, but Korean glyphs are only rendered as empty boxes. When I am using GoNotoCJKCore, Korean is rendered properly as well, so I am wondering what the main difference between the two is.
I can get around the issue by rendering my texts with the Pillow library and a libraqm layout engine which builds on harfbuzz, but this is horribly slow, so I'd prefer to keep using pygame and get it to work with GoNotoCurrent. Do you have an idea why rendering Korean might not work in my setup?
The text was updated successfully, but these errors were encountered: