-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is this support font name contains Chinese characters? #38
Comments
The .dat file in this project is called database file, not directly used as a config file of Ghostscript. The The |
Thank you very much for your reply.
when i change the Shell‘s encoding to GB2312,its shows:
These warining indicate that the pdf needs these fonts with Chinese characters。 |
Sorry, we've never tested such a usage that embeds a real font into a PDF with non-embedded fonts; the script was originally intended for proper setup of Ghostscript for PostScript -> PDF conversion when the PostScript input file contains some CJK font names. After small testing, I found out that it works when the PostScript Name written in the PDF input file contains ASCII characters, but fails when the PostScript Name contains other characters like "方正黑体_GBK". It seems that this is due to the limitation of Ghostscript. However, I also found that a fallback font named "Adobe-GB1" can be used instead of such a non-ASCII PostScript Name. The hint was there in your post:
Currently the file
When I add
the resulting PDF have FZSSK.TTF embedded. Of course there is a limitation that all such fonts are converted to a single font, regardless of the original font family (Serif or San-Serif). --- and also, some non-CJK fonts are not rendered correctly at least on my side, but this should be unrelated to this topic ... |
@aminophen you should read the "CIDFontSubstitution" section of the |
Thus if you have "方正黑体_GBK" as a truetype font, you can tell ghostscript by make a hex entry of that name. |
@HinTak Thanks for the information ;-) But I don't have such truetype font, so I can't test it. @ZhangTiny1703 You have 3 truetype fonts, right? If so, could you test hex entries of those fonts? |
@aminophen it is not what fonts you have, but what pdf requires such fonts (and not have them embedded). The above is one, I imagine. |
@HinTak Hmm, then it might not be what we can support by this project. For example FZHTK.TTF is already registered in cjkgs-founder.dat; in this case, adding an alias "方正黑体_GBK (in hex) => FZHTK--GBK1-0" in cidfmap.aliases should work. As I don't know what real font is embedded in what name in a PDF, I don't want to add such aliases. I mean, a simpler name "方正黑体" instead of "方正黑体_GBK" may appear in some PDF ??? |
Or, should I simply add a code to encode some non-ASCII characters into hex, for those who want to add user-defined database containing such characters? |
@aminophen yes, I got here because I was looking for an answer for a pdf with "黑体" as one of it not-embedded font names. (looks like made by the same piece of sh*t software) I knew an answer exists, as I used to work in ghostscript and even that part of it... Anyway, it is as I wrote, you do The table is non-exclusive - you can have multiple font names mapped to the same font file (ie substitution), and also same font names mapped to different font files (latter entries override earlier ones, I think). |
Here is a "cidfmap" file which would process the pdf posted above (% for comments):
edit the path as appropriate for yourself . (I have all of them symlinked in the current directory, and one of them with a chinese name "方正细等线简体.ttf" too, but it will be different for you). This makes the Chinese content rendering as intended. There seems to be a bug with ghostscript for the english fonts. I just filed as https://bugs.ghostscript.com/show_bug.cgi?id=703716 . This is the converted output: Note the chinese content is correct, the english parts is not. See https://bugs.ghostscript.com/show_bug.cgi?id=703716 . |
BTW, it is exactly as Ken Sharp replied on stackoverflow (to the same reporter, I think) - he just have not given you an actual example of how. e.g. "BFACCCE55f474232333132" is "楷体_GB2312" in GB2312 encoding in hex, "5f474232333132" is "_GB2312", which is the same in utf8 /ascii encoding as in GB2312 encoding. |
I add some thing in cjkgs-founder.dat:
use command
perl cjk-gs-integrate.pl
to generate cidfmap.local, it's show:use command
gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dPDFSTOPONERROR -dNOOUTERSAVE -dCompressFonts=true -dSubsetFonts=false -dEmbedAllFonts=true -sColorConversionStrategy=RGB -dCompatibilityLevel=1.6 -sOutputFile=output.pdf 1000027661706311.pdf
to convert pdf ,its error:I guess ghostscript don't support Chinese ,and files all are ascii text. but I read your blog :
some 漢字 とかたかな in config file ?
The text was updated successfully, but these errors were encountered: