-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding. of course it is encoding... #4
Comments
should be
|
This is the file that doesn't work (had to zip it, because github doesn't accept epub) |
I extracted a few parts and the html files within are encoded correctly that is, there is a charset tag in the
So I guess it could read that tag, or default to utf-8
read_html might need the encoding argument (defaults to "") |
nope thats not it (xml2::read_html(doc) would also always default to utf-8). |
So, the default was UTF-8 but I added a pass-through
In theory, it should have dealt with ^^ properly since it (honest!) passed it in all the way through and I even do a final But, if you do (this text is Latin1 btw):
it works. I'll keep this open since it'd like to provide robust support in the long run but at least the |
(just saw your extended comments) aye, i even pass something (IMO) "weird" is happening either as a result of |
It would be very nice if the text parsing would default to utf-8, because I have something that doesn't seem to be right. 1001 nights
should be
The text was updated successfully, but these errors were encountered: