-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Work around crazy emails with non-base64 encoded attachments #283
base: master
Are you sure you want to change the base?
Conversation
GMail IMAP push will get in a weird state if it encounters an email which is encoded in this manner. The connection will be reset after a long timeout after encountering this binary crap. I suspect an earlier version of GMail server software, or some non-IMAP import allows or allowed this in at some point, and yet it is not possible to upload it back verbatim. For my purposes, it suffices to skip these rare species but in general the email parts should be re-encoded as base64, or the problematic invalidly encoded attachments should be removed. The illegally encoded binary emails are detected here by looking for the PNG magic string with bytes that SHOULD NOT appear in any legitimate email data. A screenshot of one of the affected emails is at https://imgur.com/a/i09Xa.
Thanks eliask but this seems to be just a particular case ? I see only a unicode character tabulation with justification and the start of a PNG file ? This seems to solve your particular use case but not others ? |
Yeah, it's true. It only solves this particular case but I am not sure can
there be non-base64/non-ASCII/printable characters in email bodies in
general.
In case it they are illegal in general, it would suffice to look for the
bytes with values outside the legal range (maybe just >0x7f ?)
But yeah, I don't know if that is the case. Email is complicated.
For this particular problem, it would also suffice to parse the MIME parts
and see if base64 encoded data is ACTUALLY base64 or not. But then again
this introduces some extra complexity. Worth it or not, I don't know.
And this also introduces uncertainty about whether Gmail dies on binary
data in general or just inside this context.
…On Mon, 2 Jan 2017, 12:32 Guillaume Aubert, ***@***.***> wrote:
Thanks eliask but this seems to be just a particular case ? I see only a
unicode character tabulation with justification and the start of a PNG file
? This seems to solve your particular use case but not others ?
What do you think ?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#283 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAiZGBk-pGNavz6huLGOOhYpXn_liVEFks5rORiigaJpZM4LYuTa>
.
|
I have encountered same issue. Email source tells me that it contains based64 encoded PDF document attachment, but when I look at it, i see that there is binary. Yet somehow gmail browser client manages to open such file correctly for viewing or download it to PC, yet if I use API - I fail to read such attachment, because base64 encoded claim is false. |
This commit tests all base64-encoded parts of emails to push. This generalizes the previous ad hoc method which only worked for detecting the PNG magic header anywhere in the message. If a email parts/payloads claim to be base64 encoded but contain invalid characters outside the base64 alphabet, the email is never pushed to server. The erroneous emails are NOT sent to quarantine since they are assumed/observed to be both rare and can be detected client side (and notably CANNOT be detected server-side as such because they break the protocol contract).
@BangisBanginis thanks for the heads up. I got motivation to generalize this patch to test all base64-encoded data in emails. @gaubert now that the patch is "complete", could you merge it to master? NOTE: it is theoretically possible that invalid non-base64 encoded data may cause similar issues, but on balance, I believe only base64 is in a privileged position and other data would probably not cause protocol issues. |
GMail IMAP push will get in a weird state if it encounters an email
which is encoded in this manner. The connection will be reset after a
long timeout after encountering this binary crap.
I suspect an earlier version of GMail server software, or some non-IMAP
import allows or allowed this in at some point, and yet it is not
possible to upload it back verbatim.
For my purposes, it suffices to skip these rare species but in general
the email parts should be re-encoded as base64, or the problematic
invalidly encoded attachments should be removed.
The illegally encoded binary emails are detected here by looking for the
PNG magic string with bytes that SHOULD NOT appear in any legitimate
email data.
A screenshot of one of the affected emails is at
https://imgur.com/a/i09Xa.