Skip to content

Commit

Permalink
Fix processing of attachments encoded as Latin 1
Browse files Browse the repository at this point in the history
When body has been encoded at Latin 1 / ISO-8859-1 we should ensure the
strings are converted back to UTF-8 before generating their hexdigest.

This fixes applying masks and censor rules to some attachments.

Fixes #7915
  • Loading branch information
gbp committed Sep 25, 2023
1 parent 905fc22 commit 6a14613
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 1 deletion.
2 changes: 1 addition & 1 deletion lib/mail_handler/backends/mail_backend.rb
Original file line number Diff line number Diff line change
Expand Up @@ -404,7 +404,7 @@ def attempt_to_find_original_attachment_attributes(mail, body:, nested: false)
def calculate_hexdigest(body)
# ensure bodies have the same line endings
Digest::MD5.hexdigest(Mail::Utilities.binary_unsafe_to_lf(
body.rstrip
convert_string_to_utf8(body.rstrip).string

Check warning on line 407 in lib/mail_handler/backends/mail_backend.rb

View workflow job for this annotation

GitHub Actions / build

[rubocop] reported by reviewdog 🐶 [Correctable] Layout/FirstArgumentIndentation: Indent the first argument one step more than Mail::Utilities.binary_unsafe_to_lf(. Raw Output: lib/mail_handler/backends/mail_backend.rb:407:13: C: [Correctable] Layout/FirstArgumentIndentation: Indent the first argument one step more than Mail::Utilities.binary_unsafe_to_lf(. convert_string_to_utf8(body.rstrip).string ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
))
end

Expand Down
6 changes: 6 additions & 0 deletions spec/lib/mail_handler/backends/mail_backend_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -461,6 +461,7 @@
add_file filename: 'lf.txt', content: "bar\nbar"
add_file filename: 'crlf-non-ascii.txt', content: "Aberdâr\r\n"
add_file filename: 'lf-non-ascii.txt', content: "Aberdâr\n"
add_file filename: 'latin1.txt', content: "naïve"
add_file filename: 'mail.eml', content: mail_attachment
add_file filename: 'uuencoded.eml', content: mail_with_uuencoded
end
Expand Down Expand Up @@ -492,6 +493,11 @@
it { is_expected.to include(body: "Aberdâr\n") }
end

context 'when binary body encoded as Latin 1 / ISO-8859-1' do
let(:body) { "na\xEFve".b }
it { is_expected.to include(body: "naïve") }
end

context 'when attached email headers are different' do
let(:body) do
<<~EML
Expand Down

0 comments on commit 6a14613

Please sign in to comment.