Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docx mime type incorrectly guessed #117

Open
AlfonsoUceda opened this issue Jan 23, 2025 · 0 comments
Open

Docx mime type incorrectly guessed #117

AlfonsoUceda opened this issue Jan 23, 2025 · 0 comments

Comments

@AlfonsoUceda
Copy link

AlfonsoUceda commented Jan 23, 2025

Hi!

I've been experienced an error with Marcel where having a docx file (a real one) gets detected as application/zip instead of application/vnd.openxmlformats-officedocument.wordprocessingml.document, instead using file --mime-type command correctly detects the right mime type.

I've been researching why this happens and it seems the matchers defined here https://github.com/rails/marcel/blob/main/lib/marcel/tables.rb#L2416 expect a right order of those strings in the file.
I've compared two docx files one that is correctly detected by Marcel and the other one not. It seems that identifier aren't in the correct order. E.g. the [Content_Type].xml check is almost at the end of the file.

The following snippet works but I guess the gem reads first bytes for performance reasons so doesn't have to check include in the whole file.

file = Pathname.new('PATH_DOCX').open
content = file.read
content.include?('[Content_Types].xml') # => true
content.include?('word/') # => true
content.include?('_rels/.rels') # => true

Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant