Skip to content

Understand Content Type, Mime type guessing and Magic

Florent Viard edited this page Mar 18, 2020 · 3 revisions

When a file is uploaded to a S3 server, the client can indicate its media type (MIME type) the Content-Type header.
(More info about media types can be found here: MIME types definition (Mozilla docs)

This type info is not strictly needed, but it can be useful to render the right content when the resource is directly accessed by a web browser through it's URL.


Often, issues are reported by users when the automatically selected "media type" is not the one that they are expecting.
This is the case, for example, when an user is expecting "application/javascript" to be used, but that this file is uploaded with the "text/plain" type.

The problem is that s3cmd is not directly responsible for the incorrect guesses.

By default, guess_mime_type and mime-magic options are enabled.
That means that the detection will be performed using 2 different methods:

  • guess_mime_type: Will try to guess the right type to use based on the "file extension".
  • mime-magic: The content of the file itself will be inspected to determine its type.

For both of these operations, s3cmd will only use external libraries and simply ask them for their opinion on the type to associate with any given file.

So, when the guess is wrong, it means that all the external libraries that are used are wrong on the type of your file.

In such a case, there are 3 solutions:

  • Use the commandline options to force s3cmd to use a specific mime-type for all the uploaded files.
  • Report the issue upstream to the external libraries so that they can fix their detection rules
  • Create or fix the correct rule for them, locally inside your machine.

Configure locally a "guess_mime_type" file extension rule

"guess_mime_type" function uses the "mimetypes" Python standard library. The database of rules of this library is hardcoded but extendable.

On Windows, additional rules will come from the Windows registry.

On Linux (and OSX?), additional rules will be extracted from any existing file of the following list:

knownfiles = [
    "/etc/mime.types",
    "/etc/httpd/mime.types",                    # Mac OS X
    "/etc/httpd/conf/mime.types",               # Apache
    "/etc/apache/mime.types",                   # Apache 1
    "/etc/apache2/mime.types",                  # Apache 2
    "/usr/local/etc/httpd/conf/mime.types",
    "/usr/local/lib/netscape/mime.types",
    "/usr/local/etc/httpd/conf/mime.types",     # Apache 1.2
    "/usr/local/etc/mime.types",                # Apache 1.3
    ]

The format of a line is: MIME/TYPE [...TAB...] extensions.
For example:

application/x-yaml                                yml yaml