Skip to content

Latest commit

 

History

History
194 lines (129 loc) · 24.5 KB

CONFIGURATION.md

File metadata and controls

194 lines (129 loc) · 24.5 KB

Configuring Mountpoint for Amazon S3

In most scenarios, you can use Mountpoint by running the following command, where you should replace DOC-EXAMPLE-BUCKET with the name of your Amazon S3 bucket, and /path/to/mount with the directory you want to mount your bucket into:

mount-s3 DOC-EXAMPLE-BUCKET /path/to/mount

We've tried hard to make this simple command adopt good defaults for most scenarios. However, some scenarios may need additional configuration. This document shows how to configure these elements of Mountpoint:

AWS credentials

Mountpoint uses the same credentials configuration options as the AWS CLI, and will automatically discover credentials from multiple sources. If you are able to run AWS CLI commands like aws s3 ls against your bucket, you should generally also be able to use Mountpoint against that bucket.

Note: Mountpoint does not currently support authenticating with IAM Identity Center (SSO or Legacy SSO). This issue is tracked in #433.

We recommend you use short-term AWS credentials whenever possible. Mountpoint supports several options for short-term AWS credentials:

If you need to use long-term AWS credentials, you can store them in the configuration and credentials files in ~/.aws, or specify them with environment variables (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY).

To manage multiple AWS credentials, you can use the --profile command-line argument or AWS_PROFILE environment variable to select a profile from the configuration and credentials files.

For public buckets that do not require AWS credentials, you can use the --no-sign-request command-line flag to disable AWS credentials.

IAM permissions

Amazon S3 offers both resource-based access policies attached to your S3 buckets (bucket policies) and user policies attached to IAM users (user policies). You can use either or both of these access policy options to control access to your S3 objects with Mountpoint.

The IAM credentials you use with Mountpoint must have permission for the s3:ListBucket action for the S3 bucket you mount. To be able to read files with Mountpoint, you also need permission for the s3:GetObject action for the objects you read.

By default, Mountpoint allows writing new files to your S3 bucket, and does not allow deleting existing files. You can disable writing new files, or enable deleting existing files, with file system configuration flags. Writing files requires permission for the s3:PutObject and s3:AbortMultipartUpload actions. Deleting existing files requires permission for the s3:DeleteObject action.

If you only mount a prefix of your S3 bucket rather than the entire bucket, you need these IAM permissions only for the prefix you mount. You can scope down your IAM permissions to a prefix using the Resource element of the policy statement for most of these permissions, but for s3:ListBucket you must use the s3:prefix condition key instead.

Here is an example least-privilege policy document to add to an IAM user or role that allows full access to your S3 bucket for Mountpoint. Replace DOC-EXAMPLE-BUCKET with the name of your bucket. Alternatively, you can use the AmazonS3FullAccess managed policy, but the managed policy grants more permissions than needed for Mountpoint.

{
   "Version": "2012-10-17",
   "Statement": [
        {
            "Sid": "MountpointFullBucketAccess",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::DOC-EXAMPLE-BUCKET"
            ]
        },
        {
            "Sid": "MountpointFullObjectAccess",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:AbortMultipartUpload",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::DOC-EXAMPLE-BUCKET/*"
            ]
        }
   ]
}

Mountpoint also respects access control lists (ACLs) applied to objects in your S3 bucket, but does not allow you to automatically attach ACLs to objects created with Mountpoint. A majority of modern use cases in Amazon S3 no longer require the use of ACLs. We recommend that you keep ACLs disabled for your S3 bucket, and instead use bucket policies to control access to your objects.

S3 bucket configuration

By default, Mountpoint will automatically mount your S3 bucket given only the bucket name, and will automatically select the appropriate S3 network endpoint. However, you can override this automation if you need finer control over how Mountpoint connects to your bucket.

Mounting a bucket prefix

You can use Mountpoint to access only a prefix of your S3 bucket rather than the entire bucket. This allows you to isolate multiple users, applications, or workloads from each other within a single bucket. Use the --prefix command-line argument to specify a prefix of your S3 bucket, which must end with the / character. With this argument, only objects in your bucket that begin with the given prefix will be visible with Mountpoint.

When constructing the directory structure for your mount, Mountpoint removes the prefix you specify with --prefix from object keys. For example, if your bucket has a key 2023/Files/data.json, and you specify the --prefix 2023/ command-line argument, the mounted directory will contain a single sub-directory Files with a file data.json inside it. If you specify the --prefix 2023/Files/ command-line argument, the mounted directory will contain only a file data.json at its root.

Region detection

Amazon S3 buckets are associated with a single AWS Region. Mountpoint attempts to automatically detect the region for your S3 bucket at startup time and directs all S3 requests to that region. However, in some scenarios this region detection may fail, preventing your bucket from being mounted and displaying Access Denied or No Such Bucket errors. You can override Mountpoint's automatic bucket region detection with the --region command-line argument or AWS_REGION environment variable.

Mountpoint uses instance metadata (IMDS) to help detect the region for an S3 bucket. If you want to disable IMDS, set the AWS_EC2_METADATA_DISABLED environment variable to true.

Access points

Amazon S3 access points are network endpoints attached to buckets that you can use to perform S3 object operations. Each access point has distinct permissions and network controls that S3 applies for any request that is made through that access point.

You can use an access point with Mountpoint by specifying either the access point ARN or the access point bucket-style alias as the bucket argument to mount-s3. For example, if your access point has the following ARN and alias:

  • ARN: arn:aws:s3:region:account-id:accesspoint/my-access-point
  • Access point alias: my-access-point-hrzrlukc5m36ft7okagglf3gmwluquse1b-s3alias

then you can mount your S3 bucket to the /path/to/mount directory with either of the following commands:

  • mount-s3 arn:aws:s3:region:account-id:accesspoint/my-access-point /path/to/mount
  • mount-s3 my-access-point-hrzrlukc5m36ft7okagglf3gmwluquse1b-s3alias /path/to/mount

Multi-Region Access Points

Amazon S3 Multi-Region Access Points provide a global endpoint that applications can use to fulfill requests to S3 buckets that are located in multiple AWS Regions. You can use a Multi-Region Access Point with Mountpoint by specifying its ARN as the bucket argument to mount-s3. For example, if your Multi-Region Access Point ARN is arn:aws:s3::123456789012:accesspoint/mfzwi23gnjvgw.mrap, then you can mount your S3 bucket to the /path/to/mount directory with the command mount-s3 arn:aws:s3::123456789012:accesspoint/mfzwi23gnjvgw.mrap /path/to/mount.

S3 Object Lambda

Amazon S3 Object Lambda allows you to add your own code to Amazon S3 GET, LIST, and HEAD requests to modify and process data as it is returned to an application. S3 Object Lambda uses AWS Lambda functions to automatically process the output of standard S3 GET, LIST, or HEAD requests.

You can use S3 Object Lambda with Mountpoint by mounting an Object Lambda Access Point. Mounting an Object Lambda Access Point works the same way as mounting an access point, by specifying either the ARN or the bucket-style alias of the Object Lambda Access Point as the bucket argument to mount-s3. To use S3 Object Lambda with Mountpoint (or any other client), your IAM identity needs additional permissions.

To use S3 Object Lambda with Mountpoint, your Lambda function must satisfy three additional properties that may not be required by other applications:

  1. Mountpoint uses the Range HTTP header for all GetObject requests to S3. To use S3 Object Lambda with Mountpoint, your Lambda function must be configured to enable the Range header, and must map the provided Range header to the transformed object. See Working with Range and partNumber headers in the Amazon S3 User Guide for more details.
  2. When looking up files and directories in your S3 bucket, Mountpoint sends concurrent HeadObject and ListObjectV2 requests. The HeadObject request is expected to fail with a 404 Not Found HTTP status code when a file does not exist. For example, if your bucket contains a key Files/data.json and you run a command like ls Files on your mount, Mountpoint sends a HeadObject request for the key Files to discover if a file exists with that name, and will receive a 404 Not Found response from S3. Your Lambda function must correctly generate a 404 Not Found response for these requests.
  3. When working with ListObjectV2 requests, your Lambda function's response can either include a JSON-formatted listBucketResult result that S3 Object Lambda automatically converts to a valid ListObjectsV2 XML response, or include an XML-formatted listResultXML result that S3 Object Lambda does not validate further. If your Lambda function's response includes listResultXML, it must precisely match the XML schema for ListObjectV2 responses, or Mountpoint may fail to parse it.

Endpoints and AWS PrivateLink

In most scenarios, Mountpoint automatically infers the appropriate Amazon S3 endpoint to send requests to based on the bucket name and region. This includes automatically using gateway endpoints you have created in your VPC to access S3 without internet access. However, you may need to provide additional command-line arguments to change the endpoint Mountpoint uses in some situations:

  • To make requests to S3 over IPv6, use the --dual-stack command-line flag.
  • To use Amazon S3 Transfer Acceleration to optimize transfer speeds when accessing your S3 bucket over the internet, use the --transfer-acceleration command-line flag. Transfer Acceleration must be enabled on your S3 bucket to use this option.
  • To use interface VPC endpoints provisioned with AWS PrivateLink for Amazon S3, specify the interface endpoint's DNS name with the --endpoint-url command-line argument. You must replace the * part of the DNS name displayed in the console with bucket. For example, if the console shows your interface endpoint's DNS name as *.vpce-0e25b8cdd720f900e-argc85vg.s3.us-east-1.vpce.amazonaws.com, specify the following endpoint URL argument to Mountpoint:
    --endpoint-url https://bucket.vpce-0e25b8cdd720f900e-argc85vg.s3.us-east-1.vpce.amazonaws.com
    
    Alternatively, if you enable private DNS for your interface endpoint, you do not need to provide the --endpoint-url command-line argument.
  • In other scenarios, you can use the --endpoint-url command-line argument to fully override Mountpoint's endpoint detection. For example, the argument --endpoint-url https://example.com will force Mountpoint to send S3 requests to example.com. You may need to also use the --region flag to correctly specify the region to use in AWS request signing, and the --force-path-style flag to disable virtual-hosted-style addressing if the endpoint does not support it.

Data encryption

Amazon S3 supports a number of server-side encryption types. Mountpoint supports reading objects that are encrypted with Amazon S3 managed keys (SSE-S3), with AWS KMS keys (SSE-KMS), or with dual-layer encryption with AWS KMS keys (DSSE-KMS). It does not currently support reading objects encrypted with customer-provided keys (SSE-C). For new objects written by Mountpoint, Amazon S3 automatically applies server-side encryption with Amazon S3 managed keys (SSE-S3), and Mountpoint does not support using other encryption types.

Mountpoint does not support client-side encryption using the Amazon S3 Encryption Client.

Other S3 bucket configuration

If the bucket you are mounting is a Requester Pays bucket, you must acknowledge that you will be charged for the request and the data transferred, rather than the bucket owner. You provide this acknowledgement by using the --requester-pays command-line flag. If you try to mount a Requester Pays bucket without using this flag, mounting will fail with an Access Denied error.

If you want to verify that the S3 bucket you are mounting is owned by the expected AWS account, use the --expected-bucket-owner command-line argument. For example, if you expect the bucket to be owned by the AWS account 111122223333, specify the argument --expected-bucket-owner 111122223333. If the argument doesn't match the bucket owner's account ID, mounting will fail with an Access Denied error.

File system configuration

Mountpoint automatically configures reasonable defaults for file system settings such as permissions and for performance. You can adjust these settings if you need finer control over how the Mountpoint file system behaves.

File modifications and deletions

By default, Mountpoint allows creating new files, and does not allow deleting existing objects or overwriting existing objects. You can adjust these defaults in two ways:

  • If you want to allow file deletion, use the --allow-delete command-line flag. When you delete a file from your Mountpoint file system with this flag enabled, the corresponding object is immediately deleted from your S3 bucket.
  • If you want to forbid all mutating actions on your S3 bucket, use the --read-only command-line flag.

You cannot currently use Mountpoint to overwrite existing objects. However, if you use the --allow-delete flag, you can first delete the object and then create it again.

S3 storage classes

Amazon S3 offers a range of storage classes that you can choose from based on the data access, resiliency, and cost requirements of your workloads. When creating new files with Mountpoint, you can control which storage class the corresponding objects are stored in. By default, Mountpoint uses the S3 Standard storage class, which is appropriate for a wide variety of use cases. To store new objects in a different storage class, use the --storage-class command-line flag. Possible values for this argument include:

For the full list of possible storage classes, see the PutObject documentation in the Amazon S3 User Guide.

Mountpoint supports reading existing objects from your S3 bucket when they are stored in any instant-retrieval storage class. You cannot use Mountpoint to read objects stored in the S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes, or the Archive Access or Deep Archive Access tiers of S3 Intelligent-Tiering. This limitation exists even if you have restored the object. However, you can still use Mountpoint to write new objects into these storage classes or S3 Intelligent-Tiering.

File and directory permissions

Mountpoint applies default permissions that allow all files in your mounted directory to be read and written by the local user who ran the mount-s3 command. You can override these defaults in several ways:

  • To apply a different permission mode to files or directories, use the --file-mode and --dir-mode command-line arguments.
  • To change the ownership (user and group) of all files and directories, use the --uid and --gid command-line arguments. These arguments take user and group identifiers rather than names. You can find your user and group identifiers with the id command on Linux.

By default, users other than the user who ran the mount-s3 command cannot access your mounted directory, even if the permissions and ownership settings above would allow it. This is true even for the root user, and is a limitation of the FUSE system Mountpoint uses to create a file system. To allow other non-root users to access your mounted directory, use the --allow-other command-line flag. To allow the root user to access your mounted directory if you ran mount-s3 as a different user, use the --allow-root command-line flag. To use these flags, you may need to first configure FUSE by adding the line user_allow_other to the /etc/fuse.conf file. Even with these flags enabled, Mountpoint still respects the permissions and ownership configured with the other flags above.

Despite these configurations, IAM permissions still always apply to accessing the files and directories in your S3 bucket.

Configuring Mountpoint performance

At mount time, Mountpoint automatically selects appropriate defaults to provide high-performance access to Amazon S3. These defaults include Amazon S3 performance best practices such as scaling requests across multiple S3 connections, using range GET requests to parallelize sequential reads, and using request timeouts and retries. Most applications should not need to adjust these defaults, but if necessary, you can change them in several ways:

  • Mountpoint scales the number and rate of parallel requests to meet a targeted maximum network throughput. This maximum is shared across all file and directory accesses made by a single Mountpoint process. By default, Mountpoint sets this maximum network throughput to the available network bandwidth when running on an EC2 instance or to 10 Gbps elsewhere. To change this default, use the --maximum-throughput-gbps command-line argument, providing a value in gigabits-per-second (Gbps). For example, if you have multiple Mountpoint processes on the same instance, you can adjust this argument to partition the available network bandwidth between them.
  • By default, Mountpoint can serve up to 16 concurrent file or directory operations, and automatically scales up to reach this limit. If your application makes more than this many concurrent reads and writes (including to the same or different files), you can improve performance by increasing this limit with the --max-threads command-line argument. Higher values of this flag might cause Mountpoint to use more of your instance's resources.
  • When reading or writing files to S3, Mountpoint divides them into parts and uses parallel requests to improve throughput. You can change the part size Mountpoint uses for these parallel requests using the --part-size command-line argument, providing a maximum number of bytes per part. The default value of this argument is 8 MiB (8,306,688 bytes), which in our testing is the highest value that achieves maximum throughput. Higher values of this argument can reduce the number of billed requests Mountpoint makes, but also reduce the throughput of object reads and writes to S3.

Maximum object size

In its default configuration, there is no maximum on the size of objects Mountpoint can read. However, Mountpoint uses multipart upload when writing new objects, and multipart upload allows a maximum of 10,000 parts for an object. This means Mountpoint can only upload objects up to 80,000 MiB (78.1 GiB) in size. If your application tries to write objects larger than this limit, writes will fail with an out of space error.

To increase the maximum object size for writes, use the --part-size command-line argument to specify a maximum number of bytes per part, which defaults to 8 MiB. The maximum object size will be 10,000 multiplied by the value you provide for this argument. Even with multipart upload, S3 allows a maximum object size of 5 TiB, and so setting this argument higher than 524.3 MiB will not further increase the object size limit.

Logging

By default, Mountpoint emits high-severity log information to syslog if available on your system. You can change what level of information is logged, and to where it is logged. See LOGGING.md for more details on configuring logging.