From a2dbe1db20eb883ff1610025e85b112d416aa9ac Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 10:39:27 -0500 Subject: [PATCH 01/15] clean up default formatting --- content/en/admin/optional/object-storage.md | 64 +++++++-------------- 1 file changed, 20 insertions(+), 44 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index 7fd856e52..0c86ff6ca 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -111,87 +111,63 @@ You must serve the files with CORS headers, otherwise some functions of Mastodon #### `S3_OPEN_TIMEOUT` -Default: 5 (seconds) - The number of seconds before the HTTP handler should timeout while trying to open a new HTTP session. -#### `S3_READ_TIMEOUT` +Default: `5` (seconds) -Default: 5 (seconds) +#### `S3_READ_TIMEOUT` The number of seconds before the HTTP handler should timeout while waiting for an HTTP response. -#### `S3_FORCE_SINGLE_REQUEST` +Default: `5` (seconds) -Default: false +#### `S3_FORCE_SINGLE_REQUEST` Set this to `true` if you run into trouble processing large files. +Default: `false` + #### `S3_ENABLE_CHECKSUM_MODE` -Default: false +Enables verification of object checksums when Mastodon is retrieving an object from the storage provider. This feature is available in AWS S3 but may not be available in other S3-compatible implementations. -Enables verification of object checksums when Mastodon is retrieving -an object from the storage provider. This feature is available in AWS -S3 but may not be available in other S3-compatible implementations. +Default: `false` #### `S3_STORAGE_CLASS` -Default: none +When using AWS S3, this variable can be set to one of the [storage class](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html) options which influence the storage selected for uploaded objects (and thus their access times and costs). If no storage class is specified then AWS S3 will use the `STANDARD` class, but options include `REDUCED_REDUNDANCY`, `GLACIER`, and others. -When using AWS S3, this variable can be set to one of the [storage -class](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html) -options which influence the storage selected for uploaded objects (and -thus their access times and costs). If no storage class is specified -then AWS S3 will use the `STANDARD` class, but options include -`REDUCED_REDUNDANCY`, `GLACIER`, and others. +Default: `STANDARD` #### `S3_MULTIPART_THRESHOLD` -Default: 15 (megabytes) +Objects of this size and smaller will be uploaded in a single operation, but larger objects will be uploaded using the multipart chunking mechanism, which can improve transfer speeds and reliability. -Objects of this size and smaller will be uploaded in a single -operation, but larger objects will be uploaded using the multipart -chunking mechanism, which can improve transfer speeds and reliability. +Default: `15` (megabytes) #### `S3_PERMISSION` -Default: `public-read` +Defines the S3 object ACL when uploading new files. Use caution when using [S3 Block Public Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) and turning on the `BlockPublicAcls` option, as uploading objects with ACL `public-read` will fail (403). In that case, set `S3_PERMISSION` to `private`. -Defines the S3 object ACL when uploading new files. Use caution when -using [S3 Block Public -Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) -and turning on the `BlockPublicAcls` option, as uploading objects with -ACL `public-read` will fail (403). In that case, set `S3_PERMISSION` -to `private`. +Default: `public-read` {{< hint style="danger" >}} -Regardless of the ACL configuration, your -S3 bucket must be set up to ensure that all objects are publicly -readable but not writable or listable. At the same time, Mastodon -itself should have write access to the bucket. This configuration is -generally consistent across all S3 providers, and common ones are -highlighted below. +Regardless of the ACL configuration, your S3 bucket must be set up to ensure that all objects are publicly readable but not writable or listable. At the same time, Mastodon itself should have write access to the bucket. This configuration is generally consistent across all S3 providers, and common ones are highlighted below. {{}} #### `S3_BATCH_DELETE_LIMIT` -Default: `1000` +The official [Amazon S3 API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) can handle deleting 1,000 objects in one batch job, but some providers may have issues handling this many in one request, or offer lower limits. -The official [Amazon S3 -API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) -can handle deleting 1,000 objects in one batch job, but some providers -may have issues handling this many in one request, or offer lower -limits. +Default: `1000` #### `S3_BATCH_DELETE_RETRY` -Default: 3 - -During batch delete operations, S3 providers may perodically fail or -timeout while processing deletion requests. Mastodon will back off and +During batch delete operations, S3 providers may perodically fail or timeout while processing deletion requests. Mastodon will back off and retry the request up to this maximum number of times. +Default: `3` + ### MinIO MinIO is an open-source implementation of an S3 object provider. This section does not cover how to install it, but how to configure a bucket for use in Mastodon. From 2bfee836ac1d10f7d0471f4913fa1791cc7de737 Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 10:40:55 -0500 Subject: [PATCH 02/15] Revert "clean up default formatting" This reverts commit a2dbe1db20eb883ff1610025e85b112d416aa9ac. --- content/en/admin/optional/object-storage.md | 64 ++++++++++++++------- 1 file changed, 44 insertions(+), 20 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index 0c86ff6ca..7fd856e52 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -111,62 +111,86 @@ You must serve the files with CORS headers, otherwise some functions of Mastodon #### `S3_OPEN_TIMEOUT` -The number of seconds before the HTTP handler should timeout while trying to open a new HTTP session. +Default: 5 (seconds) -Default: `5` (seconds) +The number of seconds before the HTTP handler should timeout while trying to open a new HTTP session. #### `S3_READ_TIMEOUT` -The number of seconds before the HTTP handler should timeout while waiting for an HTTP response. +Default: 5 (seconds) -Default: `5` (seconds) +The number of seconds before the HTTP handler should timeout while waiting for an HTTP response. #### `S3_FORCE_SINGLE_REQUEST` -Set this to `true` if you run into trouble processing large files. +Default: false -Default: `false` +Set this to `true` if you run into trouble processing large files. #### `S3_ENABLE_CHECKSUM_MODE` -Enables verification of object checksums when Mastodon is retrieving an object from the storage provider. This feature is available in AWS S3 but may not be available in other S3-compatible implementations. +Default: false -Default: `false` +Enables verification of object checksums when Mastodon is retrieving +an object from the storage provider. This feature is available in AWS +S3 but may not be available in other S3-compatible implementations. #### `S3_STORAGE_CLASS` -When using AWS S3, this variable can be set to one of the [storage class](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html) options which influence the storage selected for uploaded objects (and thus their access times and costs). If no storage class is specified then AWS S3 will use the `STANDARD` class, but options include `REDUCED_REDUNDANCY`, `GLACIER`, and others. +Default: none -Default: `STANDARD` +When using AWS S3, this variable can be set to one of the [storage +class](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html) +options which influence the storage selected for uploaded objects (and +thus their access times and costs). If no storage class is specified +then AWS S3 will use the `STANDARD` class, but options include +`REDUCED_REDUNDANCY`, `GLACIER`, and others. #### `S3_MULTIPART_THRESHOLD` -Objects of this size and smaller will be uploaded in a single operation, but larger objects will be uploaded using the multipart chunking mechanism, which can improve transfer speeds and reliability. +Default: 15 (megabytes) -Default: `15` (megabytes) +Objects of this size and smaller will be uploaded in a single +operation, but larger objects will be uploaded using the multipart +chunking mechanism, which can improve transfer speeds and reliability. #### `S3_PERMISSION` -Defines the S3 object ACL when uploading new files. Use caution when using [S3 Block Public Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) and turning on the `BlockPublicAcls` option, as uploading objects with ACL `public-read` will fail (403). In that case, set `S3_PERMISSION` to `private`. - Default: `public-read` +Defines the S3 object ACL when uploading new files. Use caution when +using [S3 Block Public +Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) +and turning on the `BlockPublicAcls` option, as uploading objects with +ACL `public-read` will fail (403). In that case, set `S3_PERMISSION` +to `private`. + {{< hint style="danger" >}} -Regardless of the ACL configuration, your S3 bucket must be set up to ensure that all objects are publicly readable but not writable or listable. At the same time, Mastodon itself should have write access to the bucket. This configuration is generally consistent across all S3 providers, and common ones are highlighted below. +Regardless of the ACL configuration, your +S3 bucket must be set up to ensure that all objects are publicly +readable but not writable or listable. At the same time, Mastodon +itself should have write access to the bucket. This configuration is +generally consistent across all S3 providers, and common ones are +highlighted below. {{}} #### `S3_BATCH_DELETE_LIMIT` -The official [Amazon S3 API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) can handle deleting 1,000 objects in one batch job, but some providers may have issues handling this many in one request, or offer lower limits. - Default: `1000` +The official [Amazon S3 +API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) +can handle deleting 1,000 objects in one batch job, but some providers +may have issues handling this many in one request, or offer lower +limits. + #### `S3_BATCH_DELETE_RETRY` -During batch delete operations, S3 providers may perodically fail or timeout while processing deletion requests. Mastodon will back off and -retry the request up to this maximum number of times. +Default: 3 -Default: `3` +During batch delete operations, S3 providers may perodically fail or +timeout while processing deletion requests. Mastodon will back off and +retry the request up to this maximum number of times. ### MinIO From a6f01f4449a4c23ed3f653df70297b9c077fe6a7 Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 10:41:25 -0500 Subject: [PATCH 03/15] cleanup default formatting --- content/en/admin/optional/object-storage.md | 64 +++++++-------------- 1 file changed, 20 insertions(+), 44 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index 7fd856e52..0c86ff6ca 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -111,87 +111,63 @@ You must serve the files with CORS headers, otherwise some functions of Mastodon #### `S3_OPEN_TIMEOUT` -Default: 5 (seconds) - The number of seconds before the HTTP handler should timeout while trying to open a new HTTP session. -#### `S3_READ_TIMEOUT` +Default: `5` (seconds) -Default: 5 (seconds) +#### `S3_READ_TIMEOUT` The number of seconds before the HTTP handler should timeout while waiting for an HTTP response. -#### `S3_FORCE_SINGLE_REQUEST` +Default: `5` (seconds) -Default: false +#### `S3_FORCE_SINGLE_REQUEST` Set this to `true` if you run into trouble processing large files. +Default: `false` + #### `S3_ENABLE_CHECKSUM_MODE` -Default: false +Enables verification of object checksums when Mastodon is retrieving an object from the storage provider. This feature is available in AWS S3 but may not be available in other S3-compatible implementations. -Enables verification of object checksums when Mastodon is retrieving -an object from the storage provider. This feature is available in AWS -S3 but may not be available in other S3-compatible implementations. +Default: `false` #### `S3_STORAGE_CLASS` -Default: none +When using AWS S3, this variable can be set to one of the [storage class](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html) options which influence the storage selected for uploaded objects (and thus their access times and costs). If no storage class is specified then AWS S3 will use the `STANDARD` class, but options include `REDUCED_REDUNDANCY`, `GLACIER`, and others. -When using AWS S3, this variable can be set to one of the [storage -class](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html) -options which influence the storage selected for uploaded objects (and -thus their access times and costs). If no storage class is specified -then AWS S3 will use the `STANDARD` class, but options include -`REDUCED_REDUNDANCY`, `GLACIER`, and others. +Default: `STANDARD` #### `S3_MULTIPART_THRESHOLD` -Default: 15 (megabytes) +Objects of this size and smaller will be uploaded in a single operation, but larger objects will be uploaded using the multipart chunking mechanism, which can improve transfer speeds and reliability. -Objects of this size and smaller will be uploaded in a single -operation, but larger objects will be uploaded using the multipart -chunking mechanism, which can improve transfer speeds and reliability. +Default: `15` (megabytes) #### `S3_PERMISSION` -Default: `public-read` +Defines the S3 object ACL when uploading new files. Use caution when using [S3 Block Public Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) and turning on the `BlockPublicAcls` option, as uploading objects with ACL `public-read` will fail (403). In that case, set `S3_PERMISSION` to `private`. -Defines the S3 object ACL when uploading new files. Use caution when -using [S3 Block Public -Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) -and turning on the `BlockPublicAcls` option, as uploading objects with -ACL `public-read` will fail (403). In that case, set `S3_PERMISSION` -to `private`. +Default: `public-read` {{< hint style="danger" >}} -Regardless of the ACL configuration, your -S3 bucket must be set up to ensure that all objects are publicly -readable but not writable or listable. At the same time, Mastodon -itself should have write access to the bucket. This configuration is -generally consistent across all S3 providers, and common ones are -highlighted below. +Regardless of the ACL configuration, your S3 bucket must be set up to ensure that all objects are publicly readable but not writable or listable. At the same time, Mastodon itself should have write access to the bucket. This configuration is generally consistent across all S3 providers, and common ones are highlighted below. {{}} #### `S3_BATCH_DELETE_LIMIT` -Default: `1000` +The official [Amazon S3 API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) can handle deleting 1,000 objects in one batch job, but some providers may have issues handling this many in one request, or offer lower limits. -The official [Amazon S3 -API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) -can handle deleting 1,000 objects in one batch job, but some providers -may have issues handling this many in one request, or offer lower -limits. +Default: `1000` #### `S3_BATCH_DELETE_RETRY` -Default: 3 - -During batch delete operations, S3 providers may perodically fail or -timeout while processing deletion requests. Mastodon will back off and +During batch delete operations, S3 providers may perodically fail or timeout while processing deletion requests. Mastodon will back off and retry the request up to this maximum number of times. +Default: `3` + ### MinIO MinIO is an open-source implementation of an S3 object provider. This section does not cover how to install it, but how to configure a bucket for use in Mastodon. From eeacc399796fc217b7f6e112e2f3b9361371f055 Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 11:16:18 -0500 Subject: [PATCH 04/15] line breaks --- content/en/admin/optional/object-storage.md | 68 +++++---------------- 1 file changed, 14 insertions(+), 54 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index 0c86ff6ca..04c55d461 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -28,56 +28,28 @@ The web server must be configured to serve those files but not allow listing the ## S3-compatible object storage backends {#S3} Mastodon can use S3-compatible object storage backends. ACL support is recommended as it allows Mastodon to quickly make the content of temporarily suspended users unavailable, or marginally improve the security of private data. +Mastodon uses the S3 API (`S3_REGION`, `S3_ENDPOINT`, `S3_BUCKET`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `S3_SIGNATURE_VERSION`, `S3_OVERRIDE_PATH_STYLE`) for all write, delete, and permissions-modification operations. This includes media uploads (from the web interface, from Mastodon API clients, and from ActivityPub servers), media deletion (when a post is edited or deleted), and blocking access to media (when an account is suspended). -Mastodon uses the S3 API (`S3_REGION`, `S3_ENDPOINT`, `S3_BUCKET`, -`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `S3_SIGNATURE_VERSION`, -`S3_OVERRIDE_PATH_STYLE`) for all write, delete, and -permissions-modification operations. This includes media uploads (from -the web interface, from Mastodon API clients, and from ActivityPub -servers), media deletion (when a post is edited or deleted), and -blocking access to media (when an account is suspended). - -Mastodon sends URLs to the web interface, Mastodon API clients, and -ActivityPub servers for all 'read' operations. As a result those -operations are anonymous (no authentication or authorization needed) -and use plain HTTP GET methods, which means they can be routed through -reverse proxies and CDNs, and can be cached. It also means that those -URLs can contain host/domain names which are entirely different from -those used by the S3 storage provider itself, if desired. See the -detailed documentation below which describes how those URLs are -constructed and which environment variables are involved. +Mastodon sends URLs to the web interface, Mastodon API clients, and ActivityPub servers for all 'read' operations. As a result, those operations are anonymous (no authentication or authorization needed) and use plain HTTP GET methods, which means they can be routed through reverse proxies and CDNs, and can be cached. It also means that those URLs can contain host/domain names which are entirely different from those used by the S3 storage provider itself, if desired. See the detailed documentation below which describes how those URLs are constructed and which environment variables are involved. To enable S3 storage, set the `S3_ENABLED` environment variable to `true`. ### Environment variables for S3 API access -- `S3_REGION` (defaults to 'us-east-1', required if using AWS S3, may - not be required with other storage providers) -- `S3_ENDPOINT` (defaults to 's3..amazonaws.com', required - if not using AWS S3) -- `S3_BUCKET=mastodata` (replacing `mastodata` with the name of your - bucket) -- `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` need to be set to - your credentials -- `S3_SIGNATURE_VERSION` (defaults to 'v4', should be compatible with - most storage providers) -- `S3_OVERRIDE_PATH_STYLE` (only used if `S3_ENDPOINT` is configured, - set this to `true` if the storage provider requires API operations - to be sent to '.` (domain-style)) +- `S3_REGION` (defaults to 'us-east-1', required if using AWS S3, may not be required with other storage providers) +- `S3_ENDPOINT` (defaults to 's3..amazonaws.com', required if not using AWS S3) +- `S3_BUCKET=mastodata` (replacing `mastodata` with the name of your bucket) +- `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` need to be set to your credentials +- `S3_SIGNATURE_VERSION` (defaults to 'v4', should be compatible with most storage providers) +- `S3_OVERRIDE_PATH_STYLE` (only used if `S3_ENDPOINT` is configured, set this to `true` if the storage provider requires API operations to be sent to '.` (domain-style)) ### Environment variables for client access to media objects - `S3_PROTOCOL` (defaults to `https`) -- `S3_HOSTNAME` (defaults to 's3-.amazonaws.com', required - if not using AWS S3 and `S3_ALIAS_HOST` is not set) -- `S3_ALIAS_HOST` (can be used instead of `S3_HOSTNAME` if you do not - want `S3_BUCKET` to be included in the media URLs, and requires that - you have provisioned a reverse proxy or CDN in front of the storage - provider) +- `S3_HOSTNAME` (defaults to 's3-.amazonaws.com', required if not using AWS S3 and `S3_ALIAS_HOST` is not set) +- `S3_ALIAS_HOST` (can be used instead of `S3_HOSTNAME` if you do not want `S3_BUCKET` to be included in the media URLs, and requires that you have provisioned a reverse proxy or CDN in front of the storage provider) -As noted above, Mastodon will send URLs to clients when they need to -access media objects from the storage provider. The URLs are -constructed as follows: +As noted above, Mastodon will send URLs to clients when they need to access media objects from the storage provider. The URLs are constructed as follows: - If `S3_ALIAS_HOST` is not set, then the URL will be ':////\' @@ -85,21 +57,9 @@ constructed as follows: - If `S3_ALIAS_HOST` is set, then the URL will be ':///\' -It is important to note that when `S3_ALIAS_HOST` is set, the bucket -name is **not** included in the generated URL; this means the bucket -name must be included in `S3_ALIAS_HOST` (referred to as -'domain-style' object access), or that `S3_ALIAS_HOST` must point to a -reverse proxy or CDN which can include the bucket name in the URL it -uses to send the request onward to the storage provider. This type of -configuration allows you to 'hide' the usage of the storage provider -from the instance's clients, which means you can change storage -providers without changing the resulting URLs. - -In addition to hiding the usage of the storage provider, this can also -allow you to cache the media after retrieval from the storage -provider, reducing egress bandwidth costs from the storage -provider. This can be done in your own reverse proxy, or by using a -CDN. +It is important to note that when `S3_ALIAS_HOST` is set, the bucket name is **not** included in the generated URL; this means the bucket name must be included in `S3_ALIAS_HOST` (referred to as 'domain-style' object access), or that `S3_ALIAS_HOST` must point to a reverse proxy or CDN which can include the bucket name in the URL it uses to send the request onward to the storage provider. This type of configuration allows you to 'hide' the usage of the storage provider from the instance's clients, which means you can change storage providers without changing the resulting URLs. + +In addition to hiding the usage of the storage provider, this can also allow you to cache the media after retrieval from the storage provider, reducing egress bandwidth costs from the storage provider. This can be done in your own reverse proxy, or by using a CDN. {{< page-ref page="admin/optional/object-storage-proxy.md" >}} From 588eee00a6d36a99d01b562f1e3f6eba4ce507cf Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 11:17:07 -0500 Subject: [PATCH 05/15] use example.com --- content/en/admin/optional/object-storage.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index 04c55d461..0591b0a9d 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -15,14 +15,14 @@ The simplest way to store user uploads is by using the server's file system. Thi By default, Mastodon will store file uploads under `public/system` in its installation directory, but that can be overridden using the `PAPERCLIP_ROOT_PATH` environment variable. -By default, the files are served at `https://your-domain/system`, which can be overridden using `PAPERCLIP_ROOT_URL` and `CDN_HOST`. +By default, the files are served at `https://example.com/system`, which can be overridden using `PAPERCLIP_ROOT_URL` and `CDN_HOST`. {{< hint style="info" >}} While using the server's file system is perfectly serviceable for small servers, using external object storage is more scalable. {{}} {{< hint style="danger" >}} -The web server must be configured to serve those files but not allow listing them (that is, `https://your-domain/system/` should not return a file list). This should be the case if you use the configuration files distributed with Mastodon, but it is worth double-checking. +The web server must be configured to serve those files but not allow listing them (that is, `https://example.com/system/` should not return a file list). This should be the case if you use the configuration files distributed with Mastodon, but it is worth double-checking. {{}} ## S3-compatible object storage backends {#S3} From 1841dcc1005bd6b1eb203fa7b1bccaca4e67c552 Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 12:11:06 -0500 Subject: [PATCH 06/15] one line per sentence style --- content/en/admin/config.md | 2 - content/en/admin/optional/object-storage.md | 150 ++++++++++---------- 2 files changed, 78 insertions(+), 74 deletions(-) diff --git a/content/en/admin/config.md b/content/en/admin/config.md index a4244056b..815748648 100644 --- a/content/en/admin/config.md +++ b/content/en/admin/config.md @@ -579,14 +579,12 @@ The bucket must support access control lists (ACLs). For AWS S3, this means sett #### `S3_OVERRIDE_PATH_STYLE` - #### `S3_PROTOCOL` #### `S3_HOSTNAME` #### `S3_ALIAS_HOST` - #### `S3_OPEN_TIMEOUT` #### `S3_READ_TIMEOUT` diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index 0591b0a9d..6232696d2 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -7,34 +7,26 @@ menu: parent: admin-optional --- -User-uploaded files can be stored on the main server's file system, or using an external object storage server, which can be required for scaling. +User-uploaded files can be stored on the main server's file system, or using an external object storage server. -## Using the filesystem {#FS} - -The simplest way to store user uploads is by using the server's file system. This is how it works by default and is suitable for small servers. - -By default, Mastodon will store file uploads under `public/system` in its installation directory, but that can be overridden using the `PAPERCLIP_ROOT_PATH` environment variable. - -By default, the files are served at `https://example.com/system`, which can be overridden using `PAPERCLIP_ROOT_URL` and `CDN_HOST`. +By default, Mastodon will store user uploaded and federated media files on the server's file system, under `public/system` in its installation directory and the files are served at `https://example.com/system`. {{< hint style="info" >}} While using the server's file system is perfectly serviceable for small servers, using external object storage is more scalable. {{}} -{{< hint style="danger" >}} -The web server must be configured to serve those files but not allow listing them (that is, `https://example.com/system/` should not return a file list). This should be the case if you use the configuration files distributed with Mastodon, but it is worth double-checking. -{{}} - -## S3-compatible object storage backends {#S3} +To enable S3 storage, start by setting the `S3_ENABLED` environment variable to `true`. -Mastodon can use S3-compatible object storage backends. ACL support is recommended as it allows Mastodon to quickly make the content of temporarily suspended users unavailable, or marginally improve the security of private data. -Mastodon uses the S3 API (`S3_REGION`, `S3_ENDPOINT`, `S3_BUCKET`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `S3_SIGNATURE_VERSION`, `S3_OVERRIDE_PATH_STYLE`) for all write, delete, and permissions-modification operations. This includes media uploads (from the web interface, from Mastodon API clients, and from ActivityPub servers), media deletion (when a post is edited or deleted), and blocking access to media (when an account is suspended). +### Access Control -Mastodon sends URLs to the web interface, Mastodon API clients, and ActivityPub servers for all 'read' operations. As a result, those operations are anonymous (no authentication or authorization needed) and use plain HTTP GET methods, which means they can be routed through reverse proxies and CDNs, and can be cached. It also means that those URLs can contain host/domain names which are entirely different from those used by the S3 storage provider itself, if desired. See the detailed documentation below which describes how those URLs are constructed and which environment variables are involved. +When using an S3-compatible object storage backend, it is recommended to use a backend with ACL support, as it allows Mastodon to quickly improve the security of private data. -To enable S3 storage, set the `S3_ENABLED` environment variable to `true`. +Mastodon sends URLs to the web interface, Mastodon API clients, and ActivityPub servers for all 'read' operations. +As a result, those operations are anonymous (no authentication or authorization needed) and use plain HTTP GET methods, which means they can be routed through reverse proxies and CDNs, and can be cached. +It also means that those URLs can contain host/domain names which are entirely different from those used by the S3 storage provider itself, if desired. +See the detailed documentation below which describes how those URLs are constructed and which environment variables are involved. -### Environment variables for S3 API access +### Required Environment Variables - `S3_REGION` (defaults to 'us-east-1', required if using AWS S3, may not be required with other storage providers) - `S3_ENDPOINT` (defaults to 's3..amazonaws.com', required if not using AWS S3) @@ -57,9 +49,11 @@ As noted above, Mastodon will send URLs to clients when they need to access medi - If `S3_ALIAS_HOST` is set, then the URL will be ':///\' -It is important to note that when `S3_ALIAS_HOST` is set, the bucket name is **not** included in the generated URL; this means the bucket name must be included in `S3_ALIAS_HOST` (referred to as 'domain-style' object access), or that `S3_ALIAS_HOST` must point to a reverse proxy or CDN which can include the bucket name in the URL it uses to send the request onward to the storage provider. This type of configuration allows you to 'hide' the usage of the storage provider from the instance's clients, which means you can change storage providers without changing the resulting URLs. +It is important to note that when `S3_ALIAS_HOST` is set, the bucket name is **not** included in the generated URL; this means the bucket name must be included in `S3_ALIAS_HOST` (referred to as 'domain-style' object access), or that `S3_ALIAS_HOST` must point to a reverse proxy or CDN which can include the bucket name in the URL it uses to send the request onward to the storage provider. +This type of configuration allows you to 'hide' the usage of the storage provider from the instance's clients, which means you can change storage providers without changing the resulting URLs. -In addition to hiding the usage of the storage provider, this can also allow you to cache the media after retrieval from the storage provider, reducing egress bandwidth costs from the storage provider. This can be done in your own reverse proxy, or by using a CDN. +In addition to hiding the usage of the storage provider, this can also allow you to cache the media after retrieval from the storage provider, reducing egress bandwidth costs from the storage provider. +This can be done in your own reverse proxy, or by using a CDN. {{< page-ref page="admin/optional/object-storage-proxy.md" >}} @@ -95,7 +89,8 @@ Default: `false` #### `S3_STORAGE_CLASS` -When using AWS S3, this variable can be set to one of the [storage class](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html) options which influence the storage selected for uploaded objects (and thus their access times and costs). If no storage class is specified then AWS S3 will use the `STANDARD` class, but options include `REDUCED_REDUNDANCY`, `GLACIER`, and others. +When using AWS S3, this variable can be set to one of the [storage class](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html) options which influence the storage selected for uploaded objects (and thus their access times and costs). +If no storage class is specified then AWS S3 will use the `STANDARD` class, but options include `REDUCED_REDUNDANCY`, `GLACIER`, and others. Default: `STANDARD` @@ -107,12 +102,15 @@ Default: `15` (megabytes) #### `S3_PERMISSION` -Defines the S3 object ACL when uploading new files. Use caution when using [S3 Block Public Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) and turning on the `BlockPublicAcls` option, as uploading objects with ACL `public-read` will fail (403). In that case, set `S3_PERMISSION` to `private`. +Defines the S3 object ACL when uploading new files. Use caution when using [S3 Block Public Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) and turning on the `BlockPublicAcls` option, as uploading objects with ACL `public-read` will fail (403). +In that case, set `S3_PERMISSION` to `private`. Default: `public-read` {{< hint style="danger" >}} -Regardless of the ACL configuration, your S3 bucket must be set up to ensure that all objects are publicly readable but not writable or listable. At the same time, Mastodon itself should have write access to the bucket. This configuration is generally consistent across all S3 providers, and common ones are highlighted below. +Regardless of the ACL configuration, your S3 bucket must be set up to ensure that all objects are publicly readable but not writable or listable. +At the same time, Mastodon itself should have write access to the bucket. +This configuration is generally consistent across all S3 providers, and common ones are highlighted below. {{}} #### `S3_BATCH_DELETE_LIMIT` @@ -123,49 +121,53 @@ Default: `1000` #### `S3_BATCH_DELETE_RETRY` -During batch delete operations, S3 providers may perodically fail or timeout while processing deletion requests. Mastodon will back off and -retry the request up to this maximum number of times. +During batch delete operations, S3 providers may perodically fail or timeout while processing deletion requests. +Mastodon will back off and retry the request up to this maximum number of times. Default: `3` +## Provider Specific Configurations + ### MinIO -MinIO is an open-source implementation of an S3 object provider. This section does not cover how to install it, but how to configure a bucket for use in Mastodon. +MinIO is an open-source implementation of an S3 object provider. +Installing MinIO is outide the scope of this documentation, but this should show how to configure a bucket for use in Mastodon. You need to set a policy for anonymous access that allows read-only access to objects contained by the bucket without allowing listing them. - To do this, you need to set a custom policy (replace `mastodata` with the actual name of your S3 bucket): + ```json { - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Principal": { - "AWS": "*" - }, - "Action": "s3:GetObject", - "Resource": "arn:aws:s3:::mastodata/*" - } - ] + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Principal": { + "AWS": "*" + }, + "Action": "s3:GetObject", + "Resource": "arn:aws:s3:::mastodata/*" + } + ] } ``` Mastodon itself needs to be able to write to the bucket, so either use your admin MinIO account (discouraged) or an account specific to Mastodon (recommended) with the following policy attached (replace `mastodata` with the actual name of your S3 bucket): + ```json { - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "s3:*", - "Resource": "arn:aws:s3:::mastodata/*" - } - ] + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": "s3:*", + "Resource": "arn:aws:s3:::mastodata/*" + } + ] } ``` -You can set those policies from the MinIO Console (web-based user interface) or the command-line client (`mcli` / `mc`). +You can set these policies from the MinIO Console (web-based user interface) or the command-line client (`mcli` / `mc`). #### Using the MinIO Console @@ -176,7 +178,8 @@ Then, configure the “Access Policy” to a custom one that allows read access ![](/assets/object-storage/minio-access-policy.png) {{< hint style="info" >}} -If the MinIO Console does not allow you to set a “Custom” policy, you will likely need to update MinIO. If you are using MinIO in *standalone* or *filesystem* mode, [`RELEASE.2022-10-24T18-35-07Z`](https://github.com/minio/minio/releases/tag/RELEASE.2022-10-24T18-35-07Z) should be a safe version to update to that does not require [an involved migration procedure](https://min.io/docs/minio/linux/operations/install-deploy-manage/migrate-fs-gateway.html#migrate-from-gateway-or-filesystem-mode). +If the MinIO Console does not allow you to set a “Custom” policy, you will likely need to update MinIO. +If you are using MinIO in _standalone_ or _filesystem_ mode, [`RELEASE.2022-10-24T18-35-07Z`](https://github.com/minio/minio/releases/tag/RELEASE.2022-10-24T18-35-07Z) should be a safe version to update to that does not require [an involved migration procedure](https://min.io/docs/minio/linux/operations/install-deploy-manage/migrate-fs-gateway.html#migrate-from-gateway-or-filesystem-mode). {{< /hint >}} Create a new `mastodon-readwrite` policy (see above): @@ -209,19 +212,20 @@ Apply the `mastodon-readwrite` policy to the `mastodon` user: ### Wasabi Object Storage Create a new bucket and define its policy to allow objects to be anonymously readable but not listable: + ```json { - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Principal": { - "AWS": "*" - }, - "Action": "s3:GetObject", - "Resource": "arn:aws:s3:::mastodata/*" - } - ] + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Principal": { + "AWS": "*" + }, + "Action": "s3:GetObject", + "Resource": "arn:aws:s3:::mastodata/*" + } + ] } ``` @@ -229,21 +233,20 @@ Create a new bucket and define its policy to allow objects to be anonymously rea {{< hint style="info" >}} If you are using an old bucket, ensure you are not giving “Everyone” read access to objects through Wasabi's legacy Access Control settings, as that allows listing objects and take precedence over the IAM policy defined above. - -![](/assets/object-storage/wasabi-access-control.png) {{< /hint >}} Then, create a `mastodon-readwrite` policy to grant read and write access to your bucket: + ```json { - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "s3:*", - "Resource": "arn:aws:s3:::mastodata/*" - } - ] + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": "s3:*", + "Resource": "arn:aws:s3:::mastodata/*" + } + ] } ``` @@ -264,7 +267,8 @@ In your DigitalOcean Spaces Bucket, make sure that “File Listing” is “Rest If you want to use Scaleway Object Storage, we strongly recommend you create a Scaleway project dedicated to your Mastodon instance assets and use a custom IAM policy. -First, create a new Scaleway project, in which you create your object storage bucket. You need to set your bucket visibility to "Private" to not allow objects to be listed. +First, create a new Scaleway project, in which you create your object storage bucket. +You need to set your bucket visibility to "Private" to not allow objects to be listed. ![](/assets/object-storage/scaleway-bucket.png) @@ -280,7 +284,8 @@ This policy needs to have one rule, allowing it to read, write and delete object Then head to the IAM Applications page, and create a new one (eg `my-mastodon-instance`) and select the policy you created above. -Finally, click on the application you just created, then "API Keys", and create a new API key to use in your instance configuration. You should use the "Yes, set up preferred Project" option and select the project you created above as the default project for this key. +Finally, click on the application you just created, then "API Keys", and create a new API key to use in your instance configuration. +You should use the "Yes, set up preferred Project" option and select the project you created above as the default project for this key. ![](/assets/object-storage/scaleway-api-key.png) @@ -298,7 +303,8 @@ On Mastodon's side, you need to set `S3_FORCE_SINGLE_REQUEST=true` to properly h ### Cloudflare R2 -Cloudflare R2 does not support ACLs, so Mastodon needs to be instructed not to try setting them. To do that, set the `S3_PERMISSION` environment variable to an empty string. +Cloudflare R2 does not support ACLs, so Mastodon needs to be instructed not to try setting them. +To do that, set the `S3_PERMISSION` environment variable to an empty string. {{< hint style="warning" >}} Without support for ACLs, media files from temporarily-suspended users will remain accessible. From 34417ba0a6ef6599814217d2e47f16117761634d Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 13:55:47 -0500 Subject: [PATCH 07/15] client access --- content/en/admin/optional/object-storage.md | 115 +++++++++++++------- 1 file changed, 77 insertions(+), 38 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index 6232696d2..ff833eddd 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -12,68 +12,98 @@ User-uploaded files can be stored on the main server's file system, or using an By default, Mastodon will store user uploaded and federated media files on the server's file system, under `public/system` in its installation directory and the files are served at `https://example.com/system`. {{< hint style="info" >}} -While using the server's file system is perfectly serviceable for small servers, using external object storage is more scalable. +While using the server's file system is perfectly serviceable for small servers with a handful of users, using external object storage is more scalable. {{}} -To enable S3 storage, start by setting the `S3_ENABLED` environment variable to `true`. +### Backend Variables -### Access Control +The variables define how Mastodon communicates with your backend S3 storage provider. +It is important to note that even though are many references to AWS as the default provider, many different storage providers are able to be consumed by Mastodon including AWS S3, DigitalOcean Spaces, Cloudflare R2, Wasabi, MinIO, Exoscale, Scaleway, OVH, or any other other S3-compatible provider. -When using an S3-compatible object storage backend, it is recommended to use a backend with ACL support, as it allows Mastodon to quickly improve the security of private data. +Please refer to your provider's documentation for assistance in identifying the proper settings for many of these options. -Mastodon sends URLs to the web interface, Mastodon API clients, and ActivityPub servers for all 'read' operations. -As a result, those operations are anonymous (no authentication or authorization needed) and use plain HTTP GET methods, which means they can be routed through reverse proxies and CDNs, and can be cached. -It also means that those URLs can contain host/domain names which are entirely different from those used by the S3 storage provider itself, if desired. -See the detailed documentation below which describes how those URLs are constructed and which environment variables are involved. +#### `S3_ENABLED` -### Required Environment Variables +Defaults to `false`, must be set to `true` to enable S3 storage. -- `S3_REGION` (defaults to 'us-east-1', required if using AWS S3, may not be required with other storage providers) -- `S3_ENDPOINT` (defaults to 's3..amazonaws.com', required if not using AWS S3) -- `S3_BUCKET=mastodata` (replacing `mastodata` with the name of your bucket) -- `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` need to be set to your credentials -- `S3_SIGNATURE_VERSION` (defaults to 'v4', should be compatible with most storage providers) -- `S3_OVERRIDE_PATH_STYLE` (only used if `S3_ENDPOINT` is configured, set this to `true` if the storage provider requires API operations to be sent to '.` (domain-style)) +#### `S3_BUCKET` -### Environment variables for client access to media objects +Must be set to the name of the bucket hosted by your S3 provider. -- `S3_PROTOCOL` (defaults to `https`) -- `S3_HOSTNAME` (defaults to 's3-.amazonaws.com', required if not using AWS S3 and `S3_ALIAS_HOST` is not set) -- `S3_ALIAS_HOST` (can be used instead of `S3_HOSTNAME` if you do not want `S3_BUCKET` to be included in the media URLs, and requires that you have provisioned a reverse proxy or CDN in front of the storage provider) +Default: _None_ -As noted above, Mastodon will send URLs to clients when they need to access media objects from the storage provider. The URLs are constructed as follows: +#### `S3_REGION` -- If `S3_ALIAS_HOST` is not set, then the URL will be - ':////\' +Defaults to `us-east-1` (AWS) but will be specific to where your S3 bucket was created. -- If `S3_ALIAS_HOST` is set, then the URL will be - ':///\' +#### `S3_ENDPOINT` -It is important to note that when `S3_ALIAS_HOST` is set, the bucket name is **not** included in the generated URL; this means the bucket name must be included in `S3_ALIAS_HOST` (referred to as 'domain-style' object access), or that `S3_ALIAS_HOST` must point to a reverse proxy or CDN which can include the bucket name in the URL it uses to send the request onward to the storage provider. -This type of configuration allows you to 'hide' the usage of the storage provider from the instance's clients, which means you can change storage providers without changing the resulting URLs. +Defaults to `s3..amazonaws.com` (AWS) but if using a different provider will need to be set to the specific target where Mastodon connects to perform API operations. -In addition to hiding the usage of the storage provider, this can also allow you to cache the media after retrieval from the storage provider, reducing egress bandwidth costs from the storage provider. -This can be done in your own reverse proxy, or by using a CDN. +#### `AWS_ACCESS_KEY_ID` -{{< page-ref page="admin/optional/object-storage-proxy.md" >}} +No default value, must be setup on your S3 provider. + +#### `AWS_SECRET_ACCESS_KEY` + +No default value, must be setup on your S3 provider. + +### Client Access Variables + +Once S3 file storage is enabled, Mastodon will provide new URLs from the web interface, Mastodon API clients, and to other ActivityPub servers for all media 'read' operations. +Accessing these URLs does not require authentication, using plain HTTP GET methods, which means they can be routed and/or cached through reverse proxies and CDNs. +In addition to hiding the usage of the storage provider, with proper configuration you can reduce egress bandwidth costs from the storage provider. +It also means that those URLs can contain host/domain names which are entirely different from those used by the S3 storage provider itself, if desired. {{< hint style="info" >}} -You must serve the files with CORS headers, otherwise some functions of Mastodon's web UI will not work. For example, `Access-Control-Allow-Origin: *` +Remember you must serve the files with proper CORS headers, otherwise media may not be visible in the user's browser and some functions of Mastodon's web UI will not work. For example, `Access-Control-Allow-Origin: *` {{}} -### Optional environment variables +It is highly reccomended that you consider using a domain (or subdomain) you control, for delivery of S3 stored media. +Instead of delivering media from an address like `https://s3-us-east-1.amazonaws.com/example-mastodon-bucket/image.jpg` with the proper configuration it can come from something like `https://files.example.com/image.jpg`. +This allows flexibility should you decide to change S3 providers at a later date, especially where the address for your file storage has already federated to other servers for older posts, which may lead to those files being no longer accessible if you need to change this address. + +{{< page-ref page="admin/optional/object-storage-proxy.md" >}} + +#### `S3_ALIAS_HOST` + +- If `S3_ALIAS_HOST` is not set, then the URL will be `:////`. +- If `S3_ALIAS_HOST` is set, then the URL will be `:///`. + +#### `S3_PROTOCOL` + +Defaults to `https`, which generally should not be changed. + +#### `S3_HOSTNAME` + +Defaults to `s3-.amazonaws.com`, required if not using AWS S3 and `S3_ALIAS_HOST` is not set. + +### Additional Variables + +#### `S3_SIGNATURE_VERSION` + +The signature version used to authenticate and authorize requests to the S3 provider. + +Default: `v4` + +#### `S3_OVERRIDE_PATH_STYLE` + +Set this to `true` if the storage provider requires API operations to be sent to `.` (domain-style). +Only used if `S3_ENDPOINT` is also configured. + +Default: `false` #### `S3_OPEN_TIMEOUT` The number of seconds before the HTTP handler should timeout while trying to open a new HTTP session. -Default: `5` (seconds) +Default: `5` #### `S3_READ_TIMEOUT` The number of seconds before the HTTP handler should timeout while waiting for an HTTP response. -Default: `5` (seconds) +Default: `5` #### `S3_FORCE_SINGLE_REQUEST` @@ -96,21 +126,27 @@ Default: `STANDARD` #### `S3_MULTIPART_THRESHOLD` -Objects of this size and smaller will be uploaded in a single operation, but larger objects will be uploaded using the multipart chunking mechanism, which can improve transfer speeds and reliability. +The maximum size (in megabytes) of objects that will be uploaded in a single operation. +Objects above this threshold will be uploaded using the multipart chunking mechanism, which can improve transfer speeds and reliability. -Default: `15` (megabytes) +Default: `15` #### `S3_PERMISSION` -Defines the S3 object ACL when uploading new files. Use caution when using [S3 Block Public Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) and turning on the `BlockPublicAcls` option, as uploading objects with ACL `public-read` will fail (403). -In that case, set `S3_PERMISSION` to `private`. +Defines the S3 object ACL when uploading new files. +When using an S3-compatible object storage backend, it is recommended to use a backend with ACL support, as it allows Mastodon to quickly improve the security of private data. + +{{< hint style="danger" >}} +Use caution when using [S3 Block Public Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) and turning on the `BlockPublicAcls` option, as uploading objects with ACL `public-read` will fail (403). +In that configuration you should set `S3_PERMISSION` to `private`. +{{}} Default: `public-read` {{< hint style="danger" >}} Regardless of the ACL configuration, your S3 bucket must be set up to ensure that all objects are publicly readable but not writable or listable. At the same time, Mastodon itself should have write access to the bucket. -This configuration is generally consistent across all S3 providers, and common ones are highlighted below. +This configuration is generally consistent across all S3 providers. {{}} #### `S3_BATCH_DELETE_LIMIT` @@ -131,7 +167,10 @@ Default: `3` ### MinIO MinIO is an open-source implementation of an S3 object provider. + +{{< hint style="info" >}} Installing MinIO is outide the scope of this documentation, but this should show how to configure a bucket for use in Mastodon. +{{}} You need to set a policy for anonymous access that allows read-only access to objects contained by the bucket without allowing listing them. To do this, you need to set a custom policy (replace `mastodata` with the actual name of your S3 bucket): From 11a1e92469e3fadaefc9c7db0d44002062e99a65 Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 14:02:55 -0500 Subject: [PATCH 08/15] headers --- content/en/admin/optional/object-storage.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index ff833eddd..16701ff2d 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -15,6 +15,8 @@ By default, Mastodon will store user uploaded and federated media files on the s While using the server's file system is perfectly serviceable for small servers with a handful of users, using external object storage is more scalable. {{}} +## Configuration Options + ### Backend Variables The variables define how Mastodon communicates with your backend S3 storage provider. @@ -80,6 +82,8 @@ Defaults to `s3-.amazonaws.com`, required if not using AWS S3 and `S3 ### Additional Variables +Due to the large number of S3 provider options, but inconsistencies in how they implement the S3 API, there may be some tuning required specific to your implemention. + #### `S3_SIGNATURE_VERSION` The signature version used to authenticate and authorize requests to the S3 provider. From a2fa30ac99c32cb87aa86bcf5364c8742a7a0ee2 Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 14:08:38 -0500 Subject: [PATCH 09/15] default formatting --- content/en/admin/optional/object-storage.md | 39 ++++++++++++++------- 1 file changed, 27 insertions(+), 12 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index 16701ff2d..bd9960cd1 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -26,7 +26,9 @@ Please refer to your provider's documentation for assistance in identifying the #### `S3_ENABLED` -Defaults to `false`, must be set to `true` to enable S3 storage. +Must be set to `true` to enable S3 storage. + +Default: `false` #### `S3_BUCKET` @@ -36,24 +38,30 @@ Default: _None_ #### `S3_REGION` -Defaults to `us-east-1` (AWS) but will be specific to where your S3 bucket was created. +The S3 region where your bucket was created. +May not be required by all providers. + +Default: `us-east-1` #### `S3_ENDPOINT` -Defaults to `s3..amazonaws.com` (AWS) but if using a different provider will need to be set to the specific target where Mastodon connects to perform API operations. +The specific S3 target where Mastodon connects to perform API operations. + +Default: `s3..amazonaws.com` #### `AWS_ACCESS_KEY_ID` -No default value, must be setup on your S3 provider. +_No default value, must be setup on your S3 provider._ #### `AWS_SECRET_ACCESS_KEY` -No default value, must be setup on your S3 provider. +_No default value, must be setup on your S3 provider._ ### Client Access Variables Once S3 file storage is enabled, Mastodon will provide new URLs from the web interface, Mastodon API clients, and to other ActivityPub servers for all media 'read' operations. Accessing these URLs does not require authentication, using plain HTTP GET methods, which means they can be routed and/or cached through reverse proxies and CDNs. + In addition to hiding the usage of the storage provider, with proper configuration you can reduce egress bandwidth costs from the storage provider. It also means that those URLs can contain host/domain names which are entirely different from those used by the S3 storage provider itself, if desired. @@ -63,22 +71,29 @@ Remember you must serve the files with proper CORS headers, otherwise media may It is highly reccomended that you consider using a domain (or subdomain) you control, for delivery of S3 stored media. Instead of delivering media from an address like `https://s3-us-east-1.amazonaws.com/example-mastodon-bucket/image.jpg` with the proper configuration it can come from something like `https://files.example.com/image.jpg`. + This allows flexibility should you decide to change S3 providers at a later date, especially where the address for your file storage has already federated to other servers for older posts, which may lead to those files being no longer accessible if you need to change this address. {{< page-ref page="admin/optional/object-storage-proxy.md" >}} #### `S3_ALIAS_HOST` -- If `S3_ALIAS_HOST` is not set, then the URL will be `:////`. -- If `S3_ALIAS_HOST` is set, then the URL will be `:///`. +- If `S3_ALIAS_HOST` is not set, then the media access URL will be `:////`. +- If `S3_ALIAS_HOST` is set, then the media access URL will be `:///`. + +Default: _None_ #### `S3_PROTOCOL` -Defaults to `https`, which generally should not be changed. +Generally should not be changed from the default of HTTPS. + +Default: `https` #### `S3_HOSTNAME` -Defaults to `s3-.amazonaws.com`, required if not using AWS S3 and `S3_ALIAS_HOST` is not set. +Required if not using AWS S3 and `S3_ALIAS_HOST` is not set. + +Default: `s3-.amazonaws.com` ### Additional Variables @@ -140,14 +155,14 @@ Default: `15` Defines the S3 object ACL when uploading new files. When using an S3-compatible object storage backend, it is recommended to use a backend with ACL support, as it allows Mastodon to quickly improve the security of private data. +Default: `public-read` + {{< hint style="danger" >}} Use caution when using [S3 Block Public Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) and turning on the `BlockPublicAcls` option, as uploading objects with ACL `public-read` will fail (403). In that configuration you should set `S3_PERMISSION` to `private`. {{}} -Default: `public-read` - -{{< hint style="danger" >}} +{{< hint style="info" >}} Regardless of the ACL configuration, your S3 bucket must be set up to ensure that all objects are publicly readable but not writable or listable. At the same time, Mastodon itself should have write access to the bucket. This configuration is generally consistent across all S3 providers. From b4edb914767c39b3ea8d69b50d4ec5312dd88c90 Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 14:10:41 -0500 Subject: [PATCH 10/15] capital --- content/en/admin/optional/object-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index bd9960cd1..a1c3be867 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -1,5 +1,5 @@ --- -title: Object storage +title: Object Storage description: Serving user-uploaded files in Mastodon using external object storage menu: docs: From 5ee7f5f0b7eb07bf19c14d2802e14c3666bdef8b Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 14:50:30 -0500 Subject: [PATCH 11/15] rewrite for clarity on some paragraphs --- content/en/admin/optional/object-storage.md | 26 +++++++++++---------- 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index a1c3be867..aca80e5e2 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -19,10 +19,9 @@ While using the server's file system is perfectly serviceable for small servers ### Backend Variables -The variables define how Mastodon communicates with your backend S3 storage provider. -It is important to note that even though are many references to AWS as the default provider, many different storage providers are able to be consumed by Mastodon including AWS S3, DigitalOcean Spaces, Cloudflare R2, Wasabi, MinIO, Exoscale, Scaleway, OVH, or any other other S3-compatible provider. +These variables specify how Mastodon connects to your backend S3 storage provider. While AWS is mentioned as the default, Mastodon can work with various providers like AWS S3, DigitalOcean Spaces, Cloudflare R2, Wasabi, MinIO, Exoscale, Scaleway, OVH, or any other S3-compatible provider. -Please refer to your provider's documentation for assistance in identifying the proper settings for many of these options. +Consult your provider's documentation for help in setting up these options correctly. #### `S3_ENABLED` @@ -59,25 +58,28 @@ _No default value, must be setup on your S3 provider._ ### Client Access Variables -Once S3 file storage is enabled, Mastodon will provide new URLs from the web interface, Mastodon API clients, and to other ActivityPub servers for all media 'read' operations. -Accessing these URLs does not require authentication, using plain HTTP GET methods, which means they can be routed and/or cached through reverse proxies and CDNs. +Once S3 file storage is enabled, Mastodon will provide new URLs for all media 'read' operations. +These URLs can be accessed using plain HTTP GET methods, without requiring authentication. +This means that they can be routed and/or cached through reverse proxies and CDNs. -In addition to hiding the usage of the storage provider, with proper configuration you can reduce egress bandwidth costs from the storage provider. -It also means that those URLs can contain host/domain names which are entirely different from those used by the S3 storage provider itself, if desired. +By properly configuring the URLs, you can hide the usage of the storage provider and reduce egress bandwidth costs. +You can also use host/domain names that are different from those used by the S3 storage provider itself. {{< hint style="info" >}} -Remember you must serve the files with proper CORS headers, otherwise media may not be visible in the user's browser and some functions of Mastodon's web UI will not work. For example, `Access-Control-Allow-Origin: *` +Remember to serve the files with proper CORS headers, such as `Access-Control-Allow-Origin: *`, to ensure media visibility in the user's browser and proper functioning of Mastodon's web UI. {{}} -It is highly reccomended that you consider using a domain (or subdomain) you control, for delivery of S3 stored media. -Instead of delivering media from an address like `https://s3-us-east-1.amazonaws.com/example-mastodon-bucket/image.jpg` with the proper configuration it can come from something like `https://files.example.com/image.jpg`. +It is highly recommended to use a domain (or subdomain) that you control for delivering S3 stored media. -This allows flexibility should you decide to change S3 providers at a later date, especially where the address for your file storage has already federated to other servers for older posts, which may lead to those files being no longer accessible if you need to change this address. +This provides flexibility in case you decide to change S3 providers in the future. It also ensures that the address for your file storage, which may have already federated to other servers for older posts, remains accessible even if you need to change the storage provider's address. {{< page-ref page="admin/optional/object-storage-proxy.md" >}} #### `S3_ALIAS_HOST` +Instead of using an address like `https://s3-us-east-1.amazonaws.com/example-mastodon-bucket/image.jpg`, you can configure it to be delivered from something like `https://files.example.com/image.jpg`. +In this example, `S3_ALIAS_HOST` would be set to `files.example.com` and constructed as shown: + - If `S3_ALIAS_HOST` is not set, then the media access URL will be `:////`. - If `S3_ALIAS_HOST` is set, then the media access URL will be `:///`. @@ -164,7 +166,7 @@ In that configuration you should set `S3_PERMISSION` to `private`. {{< hint style="info" >}} Regardless of the ACL configuration, your S3 bucket must be set up to ensure that all objects are publicly readable but not writable or listable. -At the same time, Mastodon itself should have write access to the bucket. +Mastodon itself should also have write access to the bucket. This configuration is generally consistent across all S3 providers. {{}} From ca72ed348fa59bece9a2f59fb0c787a4bc9dca86 Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 15:12:01 -0500 Subject: [PATCH 12/15] access keys --- content/en/admin/optional/object-storage.md | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index aca80e5e2..71b977125 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -31,30 +31,39 @@ Default: `false` #### `S3_BUCKET` -Must be set to the name of the bucket hosted by your S3 provider. +The name of the S3 bucket at your provider. Default: _None_ #### `S3_REGION` The S3 region where your bucket was created. -May not be required by all providers. +Used to help construct `S3_ENDPOINT` when using AWS, but not required by other providers. Default: `us-east-1` #### `S3_ENDPOINT` The specific S3 target where Mastodon connects to perform API operations. +Used in conjuction with `S3_REGION` when using AWS, but should be specifically set when using other providers. Default: `s3..amazonaws.com` #### `AWS_ACCESS_KEY_ID` -_No default value, must be setup on your S3 provider._ +Effectively this is the API username for the S3 provider. +This is created/assigned to you by your S3 provider. +Despite the name it is not AWS specific. + +Default: _None_ #### `AWS_SECRET_ACCESS_KEY` -_No default value, must be setup on your S3 provider._ +Effectively this is the API password for the S3 provider. +This is created/assigned to you by your S3 provider. +Despite the name it is not AWS specific. + +Default: _None_ ### Client Access Variables From fde2397e9bb14ab74a568eec08d19a319940b1f8 Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 15:42:11 -0500 Subject: [PATCH 13/15] bold defaults --- content/en/admin/optional/object-storage.md | 44 ++++++++++----------- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index 71b977125..fd039a268 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -27,27 +27,27 @@ Consult your provider's documentation for help in setting up these options corre Must be set to `true` to enable S3 storage. -Default: `false` +**Default:** `false` #### `S3_BUCKET` The name of the S3 bucket at your provider. -Default: _None_ +**Default:** _None_ #### `S3_REGION` The S3 region where your bucket was created. Used to help construct `S3_ENDPOINT` when using AWS, but not required by other providers. -Default: `us-east-1` +**Default:** `us-east-1` #### `S3_ENDPOINT` The specific S3 target where Mastodon connects to perform API operations. Used in conjuction with `S3_REGION` when using AWS, but should be specifically set when using other providers. -Default: `s3..amazonaws.com` +**Default:** `s3..amazonaws.com` #### `AWS_ACCESS_KEY_ID` @@ -55,7 +55,7 @@ Effectively this is the API username for the S3 provider. This is created/assigned to you by your S3 provider. Despite the name it is not AWS specific. -Default: _None_ +**Default:** _None_ #### `AWS_SECRET_ACCESS_KEY` @@ -63,7 +63,7 @@ Effectively this is the API password for the S3 provider. This is created/assigned to you by your S3 provider. Despite the name it is not AWS specific. -Default: _None_ +**Default:** _None_ ### Client Access Variables @@ -89,22 +89,22 @@ This provides flexibility in case you decide to change S3 providers in the futur Instead of using an address like `https://s3-us-east-1.amazonaws.com/example-mastodon-bucket/image.jpg`, you can configure it to be delivered from something like `https://files.example.com/image.jpg`. In this example, `S3_ALIAS_HOST` would be set to `files.example.com` and constructed as shown: -- If `S3_ALIAS_HOST` is not set, then the media access URL will be `:////`. -- If `S3_ALIAS_HOST` is set, then the media access URL will be `:///`. +- If `S3_ALIAS_HOST` is not set, then the media access URL will be `:////` +- If `S3_ALIAS_HOST` is set, then the media access URL will be `:///` -Default: _None_ +**Default:** _None_ #### `S3_PROTOCOL` Generally should not be changed from the default of HTTPS. -Default: `https` +**Default:** `https` #### `S3_HOSTNAME` Required if not using AWS S3 and `S3_ALIAS_HOST` is not set. -Default: `s3-.amazonaws.com` +**Default:** `s3-.amazonaws.com` ### Additional Variables @@ -114,59 +114,59 @@ Due to the large number of S3 provider options, but inconsistencies in how they The signature version used to authenticate and authorize requests to the S3 provider. -Default: `v4` +**Default:** `v4` #### `S3_OVERRIDE_PATH_STYLE` Set this to `true` if the storage provider requires API operations to be sent to `.` (domain-style). Only used if `S3_ENDPOINT` is also configured. -Default: `false` +**Default:** `false` #### `S3_OPEN_TIMEOUT` The number of seconds before the HTTP handler should timeout while trying to open a new HTTP session. -Default: `5` +**Default:** `5` #### `S3_READ_TIMEOUT` The number of seconds before the HTTP handler should timeout while waiting for an HTTP response. -Default: `5` +**Default:** `5` #### `S3_FORCE_SINGLE_REQUEST` Set this to `true` if you run into trouble processing large files. -Default: `false` +**Default:** `false` #### `S3_ENABLE_CHECKSUM_MODE` Enables verification of object checksums when Mastodon is retrieving an object from the storage provider. This feature is available in AWS S3 but may not be available in other S3-compatible implementations. -Default: `false` +**Default:** `false` #### `S3_STORAGE_CLASS` When using AWS S3, this variable can be set to one of the [storage class](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html) options which influence the storage selected for uploaded objects (and thus their access times and costs). If no storage class is specified then AWS S3 will use the `STANDARD` class, but options include `REDUCED_REDUNDANCY`, `GLACIER`, and others. -Default: `STANDARD` +**Default:** `STANDARD` #### `S3_MULTIPART_THRESHOLD` The maximum size (in megabytes) of objects that will be uploaded in a single operation. Objects above this threshold will be uploaded using the multipart chunking mechanism, which can improve transfer speeds and reliability. -Default: `15` +**Default:** `15` #### `S3_PERMISSION` Defines the S3 object ACL when uploading new files. When using an S3-compatible object storage backend, it is recommended to use a backend with ACL support, as it allows Mastodon to quickly improve the security of private data. -Default: `public-read` +**Default:** `public-read` {{< hint style="danger" >}} Use caution when using [S3 Block Public Access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) and turning on the `BlockPublicAcls` option, as uploading objects with ACL `public-read` will fail (403). @@ -183,14 +183,14 @@ This configuration is generally consistent across all S3 providers. The official [Amazon S3 API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) can handle deleting 1,000 objects in one batch job, but some providers may have issues handling this many in one request, or offer lower limits. -Default: `1000` +**Default:** `1000` #### `S3_BATCH_DELETE_RETRY` During batch delete operations, S3 providers may perodically fail or timeout while processing deletion requests. Mastodon will back off and retry the request up to this maximum number of times. -Default: `3` +**Default:** `3` ## Provider Specific Configurations From 9b074697a14fe82fa0c201a51337a5e87a64b6ac Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 15:48:26 -0500 Subject: [PATCH 14/15] caching explination --- content/en/admin/optional/object-storage.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index fd039a268..a3aa1e423 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -71,16 +71,17 @@ Once S3 file storage is enabled, Mastodon will provide new URLs for all media 'r These URLs can be accessed using plain HTTP GET methods, without requiring authentication. This means that they can be routed and/or cached through reverse proxies and CDNs. -By properly configuring the URLs, you can hide the usage of the storage provider and reduce egress bandwidth costs. -You can also use host/domain names that are different from those used by the S3 storage provider itself. - {{< hint style="info" >}} Remember to serve the files with proper CORS headers, such as `Access-Control-Allow-Origin: *`, to ensure media visibility in the user's browser and proper functioning of Mastodon's web UI. {{}} It is highly recommended to use a domain (or subdomain) that you control for delivering S3 stored media. +This provides flexibility in case you decide to change S3 providers in the future. +By properly configuring the URLs, you can hide the usage of the storage provider and use caching to reduce egress bandwidth costs. +It also ensures that the address for your file storage, which may have already federated to other servers for older posts, remains accessible even if you need to change the storage provider's address. -This provides flexibility in case you decide to change S3 providers in the future. It also ensures that the address for your file storage, which may have already federated to other servers for older posts, remains accessible even if you need to change the storage provider's address. +Some S3 providers, such as DigitalOcean Spaces, provide integrated CDN/caching services as part of the S3 service. +For others, you will need to configure this manually or partner with another provider. {{< page-ref page="admin/optional/object-storage-proxy.md" >}} From 0ee843b8f5ef00e4bd2ed44711e5d79b21d69065 Mon Sep 17 00:00:00 2001 From: Michael Stanclift Date: Mon, 17 Jun 2024 15:53:32 -0500 Subject: [PATCH 15/15] move public data access info block --- content/en/admin/optional/object-storage.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/content/en/admin/optional/object-storage.md b/content/en/admin/optional/object-storage.md index a3aa1e423..188cc74d8 100644 --- a/content/en/admin/optional/object-storage.md +++ b/content/en/admin/optional/object-storage.md @@ -65,6 +65,12 @@ Despite the name it is not AWS specific. **Default:** _None_ +{{< hint style="info" >}} +The access id/key must provide Mastodon the ability to write data to your S3 bucket. +You must also set up your S3 bucket to ensure that all objects are publicly readable, but only writable or listable with proper authentication. +Consult your provider documentation for assistance. +{{}} + ### Client Access Variables Once S3 file storage is enabled, Mastodon will provide new URLs for all media 'read' operations. @@ -174,12 +180,6 @@ Use caution when using [S3 Block Public Access](https://docs.aws.amazon.com/Amaz In that configuration you should set `S3_PERMISSION` to `private`. {{}} -{{< hint style="info" >}} -Regardless of the ACL configuration, your S3 bucket must be set up to ensure that all objects are publicly readable but not writable or listable. -Mastodon itself should also have write access to the bucket. -This configuration is generally consistent across all S3 providers. -{{}} - #### `S3_BATCH_DELETE_LIMIT` The official [Amazon S3 API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) can handle deleting 1,000 objects in one batch job, but some providers may have issues handling this many in one request, or offer lower limits.