7.8

top ← prev up next →

AWS S3 Synchronization

Synchronize an S3 bucket and a filesystem directory using

raco s3-sync ‹src› ‹dest›

where either ‹src› or ‹dest› should start with s3:// to identify a bucket and item name or prefix, while the other is a path in the local filesystem to a file or directory. Naturally, a / within a bucket item’s name corresponds to a directory separator in the local filesystem, and a trailing / syntactically indicates a prefix (as opposed to a complete item name).A bucket item is ignored if its name ends in /. A bucket can contain an item whose name plus / is a prefix for other bucket items, in which case attempting to synchronize both from the bucket will produce an error, since the name cannot be used for both a file and a directory.

For example, to upload the content of ‹src-dir› with a prefix ‹dest-path› within ‹bucket›, use

raco s3-sync ‹src-dir› s3://‹bucket›/‹dest-path›

To download the items with prefix ‹src-path› within ‹bucket› to ‹dest-dir›, use

raco s3-sync s3://‹bucket›/‹src-path› ‹dest-dir›

If ‹src› refers to a directory or prefix (either syntactically or as determined by consulting the filesystem or bucket), ‹dest› cannot refer to a file or bucket item. If ‹dest› refers to directory or prefix while ‹src› refers to a file or item, the ‹src› file or item name is implicitly added to the ‹dest› directory or prefix.

The following options (supply them after s3-sync and before ‹src›) are supported:

--dry-run — report actions that would be taken, but don’t upload, download, delete, or change redirection rules.

--jobs ‹n› or -j ‹n› — perform up to ‹n› downloads or uploads in parallel.

--shallow — when downloading, constrain downloads to existing directories at ‹dest› (i.e., no additional subdirectories); in both upload and download modes, extract the current bucket state in a directory-like way (which is useful if the bucket contains many more nested items than the local filesystem)

--delete — delete destination items that have no corresponding source item.

--acl ‹acl› — when uploading, use ‹acl› for the access control list; for example, use public-read to make items public.

--reduced — when uploading, specificy reduced-redundancy storage.

--check-metadata – when uploading, check whether an existing item has the metadata that would be uploaded (including access control), and adjust the metadata if not.

--include ‹regexp› — consider only items whose name within the S3 bucket matches ‹regexp›, where ‹regexp› uses “Perl-compatible” syntax.

--exclude ‹regexp› — do not consider items whose name within the S3 bucket matches ‹regexp› (even if they match an inclusion pattern).

--gzip ‹regexp› — on upload or for checking download hashes (to avoid unnecessary downloads), compress files whose name within the S3 bucket matches ‹regexp›.

--gzip-min ‹bytes› — when combined with --gzip, compress only files that are at least ‹bytes› in size.

++upload-metadata ‹name› ‹value› — includes ‹name› with ‹value› as metadata when uploading (without updating metadata for any file that is not uploaded). Metadata specified this way overrides metadata determined in other ways, except via ++upload-metadata-mapping. This flag can be specified multiple times to add multiple metadata entries.

++upload-metadata-mapping ‹file› — reads ‹file› to obtain a hash table that maps bucket-item names to a hash table of metadata, where a metadata hash table maps symbols to strings. Metadata supplied this way overrides metadata determined in other ways. This flag can be specified multiple times, and the mappings are merged so that later files override mappings supplied by earlier files.

--s3-hostname ‹hostname› — set the S3 hostname to ‹hostname› instead of s3.amazon.com.

--region ‹region› — set the S3 region to ‹region› (e.g., us-east-1) instead of issuing a query to locate the bucket’s region.

--error-links — report an error if a soft link is found; this is the default treatment of soft links.

--follow-links — follow soft links.

--redirect-links — treat soft links as redirection rules to be installed for ‹bucket› as a web site (upload only).

--redirects-links — treat soft links as individual redirections to be installed as metadata on a ‹bucket›’s item, while the item itself is made empty (upload only).

--ignore-links — ignore soft links.

--web — sets defaults to public-read access, reduced redundancy, compression for ".html", ".css", ".js", and ".svg" files that are 1K or larger, Content-Cache "max-age=0, no-cache" metadata for most files, and Content-Cache "max-age=31536000, public" metadata for files with the following suffixes: ".css", ".js", ".png", ".jpg", ".jpeg", ".gif", ".svg", ".ico", or ".woff".

1 S3 Synchronization API

The s3-sync library uses aws/s3, so use ensure-have-keys, s3-host, and s3-region before calling s3-sync.

local-path and s3-path within s3-bucket , where s3-path can be #f to indicate an upload to the bucket with no prefix path. If upload? is true, s3-bucket at s3-path is changed to have the content of local-path , otherwise local-path is changed to have the content of s3-bucket at s3-path . Synchronizes the content ofandwithin, wherecan beto indicate an upload to the bucket with no prefix path. Ifis true,atis changed to have the content of, otherwiseis changed to have the content ofat

Typically, local-path refers to a directory and s3-path refers to a prefix for bucket item names. If local-path refers to a file and upload? is true, then a single file is synchronized to s3-bucket at s3-path. In that case, if s3-path ends with a / or it is already used as a prefix for bucket items, then the file name of local-path is added to s3-path to form the uploaded item’s name; otherwise, s3-path names the uploaded item. If upload? is #f and s3-path is an item name (and not a prefix on other item names), then a single bucket item is downloaded to local-path; if local-path refers to a directory, then the portion of s3-path after the last / is used as the downloaded file name.

If shallow? is true, then in download mode, bucket items are downloaded only when they correspond to directories that exist already in local-path (which is useful when local-path refers to a directory). In both download and upload modes, a true value of shallow? causes the state of s3-bucket to be queried in a directory-like way, exploring only relevant directories; that exploration can be faster than querying the full content of s3-bucket if it contains many more nested items (with the prefix s3-path) than files within local-path.

If check-metadata? is true, then in upload mode, bucket items are checked to ensure that the current metadata matches the metadata that would be uploaded, and the bucket item’s metadata is adjust if not.

If dry-run? is true, then actions needed for synchronization are reported via log, but no uploads, downloads, deletions, or redirection-rule updates are performed.

If jobs is more than 1, then downloads and uploads proceed in background threads.

If delete? is true, then destination items that have no corresponding item at the source are deleted.

If include-rx is not #f, then it is matched against bucket paths (including s3-path in the path). Only items that match the regexp are considered for synchronization. If exclude-rx is not #f, then any item whose path matches is not considered for synchronization (even if it also matches a provided include-rx).

If make-call-with-file-stream is not #f, it is called to get a function that acts like call-with-input-file to get the content of a file for upload or for hashing. The arguments to make-call-with-file-stream are the S3 name and the local file path. If make-call-with-file-stream or its result is #f, then call-with-input-file is used. See also make-gzip-handlers.

If get-content-type is not #f, it is called to get the Content-Type field for each file on upload. The arguments to get-content-type are the S3 name and the local file path. If get-content-type or its result is #f, then a default value is used based on the file extension (e.g., "text/css" for a "css" file).

The get-content-encoding argument is like get-content-type, but for the Content-Encoding field. If no encoding is provided for an item, a Content-Encoding field is omitted on upload. Note that the Content-Encoding field of an item can affect the way that it is downloaded from a bucket; for example, a bucket item whose encoding is "gzip" will be uncompressed on download, even though the item’s hash (which is used to avoid unnecessary downloads) is based on the encoded content.

If acl is not #f, then it use as the S3 access control list on upload. For example, supply "public-read" to make items public for reading. More specifically, if acl is not #f, then 'x-amz-acl is set to acl in upload-metadata (if it is not set already).

If reduced-redundancy? is true, then items are uploaded to S3 with reduced-redundancy storage (which costs less, so it is suitable for files that are backed up elsewhere). More specifically, if reduced-redundancy is true, then 'x-amz-storage-class is set to "REDUCED_REDUNDANCY" in upload-metadata (if it is not set already).

The upload-metadata hash table provides metadata to include with any file upload (and only to files that are otherwise determined to need uploading). The upload-metadata-mapping provides a mapping from bucket item names to metadata that adds and overrides metadata for the specific item.

The link-mode argument determines the treatment of soft links in local-path:

'error — reports an error

'follow — follows soft links (i.e., treat it as a file or directory)

'redirect — treat it as a redirection rule to be installed for s3-bucket as a web site on upload

'redirects — treat it as a redirection rule to be installed for s3-bucket’s item as metadata on upload, while the item itself is uploaded as empty

'ignore — ignore

The log-info and raise-error arguments determine how progress is logged and errors are reported. The default log-info function logs the given string at the 'info level to a logger whose name is 's3-sync.

Changed in version 1.2 of package s3-sync: Added 'redirects mode.

Changed in version 1.3: Added the upload-metadata argument.

Changed in version 1.4: Added support for a single file as local-path and a bucket item name as s3-path.

Changed in version 1.5: Added the check-metadata? argument.

Changed in version 1.6: Added the upload-metadata-mapping argument.

Changed in version 1.7: Changed upload-metadata-mapping to allow a procedure.

2 S3 gzip Support

#:make-call-with-input-file and #:get-content-encoding arguments to s3-sync pattern and whose local file size is at least min-size bytes. Returns values that are suitable as theandarguments toto compress items whose name within the bucket matchesand whose local file size is at leastbytes.

3 S3 Web Page Support

Added in version 1.3 of package s3-sync.

s3-sync Accepts the same arguments as, but adapts the defaults to be suitable for web-page uploads:

#:acl — defaults to web-acl

#:reduced-redundancy? — defaults to web-reduced-redundancy?

#:upload-metadata — defaults to web-upload-metadata

#:upload-metadata-mapping — defaults to web-upload-metadata-mapping

#:make-call-with-input-file — defaults to a gzip of files that match web-gzip-rx and web-gzip-min-size

#:get-content-encoding — defaults to a gzip of files that match web-gzip-rx and web-gzip-min-size

4 S3 Web Page Configuration

Added in version 1.3 of package s3-sync.

The default access control list for web content, currently "public-read" .

The default storage mode for web content, currently #t .

( hash ' Cache-Control "max-age=0, no-cache" ) . Default metadata for web content, currently

( hash ' Cache-Control "max-age=31536000, public" ) for a item that ends in ".css" , ".js" , ".png" , ".jpg" , ".jpeg" , ".gif" , ".svg" , ".ico" , or ".woff" . Item-specific metadata for web content, currently producesfor athat ends in, or

Added in version 1.7 of package s3-sync.

Changed in version 1.8: Added ".woff".

Default regexp for paths to be gzip ped, currently #rx"[.](html|css|js|svg)$" .

Default minimum size for files to be gzip ped, currently #rx"[.](html|css|js)$" .

5 S3 Routing Rules

Added in version 1.9 of package s3-sync.

Configures the web-site routing rules at bucket to include each of the routing rules in rules . Unless preserve-existing? is false, existing routing rules are preserved except as overridden by rules .

Creates a routing rule that redirects an access with a prefix matching prefix so that the prefix is replaced by new-prefix , the access is redirected to new-host , or both. At least one of new-prefix or new-host must be non- #f .

Changed in version 1.10 of package s3-sync: Added #:redirect-code argument.

#t if v is a routing rule as created by redirect-prefix-routing-rule #f otherwise. Returnsifis a routing rule as created byotherwise.