Skip to content

S3

Protocol: REST XML Endpoint: http://localhost:4566/{bucket}/{key}

Supported Operations

Category Operations
Buckets ListBuckets, CreateBucket, HeadBucket, DeleteBucket, GetBucketLocation
Objects PutObject, GetObject, GetObjectAttributes, HeadObject, DeleteObject, DeleteObjects, CopyObject
Listing ListObjects, ListObjectsV2, ListObjectVersions
Multipart CreateMultipartUpload, UploadPart, CompleteMultipartUpload, AbortMultipartUpload, ListMultipartUploads
Versioning PutBucketVersioning, GetBucketVersioning
Tagging PutBucketTagging, GetBucketTagging, PutObjectTagging, GetObjectTagging, DeleteObjectTagging
Policy PutBucketPolicy, GetBucketPolicy, DeleteBucketPolicy
CORS PutBucketCors, GetBucketCors, DeleteBucketCors
Lifecycle PutBucketLifecycle, GetBucketLifecycle, DeleteBucketLifecycle
ACL PutBucketAcl, GetBucketAcl, PutObjectAcl, GetObjectAcl
Encryption PutBucketEncryption, GetBucketEncryption, DeleteBucketEncryption
Notifications PutBucketNotification, GetBucketNotification
Object Lock PutObjectLockConfiguration, GetObjectLockConfiguration, PutObjectRetention, GetObjectRetention, PutObjectLegalHold, GetObjectLegalHold
Pre-signed URLs Generates and validates pre-signed GET/PUT URLs
S3 Select SelectObjectContent
Public Access Block PutPublicAccessBlock, GetPublicAccessBlock, DeletePublicAccessBlock

RestoreObject is accepted but stubbed: Floci validates the request and returns 202 Accepted, but no restore state machine runs.

S3 Select

SelectObjectContent runs SQL queries directly against S3 objects without downloading the entire file. Floci supports CSV, JSON Lines, JSON arrays, and Parquet inputs. The SQL dialect follows AWS S3 Select SQL reference.

Execution modes

Floci chooses the execution engine automatically based on the input format and whether the floci-duck sidecar is running:

Condition Engine Notes
Input is Parquet floci-duck (required) DuckDB's read_parquet — sidecar must be available
Input is CSV with FileHeaderInfo=USE and floci-duck is running floci-duck Full DuckDB SQL: all operators, LIKE, BETWEEN, IN, IS NULL, AND/OR/NOT
Input is JSON and floci-duck is running floci-duck read_json_auto — supports JSON Lines and JSON arrays
Input is CSV with FileHeaderInfo=NONE or IGNORE, or floci-duck is not running Java evaluator Supports SELECT *, column projection, simple WHERE with =, !=, <, >, <=, >=, LIKE, BETWEEN, IN, IS NULL, AND/OR/NOT, LIMIT

The floci-duck sidecar starts lazily on the first Athena query. Until then, isAvailable() returns false and S3 Select falls back to the Java evaluator for CSV and JSON. Once the sidecar is running, subsequent S3 Select calls route through DuckDB automatically.

If floci-duck is not running and the object is Parquet, S3 Select returns an error — Parquet decoding requires DuckDB.

FileHeaderInfo modes (CSV)

Value Behavior
USE First row is the header; column names are available in WHERE and SELECT
IGNORE First row is skipped and not included in output; only positional _N references work
NONE All rows are data; only positional _N references work (e.g. WHERE _1 = 'Alice')

Supported SQL operators

When using the Java evaluator (no floci-duck, or CSV with FileHeaderInfo=NONE/IGNORE):

  • Comparison: =, !=, <>, <, >, <=, >=
  • Pattern matching: LIKE (supports % and _ wildcards)
  • Range: BETWEEN ... AND ...
  • Set membership: IN (...)
  • Null checks: IS NULL, IS NOT NULL
  • Logical: AND, OR, NOT
  • Clauses: SELECT *, column projection, LIMIT

Output formats

S3 Select supports cross-format output: a CSV object can produce JSON output and vice versa.

Input Output Format
CSV CSV Default — comma-separated values
CSV JSON One JSON object per row: {"col1":"val1","col2":"val2"}
JSON JSON Default — one JSON object per line
JSON CSV Comma-separated values, values quoted when they contain commas or newlines

Example

export AWS_ENDPOINT_URL=http://localhost:4566

# Upload a CSV file
printf 'name,age,city\nAlice,30,New York\nBob,25,\nCharlie,35,London\n' \
  | aws s3 cp - s3://my-bucket/people.csv

# Query with WHERE and column projection
aws s3api select-object-content \
  --bucket my-bucket \
  --key people.csv \
  --expression "SELECT name, city FROM S3Object WHERE age >= 30" \
  --expression-type SQL \
  --input-serialization '{"CSV":{"FileHeaderInfo":"USE"}}' \
  --output-serialization '{"CSV":{}}' \
  /dev/stdout

# IS NULL check
aws s3api select-object-content \
  --bucket my-bucket \
  --key people.csv \
  --expression "SELECT name FROM S3Object WHERE city IS NULL" \
  --expression-type SQL \
  --input-serialization '{"CSV":{"FileHeaderInfo":"USE"}}' \
  --output-serialization '{"CSV":{}}' \
  /dev/stdout

# JSON Lines input
printf '{"name":"Alice","score":95}\n{"name":"Bob","score":72}\n' \
  | aws s3 cp - s3://my-bucket/scores.json

aws s3api select-object-content \
  --bucket my-bucket \
  --key scores.json \
  --expression "SELECT * FROM S3Object WHERE score > 80" \
  --expression-type SQL \
  --input-serialization '{"JSON":{"Type":"LINES"}}' \
  --output-serialization '{"JSON":{}}' \
  /dev/stdout

Mock mode note

When FLOCI_SERVICES_ATHENA_MOCK=true is set, Athena queries are stubbed but floci-duck does not start. In that configuration, S3 Select uses the Java evaluator for CSV and JSON. Parquet queries will fail unless FLOCI_SERVICES_DUCK_URL points to an already-running floci-duck instance.

Not Implemented

These AWS S3 features have no handler in Floci. Calls will return an error (typically 404 or NoSuchBucket-style):

  • Replication (PutBucketReplication, GetBucketReplication, DeleteBucketReplication)
  • Website hosting (PutBucketWebsite, GetBucketWebsite, DeleteBucketWebsite)
  • Access logging (PutBucketLogging, GetBucketLogging)
  • Request payment (PutBucketRequestPayment, GetBucketRequestPayment)
  • Intelligent-Tiering configurations
  • Inventory configurations
  • Metrics and Analytics configurations

Configuration

Variable Default Description
FLOCI_SERVICES_S3_ENABLED true Enable or disable the service
FLOCI_SERVICES_S3_DEFAULT_PRESIGN_EXPIRY_SECONDS 3600 Default pre-signed URL expiry (1 hour)
FLOCI_AUTH_PRESIGN_SECRET local-emulator-secret Secret used to sign pre-signed URLs

Examples

export AWS_ENDPOINT_URL=http://localhost:4566

# Create bucket
aws s3 mb s3://my-bucket --endpoint-url $AWS_ENDPOINT_URL

# Upload a file
aws s3 cp ./report.pdf s3://my-bucket/reports/report.pdf --endpoint-url $AWS_ENDPOINT_URL

# Upload inline content
echo '{"hello":"world"}' | aws s3 cp - s3://my-bucket/data.json --endpoint-url $AWS_ENDPOINT_URL

# Download
aws s3 cp s3://my-bucket/data.json ./data.json --endpoint-url $AWS_ENDPOINT_URL

# Inspect object attributes without downloading the body
aws s3api get-object-attributes \
  --bucket my-bucket \
  --key data.json \
  --object-attributes ETag ObjectSize StorageClass \
  --endpoint-url $AWS_ENDPOINT_URL

# List
aws s3 ls s3://my-bucket --endpoint-url $AWS_ENDPOINT_URL

# Delete
aws s3 rm s3://my-bucket/data.json --endpoint-url $AWS_ENDPOINT_URL

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket my-bucket \
  --versioning-configuration Status=Enabled \
  --endpoint-url $AWS_ENDPOINT_URL

# Generate a pre-signed URL (valid for 1 hour)
aws s3 presign s3://my-bucket/report.pdf \
  --expires-in 3600 \
  --endpoint-url $AWS_ENDPOINT_URL

Addressing Styles

Floci supports both path-style and virtual-hosted style S3 addressing.

Path-Style (always works)

Path-style embeds the bucket name in the URL path:

http://localhost:4566/my-bucket/my-key

Enable it in the SDK with forcePathStyle / pathStyleAccessEnabled:

S3Client s3 = S3Client.builder()
    .endpointOverride(URI.create("http://localhost:4566"))
    .forcePathStyle(true)
    .build();
const s3 = new S3Client({
  endpoint: "http://localhost:4566",
  forcePathStyle: true,
});
s3 = boto3.client("s3",
    endpoint_url="http://localhost:4566",
    config=Config(s3={"addressing_style": "path"}))

Virtual-Hosted Style

Virtual-hosted style puts the bucket name in the hostname:

http://my-bucket.s3.localhost.floci.io:4566/my-key

Floci supports this natively — no forcePathStyle needed. The following wildcard DNS domains resolve to 127.0.0.1 via public DNS, so virtual-hosted requests reach Floci on the host machine automatically:

Domain pattern Resolves to
*.localhost.floci.io 127.0.0.1
*.s3.localhost.floci.io 127.0.0.1
*.localhost.localstack.cloud 127.0.0.1
*.s3.localhost.localstack.cloud 127.0.0.1

Plain http://localhost:4566 also works without forcePathStyle — the SDK sends Host: my-bucket.localhost:4566 and Floci's filter extracts the bucket name from that header.

Configure the SDK endpoint to one of the base domains (no forcePathStyle):

S3Client s3 = S3Client.builder()
    .endpointOverride(URI.create("http://s3.localhost.floci.io:4566"))
    .region(Region.US_EAST_1)
    .credentialsProvider(StaticCredentialsProvider.create(
        AwsBasicCredentials.create("test", "test")))
    .build();
// SDK sends requests to: my-bucket.s3.localhost.floci.io:4566
const s3 = new S3Client({
  endpoint: "http://s3.localhost.floci.io:4566",
  // no forcePathStyle — SDK uses virtual-hosted style by default
});
// SDK sends requests to: my-bucket.s3.localhost.floci.io:4566
s3 = boto3.client("s3",
    endpoint_url="http://s3.localhost.floci.io:4566")
# SDK sends requests to: my-bucket.s3.localhost.floci.io:4566

Virtual-Hosted Style Inside Docker

Inside Docker containers, 127.0.0.1 resolves to the container itself — not Floci. Floci's embedded DNS server handles this automatically: it resolves *.localhost.floci.io, *.s3.localhost.floci.io, *.localhost.localstack.cloud, and *.s3.localhost.localstack.cloud to Floci's container IP on the Docker network.

To use virtual-hosted style from a test container, point its DNS at Floci and set the endpoint to Floci's service hostname:

FLOCI_IP=$(docker inspect -f '{{.NetworkSettings.Networks.floci_default.IPAddress}}' floci)

docker run --rm \
  --network floci_default \
  --dns "$FLOCI_IP" \
  -e FLOCI_ENDPOINT=http://floci:4566 \
  -e FLOCI_S3_VHOST_ENDPOINT=http://floci:4566 \
  my-test-image

With FLOCI_HOSTNAME=floci set on the Floci container (default in the provided docker-compose.yml), the embedded DNS resolves my-bucket.floci to Floci's IP, and S3VirtualHostFilter extracts the bucket name from the Host header.

Object Attribute Notes

Floci now persists and returns the following object attribute state on S3 object APIs:

  • user metadata from x-amz-meta-*
  • storage class from x-amz-storage-class
  • checksum metadata for object reads and GetObjectAttributes
  • multipart part manifests for GetObjectAttributes(ObjectParts)
  • canned object ACLs from x-amz-acl on PutObject, CopyObject, and multipart initiation
  • explicit object SSE headers from x-amz-server-side-encryption on PutObject, CopyObject, and multipart initiation, replayed on GetObject and HeadObject

Current limitations:

  • checksum responses focus on SHA-1 and SHA-256
  • copy-based metadata updates support x-amz-metadata-directive: REPLACE for user metadata and content type, but do not yet cover every AWS copy header
  • explicit ACL grant headers such as x-amz-grant-read and x-amz-grant-full-control are not modeled yet
  • cross-account canned ACL variants collapse to the emulator's single synthetic owner where Floci does not model a distinct second principal
  • aws-exec-read is accepted for compatibility, but Floci does not yet model a distinct EC2 bundle-reader grantee in GetObjectAcl