S3

Protocol: REST XML Endpoint: http://localhost:4566/{bucket}/{key}

Supported Operations

Category	Operations
Buckets	ListBuckets, CreateBucket, HeadBucket, DeleteBucket, GetBucketLocation
Objects	PutObject, GetObject, GetObjectAttributes, HeadObject, DeleteObject, DeleteObjects, CopyObject
Listing	ListObjects, ListObjectsV2, ListObjectVersions
Multipart	CreateMultipartUpload, UploadPart, CompleteMultipartUpload, AbortMultipartUpload, ListMultipartUploads
Versioning	PutBucketVersioning, GetBucketVersioning
Tagging	PutBucketTagging, GetBucketTagging, PutObjectTagging, GetObjectTagging, DeleteObjectTagging
Policy	PutBucketPolicy, GetBucketPolicy, DeleteBucketPolicy
CORS	PutBucketCors, GetBucketCors, DeleteBucketCors
Lifecycle	PutBucketLifecycle, GetBucketLifecycle, DeleteBucketLifecycle
ACL	PutBucketAcl, GetBucketAcl, PutObjectAcl, GetObjectAcl
Encryption	PutBucketEncryption, GetBucketEncryption, DeleteBucketEncryption
Notifications	PutBucketNotification, GetBucketNotification
Object Lock	PutObjectLockConfiguration, GetObjectLockConfiguration, PutObjectRetention, GetObjectRetention, PutObjectLegalHold, GetObjectLegalHold
Pre-signed URLs	Generates and validates pre-signed GET/PUT URLs
S3 Select	SelectObjectContent
Public Access Block	PutPublicAccessBlock, GetPublicAccessBlock, DeletePublicAccessBlock

RestoreObject is accepted but stubbed: Floci validates the request and returns 202 Accepted, but no restore state machine runs.

S3 Select

SelectObjectContent runs SQL queries directly against S3 objects without downloading the entire file. Floci supports CSV, JSON Lines, JSON arrays, and Parquet inputs. The SQL dialect follows AWS S3 Select SQL reference.

Execution modes

Floci chooses the execution engine automatically based on the input format and whether the floci-duck sidecar is running:

Condition	Engine	Notes
Input is Parquet	floci-duck (required)	DuckDB's `read_parquet` — sidecar must be available
Input is CSV with `FileHeaderInfo=USE` and floci-duck is running	floci-duck	Full DuckDB SQL: all operators, LIKE, BETWEEN, IN, IS NULL, AND/OR/NOT
Input is JSON and floci-duck is running	floci-duck	`read_json_auto` — supports JSON Lines and JSON arrays
Input is CSV with `FileHeaderInfo=NONE` or `IGNORE`, or floci-duck is not running	Java evaluator	Supports SELECT *, column projection, simple WHERE with =, !=, <, >, <=, >=, LIKE, BETWEEN, IN, IS NULL, AND/OR/NOT, LIMIT

The floci-duck sidecar starts lazily on the first Athena query. Until then, isAvailable() returns false and S3 Select falls back to the Java evaluator for CSV and JSON. Once the sidecar is running, subsequent S3 Select calls route through DuckDB automatically.

If floci-duck is not running and the object is Parquet, S3 Select returns an error — Parquet decoding requires DuckDB.

FileHeaderInfo modes (CSV)

Value	Behavior
`USE`	First row is the header; column names are available in WHERE and SELECT
`IGNORE`	First row is skipped and not included in output; only positional `_N` references work
`NONE`	All rows are data; only positional `_N` references work (e.g. `WHERE _1 = 'Alice'`)

Supported SQL operators

When using the Java evaluator (no floci-duck, or CSV with FileHeaderInfo=NONE/IGNORE):

Comparison: =, !=, <>, <, >, <=, >=
Pattern matching: LIKE (supports % and _ wildcards)
Range: BETWEEN ... AND ...
Set membership: IN (...)
Null checks: IS NULL, IS NOT NULL
Logical: AND, OR, NOT
Clauses: SELECT *, column projection, LIMIT

Output formats

S3 Select supports cross-format output: a CSV object can produce JSON output and vice versa.

Input	Output	Format
CSV	CSV	Default — comma-separated values
CSV	JSON	One JSON object per row: `{"col1":"val1","col2":"val2"}`
JSON	JSON	Default — one JSON object per line
JSON	CSV	Comma-separated values, values quoted when they contain commas or newlines

Example

export AWS_ENDPOINT_URL=http://localhost:4566

# Upload a CSV file
printf 'name,age,city\nAlice,30,New York\nBob,25,\nCharlie,35,London\n' \
  | aws s3 cp - s3://my-bucket/people.csv

# Query with WHERE and column projection
aws s3api select-object-content \
  --bucket my-bucket \
  --key people.csv \
  --expression "SELECT name, city FROM S3Object WHERE age >= 30" \
  --expression-type SQL \
  --input-serialization '{"CSV":{"FileHeaderInfo":"USE"}}' \
  --output-serialization '{"CSV":{}}' \
  /dev/stdout

# IS NULL check
aws s3api select-object-content \
  --bucket my-bucket \
  --key people.csv \
  --expression "SELECT name FROM S3Object WHERE city IS NULL" \
  --expression-type SQL \
  --input-serialization '{"CSV":{"FileHeaderInfo":"USE"}}' \
  --output-serialization '{"CSV":{}}' \
  /dev/stdout

# JSON Lines input
printf '{"name":"Alice","score":95}\n{"name":"Bob","score":72}\n' \
  | aws s3 cp - s3://my-bucket/scores.json

aws s3api select-object-content \
  --bucket my-bucket \
  --key scores.json \
  --expression "SELECT * FROM S3Object WHERE score > 80" \
  --expression-type SQL \
  --input-serialization '{"JSON":{"Type":"LINES"}}' \
  --output-serialization '{"JSON":{}}' \
  /dev/stdout

Mock mode note

When FLOCI_SERVICES_ATHENA_MOCK=true is set, Athena queries are stubbed but floci-duck does not start. In that configuration, S3 Select uses the Java evaluator for CSV and JSON. Parquet queries will fail unless FLOCI_SERVICES_DUCK_URL points to an already-running floci-duck instance.

Not Implemented

These AWS S3 features have no handler in Floci. Calls will return an error (typically 404 or NoSuchBucket-style):

Replication (PutBucketReplication, GetBucketReplication, DeleteBucketReplication)
Website hosting (PutBucketWebsite, GetBucketWebsite, DeleteBucketWebsite)
Access logging (PutBucketLogging, GetBucketLogging)
Request payment (PutBucketRequestPayment, GetBucketRequestPayment)
Intelligent-Tiering configurations
Inventory configurations
Metrics and Analytics configurations

Configuration

Variable	Default	Description
`FLOCI_SERVICES_S3_ENABLED`	`true`	Enable or disable the service
`FLOCI_SERVICES_S3_DEFAULT_PRESIGN_EXPIRY_SECONDS`	`3600`	Default pre-signed URL expiry (1 hour)
`FLOCI_AUTH_PRESIGN_SECRET`	`local-emulator-secret`	Secret used to sign pre-signed URLs

Examples

export AWS_ENDPOINT_URL=http://localhost:4566

# Create bucket
aws s3 mb s3://my-bucket --endpoint-url $AWS_ENDPOINT_URL

# Upload a file
aws s3 cp ./report.pdf s3://my-bucket/reports/report.pdf --endpoint-url $AWS_ENDPOINT_URL

# Upload inline content
echo '{"hello":"world"}' | aws s3 cp - s3://my-bucket/data.json --endpoint-url $AWS_ENDPOINT_URL

# Download
aws s3 cp s3://my-bucket/data.json ./data.json --endpoint-url $AWS_ENDPOINT_URL

# Inspect object attributes without downloading the body
aws s3api get-object-attributes \
  --bucket my-bucket \
  --key data.json \
  --object-attributes ETag ObjectSize StorageClass \
  --endpoint-url $AWS_ENDPOINT_URL

# List
aws s3 ls s3://my-bucket --endpoint-url $AWS_ENDPOINT_URL

# Delete
aws s3 rm s3://my-bucket/data.json --endpoint-url $AWS_ENDPOINT_URL

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket my-bucket \
  --versioning-configuration Status=Enabled \
  --endpoint-url $AWS_ENDPOINT_URL

# Generate a pre-signed URL (valid for 1 hour)
aws s3 presign s3://my-bucket/report.pdf \
  --expires-in 3600 \
  --endpoint-url $AWS_ENDPOINT_URL

Addressing Styles

Floci supports both path-style and virtual-hosted style S3 addressing.

Path-Style (always works)

Path-style embeds the bucket name in the URL path:

http://localhost:4566/my-bucket/my-key

Enable it in the SDK with forcePathStyle / pathStyleAccessEnabled:

JavaNode.jsPython

S3Client s3 = S3Client.builder()
    .endpointOverride(URI.create("http://localhost:4566"))
    .forcePathStyle(true)
    .build();

const s3 = new S3Client({
  endpoint: "http://localhost:4566",
  forcePathStyle: true,
});

s3 = boto3.client("s3",
    endpoint_url="http://localhost:4566",
    config=Config(s3={"addressing_style": "path"}))

Virtual-Hosted Style

Virtual-hosted style puts the bucket name in the hostname:

http://my-bucket.s3.localhost.floci.io:4566/my-key

Floci supports this natively — no forcePathStyle needed. The following wildcard DNS domains resolve to 127.0.0.1 via public DNS, so virtual-hosted requests reach Floci on the host machine automatically:

Domain pattern	Resolves to
`*.localhost.floci.io`	`127.0.0.1`
`*.s3.localhost.floci.io`	`127.0.0.1`
`*.localhost.localstack.cloud`	`127.0.0.1`
`*.s3.localhost.localstack.cloud`	`127.0.0.1`

Plain http://localhost:4566 also works without forcePathStyle — the SDK sends Host: my-bucket.localhost:4566 and Floci's filter extracts the bucket name from that header.

Configure the SDK endpoint to one of the base domains (no forcePathStyle):

JavaNode.jsPython

S3Client s3 = S3Client.builder()
    .endpointOverride(URI.create("http://s3.localhost.floci.io:4566"))
    .region(Region.US_EAST_1)
    .credentialsProvider(StaticCredentialsProvider.create(
        AwsBasicCredentials.create("test", "test")))
    .build();
// SDK sends requests to: my-bucket.s3.localhost.floci.io:4566

const s3 = new S3Client({
  endpoint: "http://s3.localhost.floci.io:4566",
  // no forcePathStyle — SDK uses virtual-hosted style by default
});
// SDK sends requests to: my-bucket.s3.localhost.floci.io:4566

s3 = boto3.client("s3",
    endpoint_url="http://s3.localhost.floci.io:4566")
# SDK sends requests to: my-bucket.s3.localhost.floci.io:4566

Virtual-Hosted Style Inside Docker

Inside Docker containers, 127.0.0.1 resolves to the container itself — not Floci. Floci's embedded DNS server handles this automatically: it resolves *.localhost.floci.io, *.s3.localhost.floci.io, *.localhost.localstack.cloud, and *.s3.localhost.localstack.cloud to Floci's container IP on the Docker network.

To use virtual-hosted style from a test container, point its DNS at Floci and set the endpoint to Floci's service hostname:

FLOCI_IP=$(docker inspect -f '{{.NetworkSettings.Networks.floci_default.IPAddress}}' floci)

docker run --rm \
  --network floci_default \
  --dns "$FLOCI_IP" \
  -e FLOCI_ENDPOINT=http://floci:4566 \
  -e FLOCI_S3_VHOST_ENDPOINT=http://floci:4566 \
  my-test-image

With FLOCI_HOSTNAME=floci set on the Floci container (default in the provided docker-compose.yml), the embedded DNS resolves my-bucket.floci to Floci's IP, and S3VirtualHostFilter extracts the bucket name from the Host header.

Object Attribute Notes

Floci now persists and returns the following object attribute state on S3 object APIs:

user metadata from x-amz-meta-*
storage class from x-amz-storage-class
checksum metadata for object reads and GetObjectAttributes
multipart part manifests for GetObjectAttributes(ObjectParts)
canned object ACLs from x-amz-acl on PutObject, CopyObject, and multipart initiation
explicit object SSE headers from x-amz-server-side-encryption on PutObject, CopyObject, and multipart initiation, replayed on GetObject and HeadObject

Current limitations:

checksum responses focus on SHA-1 and SHA-256
copy-based metadata updates support x-amz-metadata-directive: REPLACE for user metadata and content type, but do not yet cover every AWS copy header
explicit ACL grant headers such as x-amz-grant-read and x-amz-grant-full-control are not modeled yet
cross-account canned ACL variants collapse to the emulator's single synthetic owner where Floci does not model a distinct second principal
aws-exec-read is accepted for compatibility, but Floci does not yet model a distinct EC2 bundle-reader grantee in GetObjectAcl