AWS S3 Complete Guide: Storage, Security & Cost Optimization (2026)
Amazon S3 (Simple Storage Service) is the backbone of cloud storage. Every AWS workload uses S3 in some form — application assets, database backups, data lakes, static websites, Lambda deployment packages, CloudTrail logs, and more. S3 is one of the oldest AWS services (launched 2006) and one of the most feature-rich.
This guide covers everything you need to know about S3 for the AWS Solutions Architect exam and production usage.
S3 Core Concepts
Buckets and Objects
S3 is object storage — you store files (objects) in containers (buckets).
- Bucket: A globally unique container for objects. Tied to a region but names are globally unique across all AWS accounts.
- Object: A file + metadata. Max size 5 TB. Uploaded via multipart for objects > 100 MB.
- Key: The object's unique identifier within a bucket. Looks like a path:
/images/products/123.jpg— but S3 has no real folders, just key prefixes. - Object URL:
https://bucket-name.s3.us-east-1.amazonaws.com/key
S3 Is Not a Filesystem
Common misconception: S3 is like a network drive.
S3 is an object store, not a block device or filesystem:
- No in-place editing — you replace objects entirely
- No locking (use DynamoDB for distributed locking)
- Eventually consistent for LIST operations (strong consistency for reads/writes since 2020)
- Designed for "write once, read many" patterns
Storage Classes
S3 has 7 storage classes optimized for different access patterns:
| Storage Class | Use Case | Min Duration | Retrieval | Price |
|---|---|---|---|---|
| Standard | Frequently accessed | None | Instant | $$$ |
| Standard-IA | Infrequent (monthly) | 30 days | Instant | $$ |
| One Zone-IA | Infrequent, non-critical | 30 days | Instant | $ |
| Glacier Instant | Archives, accessed quarterly | 90 days | Instant | $ |
| Glacier Flexible | Archives, hours OK | 90 days | Minutes/hours | $$ |
| Glacier Deep Archive | Rarely accessed | 180 days | 12-48 hours | ¢ |
| Intelligent-Tiering | Unknown pattern | None | Instant | $$$ monitoring + auto-tiering |
Decision guide:
- Daily access → Standard
- Monthly access → Standard-IA (retrieval fee applies)
- Annual audits → Glacier Flexible Retrieval
- Compliance archives → Glacier Deep Archive
- Unpredictable pattern → Intelligent-Tiering (pay $0.0025/1000 objects for monitoring)
Lifecycle Policies
Automate storage class transitions and deletion based on age.
resource "aws_s3_bucket_lifecycle_configuration" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
rule {
id = "data-lifecycle"
status = "Enabled"
# Move to Standard-IA after 30 days
transition {
days = 30
storage_class = "STANDARD_IA"
}
# Move to Glacier after 90 days
transition {
days = 90
storage_class = "GLACIER"
}
# Delete after 365 days
expiration {
days = 365
}
# Clean up incomplete multipart uploads (costs money!)
abort_incomplete_multipart_upload {
days_after_initiation = 7
}
}
# Keep only 3 versions of each object
rule {
id = "version-cleanup"
status = "Enabled"
noncurrent_version_expiration {
noncurrent_days = 30
newer_noncurrent_versions = 3
}
}
}
Versioning
Versioning keeps every version of an object. Protects against accidental deletes and overwrites.
resource "aws_s3_bucket_versioning" "data" {
bucket = aws_s3_bucket.data.id
versioning_configuration {
status = "Enabled"
}
}
When you delete a versioned object, S3 adds a delete marker instead of removing the data. The object appears deleted but can be recovered by removing the delete marker.
Cost note: Versioning stores ALL versions — costs multiply quickly. Always pair versioning with lifecycle policies to expire old versions.
S3 Security
Bucket Policies
Resource-based policies attached to the bucket:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowCloudFrontAccess",
"Effect": "Allow",
"Principal": {
"Service": "cloudfront.amazonaws.com"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-website-bucket/*",
"Condition": {
"StringEquals": {
"AWS:SourceArn": "arn:aws:cloudfront::123456789:distribution/ABCDEFG"
}
}
},
{
"Sid": "DenyPublicAccess",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
],
"Condition": {
"Bool": {
"aws:SecureTransport": "false"
}
}
}
]
}
Block Public Access
Always enable for buckets that should not be public. This is now the default for new buckets.
resource "aws_s3_bucket_public_access_block" "data" {
bucket = aws_s3_bucket.data.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
Encryption
Enable server-side encryption for all objects:
resource "aws_s3_bucket_server_side_encryption_configuration" "data" {
bucket = aws_s3_bucket.data.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
# Use a CMK for compliance requirements:
# kms_master_key_id = aws_kms_key.s3.arn
}
bucket_key_enabled = true # Reduces KMS API calls by 99%, saves cost
}
}
Encryption types:
- SSE-S3: AWS manages keys (AES-256). Free. No extra API calls.
- SSE-KMS: AWS KMS manages keys. Audit trail in CloudTrail. Extra cost for KMS API calls.
- SSE-C: You provide keys with each request. You manage key rotation.
- Client-side encryption: Encrypted before upload. AWS never sees plaintext.
Presigned URLs
Generate temporary URLs for secure, time-limited access:
import boto3
from botocore.config import Config
s3 = boto3.client('s3',
config=Config(signature_version='s3v4'),
region_name='us-east-1'
)
# Generate URL to download a file (valid for 1 hour)
url = s3.generate_presigned_url(
'get_object',
Params={'Bucket': 'my-bucket', 'Key': 'reports/q1-2026.pdf'},
ExpiresIn=3600
)
print(f"Download URL: {url}")
# Generate URL to upload a file (POST — better for browser uploads)
presigned_post = s3.generate_presigned_post(
'my-bucket',
'uploads/user-avatar.png',
Fields={'Content-Type': 'image/png'},
Conditions=[
['content-length-range', 1, 5242880], # 1B to 5MB
{'Content-Type': 'image/png'},
],
ExpiresIn=300 # 5 minutes
)
# Returns: {'url': '...', 'fields': {...}}
# Use in browser: POST to presigned_post['url'] with presigned_post['fields']
Use cases for presigned URLs:
- Resume downloads from S3 without going through your server
- User avatar uploads directly to S3 (no bandwidth cost on your server)
- Temporary report access for external partners
S3 Event Notifications
Trigger processing when objects are created, deleted, or restored:
resource "aws_s3_bucket_notification" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
# Trigger Lambda when CSV files arrive
lambda_function {
lambda_function_arn = aws_lambda_function.csv_processor.arn
events = ["s3:ObjectCreated:*"]
filter_prefix = "incoming/"
filter_suffix = ".csv"
}
# Send SQS message for all deletions
queue {
queue_arn = aws_sqs_queue.s3_events.arn
events = ["s3:ObjectRemoved:*"]
}
depends_on = [
aws_lambda_permission.s3_invoke,
aws_sqs_queue_policy.s3_events
]
}
Static Website Hosting
S3 can host static websites (HTML, CSS, JavaScript) without a web server:
resource "aws_s3_bucket" "website" {
bucket = "my-company-website"
}
resource "aws_s3_bucket_website_configuration" "website" {
bucket = aws_s3_bucket.website.id
index_document { suffix = "index.html" }
error_document { key = "error.html" }
}
# For production: put CloudFront in front for HTTPS + CDN
resource "aws_cloudfront_distribution" "website" {
origin {
domain_name = aws_s3_bucket.website.bucket_regional_domain_name
origin_id = "S3-${aws_s3_bucket.website.id}"
origin_access_control_id = aws_cloudfront_origin_access_control.website.id
}
enabled = true
default_root_object = "index.html"
default_cache_behavior {
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "S3-${aws_s3_bucket.website.id}"
viewer_protocol_policy = "redirect-to-https"
forwarded_values {
query_string = false
cookies { forward = "none" }
}
}
restrictions {
geo_restriction { restriction_type = "none" }
}
viewer_certificate {
acm_certificate_arn = aws_acm_certificate.website.arn
ssl_support_method = "sni-only"
}
}
S3 Cost Optimization Checklist
□ Enable S3 Intelligent-Tiering OR implement lifecycle policies
□ Set lifecycle rule for incomplete multipart uploads (abort after 7 days)
□ Expire old versions with lifecycle (especially for log buckets)
□ Use S3 Storage Lens for visibility across all buckets
□ Enable bucket key for SSE-KMS to reduce KMS costs by 99%
□ Use VPC endpoints for S3 (free, eliminates NAT Gateway costs)
□ Compress data before uploading (reduce storage + transfer costs)
□ Use Transfer Acceleration only for cross-continent uploads (costs extra)
□ Review and delete empty buckets (hosting charges still apply)
□ Use S3 Select to retrieve partial objects (reduces data transfer)
SAA-C03 S3 Exam Tips
- S3 is regional — bucket is in one region, but the namespace is global
- Durability: 11 nines (99.999999999%) across 3+ AZs. One Zone-IA is 99.999999999% within one AZ.
- Availability: 99.99% for Standard, 99.9% for Standard-IA
- S3 Object Lock: WORM (Write Once Read Many) for compliance. Cannot delete or overwrite.
- S3 Replication: Cross-Region Replication (CRR) and Same-Region Replication (SRR). Requires versioning on both buckets.
- Pre-signed URLs: Give time-limited access to private objects without changing permissions.
- Access Points: Named network endpoints with their own bucket policies. Great for large shared datasets.
*CloudPath Academy's Phase 1 covers S3 in the Storage Services lesson, and Phase 2 goes deeper with hands-on S3 labs covering lifecycle policies, replication, and event-driven processing.*