AWS Lambda Tutorial: Complete Serverless Guide for Cloud Engineers (2026)
AWS Lambda is the most transformative service in AWS's catalog. It lets you run code without thinking about servers — no provisioning, no patching, no capacity planning. You write a function, upload it, and Lambda runs it in response to events. You pay only for what you use: the number of requests and the duration.
For cloud engineers, Lambda is essential. It powers everything from API backends to data pipelines to event-driven microservices. It's heavily tested on every AWS certification, and every cloud engineering job will have Lambda in the stack.
What is AWS Lambda?
Lambda is a serverless compute service that runs your code in response to triggers:
- HTTP requests via API Gateway
- S3 object uploads (new file → process it)
- DynamoDB stream changes (database update → trigger workflow)
- SQS messages (message arrives → process it)
- EventBridge rules (every 5 minutes → run scheduled job)
- SNS notifications, Kinesis streams, Cognito events...
The execution model:
- Event arrives
- Lambda spins up a container (if none available)
- Your function runs
- Container stays warm for a bit, then terminates
- You pay only for the execution time (billed in 1ms increments)
Limits to know for certifications:
- Max execution time: 15 minutes
- Max memory: 10,240 MB (10 GB)
- Max package size: 250 MB (unzipped), or use Lambda Layers
- Max concurrent executions: 1,000 per region (soft limit)
- /tmp storage: 10 GB
Your First Lambda Function
Python
# handler.py
import json
import boto3
import os
def handler(event, context):
"""
event: the triggering event (dict)
context: Lambda runtime info (function name, timeout remaining, etc.)
"""
print(f"Event: {json.dumps(event)}")
# Example: process an S3 event
if "Records" in event:
for record in event["Records"]:
bucket = record["s3"]["bucket"]["name"]
key = record["s3"]["object"]["key"]
print(f"Processing s3://{bucket}/{key}")
# Do your processing here
s3 = boto3.client("s3")
response = s3.get_object(Bucket=bucket, Key=key)
content = response["Body"].read().decode("utf-8")
print(f"File content length: {len(content)}")
return {
"statusCode": 200,
"body": json.dumps({"message": "processed", "count": len(event.get("Records", []))})
}
Node.js
// handler.js
const { S3Client, GetObjectCommand } = require("@aws-sdk/client-s3");
const s3 = new S3Client({ region: process.env.AWS_REGION });
exports.handler = async (event, context) => {
console.log("Event:", JSON.stringify(event));
console.log("Remaining time (ms):", context.getRemainingTimeInMillis());
try {
const results = await Promise.all(
event.Records.map(async (record) => {
const bucket = record.s3.bucket.name;
const key = decodeURIComponent(record.s3.object.key);
const { Body } = await s3.send(new GetObjectCommand({ Bucket: bucket, Key: key }));
const content = await Body.transformToString();
return { bucket, key, size: content.length };
})
);
return {
statusCode: 200,
body: JSON.stringify({ processed: results }),
};
} catch (error) {
console.error("Error:", error);
throw error; // Let Lambda retry the event
}
};
Lambda with API Gateway
The most common Lambda pattern: HTTP API backend.
# API Gateway HTTP API (newer, cheaper, lower latency than REST API)
resource "aws_apigatewayv2_api" "main" {
name = "my-api"
protocol_type = "HTTP"
cors_configuration {
allow_origins = ["https://myapp.com"]
allow_methods = ["GET", "POST", "PUT", "DELETE"]
allow_headers = ["Content-Type", "Authorization"]
max_age = 300
}
}
resource "aws_lambda_function" "api" {
filename = "api.zip"
function_name = "my-api-handler"
role = aws_iam_role.lambda_api.arn
handler = "handler.handler"
runtime = "python3.12"
timeout = 30
memory_size = 512
environment {
variables = {
DATABASE_URL = aws_ssm_parameter.db_url.value
ENVIRONMENT = "production"
}
}
# Uncomment for VPC deployment
# vpc_config {
# subnet_ids = module.vpc.private_subnets
# security_group_ids = [aws_security_group.lambda.id]
# }
}
# Allow API Gateway to invoke Lambda
resource "aws_lambda_permission" "api_gateway" {
statement_id = "AllowAPIGateway"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.api.function_name
principal = "apigateway.amazonaws.com"
source_arn = "${aws_apigatewayv2_api.main.execution_arn}/*"
}
# Route all traffic to Lambda
resource "aws_apigatewayv2_integration" "lambda" {
api_id = aws_apigatewayv2_api.main.id
integration_type = "AWS_PROXY"
integration_uri = aws_lambda_function.api.invoke_arn
payload_format_version = "2.0"
}
resource "aws_apigatewayv2_route" "default" {
api_id = aws_apigatewayv2_api.main.id
route_key = "$default"
target = "integrations/${aws_apigatewayv2_integration.lambda.id}"
}
resource "aws_apigatewayv2_stage" "production" {
api_id = aws_apigatewayv2_api.main.id
name = "$default"
auto_deploy = true
}
Lambda with SQS (Event-Driven Processing)
resource "aws_sqs_queue" "orders" {
name = "orders"
visibility_timeout_seconds = 60 # Must be >= Lambda timeout
message_retention_seconds = 86400 # 1 day
# Dead Letter Queue for failed messages
redrive_policy = jsonencode({
deadLetterTargetArn = aws_sqs_queue.orders_dlq.arn
maxReceiveCount = 3 # Retry 3 times, then move to DLQ
})
}
resource "aws_sqs_queue" "orders_dlq" {
name = "orders-dlq"
message_retention_seconds = 1209600 # 14 days to investigate
}
resource "aws_lambda_event_source_mapping" "sqs_to_lambda" {
event_source_arn = aws_sqs_queue.orders.arn
function_name = aws_lambda_function.order_processor.arn
batch_size = 10 # Process up to 10 messages at once
maximum_batching_window_in_seconds = 5 # Wait up to 5s to fill a batch
function_response_types = ["ReportBatchItemFailures"] # Partial failures
}
Python handler for SQS:
def handler(event, context):
failures = []
for record in event["Records"]:
message_id = record["messageId"]
try:
body = json.loads(record["body"])
process_order(body)
except Exception as e:
print(f"Failed to process {message_id}: {e}")
failures.append({"itemIdentifier": message_id})
# Return partial failures — only failed messages go back to queue
return {"batchItemFailures": failures}
Cold Starts
The biggest Lambda performance consideration. When Lambda needs to start a new container:
- Download your function code/container
- Initialize the runtime (Python, Node.js)
- Run your initialization code (outside the handler)
- Execute the handler
Cold start time: 100ms–2s depending on language, package size, and VPC.
Reducing cold starts:
# BAD: Imports inside the handler = reinitialize every invocation
def handler(event, context):
import boto3 # Cold AND warm start overhead
s3 = boto3.client("s3") # Creates new client every time
# GOOD: Imports and clients outside handler = initialized once per container
import boto3
import os
# This runs once when the container starts, cached for subsequent invocations
s3 = boto3.client("s3")
dynamodb = boto3.resource("dynamodb")
table = dynamodb.Table(os.environ["TABLE_NAME"])
def handler(event, context):
# s3 and table are already initialized — fast!
return table.get_item(Key={"id": event["id"]})
Provisioned Concurrency: Pre-warm N containers at all times. Eliminates cold starts entirely but costs per-hour like EC2. Use for latency-sensitive APIs.
resource "aws_lambda_provisioned_concurrency_config" "api" {
function_name = aws_lambda_function.api.function_name
qualifier = aws_lambda_alias.production.name
provisioned_concurrent_executions = 5 # 5 warm containers always ready
}
Lambda Layers
Share code across multiple functions without duplicating it in every deployment package.
resource "aws_lambda_layer_version" "dependencies" {
layer_name = "common-dependencies"
filename = "layer.zip" # Contains /python/lib/python3.12/site-packages/
compatible_runtimes = ["python3.12"]
description = "Shared libraries: boto3, requests, pydantic, sqlalchemy"
}
resource "aws_lambda_function" "api" {
layers = [aws_lambda_layer_version.dependencies.arn]
# Now your deployment package only needs YOUR code, not the dependencies
}
Lambda Power Tuning
Lambda has a memory setting that also controls CPU proportionally. More memory = more CPU = faster execution.
Common mistake: setting 128 MB because "it's serverless and cheap." Run the AWS Lambda Power Tuning tool:
# Deploy the Power Tuning state machine via SAR
# Then run it against your function to find the optimal memory setting
# Often 512MB or 1024MB runs 3x faster and costs the same or LESS
# because duration × price per ms is lower even though price per ms is higher
The math: 512 MB costs 2x 256 MB per ms, but if it runs in 400ms vs 1000ms, you pay 0.4× more, not 2×.
Lambda Best Practices
□ Keep packages small: use Layers for dependencies
□ Initialize clients outside the handler (reuse across invocations)
□ Set timeouts appropriately (default is 3s — too low for DB calls)
□ Use environment variables for config, Parameter Store for secrets
□ Enable X-Ray tracing for distributed tracing
□ Set DLQ/destination for async invocations
□ Use ReportBatchItemFailures for SQS (partial batch failures)
□ Test locally with AWS SAM or Docker
□ Use Lambda Powertools (Python/TypeScript) for logging, tracing, metrics
□ Avoid putting Lambda in a VPC unless you need VPC resources
Lambda vs ECS Fargate vs EC2
| Lambda | ECS Fargate | EC2 | |
|---|---|---|---|
| Max runtime | 15 min | Unlimited | Unlimited |
| Pricing | Per ms | Per second | Per hour |
| Cold start | Yes (100ms-2s) | No | No |
| State | None | None (containers ephemeral) | Persistent |
| Scaling | Automatic (0→1000) | Minutes | Minutes |
| Best for | Event-driven, short tasks | APIs, services | Long-running, stateful |
Rule of thumb:
- < 15 minutes, event-driven, bursty: Lambda
- Always-on API, predictable traffic: ECS Fargate
- Need GPU, custom kernel, persistent storage: EC2
*Phase 2 of CloudPath Academy covers Lambda in depth — from basic handlers to production event-driven architectures with API Gateway, SQS, and Step Functions.*