Skip to main content
2025-07-0615 min read
System Architecture

Lambda Considered Harmful

How AWS Lambda Turns Simple Problems Into Distributed Systems Nightmares

Look, I get it. You went to re:Invent. You saw Werner Vogels descend from the heavens on a cloud of distributed systems buzzwords. You watched him demo a serverless architecture that could process the entire Library of Congress in 30 milliseconds while simultaneously teaching itself Mandarin. You came back to your five-person startup with visions of infinitely scalable functions dancing in your head, convinced that your weekly user metrics email MUST be deployed as seventeen separate Lambda functions orchestrated by Step Functions, because what if—hear me out—what if you suddenly need to send 10 billion emails per second?
Congratulations, you've just spent six weeks building what a junior developer could have written as a cron job in 20 minutes. But hey, at least it's "web scale."

The Seductive Lie of Serverless

AWS Lambda is a solution to a problem that approximately 0.01% of companies on Earth actually have. It's like buying a Formula 1 race car to commute to your job at the strip mall. Sure, it CAN go 200mph, but you're still stuck in traffic on the 405 like the rest of us schmucks.
The sales pitch is intoxicating:
"You only pay for what you use!" Translation: You'll receive a 47-page billing statement that requires a team of forensic accountants to decipher. You'll discover that your "free tier" 1ms of compute time is like the free bread at Olive Garden—technically free, but you're paying $89 for the privilege of sitting at the table. Between API Gateway ($3.50 per million requests, which sounds cheap until you realize your health checks alone are costing you a mortgage payment), CloudWatch logs (did you know AWS charges you to READ YOUR OWN LOGS?), and the seventeen other services you need just to make the damn thing work, you're basically funding Jeff Bezos's next penis rocket.
"It scales automatically!" Translation: It will scale to infinity faster than you can say "recursive function without a base case." Remember that time you forgot to add a condition to your while loop? Well, now that mistake costs $47,000 instead of just making your laptop fan sound like a jet engine. AWS will happily let you DDoS yourself into bankruptcy. They'll even send you a helpful email about it... three days later.
"No servers to manage!" Translation: Instead of managing one server that you understand, you now manage:
  • IAM roles (a permission system designed by someone who clearly hates both humans and computers)
  • VPC configurations (because your function needs its own private network, apparently)
  • Cold starts (your function taking a nice 5-second nap before doing anything useful)
  • Execution timeouts (15 minutes max, because fuck your data processing job)
  • Memory limits (allocated in 64MB increments like it's 1987)
  • Deployment packages (ZIP files with a 250MB limit, unless you use layers, which are just ZIP files with extra steps)
  • Environment variables (stored in plaintext because security is for losers)
  • Dead letter queues (for when your functions die, which they will)
  • And approximately 73 other things that will break at 3 AM on a Saturday
You haven't eliminated complexity. You've taken it, put it in a blender with some uranium, and scattered the radioactive smoothie across 19 different AWS services.

The "I Watched a YouTube Tutorial" Architecture

Here's what happens. You need to send a welcome email when someone signs up. In the before times, you'd write:
python
1def handle_signup(user):
2 save_user_to_db(user)
3 send_welcome_email(user.email)
4 return "Welcome!"
But no, that's not "cloud native" enough. After watching 47 hours of AWS tutorials narrated by someone who sounds like they're being held hostage, you build this monstrosity:
  1. API Gateway triggers Lambda Function #1
  2. Lambda #1 validates the input (because you don't trust API Gateway's validation)
  3. Lambda #1 publishes to SNS Topic
  4. SNS triggers Lambda #2 which writes to DynamoDB
  5. DynamoDB Stream triggers Lambda #3
  6. Lambda #3 publishes to SQS
  7. Lambda #4 polls SQS (yes, POLLS, because push is too mainstream)
  8. Lambda #4 calls SES to send the email
  9. Lambda #5 logs the event to CloudWatch
  10. Step Functions orchestrates all of this
  11. CloudFormation deploys it (after 45 minutes of "CREATE_IN_PROGRESS")
You've replaced 3 lines of code with 11 services, 5 functions, and 2,000 lines of YAML. If any single component fails, good fucking luck figuring out which one. Your CloudWatch logs are scattered across five different log groups, your traces require a PhD in distributed systems to understand, and your new developer onboarding now includes a week-long course on "How We Send Emails."

The Cron Job That Ate Cincinnati

But the crown jewel of Lambda abuse is scheduled tasks. You need to generate a report every night at midnight. In the old world:
bash
10 0 * * * /usr/bin/python /home/app/generate_report.py
Done. It runs. If it fails, cron emails you. You can see the output. Life is good.
But in Lambda Land? Oh boy:
  1. Create a CloudWatch Event Rule (which is now called EventBridge because AWS loves renaming things)
  2. Write your Lambda function
  3. Package it with its dependencies (hope they're under 250MB!)
  4. Set up IAM roles (minimum 6 hours of googling required)
  5. Configure VPC access (because your Lambda needs to talk to RDS)
  6. Set up a NAT Gateway (that'll be $45/month, thanks)
  7. Add error handling (remember, no retry by default!)
  8. Set up DLQ for failures
  9. Create CloudWatch alarms
  10. Build a custom dashboard to see what the fuck is happening
  11. Implement your own retry logic with exponential backoff
  12. Add X-Ray tracing to debug the inevitable failures
  13. Set up SNS notifications for failures
  14. Create a separate Lambda to process the failures
  15. Realize you've built a distributed system to run SELECT COUNT(*) FROM users
Six months later, you've spent $10,000 and 400 engineering hours to run a SQL query once a day. But hey, it's "serverless"!

Lambda's Dirty Little Secrets

Let me tell you what the AWS marketing team won't:
Cold Starts Are Real and They Will Hurt You: Your function hasn't run in 10 minutes? Enjoy a 5-second cold start while AWS spins up a container, loads your runtime, initializes your code, and contemplates the meaning of life. Your users just wanted to reset their password, not experience the digital equivalent of waiting for Windows 95 to boot.
15-Minute Timeout Maximum: Got a job that takes 20 minutes? Tough shit. Split it into multiple functions and pray they don't fail halfway through. AWS assumes your workloads are all quick API responses, not actual work.
Debugging Is a Nightmare: Good luck reproducing that production issue locally. You'll need to simulate API Gateway, set up local DynamoDB, mock IAM roles, and sacrifice a goat to the AWS gods. By the time you've reproduced the issue, you could have rewritten the entire thing as a normal application.
Vendor Lock-in Like You've Never Seen: Think you'll migrate off AWS Lambda someday? HAHAHAHA. Your code is more tightly coupled to AWS than Jeff Bezos's bank account. You're not writing Python/Node/Java anymore—you're writing AWS Lambda Python/Node/Java, which is about as portable as a concrete submarine.

Just Use a Fucking Job Queue

Here's a revolutionary idea that'll blow your mind: use a job queue. You know, that technology we've had since the dawn of computing that actually works.
Want to run a background job? Here's the entire architecture:
python
1# Add a job
2queue.push({"task": "send_email", "to": "user@example.com"})
3
4# Worker picks it up
5while True:
6 job = queue.pop()
7 process_job(job)
That's it. That's the whole thing. It's so simple it's almost embarrassing.
  • Failed job? It's right there in your failed_jobs table
  • Want to retry? Move it back to pending
  • Need logs? They're in one fucking place
  • Want to debug? Run the worker locally
  • Need to scale? Run more workers
  • Want monitoring? One dashboard shows everything
You can build this in an afternoon with:
  • Python: Celery (battle-tested since 2009)
  • Ruby: Sidekiq (so good it makes you want to use Ruby)
  • Node: BullMQ (Redis-backed and rock solid)
  • Go: machinery, asynq, or just channels because Go is beautiful
  • Literally any language: A for loop and a database table
But no, that's too simple. You need to embrace hyperscale serverless edge-compute. (I just gave a cloud architect a nosebleed.)

When Lambda Actually Makes Sense

Fine, I'll admit it. There's exactly ONE scenario where Lambda makes sense:
You have completely unpredictable, massively spiky traffic that goes from 0 to 100,000 requests per second and back to 0.
Examples:
  • Voting for reality TV shows
  • Processing Black Friday transactions
  • Handling the traffic when Elon tweets about your app
  • Running the backend for Pokémon Go on launch day
  • Processing uploads when a celebrity's nudes leak
Notice what's NOT on this list:
  • Your SaaS app's PDF generator
  • Your daily report cron job
  • Your image resizing service
  • Your webhook processor
  • Your API that gets 100 requests per minute
  • Literally 99.9% of all workloads on Earth
For everyone else, your t3.medium EC2 instance running a job queue will handle your workload just fine. It'll cost 1/10th as much, be 100x easier to debug, and won't require a PhD in AWS to operate.

The Real Cost

The real cost of Lambda isn't the AWS bill (though that's bad enough). It's the complexity you've added to your system. It's the 3 AM pages when CloudWatch Events mysteriously stops triggering. It's the junior developer who takes three weeks to understand how to add a field to that email. It's the fact that your "simple" email sender now has 17 points of failure.
You're not Google. You're not Netflix. You're a startup with 1,000 users and dreams of greatness. Your cron job doesn't need to be a distributed system. Your background task doesn't need to be event-driven. Your architecture doesn't need to look like the result of a conference talk titled "Serverless at Scale: How We Process 50 Billion Events Per Second Using Only AWS Services and Pure Determination."

In Conclusion: Stop It

Put down the AWS console. Step away from the Lambda function. Take a deep breath.
Now repeat after me:
"I do not need serverless for my cron jobs. A simple job queue is fine. I am not operating at Google scale. My 50 daily active users do not require a distributed architecture. I will not architect for problems I don't have. Servers are not evil. YAML is not a programming language. I will stop reading Hacker News comments about system design."
Your future self will thank you. Your coworkers will thank you. Your AWS bill will thank you.
And somewhere, in a data center far away, a lonely EC2 instance running a simple job queue will smile, knowing it's doing honest work in a world gone mad with complexity.
Now if you'll excuse me, I need to go debug why my Lambda function is timing out. It's probably the VPC configuration. Or the IAM role. Or the security group. Or sunspots. With Lambda, who the fuck knows?