Comparing EC2 vs Lambda for batch processing in ERP workloads: cost, scalability, and operational overhead

We’re redesigning our ERP batch processing architecture and debating between EC2-based solutions and Lambda. Currently running nightly batch jobs on dedicated EC2 instances (m5.2xlarge) that process invoices, inventory updates, and financial reconciliation.

The jobs run 2-4 hours nightly, processing 50-200k records. We’re looking at Lambda to reduce costs since compute sits idle most of the day, but concerned about the 15-minute runtime limit and operational overhead of managing thousands of concurrent Lambda invocations.

Has anyone migrated ERP batch workloads from EC2 to Lambda? What were the tradeoffs around runtime limits, cost savings, and operational complexity? Interested in real-world experiences with both approaches.

The runtime limit issue is overblown. Use Step Functions to orchestrate multiple Lambda invocations for long-running workflows. Break your 2-4 hour job into stages: data extraction (5 min), processing batches (10 min each), aggregation (5 min). Step Functions handles the coordination and retry logic. Operational overhead is actually lower than managing EC2 instances, Auto Scaling groups, and patching.

For ERP workloads with hard deadlines, I’d actually recommend a hybrid approach. Use EC2 Spot instances with Auto Scaling triggered by CloudWatch Events (scheduled). You get 70% cost savings versus on-demand, and no runtime limits or cold start issues. Reserve a small on-demand instance as fallback if Spot is interrupted. This gives you cost efficiency without the operational overhead of refactoring everything for Lambda’s constraints.

We run similar ERP batch processing and tested both. Lambda cold starts killed performance - first invocation took 8-12 seconds versus sub-second for warm containers. With 200k records, that’s significant overhead. We landed on Fargate with scheduled ECS tasks. Gets you container portability, no server management, and predictable performance. Costs more than Lambda but less than EC2, and no runtime limit headaches.