We were running 4,000 Node containers (or "workers") for our bank integration service. The service was originally designed such that each worker would process only a single request at a time. This design lessened the impact of integrations that accidentally blocked the event loop, and allowed us to ignore the variability in resource usage across different integrations. But since our total capacity was capped at 4,000 concurrent requests, the system did not gracefully scale. Most requests were network-bound, so we could improve our capacity and costs if we could just figure out how to increase parallelism safely.