Capacity management for peak season â 200 bookings/day and 99.7% uptime
Travel-agency software is not evenly loaded. A Hungarian agency's annual chart shows this pattern: March to May + September to October at ~200 bookings/day, the remaining 8 months at ~20/day. That 10x swing is the real test of the system's architecture.
Anatomy of the peak
AprilâMay 2026 data across the full Travelium customer base:
- 6-week peak (21 March â 6 May)
- 14,800 bookings in that window
- daily average: 215 bookings, peak day 312
- peak hour: 17:00â19:00 (visitors handle this after work)
- system uptime: 99.7% (about 30 cumulative minutes of downtime)
- tail off: from 7 May daily ~25 bookings
The 8x jump cannot be scaled in two minutes. It requires deliberate preparation.
Supplier API rate limiting
The biggest danger is not our own system â it's theirs. Amadeus's base limit is 30 calls/second per agency. Exceed it and you get a 5-minute ban. During one peak hour that means 1,500 lost searches.
The Travelium-side fix: per-supplier, per-agency token bucket. Each supplier has a configured limit (Amadeus: 25 cps, Sabre: 40, direct hotel APIs variable). The bucket lives in Redis with an atomic INCR + TTL workflow.
If an agency would exceed its own limit, the request returns 429 to the UI and the agent sees a discreet toast: "High concurrent search load, please retry in 3 seconds." On average this hits a single agent ~4 times per year â tolerable.
BullMQ background with priorities
Background jobs (voucher generation, invoice issuance, email send, supplier state sync) run in a BullMQ + Redis queue with three priority lanes:
highâ customer-blocking: payment confirmation, voucher generation. Target: < 5 s.normalâ background: review-email send, statistics aggregation. Target: < 5 min.lowâ optional: monthly report generation, partner sync. Target: < 60 min.
At peak the queue depth in normal typically sits at 800â1,200, but high stays near zero. The 99.7% uptime is not hardware â it's queue priority.
Pre-warming cache
The search cache for popular packages is pre-warmed starting 4 weeks before the season. A cache-warmer cron walks the top 200 destination + date combinations hourly and pre-fetches everything from suppliers.
The payoff: at peak the average "search-to-response" time is 2.1 seconds (warm cache hit) rather than 5.5 seconds (cold lookup). The difference is perceptible to the visitor â fewer impatient bounces.
The April 2026 crunch
And the real exam. 8 April, a Tuesday, 17:42, 312 concurrent searches. System metrics:
- API p95 response: 2.8 s (target: < 3 s â )
- DB connection pool use: 78% (max: 100, â )
- Redis memory: 4.1 GB / 8 GB allocated (â )
- Supplier 429s: 8 (Amadeus, recovered within 30 s, â )
- Failed bookings: 0 (â )
That hour produced the month's biggest revenue for a 14-person Hungarian agency: 41 successful bookings, 11.2 million HUF GMV. No customer ever saw a "system temporarily overloaded" error.
Lesson
Peak season is not a hardware purchase. Token-bucket rate limiting, BullMQ priorities, cache pre-warming and per-minute metric watching together give you the architecture to absorb the 8x swing. The 99.7% uptime behind a 14-person agency matches that of a large OTA â at two orders of magnitude less engineering spend.