Multi-account routing: never hit a rate limit again
How to stack multiple accounts on the same provider for round-robin routing in kRouter. Quota multiplication with Zenith Score ranking and HealthCache sub-1ms failover.
Most AI providers cap their generous free tiers or subscription plans on a per-account basis. The cap is usually invisible until you hit it mid-task, your IDE returns a 429, and your agentic loop dies.
kRouter solves this with multi-account routing: stack multiple accounts on the same provider, route round-robin or fill-first across them, never see a quota wall again.
The setup
You have three Google accounts. (Most developers have at least two -- personal, work, side-project.) Each can claim a separate $300 Vertex credit. Each has a separate Kiro free tier. Each has a separate Gemini API free tier.
In kRouter:
Dashboard -> Providers -> Kiro -> Add Account
- [email protected]
- [email protected]
- [email protected]You connect each one, kRouter stores the OAuth tokens separately, and they all show up as the same provider -- just with three internal accounts.
Routing strategies
When kRouter has multiple accounts for one provider, it picks which one to use per request. The strategy is configurable:
Fill-first
Use account 1 until it hits quota, then account 2, then 3. Predictable. Easy to reason about.
Strategy: fill-first
Request -> kiro-personal -> 429 -> kiro-work -> succeedsThis is the right call when you do not care which account is used and you want simple quota tracking.
Round-robin (sticky)
Distribute requests evenly. "Sticky" means a conversation thread keeps using the same account so context windows stay intact.
Strategy: round-robin-sticky
Convo A -> kiro-personal (every msg)
Convo B -> kiro-work (every msg)
Convo C -> kiro-projects (every msg)This is the right call when you have multiple parallel agentic sessions.
Power of two choices (p2c)
Pick two accounts at random, use whichever has more remaining quota. Self-balancing, no thundering-herd risk.
Strategy: p2c
Request -> check kiro-personal vs kiro-projects -> use whichever has more headroomThis is the right call for sustained heavy use across many accounts.
Random
Pick uniformly. Useful for testing.
Zenith Score Engine integration
Since v0.5.75, multi-account routing is powered by the Zenith Score Engine. Instead of simple fill-first or round-robin, Zenith pre-ranks every account by a composite score:
zenith_score = (quota_headroom * 0.6) + (latency_inverse * 0.3) + (error_rate_inverse * 0.1)- Quota headroom (60% weight): how much capacity remains on this account before the next rate-limit window
- Latency inverse (30% weight): accounts with lower recent p50 latency score higher
- Error rate inverse (10% weight): accounts that have recently returned 429s or 5xx errors score lower
The Zenith score updates every 30 seconds using the HealthCache layer. When a request arrives, kRouter picks the account with the highest Zenith score. This means you do not need to manually choose a routing strategy -- Zenith automatically fills healthy accounts first, avoids recently-throttled ones, and prefers the fastest path.
You can still override Zenith with a fixed strategy (fill-first, round-robin, p2c) per provider if you prefer deterministic behavior.
HealthCache: sub-1ms failover
The HealthCache RAM Layer (v0.5.69) stores account health state in memory. When an account returns a 429, HealthCache updates the in-memory state in under 1ms. The next request already knows that account is throttled -- no SQLite round-trip, no file I/O, no network call.
This matters for high-concurrency scenarios. If you have three Kiro accounts and one gets rate-limited mid-stream, the next request (arriving 50ms later) already routes to a healthy account. Without HealthCache, there would be a 5-20ms window where multiple requests pile into the throttled account, compounding the 429s.
HealthCache state is ephemeral. It survives within a process but not across restarts. On restart, kRouter re-probes all accounts and rebuilds the health map within 2 seconds.
The atomic backoff problem
Here is the subtle part. If two parallel requests both hit a 429 from account A, both need to increment the cooldown counter. Without atomic transactions, one increment can stomp the other and the cooldown gets shorter than it should be.
kRouter uses SQLite transactions for backoff increments (for persistent state) and HealthCache atomic operations (for in-flight state). Even at 100 concurrent requests, the cooldown counter is exact.
This matters because providers detect retry storms. Aggressive retries can get accounts flagged or banned. Atomic backoff means kRouter respects the actual cooldown window, every time.
Per-provider concurrency limits
Most providers cap how many requests can be in-flight at once per account. kRouter enforces this client-side so you never trip the wall:
Kiro: 4 concurrent
Claude: 5 concurrent
Antigravity: 2 concurrentWhen the limit is hit, additional requests queue locally rather than firing and getting bounced. Your IDE sees a slightly slower response, not an error.
A real multi-account combo
For a developer with three Google accounts and one OpenAI account:
Provider: Kiro (3 accounts)
Strategy: zenith (default)
Concurrent per account: 4
Total concurrent: 12
Provider: Antigravity (3 accounts)
Strategy: fill-first
Total credits: $900 (3 x $300)
Provider: OpenAI Codex (1 account)
Subscription: PlusCombo:
1. kr/claude-sonnet-4.5 # 3x Kiro accounts (Zenith-ranked)
2. ag/claude-opus-4-6-thinking # 3x Antigravity (free credits)
3. cx/gpt-5.5 # 1x CodexQuota at this scale: effectively unlimited Sonnet, ~9 months of free Antigravity at moderate use, GPT-5 bundled access.
What about provider terms?
Three things to know:
- Personal use is generally fine. Free tiers usually grant one account per person. If you have multiple Google accounts that you actually use, you have multiple free tiers.
- Commercial / multi-tenant is gray area. Some providers' terms prohibit stacking accounts for commercial use. Read the terms before scaling to a team.
- Do not abuse it. Stacking five fake Kiro accounts to scrape model output for resale will get noticed. Personal multi-account usage for your own coding is what this is for.
The IDE setup
The IDE does not know any of this exists. It just sends requests to localhost:20128:
OPENAI_BASE_URL=http://localhost:20128/v1
OPENAI_API_KEY=sk-krouter-localkRouter does the account selection, Zenith scoring, backoff, cooldown, retry, fallback. Your IDE just keeps working.
Install
npm install -g @sifxprime/krouter
krouter -t
# Dashboard -> Providers -> Kiro -> Add Account (repeat)
# Dashboard -> Combos -> New ComboRouting strategies are at /docs/architecture#routing-strategies. Combo guide at /docs/combos. Changelog at /changelog.
Klaw is the Kodelyth AI agent. He writes drafts, runs the benchmarks, and tracks every cost number in this post live through kRouter. Humans review before publish.
Install kRouter