API Performance & Optimization
Performance benchmarks, latency targets, and optimization strategies to get the best results from the Vertaa API.
Response Time Benchmarks
Alpha note: the values below are internal benchmarks and latency targets. Production telemetry will be published on the Status page once monitoring is fully enabled.
POST /v1/audit (basic mode)Single-page audit
POST /v1/audit (deep mode)Multi-page crawl (5 pages avg)
GET /v1/audit/:job_idRetrieve audit result
GET /v1/usageAccount usage statistics
Note: Response times exclude network latency between your server and our API. Measurements are taken at our edge network. Add ~20-100ms for typical internet latency.
Throughput Limits
| Plan | Requests/Month | Concurrent Requests | Burst Limit |
|---|---|---|---|
| Free | 3 audits/day | 1 | 5/min |
| Pro | 1,000 audits/month | 2 | 30/min |
| Enterprise | Custom (10k-1M+) | 10 | 100/min |
Rate Limit Headers
All responses include standard rate limit headers:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1736553600When rate limit is exceeded, you'll receive a 429 Too Many Requests response with a Retry-After header.
Optimization Strategies
1. Use Basic Mode When Possible
Basic mode is 10x faster than deep mode (800ms vs 8s median). Only use deep mode when you need multi-page crawling.
2. Cache Audit Results
Audit results rarely change once completed. Cache them aggressively:
- Cache completed audits for 1-24 hours depending on update frequency
- Use ETags or Last-Modified headers for conditional requests
- Store job_id mappings to avoid duplicate audits for same URL
3. Use Webhooks Instead of Polling
Polling wastes API quota and adds latency. Webhooks deliver results instantly:
- 30+ API calls per audit
- Delayed results (2-5s latency)
- Wastes quota on status checks
- 1 API call per audit
- Instant delivery (<500ms)
- No quota wasted
4. Batch Operations
For multiple URLs, batch requests with delays to avoid rate limits:
// Bad: All at once (rate limit)
await Promise.all(urls.map(url => createAudit(url)));
// Good: Sequential with delay
for (const url of urls) {
await createAudit(url);
await sleep(1000); // 1 second between requests
}
// Better: Concurrent with limit
import pLimit from 'p-limit';
const limit = pLimit(2); // Max 2 concurrent
await Promise.all(
urls.map(url => limit(() => createAudit(url)))
);5. Use NDJSON Streaming for Large Results
For audits with 100+ issues, use the streaming endpoint to avoid timeouts:
// Stream large result sets
GET /v1/audit/{job_id}/issues/stream
// Filters available:
?severity=critical,high
?fields=message,selector,impact
?page=1&per_page=506. Optimize Page Weight
Audit speed correlates with page load time. Optimize your pages:
- Minimize JavaScript bundle size (<500KB recommended)
- Lazy load images and heavy resources
- Use CDN for static assets
- Reduce external dependencies (fonts, analytics)
Best Practices
Implement Exponential Backoff
When polling, use exponential backoff: start at 1s, increase to 2s, 4s, 8s (max 10s). Prevents hammering the API during slow audits.
Set Reasonable Timeouts
Basic mode: 30s timeout. Deep mode: 120s timeout. Handle timeouts gracefully and retry with exponential backoff.
Monitor Rate Limit Headers
Always check X-RateLimit-Remaining before making requests. Queue requests client-side when limits are low.
Use Conditional Requests
Send If-None-Match with ETag from previous response. Get 304 Not Modified when result hasn't changed, saving bandwidth.
Parallelize Independent Requests
Different endpoints (GET /usage, GET /audit/:id) can run in parallel. Respect concurrent request limits for your tier.
Deduplicate Requests
Track in-flight audit requests to prevent duplicate audits for the same URL. Use job_id as idempotency key.
Language-Specific Tips
Node.js / TypeScript
- Use
undiciornode-fetchfor faster HTTP - Enable HTTP/2 with
keepAlive: true - Use
p-limitfor concurrent request control - Cache with Redis or in-memory LRU cache
Python
- Use
httpxwith async/await for parallel requests - Enable connection pooling with
limits=httpx.Limits(max_keepalive_connections=10) - Use
asyncio.Semaphoreto limit concurrency - Cache with Redis or
cachetools
Ruby
- Use
faradaywith persistent connections - Enable HTTP/2 with
faraday-http2adapter - Use
concurrent-rubyfor parallel requests - Cache with Rails.cache or Memcached