Skip to main content

API Performance & Optimization

Performance benchmarks, latency targets, and optimization strategies to get the best results from the Vertaa API.

Response Time Benchmarks

Alpha note: the values below are internal benchmarks and latency targets. Production telemetry will be published on the Status page once monitoring is fully enabled.

POST /v1/audit (basic mode)

Single-page audit

Target: <2s p95
800ms
p50 (median)
1.8s
p95
3.5s
p99
POST /v1/audit (deep mode)

Multi-page crawl (5 pages avg)

Target: <30s p95
8s
p50 (median)
25s
p95
45s
p99
GET /v1/audit/:job_id

Retrieve audit result

Target: <100ms p95
35ms
p50 (median)
85ms
p95
150ms
p99
GET /v1/usage

Account usage statistics

Target: <50ms p95
15ms
p50 (median)
42ms
p95
80ms
p99

Note: Response times exclude network latency between your server and our API. Measurements are taken at our edge network. Add ~20-100ms for typical internet latency.

Throughput Limits

API throughput limits and concurrent request limits by plan tier
PlanRequests/MonthConcurrent RequestsBurst Limit
Free3 audits/day15/min
Pro1,000 audits/month230/min
EnterpriseCustom (10k-1M+)10100/min

Rate Limit Headers

All responses include standard rate limit headers:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1736553600

When rate limit is exceeded, you'll receive a 429 Too Many Requests response with a Retry-After header.

Optimization Strategies

1. Use Basic Mode When Possible

Basic mode is 10x faster than deep mode (800ms vs 8s median). Only use deep mode when you need multi-page crawling.

Recommendation: Start with basic mode for landing pages, use deep mode for full-site audits during CI/CD builds.

2. Cache Audit Results

Audit results rarely change once completed. Cache them aggressively:

  • Cache completed audits for 1-24 hours depending on update frequency
  • Use ETags or Last-Modified headers for conditional requests
  • Store job_id mappings to avoid duplicate audits for same URL

3. Use Webhooks Instead of Polling

Polling wastes API quota and adds latency. Webhooks deliver results instantly:

❌ Polling
  • 30+ API calls per audit
  • Delayed results (2-5s latency)
  • Wastes quota on status checks
✅ Webhooks
  • 1 API call per audit
  • Instant delivery (<500ms)
  • No quota wasted

4. Batch Operations

For multiple URLs, batch requests with delays to avoid rate limits:

// Bad: All at once (rate limit)
await Promise.all(urls.map(url => createAudit(url)));

// Good: Sequential with delay
for (const url of urls) {
  await createAudit(url);
  await sleep(1000); // 1 second between requests
}

// Better: Concurrent with limit
import pLimit from 'p-limit';
const limit = pLimit(2); // Max 2 concurrent
await Promise.all(
  urls.map(url => limit(() => createAudit(url)))
);

5. Use NDJSON Streaming for Large Results

For audits with 100+ issues, use the streaming endpoint to avoid timeouts:

// Stream large result sets
GET /v1/audit/{job_id}/issues/stream

// Filters available:
?severity=critical,high
?fields=message,selector,impact
?page=1&per_page=50

6. Optimize Page Weight

Audit speed correlates with page load time. Optimize your pages:

  • Minimize JavaScript bundle size (<500KB recommended)
  • Lazy load images and heavy resources
  • Use CDN for static assets
  • Reduce external dependencies (fonts, analytics)
Impact: A 3s page load becomes a 5s audit. A 10s page load becomes a 15s audit.

Best Practices

Implement Exponential Backoff

When polling, use exponential backoff: start at 1s, increase to 2s, 4s, 8s (max 10s). Prevents hammering the API during slow audits.

Set Reasonable Timeouts

Basic mode: 30s timeout. Deep mode: 120s timeout. Handle timeouts gracefully and retry with exponential backoff.

Monitor Rate Limit Headers

Always check X-RateLimit-Remaining before making requests. Queue requests client-side when limits are low.

Use Conditional Requests

Send If-None-Match with ETag from previous response. Get 304 Not Modified when result hasn't changed, saving bandwidth.

Parallelize Independent Requests

Different endpoints (GET /usage, GET /audit/:id) can run in parallel. Respect concurrent request limits for your tier.

Deduplicate Requests

Track in-flight audit requests to prevent duplicate audits for the same URL. Use job_id as idempotency key.

Language-Specific Tips

Node.js / TypeScript

  • Use undici or node-fetch for faster HTTP
  • Enable HTTP/2 with keepAlive: true
  • Use p-limit for concurrent request control
  • Cache with Redis or in-memory LRU cache

Python

  • Use httpx with async/await for parallel requests
  • Enable connection pooling with limits=httpx.Limits(max_keepalive_connections=10)
  • Use asyncio.Semaphore to limit concurrency
  • Cache with Redis or cachetools

Ruby

  • Use faraday with persistent connections
  • Enable HTTP/2 with faraday-http2 adapter
  • Use concurrent-ruby for parallel requests
  • Cache with Rails.cache or Memcached