Rate Limits

The Variational API uses weighted rate limiting for incoming requests. This means that rate limits are defined in terms of max incurred cost per unit of time, not simply the number of requests. This cost is measured in request tokens.

Most read requests cost 1 token; some more complicated read and write requests have higher costs. The exact cost for each endpoint is specified in the Endpoint Reference.


Each user company is allowed to spend 100 request tokens within a sliding 10-second window.

Therefore, simple reads are allowed at an average rate of 10 requests/second with bursts up to 100 requests/second followed by a cooldown period. Making more costly requests will reduce that rate.

The allowance is measured per company, regardless of how many different IPs or API keys are used.

A call exceeding the allowance during the current window will receive a HTTP 429 Too Many Requests response, which will include a header called X-Rate-Limit-Resets-In-Ms indicating the minimal delay in milliseconds after which the next request will be allowed.

Note that the recommended delay returned in the header is calculated with an assumption that there's only one client making requests to Variational API for the account. Multiple clients running in parallel must coordinate request throttling among themselves.

Python SDK

Client from the provided Python SDK retries rate limiting errors automatically by default.

When a rate limiting error is detected, Client will retry the request after sleeping for the duration indicated in X-Rate-Limit-Resets-In-Ms, plus for a small additional amount of buffer time increasing with each attempt.

This behavior can be turned off by setting the retry_rate_limits option to False explicitly:

from variational import Client, TESTNET

client = Client(API_KEY, API_SECRET, base_url=TESTNET, retry_rate_limits=False)

Last updated