Rate Limiting

Access to Voicegain resources is controlled using the following limit settings on the account. Newly created accounts get the limit values listed below. If you need higher limits please contact us at support@voicegain.ai

The limits apply to the use of the Voicegain Platform in the Cloud. On the Edge, the limits will be determined by the type of license you will purchase.

Types of Rate Limits

Limit	default value	description
apiRequestLimitPerMinute	75	Basic rate limit with a fixed window of 1 minute applying to all API requests. Requests to /data API will be counted at 10x other requests.
apiRequestLimitPerHour	2000	Basic rate limit with a fixed window of 1 hour applying to all API requests. Requests to /data API will be counted at 10x other requests.
asrConcurrencyLimit	4	Limit on number of concurrent ASR requests. Does not apply to OFF-LINE requests.
offlineQueueSizeLimit	10	Maximum number of OFF-LINE transcription jobs that may be submitted to the queue.
offlineThroughputLimitPerHour	4	Maximum number of hours of audio that can be processed by OFF-LINE transcription within 1 hour. Note: For Edge deployment the limit interval is per day instead of per hour.
offlineWorkerLimit	2	Maximum number of OFF-LINE transcription job workers that will be used to process the account audio.

For API requests running longer that the rate limit window length, the request count will be applied to both the window when the request started and the window when the request finished.

Every HTTP API request will return several rate-limit related headers in its response. The header values show the applicable limit, the remaining request count in the current window, and the number of seconds to when the limit resets. For example:

RateLimit-Limit: 75, 75;window=60, 2000;window=3600
RateLimit-Remaining: 1
RateLimit-Reset: 7

When Rate Limits are Hit

If a rate-limit is hit then 429 Too Many Requests HTTP error code will be returned. The response headers will additionally include Retry-After value, for example:

RateLimit-Limit: 75, 75;window=60, 2000;window=3600
RateLimit-Remaining: 0
RateLimit-Reset: 6
Retry-After: 6

If asrConcurrencyLimit is hit then the response headers will contain:

X-ResourceLimit-Type: ASR-Concurrency
X-ResourceLimit-Limit: 4
RateLimit-Limit: 0
RateLimit-Remaining: 0
RateLimit-Reset: 120
Retry-After: 120

Note that we return a superset of values that are returned for a basic API request limit. This will allow a client code that was written to handle basic rate limiting to be able to handle concurrency limiting too.

Note also that for the concurrency limit the Retry-After value is approximate and is not guaranteed - so client code may have to retry multiple timers. (We will return increasing back-off Retry-After values in case of the limit being hit multiple times.)

In case of offlineQueueSizeLimit limit we will return, for example:

X-ResourceLimit-Type: Offline-Queue-Size
X-ResourceLimit-Limit: 10
RateLimit-Limit: 0
RateLimit-Remaining: 0
RateLimit-Reset: 120
Retry-After: 120

Rate Limiting

Types of Rate Limits

When Rate Limits are Hit

Related articles