API Rate Limiting

Instructure
Instructure
12 12 8,384

Last year we updated our API rate limit policy. Since then I've been asked quite a few questions about how the rate limits work and heard quite a few schools concerned their applications will be limited.

tl;dr The API is rate limited based on the API token (not the account, instance, or user). Exceeding rate limits will result in a 403 Forbidden (Rate Limit Exceeded) error. API calls made sequentially shouldn't be an issue. However, you may run into issues if you're running concurrent API requests with the same token.

Terms:

CostCPU time + database time (in seconds)
High Water MarkSize of the "bucket", this is the allowable maximum of un-leaked cost (default is 700 as of this writing)
LeakRate at which cost is subtracted from the bucket (outflow * timespan)
OutflowRate of bucket cost reduction (currently 10 units)
TimespanTime in seconds between the last check to the current check
Bucket costAggregate of un-leaked cost subtracted from the high water mark (remaining cost available in the bucket)

General

Canvas monitors API requests to prevent abuse and/or excessive use. We do this to ensure that the rest of the environment and other schools are not negatively impacted by a single user or application. Canvas limits are imposed at the token level, so it’s helpful to create a separate token for each application that has a significantly different purpose.

So how does Canvas determine rate limits?

Canvas employs a leaky bucket algorithm. Imagine a bucket with a hole in the bottom. You continually add water to the bucket and it trickles out of the hole in the bottom at a consistent rate. If you were to increase the inflow rate faster than the hole was able to drain the water you'd eventually fill the bucket. The top of the bucket is called a high water mark. When you reach your high water mark the water will overflow the sides of the bucket. The overflow would represent rate limited API calls.

In the leaky bucket algorithm the water represents the cost of API calls. Some API calls cost more than others. This is akin to larger water droplets. For example, a GET call to retrieve your user information will cost less than a POST call to create a new user. The cost of any given API request is determined as the CPU time (in seconds) plus the database time (in seconds).

When you initially create a request Canvas adds 50 units cost up front. After the true cost of the call is determined the 50 units cost is then subtracted. This is done to prevent multiple simultaneous requests from overflowing the bucket before Canvas can calculate the cost.  Sending multiple simultaneous calls will quickly fill the bucket regardless of the true cost of each call (until the upfront cost is removed).

As more and more requests are made the bucket will begin to fill. Once the bucket cost exceeds the high water mark you will receive a 403 Forbidden (Rate Limit Exceeded).

leaky-bucket.jpg

How does the bucket "leak"?

The bucket will leak at the rate of outflow * timespan. The outflow defaults to 10 units (but is subject to change). The timespan is a representation of time in seconds between the last check and the current check.

When a new API request is made Canvas will determine the bucket cost, the leak, and the request of the current call. It will then increment the bucket cost by the cost of the request minus the leak (e.g. Bucket Cost (HWM - sum of previous API cost) - Request Cost + Leak).

This means that a completely full bucket (assuming High Water Mark as of this writing of 700 units) could be completely emptied within 70 seconds.

This also means that running processes sequentially will result in no rate limiting. The bucket will leak faster than it's possible to fill (again assuming the same default settings as of this writing). Depending on the requests it's possible you could run some parallel processes without hitting your limit as well. This will vary from request to request but you probably wouldn't be able to a large number of parallel requests. Particularly when you account for the upfront cost.

How do I keep track all of this?

Canvas will include several parameters in the header of the response to your request. The field name X-Rate-Limit-Remaining is pretty self explanatory, it's the amount of cost available in the bucket. So a low number in this field can be cause for concern. You'll also receive X-Request-Cost which is the cost of the current call. So if you do need to run requests in parallel you can use this to determine the rate at which these requests will run.

I'm pretty sure I'm going to hit my rate limit, what are my options?

We've got a couple options in place to ensure you're able to effectively run your applications.

If you're running API calls on behalf of users developer tokens are a great option for you. For example, you've developed an application for students to submit an assignment. This is a great case for developer tokens; which allow you to programmatically generate individual access tokens for users. So when a user logs into your application you can generate a token for that individual user to submit their assignment. At the same time another student can login and generate their access token. Because we rate limit based on token these two students won't interrupt each other from submitting their assignment.

12 Comments
Learner II

This is an interesting article Trevor​. I'm a bit concerned as we just implemented UDOIT at our institution which is self-hosted. This application makes use of instructor-level API tokens and performs several GET requests in succession when a course is scanned for accessibility issues. When hundreds or even potentially thousands of instructors are performing scans of content, does this pose a risk with your API rate limit policy?

Thanks!

Instructure
Instructure

Hi JEFHQ12951

Unfortunately, I'm not too familiar with UDOIT. Any issues with the API rate limit would depend on how UDOIT was developed. Do you know if UDOIT makes simultaneous API requests? Because the cost of an API call is roughly based on the time it takes to process that request and the bucket empties at a rate faster than real time an application making a single request at a given time is pretty unlikely to be throttled. If it does make multiple simultaneous requests it would be up to the application to monitor the remaining cost available in the bucket.

Explorer II

I believe tr_jbates​ developed UDOIT. Jacob?

Learner II

Thanks to dgrobani​ for bringing me in on the conversation.  Hopefully I can clear things up.

Each user of UDOIT is actually using their own API key (obtained through the Oauth2 process), and all API calls are done sequentially (rather than simultaneously).  My understanding is that since the bucket "empties" faster than sequential requests can "fill" it, we should never run into any rate limiting issues.  Also, the rate limit is on a per-user basis, so it shouldn't matter if 1 person or 100 people are using UDOIT simultaneously.

Does that sound right tfullwood@instructure.com​?

Learner II

Thanks for explaining, Jacob

Trevor, can you confirm these assertions?

Instructure
Instructure

Hey JEFHQ12951​ and tr_jbates

Based on what tr_jbates described this sounds well designed as far as rate limits are concerned. And tr_jbates​ you are correct. The only clarification I have is that the throttling is done at the individual token level rather than the user level.

Learner II

This satisfies my concern. Thanks, Trevor.

For my own benefit, what's the difference between an individual token level and the user level? Are you referring to the Developer key?

Instructure
Instructure

I'm referring to the individual's API token (in this case a specific user's token generated by the OAuth process). A single user could have any number of access tokens associated with their account. We're going to throttle specifically to a token, so API using applications will be throttled separately.

Surveyor II

tfullwood@instructure.com​ thanks for this info, I have been aware of throttling, but have not yet experienced it and have not understood how it is applied.

I have written an algorithm to transfer grades from Canvas courses to the SIS, which is a multi-tasking process, processing multiple courses simultaneously using the same token.

So far there has not been a problem and I have not received the error "403 Forbidden (Rate Limit Exceeded)".

Thanks again.

Community Member

Thank you for the API Rate Limiting policy information.

Community Member

When you initially create a request Canvas adds 50 units cost up front. After the true cost of the call is determined the 50 units cost is then subtracted. This is done to prevent multiple simultaneous requests from overflowing the bucket before Canvas can calculate the cost. Sending multiple simultaneous calls will quickly fill the bucket regardless of the true cost of each call (until the upfront cost is removed).

Does Canvas still add 50 units cost up front? I've run some tests with concurrent calls in C# and it looks like I lose about 15 points per call. 

For example: I make a large list of API calls, 2 at a time. The first call returns with an X-Rate-Limit-Remaining of 700, the second call returns with an X-Rate-Limit-Remaining of 685. The next group of calls waits until both of those calls have returned before launching themselves, and they return with the same values. The cost of each call varies between 0.05 - 0.12.

After further tests it appears that I'm able to make 45 concurrent calls before I get close to my 700 point limit (less than 100 remaining). The next batch of 45 calls start running right away once the first returns and my (actual total cost is close to 4 points so no problems there) but the next batch starts off with a full 700 points remaining.

(Also, is there a recommended API call I can use to cheaply see my X-Rate-Limit-Remaining?)

Community Member

I need an recommendation / solution for this situation as noted in https://community.canvaslms.com/thread/53565