@kenneth_robinso
The safest way is to make one request at a time, sequentially. Canvas has said that you should not exceed your limit doing that.
It is unreasonable with 20k students to do that.
You can pay attention to the x-rate-limit-remaining and the x-request-cost headers. In theory, this is more difficult that it seems, perhaps depending on the library you're using to make the API calls. You can make concurrent calls, but the 50 penalty at the beginning limits how many can be made at a time.
The best solution I have found is to stagger the requests so that Canvas has time to calculate the costs and not apply that 50 penalty all at once. The delay doesn't have to be great, 25-50 ms depending on the call that you're making, but making it longer is safer and less likely to get an error. I say that because some calls are more expensive than others. I also request per_page=50 in most cases since it's quicker to get 50 than 100.
- For enrollment data, I allow up to 50 simultaneous requests but stagger them by 100 ms.
- For the terms, I allow 30 simultaneous requests but stagger them by 100 ms.
- For the courses, I allow 30 simultaneous requests but stagger them by 50 ms.
- For the assignments, I allow 40 simultaneous requests but stagger them by 100 ms.
- For submissions, I allow 30 at a time with a delay of 250 ms. This is because I'm fetching submission_history and that can be really large.
In most cases, the delay is the limiting factor, not the number of concurrent requests allowed. If I delay each request by 50 ms, then I can only make 20 per second, so the concurrent limit would only come into play if they take longer than a second to complete.
Along the way, I allow each type of request to empty the queue before starting the next type. That allows me to get back up to the 700 limit for each new type. In another program I wrote recently, I started downloading the user list (only 20 at a time, I think) and then making calls off of it before I finished downloading the entire user list.
The code I described took 230 seconds to make 919 requests last night. We're slow because it's summer. In a regular term, it might take 13 minutes to run. I haven't gotten any error messages with timeouts and I've been running this nightly for about 1.5 years now.
We are also a much smaller school than you are wanting to handle, but I don't fetch individual user information as I can get what I need as part of another call. You may also be able to incorporate the graphQL interface that allows you to get select information from multiple tables in one call rather than having to make calls to each of those APIs. You will still have to mess with pagination, though.
This question isn't really helpful as you may think it is.
What is the best way to call the APIs if I want to get the LMS record for every student, ...
What do you mean by "LMS record for every student" ?
There is user information, there is enrollment data, there is submission data, there are analytics, there are ...
Knowing what you're trying to fetch can help figure out the best way to get it.
Another data source you may be able to use is Canvas Live Events. Keep data on your end and let Canvas let you know when it changes. Then you don't have to download the complete set of data every time.