Chip ( @maguire ),
I am always impressed by the work your students do for their projects and your willingness to share them. Your student research serves as a gentle reminder of how little I know when it comes to computer science. It makes me appreciate people like you who know what you're doing. Even more appreciated is that that you take the time to share that understanding with people. Thanks for filling in many of the details I omitted or was unaware of.
This message is going to seem like I'm rambling compared to your well-organized comments. I'm on break and really need to be working on classes for next semester.
One thing that popped out at me as I was reading that section on throttling (I didn't read the entire paper) is that the paper only looks at the time required to compute the throttling. Having 500 users (presumably with their own API tokens) making the same request simultaneously doesn't get slowed down by the rate limiting because it is per user. The request was made on a course with 6 enrollments so pagination wasn't used either, which allowed it to focus on just the throttling code.
While interesting in its own right, I felt it doesn't measure what @i_oliveira is trying to achieve here. We don't want to optimize 500 users concurrently downloading the same results, we want to optimize 1 user downloading multiple (possibly 500) pages. Stress testing your own system is important, but Instructure frowns on using multiple user accounts and tokens to bypass the rate limiting. Pagination is only mentioned once in the paper, where on page 50 it mentions using your Python code with a time.sleep() between calls. At one of the InstructureCon conferences, I went to a presentation by the software engineers who said that the rate limiting is such that if you sequentially make calls, you don't need to worry about hitting the limit. I haven't done testing on it, but when I sequentially make calls, I have never ran into rate limiting issues. With testing on multiple calls across a network to an Instructure hosted site that implemented rate limiting, I would max out at around 10 API calls per second, well below the 22 to 26 given in the paper. Again, that is heavily dependent on the calls being made.
I see rate limiting as one of those necessary evils (adds a little time to each request) for the better good of keeping the system usable and responsive for users. If you're running your own instance, then you could query the database directly to get the information that you need rather than relying on the API.
The pipelining of requests through a single connection is what lead me to implement a throttling library in my publicly-shared code. My early JavaScript code was written to run in the browser and the browsers themselves limited the connections per site to about 6 at a time. That was a built-in throttling and I didn't have to do any on my own. When they started using the multiple requests per connection, that built-in throttling was gone and my scripts started failing. My early code was in PERL, which didn't allow me to make multiple requests, so it wasn't an issue then. I then went through a PHP phase, but it API calls were still serialized. After working with JavaScript in the browser, I started using Node JS for my scripts and was finally able to take advantage of asynchronous calls.
My use of the Bottleneck library should be taken more as me documenting what I use rather than a recommendation of it. When I choose to rely on someone else's library, I look at whether it works, popularity, functionality, ease of use, documentation, and sometimes size. I tried other throttling libraries, but settled on Bottleneck for those reasons. I admit there are limitations to it that I found very frustrating at first. Then I came to the realization that I should play nice and don't have to hammer the Canvas system as fast as the x-rate-limit-remaining header will allow. We're a small school, and taking 13 minutes vs taking 8 minutes in the overnight hours isn't a big deal to us.
In other cases, speed is an issue. There are some scripts I run from within the browser while the user is waiting for the results to be fetched. Thankfully, those are infrequently executed scripts, but the user will have to wait a while to remove the missing flags or the system will time out. I feel like there should be a way to make Bottleneck work better, but sometimes you need to be a computer scientist to understand the documentation. You're right that as of right now, I'm simply using it as a simple rate limiter. It was frustrating because I added code to check all those other values and came up with a system for scaling back based on those two headers, but the system ignored it. Eventually, I just removed the code as I wasn't using it.
I also store the results of my API calls in a local database and then query it for most things. That means that I get a complete list of courses quickly, but the information may be up to a day old. For reporting processes, that's fine. If you need real-time data, the API endpoint that allows you to list the courses allows you to filter the data, including by date, so it may be possible to get the list that you want without having to fetch everything. You can use Live Events (the Canvas Data Services you mentioned) to further reduce the delay. By setting up your own endpoint, you can receive near real-time notifications of when courses are created or updated. It is not 100% reliable, though, and so you still need to periodically obtain the information in another manner.
GraphQL has potential, but lack of filtering is one of my major gripes with Canvas' GraphQL implementation. In theory, it's nice to be able to download just the information you want, but in practice you have to download more than you need so you can filter it on the client side. Another complaint is that I like the structure-agnostic style of the REST API that gives me an array of objects. That makes pagination easier to handle. With GraphQL, you need to know the structure of the object so that you can traverse it and pagination can appear at multiple locations. There are GraphQL libraries, but I haven't found one that meets my requirements yet so I still do most things through the REST API.
One thing to understand with Canvas is that there is not a single source that has all of the information you might need (unless you're self-hosting). You can use the REST API, GraphQL, Canvas Data, Live Events, and the web interface. Some information is available in multiple locations, but sometimes you can only get the information from one source.