The Instructure Community will enter a read-only state on November 22, 2025 as we prepare to migrate to our new Community platform in early December. Read our blog post for more info about this change.
We’re starting to look into some of the data from the Canvas Data with the requests table. We are looking at the URL column to understand what requests are being made most often--and which of those we need to filter out because they are made by automated jobs that we have running. Some of them are intuitive, such as /courses(.:format) taking users to the My Courses expanded listing. However, some of them are more obscure, such as when the route is /api/v1/courses/:course_id/ping(.json)(.:format) or /images/thumbnails/:id/:uuid(.:format). Has anyone started pulling together any resources about these requests? I've been taking a look at the code in general and canvas-lms/routes.rb in particular, but there aren't many comments in the code. It would be really helpful to have this information digestible in plain English
I have the same question. How can we filter out the "noise" and be able to tell which resources are being used when...
Hi, Ruby
This isn't an answer to Marlee's original question, but it's possible that the solution that we've started looking into at Harvard might be along the lines of what you're asking. Our first pass at processing is to convert the raw requests into collections of requests to a given Rails endpoint, and to parse out the parameters that are passed through the URLs. Basically what we're doing is standing up an instance of the open source Canvas code, and asking it to parse each of the lines in the requests table for us. That way we can get Rails to do the hard work of figuring out the parameters for courses, users, groups etc, and tell us where in the Canvas platform the request was routed to. We initially tried to do the same thing using regular expressions, but the intricacies of Rails routing made things unmaintainable pretty fast.
The code is at:
canvas-data-tools/parse_request_urls at master · penzance/canvas-data-tools · GitHub
There are instructions there for how we quickly stand up a Canvas server, as well as basic usage of the script. It's open source although, of course, very early Alpha quality. Bear in mind that this is only the first step in processing the data. We're currently working on other scripts to break down requests by user, course, dates and so on.
Marlee's original question regards the next step in our process - figuring out what those endpoints actually do. That takes rather more detective work, although through a combination of watching traffic go in and out of the browser, inspecting the Canvas code and good old-fashioned guesswork we're gradually figuring out some of the most commonly used routes. Instructure folks: Any help that you can give us around documentation of what individual routes are used for would be a big help.
20764325, @rubyn and @phil_mcgachey -- you may be interested in attending this webinar: Getting Started with Canvas Data.
![]()
There is some information available through the page view object through the Users API that isn't available in the Canvas Data. It might help with the detective work.
The page view object has "user_request", which is described as "A flag indicating whether the request was user-initiated, or automatic (such as an AJAX call)". It is true when the user makes the call itself and false when it is automated.
Now, the problem is that it doesn't come with requests table, so you can't use the information directly in Canvas Data. But if you come across an item you're not sure about, you can go to the API and retrieve the the corresponding data from there and then classify the request.
Another item missing from Canvas Data, but available through the API, is whether the item counted as a participation or not. That's under "participated" which is described as "True if the request counted as participating, such as submitting homework". A similar process using the API could be used to determine whether you should count a request as a participation.
A third item I see missing is the "render_time", which is "How long the response took to render, in seconds". This isn't going to help at all in classification purposes since it varies for every request. Where it might useful is if you're trying to determine when there was slowness in the system or places that need optimization. These are generally things Canvas should be looking at, but at least one use case was when a person was trying to troubleshoot a quiz that froze for some students at the beginning and end of a quiz but was fine in the middle.
Community helpTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign inTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign in
This discussion post is outdated and has been archived. Please use the Community question forums and official documentation for the most current and accurate information.