Your Community is getting an upgrade!
Read about our partnership with Higher Logic and how we will build the next generation of the Instructure Community.
Found this content helpful? Log in or sign up to leave a like!
Looking to discuss these features from the 2024-10-09 Release Notes? Post a reply and start a conversation!
Interface Updates
Other Updates
Feature Options
ADD FEEDBACK FOR THESE FEATURES IN Q&A TOPIC
- This topic is for feature discussion only—Please share use cases, best practices, questions, clarifications, etc. regarding this feature.
- Please thread replies as possible to keep posts organized. This will help everyone to track discussions and to participate.
ADDITIONAL FEEDBACK—WHERE TO POST IDEA SUGGESTIONS AND SUBMIT BUG REPORTS
- Idea requests for the Product Team to consider should be submitted in ideas.canvaslms.com . This is the best way to ask for additional or alternative functionality than what is included in the release. Linking to the related idea here so others can find it is encouraged!
- Bug reports should be submitted to Canvas Support. This way, cases can be connected and uniform updates can be provided to those involved. Bugs will not be triaged in this thread
Hi Team,
I really feel like the GraphQL changes are pretty major for any developers who have implemented GraphQL into any of their code. With that said, it seems like this change should be included in a release rather than a deploy, and probably be announced much sooner than a couple weeks before the change is implemented.
Is there any chance this change and the release schedule for it could be reviewed and modified to minimize the risk of breaking a bunch of 3rd party apps and custom code?
-Chris
I agree, going from unlimited to limited is going to be a big change. Many libraries likely didn't even implement pagination since it wasn't a thing, and are going to only end up getting the first 10 results without code changes.
It seems like implementing a timeout would have been a less disruptive approach. Having to page through 200 pages and make 200 HTTP requests for a graphql query that previously ran for 5 seconds and returned 20K results (like all course id's in a term) seems much less efficient to features in Canvas and external apps.
I want to mirror @chriscas sentiment.
I understand that if we're trying to prevent denial of service that something needs done (and relatively quickly), but this seems excessive to the point that the benefits of GraphQL (as previously implemented) are lost.
There are some requests that are very simple and don't take a lot of resources. We are a smaller institution. I can fetch 419 courses from Spring 2024 in 300 ms using a single GraphQL request. It takes 160 ms to get 100 results, so I'm now looking at 800 ms to make 5 requests to get those same 419 courses. That's a quick and one could argue that 800 ms vs 300 ms is still pretty quick.
I wonder if it might actually be less load on your servers to return all of the results from a simple request like this rather than making the same request five times -- or 200 times like @jonespm mentioned. If I make 200 sequential requests at 160 ms each, that's 32 seconds, plus the extra load for your servers to process each of those requests and query the database.
Perhaps, there could be some kind of limit such as letting requests with no more than x (pick a number, 3?) connections return unlimited responses while those with more than x connections require pagination. I'm not a fan of that because I acknowledge that some connections are more costly, like the enrollments connection.
Maybe you could just base it on a percentage of the limits since you're already computing the depth, complexity, and number of tokens to decide whether to return an error. Say requests that are under 1/3 of the limits allow unlimited responses. If the max depth is less than 5, the complexity is less than 125,000, and there are less than 1666 tokens, then you get unlimited results. That's just a suggestion, but it seems that someone with a query that complex is expecting it to timeout with more than 60 seconds anyway, so they've probably already implemented pagination.
Did you look at requests to see which ones seem legitimate when you arrived at these numbers? Did you reach out to the institutions making those calls to see if they could simplify them?
Other suggestions include implementing some kind of x-rate-limit-remaining strategy like what is used for the REST API would be helpful to people. Perhaps reducing the amount of time allowed from 60 seconds to 30 or 15 seconds would cut back on the denial of service attacks. Then people writing long queries would know they would have to involve pagination.
But changing from unlimited to restricted is a major breaking change for some of us. Not just a case of "no error, but fewer values are returned than expected."
Part of the great benefit of the GraphQL over REST API was the ability to get all of the results at once. It reduced the complexity of the code. As you can guess, none of my GraphQL queries use pagination. I also don't use a library for GraphQL because it's a simple fetch response that I need to then parse. The GraphQL libraries for JavaScript were pretty substantial additions and, if you're running inside a browser, you don't want to download extra libraries you're not going to need.
For Canvas, it's not a big deal. They have the Apollo library and can make the change in one place.
For some of us, it's a non-trivial change. While no error is returned, you just get fewer responses. If someone misses this change (thank you Chris for alerting me; I had a subscription to API changes but this showed up under a different thread), all of a sudden some things just stop processing. I use the "graded since" and "submitted since" for submissions to reduce the amount of information requested. That is frequently under 20, so I wouldn't think anything about it and assume I had fetched all of the information and update my timestamp for the next query. The extra records would never get processed and we would wonder why students weren't getting handled properly. Eventually, we would track it down.
Worse, if someone is assuming that all results are being returned, they may be relying on that behavior and assuming a lack of a record means it's not there. I have code that automatically downloads submissions for an LTI assignment and puts zeros in for students who haven't submitted yet. That code would now put in zeros for the students who had submitted but just didn't happen to be in the first 20 items returned.
There's a huge difference between 20 responses and knowing that the rest haven't been fetched and thinking that there's just 20 responses in total. In some cases, a complete failure would be better than a partial failure.
Making multiple requests and merging results from GraphQL is not as trivial as merging results from the REST API. The merging could appear at various levels of nesting with GraphQL while they were as simple as combining a top-level array with the REST API.
My primary job is a full-time mathematics professor and I handle the Canvas side of things around that job. I'm the only person at our institution who writes code for the backend processes. 12 days is a really short time period.
I've been busy rewriting years of code after we lost the server that housed it on February 17. That rewrite is still not complete, but I've heavily implementing GraphQL since I started rewriting the code because of the speed improvements and the simplicity. For my teaching, I have a lot of standalone scripts with GraphQL statements in a bunch of files and all of them are going to need to be rewritten.
Perhaps you could implement the limit on the max depth of 15, complexity of over 375,000, and limiting query strings to 5000 tokens now (October 9) while holding off on the pagination limitations for the time being.
While those limits seem pretty generous, they may cut back on some of the denial of service attacks alone without limiting the number of results returned.
I acknowledge that some legitimate scripts might break if you restrict the unlimited to those just under 1/3 of the those thresholds. Hopefully that's going to be a small number. And hopefully it's something that Canvas could look at their logs and reach out to customers who are making those calls. Those intended to be denial of service are likely going to be much longer and complex. They could also be well-intentioned by people who just don't understand what they're doing and made it overly complex (like including a connection that had already been included previously).
A suggestion for that is to generate an error if someone is above the 1/3 range but does not specify pagination. That way it fails and they don't think that it generated the full set of results.
I would propose a tiered system for the time being.
The middle tier is configurable, of course. I just picked 1/3 as a potential threshold.
If you want to ruin GraphQL, then go ahead and force pagination. But give us more time to fix things.
Pagination with GraphQL is going to be a bigger issue to resolve than I initially thought.
I went into the beta instance and looked at the graphiql interface. Every connection is limited to 20 results by default or 100 maximum.
Let's say I want to know every assignment for every course within the current term.
This query, which currently returns all assignments for all courses in this term took 14.8 s, but got everything in one request. Now, in beta, it only returns up to 400 assignments. I'm getting the first 20 courses and the first 20 assignments within each course.
query courseAssignments {
term(sisId: "2024c") {
coursesConnection {
nodes {
_id name state
assignmentsConnection {
nodes {_id name state }
}
}
}
}
}
I cannot guarantee that all of the courses have fewer than 100 assignments (one of mine has well over 100 assignments). So now I have to handle pagination twice.
I can get up to 100×100=10000 assignments with this query.
query courseAssignments {
term(sisId: "2024c") {
coursesConnection(first: 100) {
nodes { _id name state
assignmentsConnection(first: 100) {
nodes { _id name state }
pageInfo { hasNextPage endCursor }
}
}
pageInfo { hasNextPage endCursor }
}
}
}
This request took 3 s, which is pretty quick, but it's incomplete and I now have to handle pagination on two levels: courses and assignment.
That is going to dramatically increase the complexity for people trying to get the data on their own without including some GraphQL library that handles multiple levels of pagination. Previously, I didn't have to parse the results within the code making the call, I could do that after the results were fetched.
When I've had to use pagination before, because the requests were too long, I only had to do it on the outer layer. This is similar to how nested requests work with the REST API. If you get assignment groups and include the assignments, the pagination applies to the number of groups, not the number of assignments within each group. There was just one level of pagination.
This is not something I want to tackle at 2 am after a full day of teaching so I will have to come back to it.
To interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign InTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign In