From the Engineering Deck: Instructure & GraphQL

The content in this blog is over six months old, and the comments are closed. For the most recent product updates and discussions, you're encouraged to explore newer posts from Instructure's Product Managers.

EthanVizitei
Instructure Alumni
Instructure Alumni
3
6700

“Not every change is an improvement but every improvement is a change”  ~Eliezer Yudkowsky

Instructure has a long history of opening the systems we build to extension and enhancement.  When we build products like Canvas, we want to make sure they work well for ALL our customers. That means building in the best out-of-the-box features that we can for covering the 80% of our users who need an off the shelf LMS, and keeping the cogs and gears open and accessible so the other 20% can build what they need to help their specific teachers and students be successful. After a long series of forays and experiments examining whether or not GraphQL can help us improve our API development experience, we think the answer is “yes”.  Therefore over the next couple years Canvas will be migrating its venerable API in that direction with the goal of greatly simplifying what it means to build with Canvas

Why are we changing how our API works?

All changes have costs, so if your immediate reaction to hearing about an overhaul of something that’s worked well for you in the past is “whyyyyyy?!?!”, you’re not alone.  We certainly don’t take this kind of directional shift lightly, as it means a lot of new work for us too. If you’ve worked with Canvas APIs yourself in the past though, hopefully I can help you see some of the things we’re seeing.  The main idea to keep in mind is that GraphQL takes us away from APIs that are organized according to “how the data is stored” in the database, and instead focuses on “how the data is going to be used.”


Let’s look at a canvas API endpoint together, maybe something simple like a single Assignment in a course.

evizitei2_1-1629117556297.png

 

…..ok, maybe it’s not as simple as it sounded.  Can you believe how far I had to zoom out to capture all of that in this post?  There’s almost 70 fields on that single object, no wonder so many of them have been made optional over time.  Imagine, if your goal is to build a little widget that just shows the names of the courses you teach and to list the assignments that are present in each one.  You’ll need to get all the courses you’re associated with ( https://canvas.instructure.com/doc/api/courses.html#method.courses.index ), and then you’ll need to make a separate “assignments” request for each of them ( https://canvas.instructure.com/doc/api/assignments.html#method.assignments_api.index ), just to parse and extract the names.  That means for your fairly simple use case we’re giving you both WAY TOO MUCH data, and also not quite the data you need. You have to adapt our data storage patterns to your use case.

It goes a little deeper though because there will be other users with other use cases that are similarly sparse.  Maybe another teacher is trying to put together a script to put all the upcoming due dates for assignments in their classes up on a live monitor in the hall outside their classroom.  Maybe an IT support staff member is computing tallies for the school on how many assignments do and do not require peer reviews. Each of these other cases only needs a couple fields from each assignment, but today they all have to sift through many requests with too much data.

It’s not just end users writing API integrations that have those impedance mismatches, by the way.  Instructure developers are API consumers too, and different features and workflow areas in our canvas browser experience require very different views on common objects.  This is especially true for mobile applications, which often have tightly focused workflows and have to work in very bandwidth-limited situations. With GraphQL, we think we can make this story better.

How does GraphQL help me?

Imagine the canvas schema looks something like this (NOTE: THIS IS NOT THE REAL SCHEMA! FOR DEMONSTRATION PURPOSES ONLY!):

evizitei2_2-1629117628412.png

If you haven’t read a GraphQL schema before you might not know what to make of it, but hopefully it’s at least clear that there are some familiar types in there (Course, Assignment), and that there are connections between them (Submission has a User, Course has a list of Assignments, etc).  Naturally Assignment and Course have many other fields defined (maybe 60 or 70 each!) but that’s not important for the example, assume they are there.  Now let’s re-evaluate what a query pattern would look like if you want to get the assignment names from your courses:

evizitei2_3-1629117782782.png

 

Here we’ve constructed a query that asks for the “myCourseLoad” entry point and specifies the fields it cares about in the schema graph.  We want names for our courses, and the names for the assignments for those courses, and that’s it. So what might the response look like?

evizitei2_4-1629117847164.png

Well look at that, it’s only the data we care about (nothing about submissions, peer reviews, or due dates, even though those are in the schema), and perhaps more importantly we got all those assignment names we cared about in a single query rather than issuing a different query per course.  Furthermore with the same schema above, our other use cases (the ones caring about due dates and peer reviews) can also craft queries that only involve the data THEY care about without having to make any changes to the API. 

 

That flexibility doesn’t only benefit you directly, but also has implications for your quality of life in using Instructure built API clients (canvas in the browser, mobile apps, etc).  As we move to GraphQL everywhere, those systems each can cut down on user wait time (by issuing fewer requests), bandwidth (by sending less data over the wire), and development time for the the system improvements you want (by crafting novel queries that recombine existing API elements in specific ways rather than writing new special purpose API endpoints for new use cases).  We’ve seen this first hand with experimental steps in this direction our mobile teams have engaged in.  In Instructure’s mobile speed grader, courses with more than the “normal” number of students would take so long to load that in some cases it was impossible to interact with.  On top of graphql, we can just get the data the speed grader needs in as few requests as possible, so features that were contextually unusable before can become not just possible but delightful once they’re allowed to specify the data fetching according to how they want to use it.

 

In addition to flexibility, there’s the delightful tooling this API ecosystem enables.  Because of the strong typing that’s present in the schema definition, automatic validation of proposed queries is possible in the tooling without requiring an execution test (i.e. you can ask your schema-aware tool whether a query you want to run checks out as valid for the target schema).  Because the schema itself can be introspected over which makes it possible to stand up interactive tools that mix the declared types and their documentation into an experience you can actually fiddle with live in order to experiment with your query ideas.  It will take some time to shift our current API documentation into that kind of intermixed format, but the benefits for enabling external integrators to develop their queries efficiently are potentially huge.

What’s the roadmap?

It’s early days for GraphQL at Instructure, and we’re still figuring out the “when” and “how”.  Like always, we try to give our customers the time they need to migrate when changes do occur, and we’re keeping an eye on stability and scalability as we introduce new ecosystem components. That means it may be a couple years before GraphQL has taken over our whole ecosystem. What we do have a solid bead on today is the “why”.  We see tremendous potential in this style for API organization where we focus on “how the data is going to be used”.  

 

We want the use of Instructure’s API data (by our customers, our partners, and our own web/mobile clients) to be both simple and efficient, and this seems like a step in that direction.  As a result, we believe this is one of those changes where the benefits to our customers and to ourselves outweigh the costs, and we’re going to do everything we can to make those costs bearable and the transition period smooth.  That includes reasonable and accommodative deprecation periods, thorough and interactive documentation, and a commitment to being the earliest-adopting-customers of our own APIs so that we hit the rough patches before you do.  Instructure will probably never stop changing, but we’ll work hard to try to ensure those changes are all improvements.

Related Reading

  1. Learn about GraphQL: [ https://graphql.org/learn/ ]
  2. Learn about the Apollo tooling ecosystem [ https://www.apollographql.com/docs/ ]
  3. Canvas’ graphql playground [ https://canvas.instructure.com/graphiql ] (substitute your institution’s domain)

The content in this blog is over six months old, and the comments are closed. For the most recent product updates and discussions, you're encouraged to explore newer posts from Instructure's Product Managers.

3 Comments