Bookmarking for Enrollments Index API in upcoming release

Jump to solution
nardell
Community Participant

This question may be covered in the release notes and missed it. One of the pending changes is the "Bookmarking for Enrollments Index API". It seems that this change will be applied production Canvas instances on June 17.  I believe understand the affect of the change - the Link section in the Response Header will no longer provide a range of pages, as in:

<https://your.institution.instructure.com/api/v1/courses/:course_id/enrollments?page=1&per_page=100>; rel="current",<https://your.institution.instructure.com/api/v1/courses/:course_id/enrollments?page=2&per_page=100>; rel="next",<https://your.institution.instructure.com/api/v1/courses/:course_id/enrollments?page=1&per_page=100>; rel="first",<https://your.institution.instructure.com/api/v1/courses/:course_id/enrollments?page=3&per_page=100>; rel="last"

Instead links will be returned as:

<https:/your.institution.instructure.com/api/v1/courses/:course_id/enrollments?page=first&per_page=100>; rel="current",<https://your.institution.instructure.com/api/v1/courses/:course_id/enrollments?page=bookmark:WyJTdHVkZW50RW5yb2xsbWVudCIsIkxhc3RuYW1lLCBGaXJzdG5hbWUgIiwxMjM0NTY3XQ==&per_page=100>; rel="next",<https://your.institution.instructure.com/api/v1/courses/:course_id/enrollments?page=first&per_page=100>; rel="first"

..henceforth it will be necessary to recursively follow links in any API code that uses the the Enrollments endpoint. Just want to make sure I have the story correct. It seems since this would be a potentially breaking change for an institution's integrations (if the integration relied upon explicit page numbers in its page-following scheme) and Instructure is providing a period of time for the institution to accommodate this change, which is soon coming to an end. I am curious if anyone knows if Instructure is moving away from providing page ranges in REST responses. Though I do not depend on the page numbers in production code in use,  I appreciated that it allows for parallel execution of API requests. 

Thanks,

Michael Nardell

1 Solution
James
Community Champion

 @nardell  

If you look at the code used for the bookmarking, it generates SQL code to use as a starting point the next time you make the call based on where it left off with the current data. If I my key for sorting was id and I left off with id=1234, then the bookmark for the next page would translate into a WHERE id > 1234. It's more complicated than that, but that appears to be the gist of it.

For the enrollments API, it uses enrollment.type, user.sortable_name, and enrollment.id as the key. In this case, the information returned after a bookmark could change. If I leave off with a StudentEnrollment for "Smith, John" who has an enrollment_id of 1234, then it generates code something like this:

WHERE 
(type > "StudentEnrollment") OR
(type = "StudentEnrollment" AND sortable_name > "Smith, John") OR
(type = "StudentEnrollment" AND sortable_name = "Smith, John" AND id > 1234)
‍‍‍‍‍‍‍‍‍

Looking at that, there isn't anything immutable about it. Between the time I made the first call and the time I made the second call, I might have added "Doe, Jane" as a StudentEnrollment and I would never get her out of the call because I had already passed that point. Likewise, I might delete the enrollment for "Thomas, Jordan" after I had fetched it and so when I went back to make that call again, I would get different data than I did the first time.

In the case of Jane, she is missing until the next time you fetch the enrollments. Jordan remains there even though he's been deleted.

This could have happened without using bookmarks as well. If John was record number 20 (page=2&per_page=10) and Jane gets added, then John gets duplicated since he's now record number 21. Jane is still missing, though. You wouldn't get the duplication with the bookmark approach, so at least it fixes that issue.

In reality, no one is likely to be be jumping straight to page 3 as their first request for a list of enrollments. Similarly, you shouldn't use a bookmarked page a month after it was generated (if they're around that long)

The history on the patch to use bookmarks says it's to fix the sort. Fixing the sorting wasn't what I remembered from the announcement, though. It was was to improve the performance (or something like that) of the servers. This very probably could have been because too many people were hitting the API with the parallel requests. The enrollments API is more costly than some to generate. When I was updating my code to download the access report for every student in a class, it was the list of the users with enrollments that was the expensive call. The getting of the access report data wasn't costly at all. That might be why they chose to do this to enrollments.

For a long time, I've checked my links to look for the presence of a last link header and if it contains a numeric page parameter (I've got some code I probably need to double check and probably update). If the page was present, then I would take advantage of the parallelism. If not, then I would make the call sequentially.

Some people have commented about how they just put in a sequence of numbers and rely on getting an error when it's exhausted the data and they don't look at the link headers at all. That's not a good way to do this and they will definitely get bitten if they're using this endpoint but there's a good chance that Canvas will do this to other endpoints in the future.


Right now, it doesn't appear to be a wholesale change. The code to switch it over to a collection and bookmarks wasn't trivial.

For what it's worth (I didn't know this until I was preparing this response), a bookmark is a Base64URL encoding of the stringified JSON of the data structure that contains the key information from where the request left off. You can take it and paste it into an online Base64URL decoder (such as the one at Base64Decode.org) and find out what information is contained in it. It is not a bookmark saved to a database somewhere that is looked up when received to find out what the information is.

If someone wanted to, they could say "Give me all of the enrollments after 'Smith, John'" without knowing what page that happened with. That's something you cannot do with the numbered pages. Any extra padded equal signs at the end can be ignored. In my limited testing, you do have to put something in for the enrollment_id, but it doesn't have to be correct.

I put in ["StudentEnrollment","Smith, John",0] and got the Base64URL encoded value of WyJTdHVkZW50RW5yb2xsbWVudCIsIlNtaXRoLCBKb2huIiwwXQ==. When I tried page=bookmark:WyJTdHVkZW50RW5yb2xsbWVudCIsIlNtaXRoLCBKb2huIiwwXQ, I picked up with the first person after John. 

I don't know how useful that is, but it explains a little mystery I had about just what a bookmark was.

Strangely, when I decode the bookmark you provided, it has "StuUentEnrollment" instead of "StudentEnrollment".

View solution in original post