cancel
Showing results for 
Search instead for 
Did you mean: 
glparker
Adventurer

Submissions API not returning all submissions?

I think I'm running into an API bug and am hoping you can confirm.    I know the URLs won't work for most of you, but hopefully you can recognize something I'm missing.

I am calling the Submissions API to return all the submissions for a particular quiz in a particular course.    FWIW, there are 6300 students in this course.

The Course (982096)

https://usflearn.instructure.com/courses/982096

The Assignment : Quiz 5 (3156530)

The student, Adam Wetsch (ID 3867214), his submission is clearly a score of 100, I can click it and see his submission to the quiz.

http://note.io/1DYnwhg

However, the API is not including his submission in the results.   I am iterating over the 600-ish page calls, and I do not see his submissions included in the list.   I see more than 6000 other submissions, but his (and a few others) are not inlcuded in that list. 

Here is the API call I am making

https://usflearn.instructure.com/api/v1/courses/982096/assignments/3156530/submissions

If I use the API to load his submission directly by UserID, it loads fine. 

https://usflearn.instructure.com/api/v1/courses/982096/assignments/3156530/submissions/3867214

I've put in a ticket with support to see if they can replicate what I'm seeing. 

Has anyone else used Submissions and found that some records are missing?

Thanks, Glen

23 Replies

marcelo.amorim​,

Are you having the same issue with submissions or with some other API call not returning all of the information?

It's related to not specifying an order in the data returned and when it's not specified, PostgreSQL is free to return the data in whatever order it wants to. That's not a problem when it fits on one page, but it is sometimes a problem pagination is involved.

In addition to the postings above, there was a discussion about this in a related matter (see the bottom) How is the To-Do List sorted?

Somewhere, I'm having trouble finding it right now, I think Glen said he just specified a particular per_page and it worked so he stopped worrying about it. It may have been in person when I saw him at InstructureCon15. I don't remember which per_page he picked, but the real solution is for Canvas to make sure that all of the information is returned.

Well, I've developed an external app that uses the API to get all submissions from a specific quiz.

But, unfortunately I cant get it right. I'm worried how my app is going to work this way.

Just to be clear -- by all submissions, do you mean that there are students whose submissions do not show up through the API or that there are students who completed multiple attempts and you're not getting all of those attempts?

If it's the first kind, which is what Glen was experiencing where some were sporadically and seemingly unpredictably not getting returned, then you might file a trouble ticket with Canvas and refer to this discussion and the other one I mentioned. Maybe they never fixed it.

If it's a kind where you get a bunch and then it stops, it might be timeout issues. I don't know what software you're using, but Google Sheets has a five minute time out. In another issue that came up, Chrome seems to apply a timeout while Firefox doesn't. This shouldn't be an issue if you're running a server-based command line software like PERL, Python, PHP, etc. If you're accessing the results through a web-browser, there might be an issue of it timing out before it's done. For instance, in PHP, there is a max_execution_time parameter that defaults to 30 seconds when running from within a web server (for example, Apache) but 0 (no timeout) if running from the command line.

If you're only getting 10 (or whatever the per_page setting is), then it's a pagination issue.

I mention those other issues just in the off-chance that we're missing something. I'd hate for you to submit a trouble ticket and find out it's something that we could have helped with. At this point, I don't know enough other than if it's the same issue that Glen was seeing, I'd file a trouble ticket. If it's anything else, it might have a simpler fix that didn't involve contacting Canvas support.

marcelo_amorim
Community Member

I need the API to verify which students have sent the submission, comparing to a list of students previously made. The API is used to process its data through PHP Google AppEngine service and it seems to be unstable because I have multiple quizzes and most of them work good. I checked the "per_page" attribute and it doesn't influence the result.

Maybe, opening a ticket is the best thing. I'll be back if a solution comes.

Thanks for replying!

stuart_ryan
Community Coach
Community Coach

Hi  @glparker ,

I am going through having a look at some of the early days in Canvas Developers and checking in to see if older enquiries have been answered.

I am wondering, were you ever able to find out the cause of your issue, I am hoping I can assume that it is well and truly resolved by now, but if not, please let us know.

I am going to mark this as assumed answered for the time being, however by all means please let us know if you still have an outstanding issue and we can have another look!

Cheers,
Stuart

aetherus_zhou
Community Member

I found the cause of this problem.

If I do the pagination manually (i.e. ignore the Link header), then I have to be very careful about the `per_page` parameter because Canvas has an upper limit of this parameter for each API that supports pagination, but the upper limits vary among the API's. If you set the `per_page` beyond the limit, Canvas will lower it to the upper limit and it does not tell you.

For example, when I call the API `GET /api/v1/accounts/1/courses?page=1&page_size=200`, Canvas will lower the page size to 50, but I don't know such a fact, so the next call would be `GET /api/v1/accounts/1/courses?page=2&page_size=200`. It should respond with the 51st course through the 100th, but somehow it considers to return the 201st course through 250th, so in this way, the 51st through the 200th courses will never be fetched.

The solution is simple, just set the `page_size` to a ridiculously large number (e.g. 9999), and rely on the `Link` header to do the right thing. This header is a little bit hard to parse. Here's a Ruby method to get the URL for the next page:

```

def next_link(link_header)
  link_header =~ /<([^>]+)>;\s*rel="next"(?:,|$)/ && $1
end

```

Hi  @stuart_ryan ,

I'm having the same problem as Glen did. There's more to it, though. When I compare the Student Analysis Report for a quiz with the data I get from calling the submissons API for the same quiz, I don't get data for all the students, for those students I do get data for, only their first submissions data is present.

Changing the per_page value has an effect but doesn't solve the problem. When per_page is set to 200, I get the last 69 records ordered by student id, ignoring any subsequent submissions that may have been made by the student; when the value is 100, the same result. When the value is 50, the last 19 records are returned. When the value is 25, the same 19 are returned. When the value is 20, 9 records are returned and same again when the per_page value is set to 10. This is for a quiz allowing 2 attempts with 193 submissions.

I may be missing something but the results seem to suggest there's something amiss with the Submissions API?

 @ric_canale  

The submissions API can be a little difficult to understand, but I think it's operating correctly if you understand what it's doing.

If you want submission data for additional attempts, trying adding the query parameter include[]=submission_history. As a bonus, when you do this for a quiz it will also return the responses given by the students, although you have some work to do to make it usable.

Don't set a per_page=200, it most cases, including the submissions API, it only supports 100 at a time.

If you keep getting the same values returned over and over, it sounds like what happens when you have a bookmark for the page in the URL. This happens with the list multiple submissions endpoint. That bookmark specifies where to start and the per_page tells it how many to take starting at that point.

For example, when I started with

/api/v1/courses/2610710/students/submissions?student_ids[]=all&assignment_ids[]=23468640&per_page=5

I got a next link header that had a query parameter of 

assignment_ids[]=23468640&student_ids[]=all&page=bookmark:WzI5NTI4NDM3OF0&per_page=5

If I load that, I get 5 submissions, starting with submission_id=295284374 and ending with submission_id=295284378. That is, it's the second 5 results. If I change the per_page and make it 10, and fetch the data, then I get 10 submissions, but still starting with submission_id=295284374, containing the first five submissions from before (including the 295284378), and then five more that I didn't have before.

If you're on position 51, then you would always get the same values subject to a limitation of either the per_page or the 19 that remain. If your per_page is >= 19, then you get the last 19 submissions. If your per_page < 19, then you get the first per_page results, but starting with position 51.

If you're trying to use page= and per_page= to get all of the results, you should be careful. I regularly take advantage of that in a lot of my code, but I always check the link headers (either manually when writing the code or programmatically as part of the fetch). If Canvas doesn't supply a page= in their next header, then I do not attempt to use those because it's not supported for that API call. It may look like it works -- for example, I can set page=2&per_page=5 in my call and get the same 5 results I described before. However, if you're doing this and another submission comes in while you're fetching the results, the results are not predictable. That's why there's a bookmark for this kind of data -- to make sure that you get all of the data.

Thank you, James, and apologies for the delayed response. I've been away for a few days. I got some great clues from your post (this is a learning exercise for me as much as a problem to solve). I understand now that the Submissions API is acting correctly. I am actually using your canvasAPI() function, originally from your Course Due Dates Google sheet. When I check the Link headers, I get exactly what you suggested:

"Link":"<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=1&per_page=100>; rel=\"current\",<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=2&per_page=100>; rel=\"next\",<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=1&per_page=100>; rel=\"first\",<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=3&per_page=100>; rel=\"last\"","status":"200 OK"}

"Link":"<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=2&per_page=100>; rel=\"current\",<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=3&per_page=100>; rel=\"next\",<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=1&per_page=100>; rel=\"prev\",<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=1&per_page=100>; rel=\"first\",<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=3&per_page=100>; rel=\"last\"","status":"200 OK"}

"Link":"<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=3&per_page=100>; rel=\"current\",<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=2&per_page=100>; rel=\"prev\",<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=1&per_page=100>; rel=\"first\",<https://rmit.instructure.com/api/v1/courses/20447/quizzes/73499/submissions?page=3&per_page=100>; rel=\"last\"","status":"200 OK"}

So I needed to look elsewhere. Then I remembered that I had trouble with the object returned by the canvasAPI() function. When calling the Submissions API, the canvasAPI() function returns an object of the form: {"quiz_submissions":[{},{},{}]} instead of [{},{},{}]} so I was fixing this after using your canvasAPI() function. But what if this unexpected form of the array creates a problem within the canvasAPI() function itself? Sure enough, when looking through it I could see where it was falling out because the returned Submissions object was a single value/key pair. So, I used:

if (json.hasOwnProperty('quiz_submissions')) {
json = json.quiz_submissions;
}

to capture the condition, amend the object to the expected form and continue on. Now I'm getting all the data.

Thanks for your help and your scripts!

Cheers

Ric

The Google Sheets API was never completed; it was for the functions that I was using which used the page and per_page, but it wasn't robust. Other people took early versions and used it and shared it, so some people who think they are using my version are really using a derivative version. Since you got it directly from the course due dates, at least you're working with an original, but it was written early in my learning how things worked and if I was to start again, I would probably make some changes. I'm very good at starting things and then getting distracted before finishing them Smiley Sad