cancel
Showing results for 
Search instead for 
Did you mean: 
parikhharshal
Community Member

Assignment Submission API data keeps changing

Hi there

Every day thru Rest API call I am getting data for Assignment Submission API and suprisingly some day rows increases and decreses which is wierd or is it how it works?

How to track that? Could there be any particular reason for that?

Thanks

Harshal.

4 Replies
James
Navigator

 @parikhharshal  

I'm not sure I exactly follow what you're saying. Are you saying that the number of rows downloaded varies each day or that there is some information that is there on some days that isn't there on other days or something else? You also didn't say which API call you were using -- submissions for a single assignment or submissions for multiple assignments. The multiple submissions one is more powerful, but comes with some potential issues if you don't understand the query parameters fully.

Submissions entries are created when the assignment is created. If you have a class with 50 students and 10 assignments, there are 500 submission entries. Submission entries will change as student submit information, but the number of submission entries will remain constant unless the number of students or the number of assignments change. Changing the number of students may include marking student enrollments as completed, inactive, or deleted. It doesn't have to be an additional enrollment.

There are all kinds of options that control which information is downloaded.

  • You can specify a date to receive only the information that has been graded (or submitted) since that date and time. If you are fetching just the recent information, then obviously the number of submissions made each day varies, but the total number should remain the same.
  • Submission entries are made each time there is a new assignment, so if faculty are creating new assignments, there will be more rows than before and the count will increase.
  • Once student enrollments are marked as completed or inactive, they don't show up in the fetch by default. This might explain a decrease in the amount of information returned. With the multiple submissions API, you need to include the IDs of those students to get them included; student_ids[]=all won't include them.
  • It may be related to how you're handling pagination. Some people try to use a page=# approach to fetch the information. This works with many API calls and so people think it works for everything. I've seen people say just keep increasing the page number until you don't get any more information. That won't work for the submissions API, which may use a page=bookmark: in the link header when there is a lot of data. For example, if I fetch a submissions for a single assignment (48 students) I get the page=# links, but if I use the multiple submissions API and fetch submissions for that same single assignment with student_ids[]=all, I get the bookmark approach.
  • Also related to pagination, some people try to put in a per_page of more than 100 and then don't check the link headers to see that Canvas has reduced it to 100. I've seen some in the Community say to put in some really large number like 2000, but Canvas just doesn't go that high and it's bad advice.

If those things don't answer the question, then sharing exactly which API you're calling and what parameters you're calling it with may allow us to provide additional insight..

Hi James Jones

I am finiding course id and assignment id from the below SQL query and then using this to pass it to "https://swinburneonline.instructure.com/api/v1/courses/"course_id+"/assignments/"+assignment_id+"/submissions?per_page=100" URL and until last page URL changes.

select distinct cd.canvas_id as course_id, ad.canvas_id as assignment_id

from canvas_data_sut.assignment_dim ad

inner join canvas_data_sut.course_dim cd on (cd.id = ad.course_id)

inner join canvas_data_sut.enrollment_term_dim etd on (etd.id = cd.enrollment_term_id)

left join canvas_data_sut.quiz_dim qd on (qd.assignment_id = ad.id)

where etd.name in

         (

          select distinct etd.name from landing.saas_canvas_enrollment_term_dim_history etd

          left join presentation.dim_date_final dd on dd.teaching_period = right(etd.name,4) + ' TP' + right(split_part(etd.name,',',1),1)

          WHERE dd.partner_id=1 and current_date=full_date

          ) -- Basically find current TP 

  and cd.workflow_state != 'deleted'

  and ad.workflow_state != 'deleted'

  and lower(ad.title) not in ('originality report submission', 'student confidentiality agreement')

  order by course_id

Not sure if I am doing something wrong.

until last page URL changes

What do you mean by this?

When handling pagination, you should focus on the next page rather than the last page. When bookmarks are used for pagination, the last is missing. As long as there is a next page, you should continue fetching it. When you get to the last page, the next link header disappears.

I try to cheat the system sometimes and use the last page to tell me how many requests I should make using the page= option and then I can make the requests in parallel. However, that approach will not work when bookmarks are used.

My class only has 48 students in it, which wasn't enough for the single-assignment API call that you're using to kick in the bookmarks. I checked another course that had 1458 submissions for a single assignment and it didn't go to the bookmark approach either. It may be because of a low server load or it may be safe to do page=2&per_page=100, page=3&per_page=100, etc, until you get to the point where there is no next page left. If you follow the links sequentially (not in parallel), the approach of using the value in the next link header will work whether it uses bookmarks or not. When I try to game the system, I check the format of the next page header to see if it's returning page numbers or bookmarks and only use the cheat when it's returning numbers.

There is another potential issue going on. Back in 2015 there was this thread: https://community.canvaslms.com/message/4115  When per_page=100 was used, it wasn't returning all of the rows and some of them were being returned twice. Changing the per_page value seemed to fix it, at least in that case. I thought Canvas had fixed the issue, but sometimes issues pop back into the mix.

James Jones

Thanks for the quick reply.

Yes I am iterating to next page until last page is found so in that way it would be alright. What I suspect is as you rightly mentioned some student enrolments getting inactive or withdrawn which might be reducing number of records. I will still double check.

Thanks a lot for your pointers.