Get a List of Students and Missing Work

Navigator
5 1 262

In a Canvas course, you can quickly check the number of missing assignments for single students relatively quickly. You can also message groups of students missing specific assignments from the analytics page (or the gradebook). What you can't do is get a list of all students in a course and their missing assignments in a CSV for quick analysis.

In my never-ending exploration of the Canvas API, I've got a Python script that creates a missing assignments report for a course, broken down by section.

Sidebar...

I have my own specific thoughts about using the "missing" flag to communicate with students about work. The bigger picture is that while we're distance learning, it's helpful to be able to get a birds-eye view of the entire course in terms of assignment submission. We also have enlisted building principals to help check in on progress and having this report available is helpful for their lookup purposes.

The Script

from canvasapi import Canvas # pip install canvasapi
import csv
import concurrent.futures
from functools import partial


KEY = '' # Your Canvas API key
URL = '' # Your Canvas API URL
COURSE = '' # Your course ID

canvas = Canvas(URL, KEY)
course = canvas.get_course(COURSE)
assignments = len(list(course.get_assignments()))
writer = csv.writer(open('report.csv', 'w'))

def main():
    sections = course.get_sections()

    writer.writerow(['Name', 'Building', 'Last Activity', 'Complete', 'Missing'])

    for section in sections:
        enrollments = section.get_enrollments(state="active", type="StudentEnrollment")
        
        # Play with the number of workers.
        with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
            
            data = []
            job = partial(process_user, section=section)

            results = [executor.submit(job, enrollment) for enrollment in enrollments]
        
            for f in concurrent.futures.as_completed(results):
                data.append(f.result())
                print(f'Processed {len(data)} in {len(list(enrollments))} at {section}')
                
        writer.writerows(data)

def process_user(enrollment, section):
    missing = get_user_missing(section, enrollment.user['id'])
    return [ 
        enrollment.user['sortable_name'], 
        section.name, 
        enrollment.last_activity_at, 
        len(missing), ', '.join(missing)
    ]

def get_user_missing(section, user_id):
    submissions = section.get_multiple_submissions(student_ids=[user_id], 
                                                   include=["assignment", "submission_history"], 
                                                   workflow_state="unsubmitted")

    missing_list = [item.assignment['name'] for item in submissions \
        if item.workflow_state == "unsubmitted" and item.excused is not True]

    return missing_list


if __name__ == "__main__":
    main()‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

How does it work?

The script uses UCF's canvasapi library to handle all of the endpoints. Make sure to pip install before you try to run the script. The Canvas object makes it easy to pass course and section references around for processing. Because each student has to be individually looked up, it uses multiple threads to speed up. There isn't much compute, just API calls and data wrangling, so multithreading worked better than multiprocessing.

For each section, the script calls for each students' submissions, looking for workflow_state="unsubmitted" specifically to handle filtering on the Canvas servers. From this filtered list, it creates a final list by checking the submission history and any excused flags. A list is then returned to the main worker and the section is written as a whole to keep the processes thread-safe.

When the script is finished, you'll have a CSV report on your filesystem (in the same directory as the script itself) that you can use.

 

Improvements

Currently, missing assignments are joined as a single string in the final cell, so those could be broken out into individual columns. I found that the resulting sheet is nicer when the number of columns is consistent, but there could be some additional processing added to sort assignments by name to keep order similar.

Canvas is also implementing GraphQL endpoints so you can request specific bits of data. The REST endpoints are helpful, but you get a lot of data back. Cleaning up the number of bytes of return data will also help it run faster.

1 Comment
Explorer II

I've been thinking about trying multi-threading in some of my Python scripts. Thanks for this clear example!