Community help

Neil47 · ‎05-18-2021

Hello,

I run a python script to get a list of all enrollments for a course with 3800+ enrollments using the api/v1/courses/{course_id}/enrollments API endpoint.

But on every run, I am getting a different number of enrollments which are never accurate. The script gave us consistent and correct output until last week. It also gives us the desired output for courses with less than 500 enrollments.

Why does this happen and is there a fix for it?

import csv
import time
import logging
import datetime
import requests


def create_log():
    timestamp = datetime.datetime.now().strftime("%d-%m-%y_%H%M%S")
    logfile = f"koc_scriptlog_{timestamp}.log"
    logging.basicConfig(
        filename=logfile,
        filemode="w",
        level=logging.DEBUG,
        format="%(asctime)s - %(message)s",
    )


def main():

    create_log()
    course_id = 3010
    BASE_URL = f"https://companyname.org/api/v1/courses/{course_id}/enrollments"
    TOKEN = {"Authorization": "Bearer tokenkey"}
    page = "next"
    count = 0
    t_count = 0

    while page == "next":

        # Sleep for 5sec after 100 calls
        if t_count % 100 == 0:
            time.sleep(5)
        t_count += 1

        res = requests.get(BASE_URL, headers=TOKEN)
        if res.status_code == 200:
            count += 1
            print(f"{count}. ", end=" ")
            page_links = res.links
            logging.info(f"Call {count}. {res.url} | {page_links}")
            if "next" in page_links:
                if "url" in page_links.get("next"):
                    new_url = res.links["next"]["url"]
                    print(new_url)
                    BASE_URL = new_url
                    page = "next"
                    responses_json = res.json()
                    temp_list = []
                    for res_json in responses_json:
                        r = [
                            res_json["user_id"],
                            res_json["user"]["name"],
                            res_json["user"]["login_id"],
                            res_json["enrollment_state"],
                            res_json["role"],
                            res_json["user"]["sis_user_id"],
                        ]
                        temp_list.append(r)
                        logging.info(r)
                    with open(
                        "output.csv", mode="a", newline="", encoding="utf-8"
                    ) as csvfile:
                        canvas_writer = csv.writer(
                            csvfile, delimiter=",", quotechar='"'
                        )
                        canvas_writer.writerows(temp_list)
            else:
                print("Next page not found")
                page = "last"


main()

chriscas · ‎05-18-2021

Hi @Neil47 ,

I'm wondering if you will get consistent output running the following code (mashed up between your code and some stuff I run on a consistent basis, hoping I got it all right if you put in your tokenkey). This code will just try getting the enrollments list, and then print the length (number of enrollments found). I'd expect it to give the same result each time you run it (assuming you're not constantly changing the enrollments in the course.

Without seeing your output, I'd guess possibly a pagination issue or some kind of request error is happening based on my own experience. I was running into a lot of random errors in out test environment in the last few weeks, so I had to add the retry method found in my script below (googled to find that, I'm by no means a python expert). Even though it uses slightly more memory, my scripts all use the canvas_get_allpages function just to avoid recreating the same request loops over and over. I think I've accounted for all of the possible responses and headers for pagination in that function now, and I've gotten reliable results in all the calls I do lately with these improvements. Someone else might come in and offer some way better advice, but I thought I'd take a stab since I've been tinkering more with python lately.

-Chris

import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
import json
from datetime import datetime
import csv

# Requests call with retry on server error
def requestswithretry(retries=3, backoff_factor=0.3, status_forcelist=(500, 502, 504), session=None, ):
    session = session or requests.Session()
    retry = Retry(
        total=retries,
        read=retries,
        connect=retries,
        backoff_factor=backoff_factor,
        status_forcelist=status_forcelist,
        allowed_methods=frozenset({'DELETE', 'GET', 'HEAD', 'OPTIONS', 'PUT', 'TRACE', 'POST'})
    )
    adapter = HTTPAdapter(max_retries=retry)
    session.mount('http://', adapter)
    session.mount('https://', adapter)
    return session

# Paginated API call to Canvas
# returns list of dictionary objects from json
def canvas_get_allpages(url, headers):
    if not 'per_page=' in url:
        if '?' in url:
            url=url+'&per_page=100'
        else:
            url=url+'?per_page=100'
    rl=[] #list of JSON items converted to python dictionaries
    repeat=True
    while repeat:  
        r=requestswithretry().get(url,headers=headers)
        if r.status_code!=200:
            print(datetime.now().isoformat()+': Error',r.status_code,'while retrieving get response from ',url)
            repeat=False
        else:
            rj=r.json()
            if type(rj) is dict:
                for key in rj:
                    rl=rl+rj[key]
            else:
                rl=rl+rj
            url=r.links.get('current',{}).get('url',url)
            last_url=r.links.get('last',{}).get('url','')
            if (url != last_url) and ('next' in r.links.keys()):
                url=r.links['next']['url']
            else:
                repeat=False
    return rl

course_id=3010
BASE_URL=f"https://companyname.org/api/v1/courses/{course_id}/enrollments"
TOKEN={'Authorization': 'Bearer tokenkey'}
enrollments_list=canvas_get_allpages(BASE_URL,TOKEN)
print(len(enrollments_list))

Neil47 · ‎05-18-2021

Hey Chris, thanks for sharing your script.

I am still getting a different enrollment count every time. This works for other course with less than 500-800 enrollments, but returns inconsistent results for the course 3010 which has over 3800+ enrollments. In my script I am logging the status of every call, the pagination links from the response header, and the response object, so I know for a fact that none of my calls error out. Both our scripts handle pagination in sort of similar way, so I am thinking this might be an issue with the API.

I am able to get an accurate count of student enrollments when I use the Courses API - api/v1/courses/{course_id}/students. However this simply returns the user object so its not really useful to me.

chriscas · ‎05-19-2021

Hi @Neil47 ,

I think I'd file this as a bug with Canvas support, since the API call used to work and now doesn't. I looked at the API change log and don't see any recent enrollments changed mentioned there, but Instructure might have made a tweak and forgot to document it or didn't think ti would have any meaningful impact. If you wanted to dig deeper yourself, I'd run the call a few times and compare the results page-by-page to see where the differences started happening. Maybe you'd be able to spot something in common (like a certain role not being included sometimes, or particular students being affected). On the other hand, maybe you'd just identify that the results are completely random. In either case, it would probably help with support. You may also need to be persistent with support, as API type things like this will usually need to be escalated up a few levels to get someone familiar with how they should work.

-Chris

Neil47 · ‎05-20-2021

Hey Chris, I have opened a ticket with Canvas. Will update you with the resolution, hopefully soon! Thanks for your help!

You're signed out

Enrollments API giving a different enrollment count on every run

Canvas

Community help

View our top guides and resources: