creating JSON objects with Canvas data from API

Hey. I am trying to build JSON objects to work with down the line. However, I am running into an issue and am hoping I can get some help. 

I am wanting to store all pages of a Canvas course in the JSON 'page_url', but I am only getting the last page that is in the iteration of a course. Any ideas on how to get ALL pages in a course to be store in 'page_url' in my JSON file? 

I've been wracking my brain for a solution but I'm lost, and every time I take 1 step forward, I take 3 steps back. 




import requests
import json
import time
import re

# Start timer for code to run :)
# Opens file with all courseIDs
# And creates the report file (myfile.json)

start = time.time()
DataFile = open("PATH_TO_FILE","r")

#secret token and course ID

secret_token = "MY_SECRET_KEY"
course_id =

htmlUrlVariable = None
titleVariable = None
bodyVariable = None

for c in course_id:
    ###### Code to GET course information

    Get_url = "https://INSTITUTION_URLapi/v1/courses/"+c+"/pages?per_pages=9999"
    headers = {'Authorization' : 'Bearer ' + secret_token}
    r = requests.get(Get_url,headers = headers)
    User_dict = r.json()

    for i in User_dict:
        pageIdToStr = str(i['page_id'])
        # Make second API call to page body to get <HTML> <body> code
        Get_url = "https://INSTITUTION_URLapi/v1/courses/"+c+"/pages/"+pageIdToStr
        headers = {'Authorization' : 'Bearer ' + secret_token}
        x = requests.get(Get_url,headers = headers)
        pages_dict = x.json()
        htmlUrlVariable = pages_dict['html_url']
        titleVariable = pages_dict['title']
        bodyVariable = pages_dict['body']
        finalURL = print(htmlUrlVariable)

    ###### Code to find sub-account ID

    Get_url = 'https://INSTITUTION_URLapi/v1/courses/'+c
    headers = {'Authorization' : 'Bearer ' + secret_token}
    e = requests.get(Get_url,headers = headers)
    subAccount_dict = e.json()
    courseNameVariable = subAccount_dict['name']
    subID = subAccount_dict["root_account_id"]
    subVariable = (str(subID)) # Variable needed to print to final report

    ###### Code to find course instructor with id role of 5020 (teacher role)

    Get_url = 'https://INSTITUTION_URLapi/v1/courses/'+c+'/enrollments?per_page=9999'
    headers = {'Authorization' : 'Bearer ' + secret_token}
    e = requests.get(Get_url,headers = headers)
    enroll_dict = e.json()    

    # Iterate through the JSON array
    for item in enroll_dict:
        teacher = item["role_id"]
        if (teacher == 5020):
            facultyIdVariable = item["role_id"]
            instructorVariable = (item['user']['name'])
            instructorUserNameVariable = item['user']['login_id']

    #### Writes one course detail to JSON object. Used for testing

    canvasData = {  
        'instructor': instructorVariable,
        'course_Name': courseNameVariable,
        'page_url': htmlUrlVariable,
        # 'body' : bodyVariable

    json_object = json.dumps(canvasData, indent=4)

    with open("PATH_TO_FILE", "a") as outfile:

print(">>>>Process complete! It took", time.time()-start, "seconds to complete.\n")




Once you get everything working, 9999 isn't going to get all of your pages. There is hard-limit in most places of 100. You really need to pay attention to the pagination documentation if you want to make sure that you're getting all of the pages.

The reason you're only getting the last URL is because you are storing the page url to a string, which can only hold one value at a time. It only remembers that last value you give it.

