Updating Page content with Python and Canvas API?

Jump to solution
Community Explorer

Hello all - Relatively new member to Canvas and the API.  I have some programming background but am also new to Python.

We have a course template (not a Blueprint) that we use as the source for all our core course layout.  One of the Pages has content about our library.  That content needs to be updated in all our active courses.  (Managing the content in a more efficient way is a topic for another discussion - right now I'm just dealing with the situation at hand 😉 )

My thought was to use Python to call the Canvas API to:

  • Access a new copy of the source library page and grab the Body content
  • Access a target course and update the Page's Body content with the new content
  • Add appropriate looping

I've got the connection/token working and can access the source course's Page and get the Body content.  BUT, when I attempt to PUT the content into the target I'm getting a strange "truncation" of the content at the first ampersand or semicolon.  No error, it just stops populating the content at that character and then closes the open HTML tags?!?  If it was a true issue with special/reserved character I would have expected the content to just end or some kind of error.

Here's a simplified example using just a string to represent the new source content.  Again, this is Python code.

import requests

import json

url = 'https://myschool.instructure.com/api/v1/courses/4699/pages/Library'

headers = {'Authorization' : 'Bearer ***HIDDEN***'}

sBody="<p>testing</p><p>checks & balances<a href=\"http://google.com\">a;b;c</a></p>"

p = "wiki_page[body]="+sBody


r = requests.put(url, params=p, headers=headers)


print("Response Status Code")


print("Response - Body")


**** OUTPUT ****

Response Status Code


Response - Body

<p>testing</p><p>checks </p>

If I change the "&" to "and" in the string (for giggles) and re-run, I get this output.  It "stops" at the first semicolon and, again, closes the tags???

Response Status Code


Response - Body

<p>testing</p><p>checks and balances<a href=\"http://google.com\">a</a></p>

And, as expected, the target Page's Body has just this partial content.

Ampersands and semicolons are so prevalent in HTML or page content I must be missing something.

I've tried a number of different ways to encode/escape the characters without luck.  I'm not even really sure at this point where in the flow the behavior is introduced so I'm struggling to troubleshoot.

I did try to use Canvas API Live but something's not working with it on my side - in general, not with this specific call.

Any thoughts/insights are appreciated.

Labels (3)
1 Solution
Community Champion

Up front disclosure: I'm not a programmer.

I tried your string in some old script I had and it seemed to work. I'll post the function below which accepts a page url as an argument and then will replace some existing text with some other text.


def edit_page(page_url):
	# get page body
	url = page_url.replace(API_URL,API_URL+'api/v1/') #change URL to API URL
	r = requests.get(url, headers = header, data = {})
	j = json.loads(r.text)
	old_body = j['body'].encode('utf-8')
	new_body = old_body.replace('<p>test</p>'.encode('utf-8'),'<p>testing</p><p>checks & balances<a href="http://google.com">a;b;c</a></p>'.encode('utf-8'))
	# edit page
	payload = {'wiki_page[body]' : new_body}
	r2 = requests.put(url, headers = header, data = payload)


It gave me this on the page:

Edited pageEdited page

View solution in original post