The Instructure Community will enter a read-only state on November 22, 2025 as we prepare to migrate to our new Community platform in early December.
Read our blog post for more info about this change.
Hi, I am new to the Python requests library and I am trying to figure out how to make an api call to dump the data files from Canvas. I keep getting a 404 error, so I just want to make sure I'm forming the request correctly. I looked at the API guide that showed how a request is broken down, but I'm still not sure how to use those in the Python library. The documentation on this seems to be pretty sparse for beginners. Here's the information i know, I'm just not sure how to form the request (and if what I have tried is correct, why might I be getting a 404 back?). Thanks for any help! I realize this is partially/mostly a python question...
I am trying to figure out how to actually write the request code. Any help is appreciated!
| Key | Value |
|---|---|
| HTTP_METHOD | GET |
| Host_Header | utexas.instructure.com |
| Content-Type_header | text/html? |
| Content-MD5_header | ? |
| /path/to/resource | /api/account/self/dump |
| alphabetical=query¶ms=here | None? |
| Date_header | "Thur, 25 Jun 2015 08:12:31 GMT" format (but within 15 mins of request) |
| API_secret | From the Canvas Data Portal |
@millsw ,
I think a lot of people have the same question you do because the documentation is not clear. I did some playing around tonight and five or so hours later, I had success.
The 404 error is a file not found. That's because you're calling the wrong host. This is not part of your regular Canvas instance.
Using the example code, the required values in the reqOpts variable are method, host, and path.
If your API Key was 1234 and your HMAC signature computed from the above is 9876, then you would add the following authorization header
Authorization: HMACAuth 1234:9876
Add the Data header as indicated in the instructions
Then make a call to GET https://portal.inshosteddata.com/api/account/self/dump
The results are returned in JSON form with a content-type of application/json
I can't help you with the Python part of it, but I was able to get it working in PHP. Hopefully the information provided will be enough to get you going.
Thanks James! This definitely gives me some help with what to try next. I'll post back here with a python solution if I get it working.
You're welcome @millsw .
This is something I've been meaning to figure out since I first got access to Canvas Data, but just couldn't find the time. I knew there were other people needing the same information from your post and one over in Big Data. There are a lot of people talking about ODBC and Redshift, but I wasn't going to sit there and click on 50 files (a day) to download them, so I focused on getting access through the API.
I may not have it perfect, but I have it working and I'll take that for now.
I supplied some source code (PHP and JavaScript) over in Re: How to use the API to download datafiles
The next major task for me on this project will be to actually download the files and get them into a database. The impetus for getting this working first was to give our IT department a heads up on how big of a database server we would need to store everything.
Update: Someone in my office (ladd) got a nodejs version to work with the updated information, but the Python version is getting an authentication error. Thanks to @James I am no longer getting 404. For those who are looking for where to start, here is what I've got so far (both of these are resulting with 401s:
My mimic of the node js, which gets a 401 Unauthorized response:
#imports
import datetime
import requests
import hashlib
import hmac
#Useful for converting day keys into appropriate string abreviations
dayNameDict = {
0 : 'Mon',
1 : 'Tues',
2 : 'Wed',
3 : 'Thurs',
4 : 'Fri',
5 : 'Sat',
6 : 'Sun'
}
#Get the current time, printed in the right format
def nowAsStr():
currentTime = datetime.datetime.utcnow()
weekdayName = dayNameDict[currentTime.weekday()]
prettyTime = currentTime.strftime('%d %b %Y %H:%M:%S UTC')
return weekdayName + ', ' + prettyTime
#Set up the request pieces
apiKey = 'key_here'
apiSecret = 'secret_here'
method = 'GET'
host = 'https://portal.inshosteddata.com'
path = '/api/account/self/file/latest'
timestamp = nowAsStr()
requestParts = [
method,
host,
'', #content Type Header
'', #content MD5 Header
path,
'', #alpha-sorted Query Params
timestamp,
apiSecret
]
#Build the request
requestMessage = '\n'.join(requestParts)
hmacObject = hmac.new(apiSecret, '', hashlib.sha256)
hmacObject.update(requestMessage)
headerDict = {
'Authorization' : 'HMACAuth ' + apiKey + ':' + hmacObject.hexdigest(),
'Date' : timestamp
}
#Submit the request/get a response
response = requests.request(method='GET', url=host+path, headers=headerDict, stream=True)
#Check to make sure the request was ok
if(response.status_code != 200):
print ('Request response went bad. Got back a ', response.status_code, ' code, meaning the request was ', response.reason)
else:
#Use the downloaded data
jsonData = response.json()
My attempt at using the Boto library to deal with the authentication for me
conn = boto.s3.connection.S3Connection(host=hostStr, path=pathStr)
conn.get_all_buckets()
s=requests.Session()
s.get(host+path,headerDict)
@millsw ,
Another problem may be with your date and time.
The RFC 3271 format specifies a fixed-length date/time. Each day is three letters long -- see section 7.1.1.1 for the details. So "Tues" and "Thurs" are not supported. However, it looks like you can just use the %s format code for the Python strftime() function to get the date. Also, the specification calls for "GMT" instead of "UTC".
I haven't tested this, but from the documentation, it looks like the strfttime() code you want is:
'%a, %d %b %Y %H:%M:%S %Z' or '%a, %d %b %Y %H:%M:%S GMT'
Edit: The original post had %s instead of %a
The first one is only valid if your datetime object has a time zone.
I'm not saying what you have won't work. It is entirely possible that Canvas just uses whatever you supply in the header and as long as it can parse it and it's within 15 minutes of the server's time, you're okay. But if you're having problems with something working, you'll have a better chance of success if you follow the specs.
@millsw ,
I did more testing with the date and it appears that even though the Tues and Thur are non-standard, they work. I can't test Thurs because it's only Tuesday and the timestamp has to be within 15 minute of the server time.
Still, it's best to use the standard, which calls for a three-letter date and a fixed-length string.
Have you made any progress on getting a python script functioning? Was the issue with the format of the timestamp?
@millsw ,
I don't speak python, so sorry if I missed this, but I don't see where you're base64 encoding your sha256 hash.
Also, the host is portal.inshosteddata.com not https://portal.inshosteddata.com
I've got to leave right now, but hopefully that helps for now.
Hi James,
Thanks for the tip on the day standardization. I have updated this in my code, and added a base64 encoding on the hexdigest of my hmac object (when building the headerDict object).
I also tried changing the host by removing the https://, but I get back an error saying my host is malformed when I do that (as opposed to the 401 response I get from the https:// version. Did you ever run into that issue with your trials?
Here's the error:
"requests.exceptions.MissingSchema: Invalid URL 'portal.inshosteddata.com/api/account/self/file/latest': No schema supplied. Perhaps you meant http://portal.inshosteddata.com/api/account/self/file/latest?"
@millsw , there are two places where portal.inshosteddata.com is used.
The host name that is used in the HMAC signature is just the host name and not a complete URL. portal.inshosteddata.com
The URL that you use to make the request is the full URL, https://portal.inshosteddata.com/api/account/self/file/latest
You may want to take a look at the document I wrote that explains things in detail and provides working code for JavaScript, PHP, and PERL. There are also codes that you can use to verify your signature before you make the actual API call. Canvas Data API Authentication
Hi @millsw ,
I get the same 401 Unauthorized response in my java implementation. Were you able to find a solution?
Answering to my own question... I got it resolved. It was an extra "\n" at the end of my 8 items concatenated string ![]()
Hi Abdul,
Unfortunately, no. We were able to get it to work with Node.js, so we just stuck with that. Sorry!
Thanks @millsw and @James for the pointers. I got some python code working. The trick was to use the regular digest method, rather than the hexdigest:
requestMessage = '\n'.join(requestParts)
hmacObject = hmac.new(apiSecret, '', hashlib.sha256)
hmacObject.update(requestMessage)
hmac_digest = hmacObject.digest()
sig = base64.b64encode(hmac_digest)
Cheers,
Damian
Thanks damiansweeney and @James and @William! Ileft a fresh answer with python code working at the bottom and in Stack Overflow
canvas lms - How can I use Candas Data REST API using python? - Stack Overflow
Thanks to both you and James for this thread.
A coworker and I are also struggling to use the Canvas Data API. Our goal is to retrieve our files and load them into an Oracle db. We are BI developers. We know a bit about dimensional models and using them, but next to nothing about APIs and API authentication, so this discussion is appreciated. The documentation was challenging for someone as inexperienced at this as myself.
I know it is a bit old-school, but Perl is the scripting language I know best. I'll be taking the suggestions made here and trying to incorporate them into a Perl script. If anyone has suggestions for doing this in Perl, they would be much appreciated,
Keep us posted on your progress so we can benefit from what you learn.
Thanks.
PERL and PHP are my two prevalent programming languages. Do you need help getting the authorization part working, making the actual API, or some of both?
I haven't used PERL with the Canvas Data API, but I have with the regular Canvas API. There, I used the HTTP::Request::Common and LWP::UserAgent modules to make the calls and the JSON module to convert the output to a PERL data structure.
There are modules on CPAN for almost anything you need to do. I can't recommend a HMAC or SHA256 one without research, but the good ones usually tend to pop up towards the top of a Google search.
By the way, the Canvas Data API is much simpler than the Canvas API. The Canvas Data API only involves GETs (at this time) and no pagination, so the authentication and right locations were the toughest part.
Ah, thanks much, James.
When I say that Perl is the scripting language I know best, I have to confess that, even there, I'm self taught and quite rusty. But I did manage to use it successfully to pull down data from the regular Canvas API, using the JSON and DBI modules to convert the data and load it into some db tables.
That's my qualification on how much, or little, I understand in what I'm about to say. I think it is mostly the authentication piece that's the biggest challenge. If I could send the right request and get a reply, I think I could manage from there. But the truth is, I haven't even tried to do it in Perl yet because I was having such a hard time understanding what I need to send and how to construct it. That's what's giving me headaches--at least for now.
Your guidance in this discussion and the other one that you referenced contain many helpful suggestions.
Yes, I Googled HMAC SHA256 and I think that gave me some pretty good leads on which module to use for that piece.
Thanks
@s_moles ,
I wrote a PERL module this morning that would call the API. It works with limited testing (the dump) and it's not complete (I like to add methods for all of the API calls rather than making the end-user specify the URL themselves), but here's what I found so far:
You need to the Digest::SHA module and in particular, the hmac_sha256_base64 procedure.
use Digest::SHA qw{hmac_sha256_base64};
Assuming that you can generate the message as shown in the examples, then the line to generate the HMAC code is this:
my $hmac = hmac_sha256_base64( $message, $secret );
But then there's one additional gotcha. The PERL implementations don't automatically pad the code to the proper length. The length needs to be a multiple of 4 and you add equal signs to the end if it's not.
This can be fixed as such:
while ( length($hmac) % 4 ) {
$hmac .= '=';
}
Depending on how you want to dump your data, there is an issue of a lack of true and false values in PER, instead we use 1 and 0 respectively. The JSON decoders, and this seems pretty common, is to create objects for true and false and then make them act like 1 and 0. So you should be able to say something like if ($item->{'finished'}) { } to see if an dump has been finished.
The following code snippets are other places you might want some help. They probably won't run alone, but incorporated into the bigger picture, it all works.
You can also use HTTP::Date to get the proper date format.
use HTTP::Date;
my $timestamp = time2str();
After reading the documentation, it appears that decode_json() is preferred to from_json.
use LWP::UserAgent;
use JSON qw{decode_json};
my $json;
my $ua = LWP::UserAgent->new;
my $headers = HTTP::Headers->new(
'Authorization' => 'HMACAuth ' . $api_key . ':' . $hmac,
'Date' => $timestamp,
);
my $request = HTTP::Request->new( $method, $uri, $headers );
my $response = $ua->request($request);
if ( $response->is_success ) {
$json = decode_json( $response->decoded_content );
}
else {
print( STDERR $response->status_line() );
}
I apologize for the mix of procedural and object-orientated programming. I am trying to switch most of my coding over to OO, but sometimes, when you just need one function from another module like Digest::SHA, it's quicker to use the procedural approach.
Hopefully you'll get something useful out of this. I'll try to get the code posted somewhere once it's done, but you may not want to wait that long.
Wow, this is awesome, James. Thanks much.
Among other things, I never would have figured out that the hmac code needs
to be padded with equal signs to a multiple of 4.
It would have taken me days, if not weeks to get this far.
Not sure when I'll get to give it a try. We've got a Cognos upgrade going
that we're testing for and I've got some other deadline-sensitive work to
finish too. So, you might not hear back from me right away, but I'll
definitely use this information.
When connecting to the regular Canvas API, I used the REST::Client module.
I'm not familiar with LWP, but you've given me some good examples to follow.
I'll keep an eye out for additional posts to the community.
Thanks again
On Wed, Nov 25, 2015 at 8:36 AM, james@richland.edu <instructure@jiveon.com>
I haven't used the REST::Client myself, but that should be the easy part of the picture. Good luck with the upgrade - it's nearing the end of the semester so crazy for me, too.
I just wrote a blog post Canvas Data API Authentication that explains in excruciatingly painful detail how to create the HMAC signature for the API calls. There is working code for JavaScript, PHP, and PERL (sorry no Python). For those who make it all the way to the bottom, you are rewarded with links to my Canvancement site that has a complete API PERL module or PHP class that will do all of the API work for you.
I am also working on (working for me but not ready for publication) PHP code that will:
Brook, I decided not to pursue the PERL approach on those secondary files, I've got too much other stuff going on.
I know MySQL is probably not what someone wanting aggregate data or dashboards would want, but I don't have experience with OLAP and we're more interested in drilling down to generate reports right now. Looking at aggregate data and OLAP cubes may be something we pursue in the future.
I've been reading the blog post. Must be the most helpful and most generous community post(s) that I've ever come across.
I don't think my PERL skills are good enough to module-ize and share what we come up with for unzipping and importing the files into our Oracle db, but I hope I''ll be at least considerate enough to let you know how we make out.
Thanks for the kind words, Brook.
I had your fear (I'm not an expert, I don't know what I'm doing, my code would be laughed at by anyone who knew what they were doing, etc). What I think you'll find is that most of the people here don't care about how bad (or good) your code is. As long as it works, and even sometimes when it doesn't, it gives them something to go off of. Simply put, most people (at least the vocal ones) are of the opinion that "something is better than nothing."
So, yes, please feel free to contribute back what you find.
Thanks to @James @William Mills and damiansweeney! I was able to make the python request successfully. It follows a draft of the code so we keep the answer in one post.
It would be good to have a better tool to collaborate tough. I would suggest also we could help growing canvas-lms tag on StackOverflow http://stackoverflow.com/tags/canvas-lms/info
This answer is also available in canvas lms - How can I use Candas Data REST API using python? - Stack Overflow
Best,
#!/usr/bin/python
#imports
import datetime
import requests
import hashlib
import hmac
import base64
import json
#Get the current time, printed in the right format
def nowAsStr():
currentTime = datetime.datetime.utcnow()
prettyTime = currentTime.strftime('%a, %d %b %Y %H:%M:%S GMT')
return prettyTime
#Set up the request pieces
apiKey = 'your_key'
apiSecret = 'your_secret'
method = 'GET'
host = 'api.inshosteddata.com'
path = '/api/account/self/dump'
timestamp = nowAsStr()
requestParts = [
method,
host,
'', #content Type Header
'', #content MD5 Header
path,
'', #alpha-sorted Query Params
timestamp,
apiSecret
]
#Build the request
requestMessage = '\n'.join(requestParts)
print (requestMessage.__repr__())
hmacObject = hmac.new(apiSecret, '', hashlib.sha256)
hmacObject.update(requestMessage)
hmac_digest = hmacObject.digest()
sig = base64.b64encode(hmac_digest)
headerDict = {
'Authorization' : 'HMACAuth ' + apiKey + ':' + sig,
'Date' : timestamp
}
#Submit the request/get a response
uri = "https://"+host+path
print (uri)
print (headerDict)
response = requests.request(method='GET', url=uri, headers=headerDict, stream=True)
#Check to make sure the request was ok
if(response.status_code != 200):
print ('Request response went bad. Got back a ', response.status_code, ' code, meaning the request was ', response.reason)
else:
#Use the downloaded data
jsonData = response.json()
print json.dumps(jsonData, indent=4)
Community helpTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign inTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign in
This discussion post is outdated and has been archived. Please use the Community question forums and official documentation for the most current and accurate information.