Showing results for 
Show  only  | Search instead for 
Did you mean: 
New Member

AWS Canvas files query

Hello All and @msmith 


We have been using Canvas API  for GET analytics/ activity/page_views/ participations  

Since we have over 40K students, it takes really long time to get each record 1 by 1.

So addition, we recently obtained access to AWS Canvas files, the same files are located at
We need to query those files to get analytics/ activity/page_views/ participations from Amazon/Redshift/Athena, instead of Canvas API.

There are a bunch of different files, the question is how to query this data to get the same report -

analytics/ activity/ page_views/ participations? What tables/files should be included for the student activity report? 

Thank you

Tags (1)
0 Kudos
3 Replies
Community Champion

Hello @CanvasUser2020 -   With Canvas Data 2 availability being a few short months away and with it having a completely different schema (will have datasets vs. a star schema), I'd have a tough time justifying spending time right now developing such reports for the current Canvas Data.

But to start answering your questions, the last_activity_at date is found within the enrollment_dim table. Join the course_dim, enrollment_dim, user_dim and pseudonym_dim (unique_name is the email address) tables.

Additional activity/ participation measures might be the # of discussion posts, # of messages, # of zeros, current course score, # of missing assignments, # of assignment scores below a certain range, etc. 

These use the discussion_dim, conversation_message_dim, submisison_fact/dim, course_score_fact tables (and more).

I suppose it is somewhat possible to refer to the requests table to estimate page views once you have successfully excluded the noise.  But note that it isn't obvious if the student 'participated' or not as that field, while present in the API, is not found in Canvas Data.  Here is a great discussion on using the requests table to count.  And here is another.

And here I am asking for clarity on what changes may come to Requests with Canvas Data 2.  No answer on that as yet, but perhaps I've missed something.  Perhaps someone else knows what Canvas Data 2 will bring in terms of changes for Requests?

If you press on and get stuck feel free to send a note.  I've used Requests for a few reports where aggregation estimates were acceptable.  For me, the disclaimer that the Requests data may be incomplete dissuades me from relying on it for anything that relies on statistical accuracy. 



New Member

Some people at Harvard University created this process  which I use.  It creates processes which copy the CanvasData  (1) files into S3 and also sets up the environment for Athena.  I found about it from a Post in this forum about a year ago, but the post no longer works.

I run Athena queries against it;  Use Tableau, and also access the S3 data from Snowflake.