Hello All and @msmith
We have been using Canvas API for GET analytics/ activity/page_views/ participations
Since we have over 40K students, it takes really long time to get each record 1 by 1.
So ...in addition, we recently obtained access to AWS Canvas files, the same files are located at https://mycampusname.instructure.com/accounts/myID/external_tools/766806
We need to query those files to get analytics/ activity/page_views/ participations from Amazon/Redshift/Athena, instead of Canvas API.
There are a bunch of different files, the question is how to query this data to get the same report -
analytics/ activity/ page_views/ participations? What tables/files should be included for the student activity report?
Hello @CanvasUser2020 - With Canvas Data 2 availability being a few short months away and with it having a completely different schema (will have datasets vs. a star schema), I'd have a tough time justifying spending time right now developing such reports for the current Canvas Data.
But to start answering your questions, the last_activity_at date is found within the enrollment_dim table. Join the course_dim, enrollment_dim, user_dim and pseudonym_dim (unique_name is the email address) tables.
Additional activity/ participation measures might be the # of discussion posts, # of messages, # of zeros, current course score, # of missing assignments, # of assignment scores below a certain range, etc.
These use the discussion_dim, conversation_message_dim, submisison_fact/dim, course_score_fact tables (and more).
I suppose it is somewhat possible to refer to the requests table to estimate page views once you have successfully excluded the noise. But note that it isn't obvious if the student 'participated' or not as that field, while present in the API, is not found in Canvas Data. Here is a great discussion on using the requests table to count. And here is another.
And here I am asking for clarity on what changes may come to Requests with Canvas Data 2. No answer on that as yet, but perhaps I've missed something. Perhaps someone else knows what Canvas Data 2 will bring in terms of changes for Requests?
If you press on and get stuck feel free to send a note. I've used Requests for a few reports where aggregation estimates were acceptable. For me, the disclaimer that the Requests data may be incomplete dissuades me from relying on it for anything that relies on statistical accuracy.
Some people at Harvard University created this process https://github.com/Harvard-University-iCommons/canvas-data-aws which I use. It creates processes which copy the CanvasData (1) files into S3 and also sets up the environment for Athena. I found about it from a Post in this forum about a year ago, but the post no longer works.
I run Athena queries against it; Use Tableau, and also access the S3 data from Snowflake.