Obtaining and using Access Report data for an entire course

James
Community Champion
203
147228

May 27, 2020, update (version 15):

Another week, another update. This time I installed a new linter to check for crud and subtle issues with the code. It turned up a lot of little things and at least one place where I was using an array when I thought I was using an object. Here are some of the changes for version 15

  • The file is now sorted by the sortable name, when present. Version 14 sorted it by the Canvas User ID.
  • Promises now work in the code. Version 14 relied on Bottleneck's idle function to kick in, but it might have triggered before all of the fetches were done.
  • The script can now handle students enrolled in multiple sections of the course. You can generate a delimited list of sections, duplicate the information for each section, or just return the first section found.
  • The script now only tries to fetch usage information when there is a last_activity_at date. That is no guarantee that there is an access report, but I never found an access report without having a last activity date.
  • The script can now hide empty columns with no data. Previously, this was only for SIS data unless you edited the code.
  • The script can be run correctly without reloading the page. Previously it appended the data, so if you ran it twice without refreshing the page, the data would be there twice.

May 19, 2020, update (version 14):

I've done a major rewrite of the script.

  • It now uses a throttling library to play nicer with Canvas. Large courses were getting inconsistent results because the requests were failing. Requests shouldn't fail, but if they do, I throttle it even more and then try to repeat the failed requests.
  • I added the Last Activity and Total Activity Time from the People page to the report.
  • I added the current course score and current course grade to the report (the grade is only available if you're using a grading scheme).
  • I modified the quiz views to match what Canvas gives. In the data supplied, they count taking a quiz as a view, but they subtract that off in the online access report. My data now matches theirs.
  • I check the permissions of the user to make sure they can generate a report and don't add the button if they cannot. Previously, students could add the access report, it just didn't give them any information. Now it doesn't even show the button.

This is a major rewrite and updating will show a lot of changes. There is a good chance that any local customization you have done to the script will be lost. Those using a non-instructure.com domain for Canvas may need to reset their // @include line on line 5 (see the custom URL section below).

Synopsis

Canvas provides an Access Report for each student, but you have to click on each student's name from the People page to get it. This document will show you how to install a button on the People page that will create a .CSV file of the Access Report data for all students enrolled in the course. It will then show some ways you can obtain useful information from that data using Microsoft Excel. Although much of this data could be obtained through Canvas Data, this script makes it available to the instructor in real-time.

Quick Install

For those power users who are impatient, here are the quick install steps.

  1. Install a browser userscript manager Tampermonkey for Chrome/Firefox/Safari
  2. Install the Access Report Data user script.
  3. Navigate to the People page and click on the "Access Report Data" button

If you run into problems, be sure to go back and read the instructions.

Note that the Tampermonkey extension for Safari requires payment. This is for the author of the extension that allows the script to run. I am not asking anything for the script itself.

Introduction

Canvas will give you an Access Report for each student that tells you how many times a student has viewed or participated in a particular content item. To obtain the list, you go to the Course Roster by clicking the People navigation link, then click on a student's name. Once you do that, you will get an item on the right-side navigation bar that says "Access Report for student's name". When you click on that, you get something like this:

Student Access Report

It gives you the type of content (as an icon), the name of the content, the number of times a student viewed the content, the number of times a student participated in the content, and the last time the student viewed (or participated) in the content. It is sorted by the time the content was last accessed so you can see what they have been working on most recently.

Unfortunately, this information is available for just one student at a time, so answering questions like "Who viewed the PowerPoint presentation I told them to read?" becomes difficult -- you need to go into each student individually to see that information.

I've written a User Script that solves that problem. It fetches a list of all of the students in the class and then obtains the Access Report for each one, compiling all of that data into a single Comma Separated Values (.CSV) file that can be opened with a spreadsheet like Microsoft Excel.

This document shows you how to install that script and then analyze the data that you get from it.

Installing the User Script

A user script is a JavaScript that is ran by the browser on the user's machine. Rather than a Canvas-supplied script, it's one that the user installs and runs on their own. The installation is per user and per machine, so you will need to install it on each machine that you want to use the script with.

Custom URLs

The script automatically runs on any page that matches https://*.instructure.com/courses/*/users. This is the main People page if your site is hosted by Instructure without a custom instance. If you have a custom URL, like canvas.university.edu, then you will need to modify the script to get it to work.

The specific steps to do this vary depending on your browser add-on. In Greasemonkey, click on the Greasemonkey pull-down, choose Manage User Scripts, find the Access Report Data, right click and choose Edit. In Tampermonkey, click on the Tampermonkey Icon, choose Dashboard, and then click on Access Report Data.

In either case, you need to change // @include statement on line 5 to match your instance. In the case of canvas.university.edu, you should change it to https://canvas.university.edu/courses/*/users. The * is a wildcard that will match any course.

Customization

There are two configuration variables that can be set within the source code to alter the functionality of the script.

  • showViewStudent will include the participations where a student clicked on a student's name from the People (roster) page and viewed their profile when it is set to true. The default is false, which will remove this information from report before the CSV file is downloaded. This information was included prior to February 3, 2017, but faculty were getting confused when they did a pivot table and the names of their students showed up as titles along with assignments and content pages. Now users will have to explicitly set it to true to get that information.
  • quizParticipation is a flag to make the data in this report match the data displayed in the Canvas Access Report. Canvas counts quiz participations as views, but then subtracts them off when displayed. When this is true, it makes this report match Canvas. If false, then it displays what was in the JSON data, which was the previous behavior.
  • enrollmentStates is an array that says what kind of enrollments should be included. The default is 'active' and 'completed', but you may also use 'invited', 'rejected', and 'inactive' (inactive probably won't help much). See the List users in a course endpoint for additional information.
  • analytics is an array that provides an easy way to include the student analytics (last activity and total activity) or grade (current grade and current score).
  • disableMissing is a Boolean value that will remove empty columns from the output when true.
  • headingSpaces will allow you to remove the space in the headings and replace it with something else (empty string or underscore are common). This can cause "User ID" to become "UserID" or "User_ID". Some users reported problems with programs, like R, when there were spaces in the variable names (headings). The default is ' ' (a space), which is to leave the spaces.
  • multipleSections determines how students enrolled in multiple sections are handled. Setting the value to 1 will duplicate the information for each section, so a student enrolled in three sections will have three times the data. This is important if you are using the section in your pivot table, but will throw off the counts if you are not. Setting the values to 2 will take the first section that it finds and use it and ignore all of the other sections so that each item is included only once with no duplication. Setting it to a non-numeric string will use that string as a delimiter but not duplicate any items. The default is ', ' (comma followed by space) so that multiple sections will show up as a comma delimited list.
  • maxConcurrent is the maximum number of concurrent calls to make to Canvas. This is part of the throttling system. The higher this is, the faster the calls can be made, but the more likely you will abuse Canvas in a way that causes them to block your requests.
  • minTime is a delay in milliseconds injected between each call. If you make a lot of calls that the same time, Canvas imposes a penalty that can quickly exhaust the available limit before the maxConcurrent is reached. It is likely that this value will have more impact that maxConcurrent will.
  • debug is a Boolean flag to output debugging information. Included in the debugging output is the minimum x-rate-limit-remaining value and the maximum x-request-cost value. Those can be used to help tweak the maxConcurrent and minTime settings.
  • csvFields contains the headings, source field, any special formatting required, and allows you to disable columns that you do not want. The three sources are u (user data), s (section data), and a (access report data). This is under the advanced configuration section, which typically means don't mess with it if you don't know what you're doing, but any user should be able to change the name property.

Export the Data

In the June 4, 2016, production release, Canvas consolidated the buttons that previously appeared on the People page under the administrative cog.

The script was updated on June 6, 2016, to reflect that change and move the Access Report Data into that same cog. Note that the demonstration videos have not been updated to reflect the change.

Open Settings

When you expand the menu by clicking on the cog, then you can choose the Access Report Data item.

Click Access Report Data

Click on this and wait.

There is now a progress bar that appears once the list of students has been downloaded and it is fetching the access reports.

It has to make an API call for the list of students and then generate the Access Report for each student in the course. This only took 8 seconds for my class with 46 students and up to 112 content items, but we're only one month into the semester and it will probably slow down later. It took 12 seconds on a course with 81 active students and 126 content items. It took 15 seconds for a class with 35 students but 470 content items.

Limitations

Early on, really large classes were having timeout issues in Chrome. Switching to Firefox helped. Now I wonder if the problems were related to the requests failing and so I hope the changes I made with version 14 have improved this so that it is no longer an issue. I've made some improvements to the code with version 14 and let Chrome run for 4.5 minutes in a class with 29,615 items without issues

If you have a really large class, you may need to tweak the settings. The call to get the enrollments is relatively expensive compared to the call to get the usage (access report data). In a class of 300 students, it was fine, but in a class of over 10,000 students, it was timing out. It is supposed to retry failed attempts, but there was some data loss that I haven't tracked down yet. It's best to avoid exhausting the x-rate-limit-remaining header. Increase the minTime or decrease the maxConcurrent values. Normally, the minTime is the more important one, but if your requests are taking a long time to process (like the enrollments were), then you may decrease the number of concurrent requests. Reducing maxConcurrent from 40 to 20 avoided the lost data issue in the 10k+ course and only added 4 seconds to the download time for the class with 300 students.

The course with 81 students actually had 233 students in it, but it was a resource course for faculty and only 81 had bothered to go into the course to do anything. That brings up an important note about the Access Report.

The Access Report only provides information on students who are doing something in the course.

Access Report data does not return information about who has not done something, but Excel has an option that will allow you to see this.

Another thing to note is that additional information is available through the Page View data, like the exact times when students did something and which browser they were using at the time. However, faculty don't generally have access to this information and you have to load the information for each student and sift through it to see which applies to the particular course. This takes way more than 8 seconds for little gain.

The Page View information is available through Canvas Data. However the data there is not current, running about a day or two behind real-time. It also has limited availability and requires additional resources to analyze. Page view data still occurs when a student does something. If they're not doing anything, there is nothing to record.

Finally, links to URLs that are contained within assignments or pages are not included here. The External URLs are just those that are linked to items in modules, not a link in a page full of instructions.

Raw Data

Okay, you've installed the script, clicked the button, and opened the file in Excel. Now what?

Well, you get a bunch of raw data that will need manipulated before you can tell anything useful. It looks pretty intimidating, you have a report that looks like this (and this is just the first 10 rows).

Excel Raw Data

Note that the SIS Login and SIS User ID may not be there, depending on the permissions you have within Canvas.

Where did all this come from?

It turns out that Canvas actually provides more information than it shows on the page you get when you ask for the Access Report. Canvas displays the HTML version with just the highlights and in a form that is easy for humans to read. It also provides the data in JavaScript Object Notation (JSON) format that is much easier for a computer to deal with. Canvas internally calls these items assets, and that gets reflected in the naming of the item.

Here's what that first entry looks like in JSON.

{
"id":123456789,
"asset_code":"discussion_topic_11319205",
"asset_group_code":"topics",
"user_id":1278402,
"context_id":1785810,
"context_type":"Course",
"last_access":"2016-02-12T21:31:11Z",
"created_at":"2016-02-10T20:12:45Z",
"updated_at":"2016-02-12T21:31:11Z",
"asset_category":"topics",
"view_score":2,
"participate_score":null,
"action_level":"view",
"summarized_at":null,
"display_name":"Discussion 4: Hypotheses and Errors",
"membership_type":"StudentEnrollment",
"readable_name":"Discussion 4: Hypotheses and Errors",
"asset_class_name":"discussion_topic"
}‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

What's being displayed in the HTML version of the report is contained are the asset_class_name, readable_name, view_score, participate_score, and last_access. The User's name is displayed at the top of the report, but the user_id (1278402) can be used to identify the student (in this case Cubic Dream) and the user script merges the user information and the access data together into a usable format for the user. By the way, all of the user data have been anonymized, but the assignments and access data are from my actual class. Cubic Dream isn't my student's real name and 1278402 isn't Cubic's real Canvas User ID.

Data Dictionary

In statistics, I emphasize the importance of having a good data dictionary. You need to know what the values represent.

  • User ID is the Canvas User ID. Everyone has one. Most people won't find it extremely interesting, but in the case of multiple students with the same name, it can help tell them apart.
  • Display Name is name of the student that is displayed in Canvas. It's normally in First Last format, like Cubic Dream or Skinny Record.
  • Sortable Name may not be available to you. There is a feature that your Customer Success Manager can enable that will allow you to list names in Last, First format like Dream, Cubic or Record, Skinny. If it's enabled, this provides you with a quick way to sort alphabetically by last name.
  • Category corresponds to the asset_category_type of the Access Report data. It includes things like announcements, assignments, collaborations, conferences, external_urls, files, grades, home, modules, quizzes, roster, topics, and wiki.
  • Class corresponds to the asset_class_name of the Access Report data. It includes things like announcement, assignment, attachment, content_tag, discussion_topic, google_docs_collaboration, quizzes/quiz, student_enrollment, teacher_enrollment, and wiki_page. Generally, there is a one-to-one correspondence between category and class, but some things like the roster category are broken into student_enrollment and teacher_enrollments for the class name.
  • Title is the name of the content. In the Access Report data, it's called the readable_name There was a display name as well, but it was sometimes blank and when it was there, it mostly matched the readable name.
  • Views is the number of times the student viewed the content. For discussions, this would be the number of times they went in and viewed the discussion. This is the closest most people will get to the number of messages they read, but it's the number of times they went in, there is no guarantee they actually did anything past opening the page.
  • Participations is the number of times the student participated. For discussions, this would be the number of posts they made.
  • Last Access is the last time the student viewed or participated. It is converted to the local time according to the browser.
  • First Access is the first time the student viewed or participated. It is converted to the local time according to the browser.
  • Action is either view or participate. I'm not exactly sure on this, but I think it corresponds to the last access.
  • Code is a unique identifier for each piece of content. It contains the type of content plus the Canvas ID. For example, assignment_8556874 identifies this as assignment with a Canvas ID of 8556874. That is not very useful for faculty, but if someone was trying to link all the data together for advanced reports, it may be.
  • Group Code is another way of categorizing the data that most people will ignore. The group doesn't refer to Groups in the normal sense, but is some way to organize the data. For example, there is an assignment_group_### group code field that appears to represent which assignment group the item belongs to. That means you could, with additional information, break it down by whether the assignment was homework, exams, projects, etc., depending on your assignment groups. On the other hand, every single code beginning with wiki_page_ belonged to the same wiki_ group code (at least for my course).
  • Context Type should be Course or Group and relates to the Context ID to determine exactly which course or group.
  • Context ID is the Canvas Course or Group ID for the course. It is used in conjunction with the Context Type.
  • Login ID is what the user uses to log into Canvas with. For us, it's their NetID, but it could be an email address.
  • Section Name is the name of the section the student is enrolled in. This may be a delimited list if the student is in more than one section. See the multipleSections configuration variable for more information.
  • Section ID is the Canvas ID for the section the student is enrolled in. This may be a delimited list if the student is in more than one section. See the multipleSections configuration variable for more information.
  • SIS Course ID is the Course ID supplied by your Student Information System (SIS). This column may not be there if the person requesting the Access Report doesn't have access to the SIS information from Canvas. It's a permissions issue and the script runs as the person calling it.
  • SIS Section ID is the Section ID supplied by your Student Information System (SIS). This column may not be there if the person requesting the Access Report doesn't have access to the SIS information from Canvas. It's a permissions issue and the script runs as the person calling it.
  • SIS User ID is what your SIS knows the person by. For us, it's an integer that uniquely identifies the user. This column may not be there if the instructor doesn't have access to the SIS information inside Canvas.
  • Last Activity is the last time the student accessed the course as explained in How do I use the People page in a course as an instructor?  It is converted to the local time according to the browser.
  • Total Activity is the total time in decimal hours as explained in the Canvas Instructor Guide. Instead of giving hours:minutes:seconds, 3:12:15 becomes 3.21 hours.
  • Current Score is the current numeric score for the student in the course.
  • Current Grade is the current letter grade for the student in the course. This is only available if a grading scheme has been set.
  • Final Score is the score the student would get if all ungraded assignments were given a 0. This is disabled by default.
  • Final Grade is the letter grade the student would get if all ungraded assignments were given a 0. This is disabled by default.

Note that section information was added after the original script was released and are not included in the videos.


Drilling Down to Specifics

Some questions can be answered using just the raw data.

Format as a Table

If you're going to examine the raw data, you will want to turn it into a table first. To do this, go to Home > Format as Table or Insert > Table. This allows you to filter or sort on a column, which will greatly increase your productivity later.

Who participated in a discussion?

Let's say we wanted to know who participated (and how many times) in Discussion 3.

I purposely started this one off in a novice way, to show what people who may not be familiar with Excel can do. There are more efficient ways of doing this.

  • You can sort by the title by clicking in the Title column and going to Data > Sort > A-Z. An easier way is to click on the down arrow on the Title heading at the top of the table and choose Sort A to Z.
  • Doing that may overwhelm you with information so you can use Ctrl-F to find the one you want. But once I did that, it turned out that Discussions are in there twice, once as a discussion and once as an assignment. This is a problem when you have multiple contents with the same name, so sorting by title may not be the best solution.
  • You can filter the Category (choose topics) or Class (choose discussion_topics). To do this, click on the appropriate heading at the top and un-check the ones you don't want.


Who viewed an external URL?

You'll want to sort by title to group the content together. You could use the Code to group, but it wouldn't be alphabetical.

I then used filters to select the external URLs. Those can be found under Category (external_urls) or Class (content_tag) and found the one I was looking for.

Show Me!

Here is a video walk-through of the three items mentioned here.

Summary Reports

Sometimes you want a broader picture than just what a student did on a single assignment. To accomplish these, you'll need to create a Pivot Table.

Create a Pivot Table

To create a pivot table, you need to go to Insert > Pivot Table and click OK.

Once you're there, you will probably want to drag the Display Name or Sortable Name down to the Rows so you can break things down by students. Every one of the items in this section starts off the same way.

Pivot tables allow you to insert slicers to quickly filter the data or time slicers to view data over a particular time period (you must have a date/time field to do this, but we have first access and last access to pick from)

Analyzing Discussions

This whole Access Report project grew out of a desire to know how many times students had gone into the discussions and at least viewed them.

What I'd like to know is how many times did a student view or participate in a discussion this semester.

  1. Choose Insert > Pivot Table
  2. Drag the student's name to the Rows
  3. Drag the Participations to the Values. Excel wants to do "Count of Participations" instead of "Sum of Participations". You can change this in several places, but double clicking on the heading of the table is probably the quickest. Change it from Count to Sum and then change "Sum of Participations" to something else. You might want to use "Participations", but that's already used, so you have to pick something else.
  4. That gives the participations for the entire course. You could add a filter on the right side, but a faster way is to choose Insert Slicer from the top. This gives a nicer interactive menu where you can immediately click and limit your data.
  5. After looking at Participations, you can do the same thing with Views. Drag it to the Values box, where, thankfully, it comes through as a sum. I would change the title from "Sum of Views" to something else, like "View"
  6. You can sort the data by clicking on the Row Labels pull down. The default A to Z and Z to A are for the data in that column, but you can choose More Sort Options and tell it to sort by another column
  7. To break the report down by the discussion, you can add another slicer for the Title and then look at participations and views one discussion at a time.
  8. If you decide you want to look at all of the discussions, then drag Title to the Columns selector on the right side. You'll need to turn off the filter on the Title if you do this.

Here is a video that shows all of that in action.

Filtering by Time Period

The Course Roster (People) page shows you when the last activity of a student was and the total amount of time spent in the course. It does this in alphabetical order. Unfortunately, last activity could be just logging into the course, it doesn't mean they did anything else once they got there.

As a side note, if you have installed my Sort a Roster Canvancement, then you can click at the top of any column to sort by that column. You can find sort by section, by the time they last accessed the course, or even by the total amount of time spent in the course.

You can filter the information in the Access Report Data spreadsheet by time.

  1. Choose Insert > Pivot Table
  2. Drag the student's name to the Rows, and both Views and Participations to the Values. As before, change the heading "Sum of Views" to just "View" and the "Count of Participations" to Sum instead of Count and then change the title to "Participation".
  3. Click on Insert Timeline and choose Last Access.
  4. Change the slider from Months to Days and then highlight the date range to restrict the report to.
  5. You could add filters (through the Slicers) to limit what kind of activity you want to look at.

Remember that only students who have data for that time period will show up. There is no easy way to get a list of students who aren't doing a particular thing.

Here is a video that walks you through the report.

Quick Tables

A quick table is a table that shows you information about a particular area. What we're going to do here is break down the information by the Class. You could just as easily choose the Category and some might find it more useful.

  1. Choose Insert > Pivot Table
  2. Drag the student's name to the Rows, the Title to the Columns, and Class (or Category) to the Filters.
  3. Draw Views and Participations to the Values. Rename Views to be V (yes, just a single letter). Change "Count of Participations" to Sum and then rename it to just be the letter P.
  4. Rotate the titles in Row 4 so that they are vertical. Do not just click on Row 4, it doesn't work; you need to select the cells and then do the rotation. To rotate the text, click on the Orientation icon from the Home screen and choose Rotate Text Up. You may also want to Right align all the text from columns B on, although it may not really matter if you use a single letter for Views and Participations.
  5. Be prepared to be wowed!
  6. Click inside the pivot table, then at the top click on PivotTable Tools > Analyze. On the left, choose the pulldown menu next to Pivot Table Options (don't click on the word Options). Then click Show Report Filter Pages (this won't be available if you forgot to put something in the Filters box on the right). Then choose the filter(s) to use and click OK.
  7. What you get is a page for each type of content. The name of the student is on the side and the name of the content is across the top. This allows you to quickly (hence the name Quick Table) look at discussions or quizzes or external URLs or anything else in the class that is available.

Here is a video that shows all of this in action.

Viewing Who Has Not Participated

As mentioned above, what we are looking at is the Access Report data and it doesn't include information about students who are not engaging in your course. Luckily, there is one checkbox in Excel that we can check to get that information. The student will have needed to do something, anything, so that their name is in the Access Report data, but then we can see what they have not done.

  1. Choose Insert > Pivot Table
  2. Drag the student's name to the Rows
  3. Drag the Participations to the Values. Excel wants to do "Count of Participations" instead of "Sum of Participations". You can change this in several places, but double clicking on the heading of the table is probably the quickest. Change it from Count to Sum and then change "Sum of Participations" to something else. You might want to use "Participations", but that's already used, so you have to pick something else.
  4. Choose Insert Slicer from the top and add slicers for Category (or class) and Title. Choose the content you want to view. What you currently have is a list of those who have participated.
  5. To get the list of those who have not participated, click on the student's name from the Rows field and choose Field Settings. Then click on Layout & Print. Check the "Show items with no data" and click OK.
  6. If you like, you can go to Data > Sort > A to Z to bring those who have not participated to the top.

Here's a video showing how it works.

Updating the Script

Tampermonkey should attempt to update this script automatically for you. In case you have turned off that functionality or would like to update sooner, you can click on the Tampermonkey icon and choose Check for userscritp updates.

The update process keeps your local settings and shows the changes to the code for you to decide whether to upgrade.

Canvancements

This script is a Canvancement -- a Canvas Enhancement. The links in the document point to an installable version of the code, but there is an Access Report Data project page as well that contains the source code as well as a version that you can use to anonymize the names like I did for the videos. Other projects, like the Roster Sorter that was mentioned here can be found on the Canvancement website as well.

203 Comments