Join us for The InstructureCon25 Encore Series—a book club that brings the voices from InstructureCon into your everyday learning. Our reading of Everyday Dharma begins September 2!
We currently leverage the Canvas Analytics API to query page views as a measure for engagement.
The page views are, typically, less than the counts per user from the requests table - which makes logical sense. However, the magnitude of this difference bears investigation.
Driving Questions:
Some additional context from @rubyn :
I am looking at page_views count. The results from Redshift doesn't match the Analytics API for the page_views value, nor what I see on the Course Analytics page in Canvas. Should I count a different thing, or add more filters to get the expected results?
Redshift query:
select u.canvas_id as "user id",
c.canvas_id as "course id",
count(r.id)
from requests r
inner join user_dim u on r.user_id = u.id
inner join course_dim c on r.course_id = c.id
inner join enrollment_fact ef on r.user_id = ef.user_id and r.course_id = ef.course_id
inner join enrollment_dim ed on ef.enrollment_id = ed.id
where c.canvas_id in ('565')
and ed.type = 'StudentEnrollment'
and ed.workflow_state <> 'deleted'
group by u.canvas_id, c.canvas_id
order by u.canvas_id
compared to Analytics API: https://umich.instructure.com//api/v1/courses/565/analytics/student_summaries
Thanks.
@jeff_longland , as I was reading this I made the assumption that the participants here would have found the resources in the User Group: Analytics Beta informative on that point, and this document specifically: Analytics Page Views and Participations . If that doesn't sufficiently document the Page View behavior, would you please post a comment under that lesson? Thanks!
Stephanie, let's be clear before I go on, I am not interested in any Beta stuff. I am interested in how Canvas currently is reporting Page Views, not future stuff. Your first link strongly seems to be some future activity for "Analytics Beta." Your second link seems to go to the docs, but the breadcrumb at the top says "Beta" in it (All Places > User Group: Analytics Beta > DocumentsActions." Is this documenting what is currently happening in Canvas, or will be happening?
Also, when you say "post a comment under that lesson" I have no clue what you mean? What is a "lesson?"
Hi, @richard-jerz , happy to address these:
Okay, Stefanie. You answered my question about User Group: Analytics Beta reference, that this is some current new stuff happening. Page View counts have been shown in my Canvas for some time. This is what I want to know about, so for now, I will ignore the User Group: Analytics Beta.
Yes, I now see that I can "comment" on a Canvas documentation web page. But commenting there appears not to generate dialog, as commenting here does. It appears to be one way, me just saying something. Well, I might do this when we are at that point.
You didn't address my question about theAnalytics Page Views and Participations that you provided to us. When I go to this web page, this is where I see at the top (All Places > User Group: Analytics Beta > DocumentsActions in the breadcrumb. I am trying to draw your attention to the word "Beta" on this web page. Is this Analytics Page Views and Participation the documentation for the Canvas for schools that have, or have not opted into the Beta analytic?
Okay, I will stay tuned. This will give me some time to digest what the docs say and what Canvas Community folks, like Robert Carroll, James, and Stuart have said.
Stephanie,
While we wait for your “member of your data team to add some comments here” I thought that I would attempt to summarize the questions that require answers.
Question #1 – Exactly how is "Page Views" calculated?
This has been our question since the first post, and after lots of discussions, remains a mystery. The docs that you pointed us to (which we still don’t know if they apply to the current docs or beta docs) suggests using the Request table. However, the user community says this doesn’t work, and no one in the user community has yet been able to get the same "Page View" counts that Canvas shows to instructors. So, we have a mishmash of ideas and NO SOLUTION. It is imperative that we be able to reproduce these values to verify the Page View counts, and include details about “time” and “which page was viewed,” either by (SQL) querying the database or by running programming code. If the count, time, and what was viewed cannot be reproduced and verified, Page Views “analytics” become useless, meaningless, and “not analytics.”
Question #2 – Are Mobile App page views included?
Again, a mishmash of ideas between what the docs say (they are included) and what the user community says (they are not included.) If Mobile App page views are not included, Page View counts become flawed.
Question #3 – Exactly how is "Total Activity" time on the People web page calculated?
"Total Activity" time should be verifiable by SQL or program code. We seem to have an agreement that Activity Time is a bogus concept since computers don’t track activity time. However, even if it is a false concept, we should still be able to verify the Canvas reported Activity Time by reproducing these values with SQL or program code.
The critical point of all of this is that we do not want some general description, we need to be able to look at the data ourselves and reproduce the exact values that Canvas shows.
If someone else wants to correct or clarify what I have summarized, I am open to suggestions. Maybe I have missed something.
Hello @richard-jerz ,
I apologize for the lag in reply. I also appreciate your frustration regarding the lack of transparency around a few of the data points your team is trying to recreate. I will attempt to address the questions here, in separate replies to help make the discussion/questions easier to track. If there are still unanswered questions, please let me know and we will work to get them answered.
Question 1: How exactly is "Page Views" calculated.
The direct, technical answer as to how Page Views is calculated in the Access Report can be found by searching for "log_asset_access" in the instructure/canvas-lms github repository: https://github.com/instructure/canvas-lms/search?q=log_asset_access&unscoped_q=log_asset_access
We are using a combination of controller/action pair and route to categorize and log views and participations. Unfortunately, the code is fairly nested, and will take a bit of time for your team to comb through, though all of the logic for the Access Report can be found there.
One reason why your team may not be able to recreate the report exactly is that we are only tracking requests that were successful. For example, we only record a submission if that submission was saved (something that you wouldn't be able to see in the request logs). Additionally, it's important to know that while we cover a large number of requests, we do not track all of the routes. For example, we are not tracking the /courses/<course_id>/ping route since that is generated by the system to support tracking the 'Total Activity' metric.
The full details on what we do and don't track can be found in the above link search, though note that due to the constraint of the request needing to be successful, the request logs will not provide a 100% replication of the report. As such, the request tables should only be used to approximate the actual data in our reports. We understand that this is not the best-case scenario, and we are actively working on improving what we believe to be a critical data set for tracking student engagement and activity. Live Events work is part of this, as is the new Analytics 2.0 work currently in Beta.
I hope that this information provides some clarity. If your team would like help pulling the exact list of controller/action pairs and routes used for the Access Report, let me know and we will get that to you.
Thanks again for continuing the dialog and for asking the questions, and again, my apologies for the lag in reply. Please let me know if there is anything else we can do to help answer this question for you and your team.
I'll send a follow up with answers to the other two questions.
Kevin Turco
Director of Product - Data and Analytics
Canvas
Hi Kevin,
Thanks for your reply, and it is fine to tackle one question at a time, meaning Question #1. My analytics team might be able to make more sense of what you said than I.
At this point in time, I am not concerned with "submissions," only the Page View statistics. When you say "that you only track successful submissions" does this mean Page Views can also have both a "success" and "non-success" and that our records for Page Views will always be equal to or greater than those reported to the professor by Canvas? And then are you saying that there will be no way for us to know the difference between whether a reported Page View was or was not successful, meaning that there is no way to separate our results? And if you don't track "system" ping routes, to me, this seems correct. The Student Page View counts should only be what the student does and not what the system does.
At this point, I am not interested in "live" reports. I am interested in last semester courses.
So let's see if I understand what you are saying. If someone says (let's say the someone is very official, like the government or a legal entity) "I don't believe the Page View courts are correct, prove it to me, along with what was viewed" that your response would be "We can't. We don't have any way to verify the Page View data." Or are you saying that you would never be able to verify Page Views because Canvas is only incrementing a "counter," and not keeping track of what viewed?
My request is very simple. When Canvas says Bill Jones viewed 100 pages, I want to know what Bill Jones viewed.
What I will do is to try to connect our Data Analytics team with you and see where we end up.
Thanks again.
Question #2 – Are Mobile App page views included?
Mobile page view data was added to the Course Analytics reporting last year. It has not, however, been added to the Access Report. This may be why you see mixed messages on whether or not it is included. While our plan is to continue adding mobile data to our reports where appropriate, it is critical to understand the limitations of mobile page view data, and the reason for us to start with Course Analytics and not the Access Report.
Even more so than Canvas request data, mobile request data is best used in aggregate to identify trends of activity, and not for audit purposes or to get exact counts. Mobile page view data is inherently tricky to capture, since cell phones can lose service, apps can be force-quit, and phones can lose power and be shutdown mid request. As such, mobile page view might be delayed from getting to our servers, or even dropped. This is true for all mobile usage data, across all applications with mobile analytics data. It's a very tricky data set to work. This is why we focused on including mobile data first with our aggregate reporting (Course Analytics), as this data is valuable in showing overall activity trends.
With the growing use of mobile devices, we understand the need to include mobile data wherever page view data is included. This is something we will continue to work towards. I appreciate, and acknowledge this is not an ideal answer to address your immediate question, however transparency is something I take seriously, so want to make sure you have all the details.
Please let me know if you have any questions regarding this,
Kevin
Question #3 – Exactly how is "Total Activity" time on the People web page calculated?
Starting with the technical answer, Total Activity is based off of the field Interaction Seconds, which is defined in our API as: An approximation of how long the user spent on the page, in seconds.
The code for this can be found here: https://github.com/instructure/canvas-lms/blob/master/public/javascripts/page_views.js
Time spent data is inherently difficult to nail down in online programs due to the nature of web pages and how we interact with applications. Example: A user can click on a page, spend 2 minutes reading it, then walk away from their desk for 5 minutes, come back and shut down the computer manually without ever moving the mouse or touching the keyboard. In this scenario, we have no way of knowing that the user spent 2 minutes (and only 2 minutes) reading the page. Because of this uncertainty, we have taken a conservative approach and are only considering actual activity, or interactions with the page, in our calculations.
Here are the basics of how we calculate Total Activity:
- In every second in which there was an interaction with the page (mouse/keyboard movement), increment a counter
-Don't record open-ended idle time where the user is not active on the page. Idle time in this case is >10 seconds.
- If there were at least 30 un-saved (never made it to our server) interaction seconds when the page unloads, save them in the cookie
While it is not an exact representation of time spent it, it does give an approximation of activity time for as much as we can know of for certain the user's behaviors. As others have mentioned, this data point can be improved upon. We have discussed ways of improving this metric to more closely capture actual time spent, and look forward to some day adding that to our system.
All that said, since we are incrementing a counter, and have conditions on when we store/record the data, this metric cannot be exactly reproduced via the request logs. Your team ma be able to approximate time spent from the request logs activity, however you should not expect exact replication. I know this isn't the expectation you have, however due to the way we are tracking this data point (relying on a counter, cookies in the browser, and sending data back to our servers in chunks) the data will be different.
I hope this helps with the issue of transparency you expressed earlier, and conveys and understanding that we value these metrics as an important part of understanding learning behaviors and will continue to evolve our metrics to further enhance our analytics and reporting offerings in the future.
Thanks again for taking the time to read through these comments, and to actively engage in the Community. Let us know if you have any further questions.
Best,
Kevin
Hi everyone,
I've been working on making the requests data more usable by deriving ids for all course elements rather than just the four provided in the raw requests data. I thought I would share some findings that have the potential to impact obtaining valid view counts.
The concept is to derive the ids from the url values and populate a cut-down fact table. If this id done on incremental data it produces a table which will perform reasonably well. Attempting to do this with the full historical requests data on demand is not feasable. Deriving the ids can be a convoluted process.
For example to derive the wiki_page_id for a wiki page view, we decode the url of the form:
/courses/<course_canvas_id>/pages/<wiki_page_title>?module_item_id=<module_item_canvas_id>
We can use the module_item_canvas_id to join to the module item dimension where we can see that the content_type is WikiPage and the wiki_page_id and title are available. The title in the requests url is not valid because the spaces are replaced by dashes or %20. Sometimes the ?module_item_id= is missing from the url, so the only way to get to the wiki_page_id would be to match the title in the url with the title in the module item dimension.
During this process, I discovered that the ids that are provided in the raw requests data are missing fairly frequently
For example, the assignment_id is populated when the url is of the form:
/courses/<course_canvas_id>/assignments/<assignment_canvas_id>
However, when the url looks like this it isn't and needs to be derived from the assignment dimension using the assignment_canvas_id:
/courses/<course_canvas_id>/assignments/<assignment_canvas_id>?<module_item_canvas_id>
I've also noticed examples where web_application_controller and web_application_action columns are \N when the rest of the record appears to indicate valid user activity.
Then there's group records...
If this is of any interest, I'll update as I make further progress.
Regards,
Stuart.
Hi Kevin,
I appreciate your detailed posts and explanations to these three questions and your explanations; however, I am not sure that they directly answer the real questions, or that maybe the real questions were not communicated well. So, I want to verify that I have understood your replies correctly and that everyone understands the questions, and answers. Follow-up posts are welcomed, such as Stuart’s, but I don’t see his post changing my summary below; perhaps, reinforcing it.
As an instructor, this “graphic” shows the statistics “Total Activity” and “Page Views” for one student in one course.
[The graphic that previously displayed here has been removed at the request of the school]
––>
The three questions, simplified are:
1. Are you able to provide my school’s course analytics team the exact step-by-step method that they should use to reproduce and verify the exact “Page View” counts, and which pages were viewed (shown in my graphic?)
(Yes or No?)
From your reply, I see the answer as “No.” Correct me if I misunderstood your reply.
2. Does the “Page View” statistic, as shown in the graphic, include all student’s mobile data?
(Yes or No?)
Your answer is “No.” Correct me if I misunderstood your reply.
3. Are you able to provide my school’s analytics team the exact method that they should use to reproduce and verify the exact “Total Activity” time that Canvas reports?
(Yes or No?)
Your answer appears to be a very solid “No.”
I don’t see described step-by-step methods to reproduce these statistics from what you have told us. The computer code is undocumented, “nested,” messy, and logic (like a flowchart) is missing. If you believe that your answers to any of these questions are a “yes,” then I need you to provide us the exact method that any school can follow to reproduce the exact data that Canvas is showing us.
To interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign inTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign in
This discussion post is outdated and has been archived. Please use the Community question forums and official documentation for the most current and accurate information.