Skip to main content
cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Canvas Data 1 requests and Canvas Data 2 web_logs Comparison

Canvas Data 1 requests and Canvas Data 2 web_logs Comparison

This document provides a brief overview of the differences between and how to convert data from the Canvas Data 1 (CD1) requests table to the Canvas Data 2 (CD2) web_logs table.

Note: We recommend backing up your data before reformatting.

Format Changes

The following columns, available in the CD1 requests table, are not available in CD2 web_logs:

  • timestamp_year
  • timestamp_month
  • timestamp_day
  • root_account_id
  • course_account_id
  • user_agent

The CD2 web_logs table requires a new boolean column: participated.

The following requests columns can be converted using PostgreSQL type casting to use in web_logs:

  • http_status::integer (instead of varchar)
  • user_agent_id::integer (instead of bigint)
  • remote_ip::INET (instead of varchar)
  • web_application_context_id::bigint (instead of varchar)
  • session_id::UUID (instead of varchar)

The following requests columns are now enum types in web_logs:

  • web_application_controller
  • http_version
  • http_method
  • web_application_action
  • web_application_context_type

Removed Columns

timestamp_*

timestamp_* column values used in the requests table can be retrieved in web_logs using the timestamp column.

Here’s an example how to retrieve the values of timestamp_day:

SELECT concat(EXTRACT(YEAR from timestamp), '-', EXTRACT(MONTH from timestamp), '-', EXTRACT(DAY from timestamp)) FROM canvas_logs.web_logs_new;

root_account_id

Because the CD2 API returns data for a specific root account, the root_account_id is not included in the CD2 web_logs table. 

course_account_id

course_account_id can be retrieved by joining the canvas courses table using the course_id column:

SELECT canvas.courses.account_id FROM canvas_logs.web_logs, canvas.courses WHERE canvas_logs.web_logs.course_id=courses.id;

user_agent

user_agent data can be retrieved in the CD2 user_agents table. To retrieve the new user agents, you can join the table using user_agent_id. CD1 user_agent_id records are not migrated to the web_logs table. 

Handling schema changes

Participated column

The participated column has no counterpart in CD1.

http_version column

In the CD2 schema, the http_version column has no HTTP/ prefix. You can convert to the new format using the following PostgreSQL function:

regexp_replace(http_version, '(HTTP/)(\d+.\d+)', '\2', 'i')

Handling enums

If you'd like to convert columns that were varchars in CD1, but enums in CD2, you can locate the valid values in the CD2 schema definition. Other enum values found in your historical data can also be added to the enum type.

Globalized IDs

The following CD1 requests globalized IDs are localized in CD2 web_logs:

  • user_id
  • real_user_id
  • course_id
  • quiz_id
  • discussion_id
  • conversation_id
  • assignment_id

Convert CD1 values to their CD2 counterpart using this formula:

MOD(globalized_x_id,10000000000000) as local_x_id;

user_agent_id column

To retrieve new IDs, join the CD1 user_agent column values with the CD2 user_agents table.

Was this article helpful? Yes No
Have a question about Canvas? Ask in the Q&A forum:
Embed this guide in your Canvas course:

Note: You can only embed guides in Canvas courses. Embedding on other sites is not supported.