[2020-12-21 Update: For the latest updates, please visit the Canvas Roadmap]
[2020-09-18 Update: Canvas Data Early Access information can be found in the Canvas Data 2 Early Access information page]
Canvas Data is one of the Instructure data products built to provide Canvas customers with LMS transactional and web server log data.
We are compelled to introduce our next generation of Canvas Data product. The product encompasses years of continuous customer feedback and data research, including a number of cutting edge data technologies and a rich LMS selection of ecosystem datasets.
The mission of Canvas Data 2 is to enable Canvas customers to easily find, filter, and understand the variety of Canvas data in a timely manner.
Data is referenced as datasets and provides more granular data than the Canvas Data star schema.
* features not included in the initial beta release but will most likely be rolled out approximately 6 months afterward
Features |
Canvas Data |
Canvas Data 2 |
Latency |
24 – 48 hours |
≤ 4 hours |
Table snapshot |
✓ |
✓ |
Table deltas/updates |
✓ |
|
CLI |
✓ |
✓ |
API |
✓ |
✓ |
UI downloads |
✓ |
|
Schema available in API Documentation page |
✓ |
|
Star Schema |
✓ |
|
Beta Schema |
✓ |
|
Schema versioning |
✓ |
✓ |
Canvas LMS data |
65 dimensions |
89 unique datasets |
Weblogs aka requests |
✓ |
✓ |
Catalog data |
✓ |
✓ * |
New Quizzes |
✓* |
|
Outcomes data |
✓* |
|
File format |
tsv |
json csv parquet * |
In Canvas Data 2, the following behaviors can be expected:
The Canvas LMS supports the Canvas Data 2 authn & authz mechanism, which means customers can use Canvas access tokens to access the Canvas Data 2 API and command line interface (CLI).
While we are planning on supporting both the API and CLI, use of the CLI is strongly recommended, as the CLI allows customers to quickly and efficiently filter data at the sub-command level prior to downloading it, this helps to avoid complex API logic.
Canvas Data 2 unique datasets will answer the majority of the needs our customers voiced during the community survey conducted by our product management . Here are some of them :
* features not included in the initial beta release but will most likely be rolled out approximately 6 months afterward
Canvas Data 2 documentation will be hosted in the Instructure API Documentation page: https://canvas.instructure.com/doc/api/.
The Canvas Data 2 public schema will be versioned; any updates (additions and deletions) will create a new version. Customers will also be able to view the beta version of Canvas Data 2 schema, which allows customers to view new changes prior to the changes being released to Canvas production. This behavior is not to be confused with accessing Canvas Data 2 directly in the beta environment, which will not be supported.
Canvas Data 2 will still be offering access to weblogs dataset with latency ≤ 4 hours. Granular data filtering (e.g by request_id, request timestamp, user_id) prior to download has been considered as a highly desired feature and is currently undergoing additional research.
Canvas Data 2 will provide access to all changes occurring on a specific dataset within a default or custom timeline. A user will be able to provide the starting point date and time as a custom parameter. The Updates file will contain a log of transactions, each containing metadata.orderId and metadata.status. The metadata.orderId is a lexicographically sortable ULID that represents the order of change in the source database. A user could leverage the record metadata.orderId to request all changes that happened since the record was updated. Updates files will only be available in Canvas Data 2 for 60 calendar days .
Canvas Data 2 introduces a new data schema that is closely aligned with Canvas API schema. The following additional schema details will be introduced in the new product:
Some data fields, such as student quiz responses, are stored in yaml data type fields in the source database. These fields will be released as json formatted fields.
Canvas Data 2 Release Timelines
September 2020: Canvas Data 2 early preview [access to sandbox data]
Users in all regions can use Canvas Data 2 tools [CLI only] to learn the new schema and cli commands. Access to customer specific data will not be available.
Q3/Q4 2020: Canvas Data 2 public beta [access to production data]
Users in all regions can use Canvas Data 2 tools (API and CLI) to explore their own data.
Specific dates for both the early preview and the public beta will be provided when available.
As soon as Canvas Data 2 is released to beta and our team reviews feedback, we will announce the deprecation of Canvas Data. We are allocating six months for our customers to migrate from Canvas Data to Canvas Data 2 prior to us turning the old solution off. An official announcement will be made in advance to inform customers of the deprecation timeline.
Note: Canvas Data services hosted through Instructure Professional Services will be updated prior to Canvas Data end of life.
We anticipate the majority of our Canvas Data customers will create a plan for migrating Canvas Data ahead of time, which will depend on the complexity of the customer’s custom data warehouse and analytics implementation.
Customer attention will be required, as the two version of Canvas Data include the following major differences:
Customer migrations could include the following options:
The following documentation will be made available to customers to assist with migration:
We know more questions may exist about the Canvas Data 2 migration. Questions may be asked using the Comments section of this blog post.
To request the schema for Canvas Data 2 and provide feedback about potential migration needs, please reach out to us via the Canvas Data 2 Request Form.