The Instructure Community will enter a read-only state on November 22, 2025 as we prepare to migrate to our new Community platform in early December.
Read our blog post for more info about this change.
Found this content helpful? Log in or sign up to leave a like!
I recommend that the planned change to remove the user_agent_id column from web_logs be cancelled for a later time. Instead, just provide null contents for the column.
Changing table structure affects those of us who maintain ongoing replicas of CD2 tables and breaks the attendant processing, especially if .csv format is used. Use of parquet or json format reduces some of the immediate trauma, however, at some point, the reconstituted version of the table needs to be dropped and recreated.
More importantly, the web_logs table is a special case. Some of us had been using the requests table for time series analysis and have converted to using web_logs by adding converted requests table records to the 30 day web_logs snapshot. Approximately twenty steps are needed in my environment to add the additional columns and recreate the history. The 12/16/2023 change will probably need as many. Just remember that after the new snapshot is taken, we need to drop and recreate the incremental tables and change the procedures associated with the application of the incremental records to the original snapshot.
Note that the elimination of the user_agents table would not be a significant issue.
It looks like the user_agent_id column was finally dropped in Nov 2024 instead of on 2023-12-16 as indicated in the release notes. This has caused a significant amount of turmoil for our data processing procedures which has taken us weeks to untangle and we're still not done. It would have been a non issue if, instead, a null value was provided as @sor1 suggested. Has anyone else been negatively impacted by this change?
Hey, thanks for reporting your issue. I would like to improve our data schema rollout process so can you please help me understand some things?
We hadn't been sending incremental data for a while for user_agents table. Why haven't you stopped syncing it when the announcement was made?
user_agents table was just dropped, so no change was made in a table schema, like adding or removing a new column which could make challenges in a huge database. What is the turmoil around that, can you please describe it in more details so I can understand it better?
@sgergely I'm referring to the removal of the web_logs.user_agent_id column referenced in the release notes I linked above, not the user_agents table. We store web_logs data in S3 so we had to invest a significant amount of time to identify when the schema change occurred (since it was scheduled for Dec 2023 and then arrived unexpectedly) and adjust our data files. We couldn't simply re-initialize the table without losing our web_logs history due to the 30-day retention policy.
I just wanted to share some experience and thoughts. Note that our environment is Snowflake:
Our processing uses json (parquet would be just as good) to partially insulate us from column changes. We also maintain the web_logs records since CD1. (Needed to convert Requests to web_logs) - this gives us history since 2020. We are also considering maintaining a history of incremental update records of other tables to permit us to approximate the state of Canvas Data at times in the past. For processing performance, we refresh the table snapshots every term.
Community helpTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign inTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign in