The Instructure Community will enter a read-only state on November 22, 2025 as we prepare to migrate to our new Community platform in early December.
Read our blog post for more info about this change.
Found this content helpful? Log in or sign up to leave a like!
Hi --
I'm having trouble getting dap syncdb to work in a reliable/predictable manner w/r/t the application of upstream table schema changes to my local database. For some reason it seems to be trying to execute ALTER TABLE statements every time, even though the table schemas are up to date as far as I can tell.
This is especially problematic for my setup because i have a number of views that are based on the CD2 tables; since postgres does not allow table alteration if there are dependent views, I have to drop them before altering the table and restore them afterwards.
For example, I'm trying to use dap syncdb to synchronize the assignments table. According to the instructure_dap.table_sync table, I have version 2 of the assignments table schema. This matches the version that I get from the dap schema command. I assume that this means that my local table's schema is up to date, and no ALTER TABLE statements are necessary. However, when I try to dap syncdb this table, it's trying to execute this:
ALTER TABLE "canvas"."assignments"
ALTER COLUMN "submission_types" SET DATA TYPE "canvas"."assignments__submission_types" ARRAY;
But this alteration is unnecessary -- the assignment.submission_types column already has the above data type.
How do I get the dap library to stop trying to execute these unnecessary ALTER TABLE statements? The documentation is thin on how schema synchronization works, but it sounds to me like it should only be altering if there is a mismatch between the source and the target. Am I misunderstanding something here?
Thanks --
Colin
Solved! Go to Solution.
The phantom change issue you reported has been fixed on the dev branch of the open-source library pysqlsync. Until this is included in the official DAP client library distribution, you can pull the latest version and install it locally with
pip3 install --upgrade git+https://github.com/hunyadi/pysqlsync.git@dev
DAP client library delegates all schema and data synchronization work to an open-source library called pysqlsync. If you open a GitHub issue, you can see the progress directly. Otherwise, I can open an issue for you. Specifically, I have also seen these phantom updates related to arrays of enumeration types. They seem to stem from PostgreSQL creating two internal types for arrays of enumeration types: one with the enumeration type name, and one with an underscore as a prefix. One denotes the flat type, the other denotes the array type. From what I recall, the target type is paired up with the wrong PostgreSQL database type, and this causes the phantom updates. While the updates do nothing (they aim to mutate the existing type to the same type), I agree they can cause trouble with views and other derived objects.
The phantom change issue you reported has been fixed on the dev branch of the open-source library pysqlsync. Until this is included in the official DAP client library distribution, you can pull the latest version and install it locally with
pip3 install --upgrade git+https://github.com/hunyadi/pysqlsync.git@dev
Fantastic -- thank you!
After upgrading to 1.0, 59 of the tables are failing with what I think is the same error (one example is below).
error executing query:\nALTER TABLE \"canvas\".\"assignment_groups\"\nALTER COLUMN \"id\" DROP DEFAULT,\nALTER COLUMN \"rules\" SET DATA TYPE jsonb;\nCOMMENT ON COLUMN \"canvas\".\"assignment_groups\".\"migration_id\" IS 'The unique identifier of the migration that imported this assignment group.';\nCOMMENT ON COLUMN \"canvas\".\"assignment_groups\".\"rules\" IS 'Type extracted from column `assignment_groups.rules`
We also have custom materialized views and so can't drop conveniently. Installing the dev branch is not convenient because we deploy via AWS SAM and the SAM builder seems to have some bugs in parsing requirements.txt files that point directly to GitHub branches (same requirements.txt file works fine in a local virtualenv but gives "Unable to retrieve name/version for package: pysqlsync" when doing sam build). So I guess I'll just hope that dev gets merged soon or roll back to 0.3.8.
ETA: Rolling back to 0.3.8 doesn't work because the metadata schema seems to have changed? I get the below error when I try to sync assignments_group with 0.3.8.
relation "canvas.dap_meta" does not exist\n[SQL: SELECT canvas.dap_meta.id, canvas.dap_meta.namespace, canvas.dap_meta.source_table, canvas.dap_meta.timestamp, canvas.dap_meta.schema_version, canvas.dap_meta.target_schema, canvas.dap_meta.target_table, canvas.dap_meta.schema_description_format, canvas.dap_meta.schema_description \nFROM canvas.dap_meta \nWHERE canvas.dap_meta.namespace = $1::VARCHAR AND canvas.dap_meta.source_table = $2::VARCHAR]\n[parameters: (\'canvas\', \'assignment_groups\')]
Hey @jwals --
I'm also using AWS SAM and I'm able to get it to work with a requirements file like:
aws-lambda-powertools==2.34.2
# temporary until new version is released:
pysqlsync @ https://github.com/hunyadi/pysqlsync/archive/refs/heads/dev.zip
instructure-dap-client[postgresql]==1.0.0
Also, I will mention this pgsql code that I have found to be immensely helpful in dealing with dependent views:
https://github.com/rvkulikov/pg-deps-management
In my workflow I try to synchronize each table first, and if the sync step fails with a QueryException that contains the string "ALTER TABLE", I use the pgsql functions from the repo above to drop the dependent views, re-run the sync, and restore the dependent views. I've only just started using it in my dev env, but so far so good.
--Colin
Do you have table called "dap_meta_backup"? We just tried installing v1 over v0 in our sandbox. It renamed the existing table to that.
Although it didn't create a new "dap_meta" table, so we can't get sync working...
I do have the "dap_meta_backup" table in my database -- you're correct about it having been renamed by the new version of the library. Two new tables, "database_version" and table_sync", should have been created to replace it -- in postgres, they are in a separate schema called "instructure_dap". (I gather that mysql does things differently and IIRC the tables are in the main schema but have "instructure_dap_" prefixed to their names.)
What error do you get when you try to sync for the first time with the new client?
Hey @ColinMurtaugh, unfortunately even with your requirements.txt verbatim I'm still getting "Error: PythonPipBuilder:ResolveDependencies - Unable to retrieve name/version for package: pysqlsync-dev." I'm still on SAM CLI 1.103 and I think the newest is 110 so maybe upgrading there will help. The pg-deps-management package does indeed look helpful, thanks for the tip.
My syncs have been failing since 1.0.0. However, they are logging no errors, even when using "debug" loglevel. I was going to open a case with Instructure, but if this is the issue (yes, I'm using PG database), I'll wait for a production update as the dropdb > initdb is working fine, even though it takes a LOT longer.
Community helpTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign inTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign in