The Instructure Community will enter a read-only state on November 22, 2025 as we prepare to migrate to our new Community platform in early December. Read our blog post for more info about this change.
Found this content helpful? Log in or sign up to leave a like!
The issue -- we currently pull reports from the api/v1/reports several times a day, and the person who is responsible for those jobs that has complained to me on multiple occasions that they take a long time and often get stuck or fail, because of the size of the reports.
It looks like all of the data in those reports is available via canvas data, so I thought I would write the needed queries and move to updating multiple times a day (I currently sync once per day) -- but my workflow triggers a number of Kubernetes pods (1 per table plus the job that fetches the table list, so 91) which seems wasteful if the data hasn't been updated yet.
Is there a query-able data source accessible by either the dap cli or api that is similar to the meta table syncdb uses? Basically a I'm looking for a way to get a timestamp for the last completed update from our production instance to that's available to use as a trigger for a workflow
I had considered implementing a manual task that attempts an incremental snapshot from the last retrieved on one of the bigger tables, and then triggering the entire workflow if it returned more than 0 records, but I'm not sure if all the tables are updated at the same time or not.
I think this thread may be useful Canvas Data 2 - Incremental "until" to maintain referential integrity
Since it's streamed for 'eventual consistency' you're not going to be able to make any inference that an update to one table means another has been also been updated (ie it's not a read replica). Except that data >=4hrs old should be in the tables.
Great question!
The data behind CD2 is updated every 4 hours if there is any data that can be added. Since every Canvas instance usage is different, and features are used differently there is no easy way to provide a solution for you.
What I would do is that I would check incrementally only those tables that are important for my data needs.
The other idea I would also check is historically which table gives me new data every 4 hours and which is not, so then I would know which tables I should check frequently - every 4 hours - and which needs only 1/day incremental update check.
Community helpTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign inTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign in