Found this content helpful? Log in or sign up to leave a like!
Submissions Incremental Query job run time
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am developing a AWS process that uses a Step Function to coordinate Lambda Calls in order to get data from the DAP client for our data lake. However, I am running into an issue with the Submissions table where it is taking almost an hour for an Incremental query of about 10 days of data, but the Snapshot query takes only around 7 minutes to complete.
Is there anything I can do to speed up the incremental call, or should I kick off the job, capture the ID and check the status until it is ready for download?
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We use a similar-sounding process to maintain our CD2 database (AWS Step Function; Lambdas) and once in a great while incremental queries can take a long time to process (and cause a failure on our end due to Lambda timeouts). We're syncing every 3-4 hours so these timeouts are very infrequent - maybe 2-3 times in the couple of years that we've been using this process. I've found that keeping the increments short reduces the likelihood of a timeout, but as far as I know there's no other way to speed up the incremental calls.
When this happens, we typically will use the dap CLI to manually sync that table (running on an EC2 where the Lambda timeout isn't an issue). Once the table is synced again, our regular Step Function/Lambda process can continue again.
BTW -- a slightly out-of-date version of the code/infra that we use to maintain our CD2 database is here: https://github.com/Harvard-University-iCommons/canvas-data-2-aws
--Colin