Setting up DAP Synchronization in AWS, Fargate or Labmda?

Jump to solution
burkepk
Community Member

Hello all, our team is new to developing solutions to pull data from Canvas.  I have been lurking in the forums for a bit, trying to gather as much information I can to set up the correct tools that we would need to be successful.  However the one part I am wondering about is where to run the Python functions to handle the initialization and synchronization calls.  We want to stay away from EC2 because of the administrative overhead involved, but also we are apprehensive about the 15 minute time limit on Lambda.  Has anyone containerized their code to run on Fargate?  Or is the 15 minute time limit in Lambda enough for processing the synchronizations and we run a one off process just for the initializations?

Labels (3)
1 Solution
ColinMurtaugh
Community Champion

Hi --

We've had success running our CD2 init/sync code in Lambda and orchestrating the process using Step Functions. Currently we're syncing everything in the canvas schema every three hours, and the process has been running for a couple of months without problems. A few of the tables (less than 5, IIRC) are large enough that the init step took longer than 15 minutes -- for those we just ran a one-off init outside of Lambda, and subsequent syncs have been fine. 

Here's a link to a work-in-progress version of our pipeline code. This is essentially a slightly simplified version of the process that we run ourselves; I have a little work to do to apply some our recent updates to the public version of the code, but you can get a sense of how it works:

https://github.com/Harvard-University-iCommons/canvas-data-2-aws/tree/develop

--Colin

 

View solution in original post