cancel
Showing results for 
Search instead for 
Did you mean: 
univsys
Community Participant

Import Issues

Has anyone else had import issues with the FTP uploads over the last month? We run our uploads 4 times a day, and we have had imports stall twice in the last 30 days (end of June).

Subsequent imports then become "Pending" and stack up. We have to have Support abort the "running" import to get things moving again.

Support says there is nothing to indicate an issue with our files, and they do not know what causes the import to be hung.

0 Kudos
4 Replies
robotcars
Community Champion

 @univsys 

I don't have an answer to what causes your delay. I can share some experiences.

Sometimes I've had issues because of how I sent something or sending them too often when one hasn't finished. I most recently experienced this when moving to a 20 minute differential and some of those differentials took longer than 20 minutes for Canvas to import. One of the causes may have been a shared file space and as new CSVs were being created, others were being collected, zipped and sent to the API. I updated the code to skip a job when one is already running and resolved the issue.

I know that if we send a weekly import it can sometimes take or has taken a few hours.

I think Canvas distributes these jobs more now so they are faster.

You don't have to put in a ticket to abort these, if you want to get moving faster.

SIS Imports - Canvas LMS REST API Documentation 

Has endpoints to abort a single import by id, or abort all pending.

I use this in an API tool of choice (Postman, Paw)

or the Canvas Live API 

univsys
Community Participant

Hi Robert!

To give a bit of background, we upload these files to the canvas SFTP site 4 times a day. These are full file imports, not differential. Our imports roughly take about 25 minutes to process all 13 files that we upload.

We FTP these files to their server, and they're picked up on the 30 minute mark of the hour. They're usually finished processing and we receive an inbox/email notification of the status of the import around the 55 minute mark of the hour.

We also run a demographic-only import through the API on an hourly basis. This happens at the 00 mark of the hour, and typically takes a few minutes to process. 

If, lets say, the FTP process (13 files) is running and is importing, and for some reason it's taking longer than normal...if I start the API demographic import (3 files) could that cause some sort of collision issue in canvas?

When you say that you are skipping the job if one is already running, are you doing this all on your end through your script (i.e. checking to see if the script is running already)? Or are you querying the API to see if a job is running already, and then implementing a *wait* in your script?

I'm thinking because we're importing one way through FTP, and another way through direct API, it may be causing an issue?

babylon5
Community Member

@univsys - hello.  Saw your post about sending files to Canvas via SFTP and wanted to ask if we can connect on how you were able to accomplish this.  Our current process sends the .csv to files to an on-site local SFTP server and then Canvas will pull data from that SFTP server. 

Thanks

robotcars
Community Champion

@univsys 

Sorry I missed getting back to you. The end of July had me rewriting our entire SIS integration to handle 4 instances of Canvas (pandemic problems).

If you're using the API and an import is already running, the API will generally take your Import Request, and return a workflow_state of 'pending'. If your FTP job is still running, your API import would begin after that one completes. SIS Imports won't overlap (concurrent imports), but your data needs to be sent in order.

As for skipping jobs, our jobs run every hour now, but with the same process. If a job is still running on the 00, when a new job is starting the cron script checks to see if the previous job is still running (using linux commands), and emails us the 'BusDriver is delayed, previous job still running'. The process starts again on the next hour. Typically our hourly imports take a couple of *seconds for Canvas to Import, so we only get this notification when the Database on our end takes a long time to return data or the database is having connection problems, in which case our script persistently retries until it gets the data. However, if you aren't initiating the first job (because canvas is via FTP), then somewhere your code needs to check one of the following endpoints:

Check to see if any import is running

https://canvas.instructure.com/doc/api/sis_imports.html#method.sis_imports_api.index

Check the status of existing imports

https://canvas.instructure.com/doc/api/sis_imports.html#method.sis_imports_api.importing

With these, you could decide to delay POSTing your file to the API until one is finished, or send it anyway for it to be 'pending' and imported after the others completed. It kind of depends on whether you monitor the import status and wait for completion. Our workflow runs the import and monitors the progress to completion for archival/status reasons, or decision about the import status and whether a retry is necessary.

*We send all data in the nightly at midnight, and leave a 3 hour cron window for it to complete before starting hourly's (delta) imports at 3AM.