Since March 5, we have had extreme inconsistency in file retrieval times with some outright failures. We are bringing down only 19 of the tables.
We've had the same problem. We're using canvasDataCli sync to download all our files, but what using to be a 3 1/2 minute process is now very inconsistent, sometimes taking hours.
We are downloading using the command line tool which defaults to 5 async downloads at a time. Some files come down fast in seconds and others can get stuck for hours. When we retry, the files having trouble can come down fast and then other files have a slow down.
Looking in Wireshark the packet payloads just get smaller and smaller for different files at different times.
Thanks, Tom and Mikhael. Your experiences are very much like our own. It's helpful to know that it's not just us.
For what it's worth, to retrieve the files we're using the Perl LWP::Simple module, which as I understand it issues a simple HTTP GET on the file address.
We were having this problem (getting Canvas Data files via the canvasDataCli slowed or stopped), too, for about a week and a half. For us, it started also with the data dump produced on March 5th. We noticed that the problem only occurred when we were connecting from our university network; the download was as fast and complete as ever via residential ISPs. Our download problems ended this past weekend. We didn't download the data dumps produced on March 16th or 17th, but we haven't had trouble starting with the March 18th data dump.
Are you all still having issues downloading? Are you also connecting via university networks?
Our issues seem to have cleared as well. We are connecting via our university network. Another part of our campus was having AWS issues that also seem to have cleared. We are continuing to monitor in case we have more issues.
We stopped the job in our production environment but have let it continue to run in QA, which points to the same set of Canvas Data files on S3. Consistent with what you've seen, the QA job has been running well since early morning March 16.
BTW, we opened a ticket with Instructure for this issue. We did that by sending an email to the support team. I don't have a ticket number. On March 19, we heard back that the information had been passed along to the product and engineering team. No further news at this point.
Thanks for keeping us posted on your experience.
My boss, who is the most skilled user of Google I know, found a blog post on "the mystery of the hanging S3 downloads." Much of it is beyond my ken, but the problem it describes sounds very much like what we experienced, particularly what Mikhael observed as "the packet payloads just get smaller and smaller for different files at different times."
Another member of our team has passed the post along to the Instructure support people.
Retrieving data ...