The Instructure Community will enter a read-only state on November 22, 2025 as we prepare to migrate to our new Community platform in early December.
Read our blog post for more info about this change.
Let me start by saying I am not in any way, shape or form an experienced DB admin, although I can stumble and fumble my way through a lot of things due to my 30 years of experience in IT and programming. We have a server set up with SQL 2012 and want to download Canvas Data and import it into the database so we can build and run some custom queries against the data.
In learning my way through this project, I have managed to get the CanvasDataCLI tool working, but trying to manually import the unpacked .txt files generates errors, or I get a success message but there isn't any data where I expect it to be. I could very easily be doing something wrong that is causing this process to fail.
I tried using CanvasDataViewer, and I have it working to the point that it will download all the files but nothing gets unpacked/unzipped and imported. I thought I had it fixed by correcting an oddly specific issue in the machine config files, but... no. The queries to download the data and import the data both indicate that they ran successfully but there is no data in the unpack folder, and no data in the database.
Next, I tried setting up the CanvasDataLoader. That process failed when I tried to do the "cargo build --release" step because it couldn't find the Visual C++ tool that I'd just installed.
I have no idea what to try next or how to fix any of these processes. Does anyone have any suggestions to point me in the right direction? I'm hoping I've done something really stupid to prevent CanvasDataViewer from working and that it would be an easy fix to get that option up and running because it seems that it would need the least amount of intervention once it gets going.
Solved! Go to Solution.
Thank you for the response. We're a Windows shop, so no Linux servers.
I was able to figure out why CanvasDataViewer wasn't working. It was looking for 7-zip (used to unpack the downloaded data) in Program Files. I had it installed in a different location. I made an adjustment to where CanvasDataViewer was looking for 7-zip and I was able to get the process running and importing our data.
Hi Tess,
CD:CLI doesn't import data, it only handles downloading and syncing of downloaded files you store locally.
I stopped at the Windows system requirement for Canvas Data Viewer.
Canvas Data Loader only deals with MySQL and PostgreSQL.
I'm pushing into SQL Server 14, but I don't think there'd be an issue for 12.
I have worked with CD:CLI a bit to download the files but I'm not using it in production. I'm working on improving my import process before switching, but I've just finished another project to deal with Live Events.
The best option I see right now, is probably Embulk — Embulk 0.8 documentation
https://github.com/embulk/embulk-output-jdbc/tree/master/embulk-output-sqlserver
My current solution uses @James canvancement/canvas-data at master · jamesjonesmath/canvancement · GitHub, except I modified the import.sh to use bcp Utility - SQL Server | Microsoft Docs installed on RHEL5
Install SQL Server command-line tools on Linux - SQL Server | Microsoft Docs
If Embulk doesn't do the magic (read FASTER FASTER) and simplicity... then I'll probably switch to CD:CLI with a modified version of James' script again.
I'd be happy to share my experience and help where I can if this looks like something you'd like to do. Otherwise I'm sure there are others around here using SSIS and Import Wizards or even fancier options, but I don't have access to these tools or a desire to us them.
Thank you for the response. We're a Windows shop, so no Linux servers.
I was able to figure out why CanvasDataViewer wasn't working. It was looking for 7-zip (used to unpack the downloaded data) in Program Files. I had it installed in a different location. I made an adjustment to where CanvasDataViewer was looking for 7-zip and I was able to get the process running and importing our data.
Great! Glad you got something working.
Embulk can do Windows too.
https://github.com/embulk/embulk/tree/release_v0.8.27_3#windows
I had a pretty long reply with thoughts and a benchmark for Embulk but this reply trashed my saved post.
The gist of it was...
Embulk was twice as fast as BCP alone.
28 Submission files (27.9M submissions):
Embulk 35 minutes
BCP 68 minutes.
PS. It can also convert timestamps to your local timezone...
Community helpTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign inTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign in
This discussion post is outdated and has been archived. Please use the Community question forums and official documentation for the most current and accurate information.