The Instructure Community will enter a read-only state on November 22, 2025 as we prepare to migrate to our new Community platform in early December.
Read our blog post for more info about this change.
Contrary to what appears to be a common belief, not all Canvas admins are computer geeks. Most of us did not need those skills in the prior LMSs we migrated from - everything was done through the UI. Many of us, myself included, are becoming more geeky because those skills are helpful for Canvas admins, but we are not up to speed with the data management skills that appear to be needed for Canvas Data. It would be very nice if the Canvas Data Portal also included a UI to view and download the data files in a format we can easily work with like Excel.
At our campus We are working with an IT person from our state system, and he had difficulty even accessing the text files in that obscure zip format used, and get it into Excel, and still needs to apply the data definitions to create a useful report - and he is a geek.
I know that the Canvas Data Portal was tested in Beta, but I strongly suspect that the beta testers 1) had extra support from Instructure, and we a tad bit more geeky than I am.
My new slogan: "Data for the rest of us!"
That is my question - will there be improvements to the Canvas Data Portal to make it more user friendly for the rest of us?
Solved! Go to Solution.
Hi James! Your reply are spot on and I just wanted to add a few updates here -
1) I'm so glad folks here are bringing up the need for a data browser UI - this is something we have been thinking about as well. As you have highlighted, we are starting with the foundation of giving access to data and from here we have a few directions we want to go:
For (b), one of the ways we are considering doing this is by creating a data browsing type interface. We are also looking at common reports and queries that would be useful to provide out of the box. Finally we think actually the best way to surface data into Canvas is to make it so students and teachers get information as part of the normal flow that they may be doing while doing assignments, quizzes, discussions, etc.
2) Last weekend, we did a load of historical page view (requests) so some schools may not have seen the full history of requests prior to that. Also, since then we found an issue so we actually plan to rerun it again in the next few days. We appreciate your patience and will keep the community updated here with the latest.
3) Lastly, we are considering improvements to our current Managed Data (Redshift) offering to enhance access capabilities, among other things. We welcome input from schools and would like to use this forum to validate our plans.
Thanks again for your interest!
Linda
@kmeeusen , from what I've gathered from @James (and I'm sure he'll correct me if I'm wrong), a better way to work with Canvas Data is through the API. I don't think this is a magic wand that will fix everything, but it sounds like it might be a bit more manageable. Yet, I will 100% agree with you that when I created this feature idea - - Canvas Data was not what I was imagining. My current hope is that at some point James will get time (and have server space) to build a simple front-end GUI that will allow me to do my own data queries from Canvas data. Fingers-crossed.
I love the idea of Canvas Data! I just wished it was more user-friendly for those of us who do not yet use API.
A font-end GUI is exactly where I would like to see Canvas go with this - very unfortunately, not all of us have a James available![]()
@kmeeusen ,
One could argue that part of the reason for not-having a front-end gui is because Canvas is giving us access to our data and allowing us to do with it what we want to do with it. If they slap a front-end on top of it, then it limits us to what they've already created reports for. That was one of the complaints -- "we can't generate what we want because we don't have access to the data."
But I agree that there is a need for more reports within Canvas that are easily accessible for the masses. I also think that people at Canvas have been saying "If we give them access to their data, they won't need to ask us for as many obscure reports used only by one person at one university."
After InstructureCon's announcement, Canvas Data has been sold as the solution to everything, but it's not. It may be more helpful for those who can shell out the money to pay for access to Redshift and so they can run queries directly without having to download the data in flat files. Tableau and other packages may be helpful for those who can afford the license. R is free and may be useful for generating statistical reports, but I'm not sure how useful it would be for drilling down to a list of Faculty who are actively using Canvas. All of those have learning curves and if you're a small shop, you probably don't have the expertise to do it yourself. There is a market here for third-parties to do the crunching and analysis for us. Maybe making it difficult for people to use the free version is part of the agenda. Unfortunately, some of us live in a state with a dysfunctional government and some state universities are saying they will have to shut down if something isn't done and so we're relegated to free for right now. Of course, with free comes the reminder that "You get what you pay for."
I asked Kona last night to "Give me a use-case for Canvas Data." And she said "Knowing which students viewed a page." Great, that should be possible since the page view requests are available in Canvas Data. But when you look at it, the requests information is incremental (good) instead of all pages going back to the beginning of time but it's missing data. Each file is less than a day's worth of requests, even though they're generated daily. The Canvas Data Portal had 5 days worth of downloads available and 2 of them were incomplete. I had a fraction of the actual page views available for the last 5 days. Telling an instructor who viewed a page is going to be impossible. However, with the API, I can get that information by searching through the page views for each student in a course. It takes a while, but where Canvas Data is supposed to help, it doesn't.
People using the flat files are going to have to download and archive some of the information since it's incremental. For other dumps, like the submissions, it's the full dump and not incremental, so I have to download the full (for us) 64 MB every day and that only gets bigger as time goes on.
I hope the kinks get worked out as it matures. Right now, it's out of beta, but I'd still consider it to be in its infancy. As early adopters, we're forging new ground and should expect all the difficulties that go with that.
Hi James! Your reply are spot on and I just wanted to add a few updates here -
1) I'm so glad folks here are bringing up the need for a data browser UI - this is something we have been thinking about as well. As you have highlighted, we are starting with the foundation of giving access to data and from here we have a few directions we want to go:
For (b), one of the ways we are considering doing this is by creating a data browsing type interface. We are also looking at common reports and queries that would be useful to provide out of the box. Finally we think actually the best way to surface data into Canvas is to make it so students and teachers get information as part of the normal flow that they may be doing while doing assignments, quizzes, discussions, etc.
2) Last weekend, we did a load of historical page view (requests) so some schools may not have seen the full history of requests prior to that. Also, since then we found an issue so we actually plan to rerun it again in the next few days. We appreciate your patience and will keep the community updated here with the latest.
3) Lastly, we are considering improvements to our current Managed Data (Redshift) offering to enhance access capabilities, among other things. We welcome input from schools and would like to use this forum to validate our plans.
Thanks again for your interest!
Linda
Thank you, Linda. I greatly appreciate that work is continuing to improve the user experience with Canvas Data.
You have greatly reassured me.
Thanks, James! Very thoughtful reply, and helpful!
The API is only useful for downloading the data without sitting there and clicking on 50 different files. It also allows you to get the column and table definitions (schema) in a computer-readable (JSON) format. It does not help in the slightest with using it once you have it.
.ZIP has been around since 1989 with PKZIP. Windows has supported zipped folders since 1998. Mac has built-in ZIP support since OS X 10.3 in 2003. ZIP has been extended and is even used for compression in the .DOCX and .XLXS files (among many others). So, calling ZIP obscure is not close to reality.
That said, ZIP is not the format used by the files downloaded from Canvas Data. Those are gzip files, which have been around since 1992. Pretty much every *nix based operating system used gzip because it was designed to work around the patents of other compression software. If you are operating on a command line, you would use gunzip filename. gzip got an added boost in popularity because a flag was included with the tar command so that you could compress several files together. There have been attempts to replace it, .bzip2 is one that often achieves better compression, but it never caught on as much as the gzip one did.
That's great for the geeks, but for Windows users, I recommend the 7-zip program. It's open source and doesn't have any nag-ware in it. It has a gui interface, but it also adds items to the right-click menu to extract files where they are or to folders.
I haven't had time to dig into the files much yet. But it appears that they are tab delimited files and so you could rename them with a .txt extension and then open them as a text file in Excel. I don't recommend that, though. Our first submission_fact-part file came in two parts, the first had 608,773 rows of data and the second had 476,573 rows of data. The submission_dim-part file has a row for all 1,085,346 of those. I don't know what that means yet because I haven't looked at what's in the files, but I do know that Excel is not the place to do the analysis because Excel 2013 and Excel 2016 have a limit of 1,048,576 rows and I've already exceeded that. Those three files, when uncompressed, make up 386 MB. That's for our information going back to August 2012, but we're a small school compared to a lot of places.
Excel really isn't useful for Big Data and so yes, it seems really out of reach for most of us normal folks. Being a computer geek or even an Office expert doesn't mean you've got any experience with big data or business intelligence. I would consider myself to be fairly capable, but I've got a steep learning curve with the rest of everyone. This really is out of the reach of most admins, especially with the lack of quality documentation. Big Data and Canvas Data have been sold as the solution to a lot of things in the ramp up to its release, but if you can't use it, it doesn't do you any good.
Now, give me some time and I'll have some more tangible things to say. But right now, it's kind of like the Canvas API in that not everyone can use it but if you can, you can do some pretty amazing things.
I was looking for a way to increase the row limitation in Excel when I came across a built-in feature (built into Excel 2013 and up) called Power Pivot. Once this add-in is enabled, Excel is only limited by the amount of memory on the computer (64-bit version of Excel works best).
What’s new in Power Pivot in Microsoft Excel 2013 - Excel
I have not had a chance to try the add-in yet but it might be helpful when working with these large data files.
Hopefully you'll find Power Pivot easier and more helpful than I did. I wrote about it here How to list teachers who have published/unpublished courses and for those wanting pivot tables, it might be awesome, but if you're after drilling down to things, like obtaining a list of students who met some criteria, you're limited to a one-page view that you than can't do much with after you have it.
Your article will be a great starting point. Thank you! The add-in caught my eye because it removes the row number limitation once enabled.
Community helpTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign inTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign in
This discussion post is outdated and has been archived. Please use the Community question forums and official documentation for the most current and accurate information.