cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Community Member

What are people using to access the data from Canvas Data Portal?

Jump to solution

Hi all,

I have just gain access to the Canvas Data portal last week and have been exploring it. From what I can tell so far, it is a tab delimited file stored inside a gz file. 

Excel can to a certain point understand the Tab delimitation but it is not the best tool, so I am wondering what everyone else is using to access the data in those files. 

Thanks,

Sc.

Tags (1)
1 Solution

Accepted Solutions
Highlighted
Navigator

samson.chung@uoit.ca 

A relational database. Canvas uses Postgresql on the backend, but you can use MySQL, MSSQL, Oracle (has some problems with long field names), Informix, etc. Basically, whatever database system you're already using, but if you're talking "Excel" be prepared for the difficulties of big data. People who can afford it may use Amazon Redshift since that eliminates the need to host and manage the data locally.

Excel cannot handle the size of some of the tables. Excel also has issues with the IDs, which are bigInt unless they are imported as strings instead of numbers, but they aren't by default so it's problematic. Tableau likewise had problems with the bigInt fields (that may have been recently fixed, but I can't find the announcement I thought I read about it).

For additional information about getting Canvas Data loaded, please see these threads.

View solution in original post

4 Replies
Highlighted
Navigator

samson.chung@uoit.ca 

A relational database. Canvas uses Postgresql on the backend, but you can use MySQL, MSSQL, Oracle (has some problems with long field names), Informix, etc. Basically, whatever database system you're already using, but if you're talking "Excel" be prepared for the difficulties of big data. People who can afford it may use Amazon Redshift since that eliminates the need to host and manage the data locally.

Excel cannot handle the size of some of the tables. Excel also has issues with the IDs, which are bigInt unless they are imported as strings instead of numbers, but they aren't by default so it's problematic. Tableau likewise had problems with the bigInt fields (that may have been recently fixed, but I can't find the announcement I thought I read about it).

For additional information about getting Canvas Data loaded, please see these threads.

View solution in original post

Highlighted
Community Member

jamesjonespa@gmail.com‌, thanks for the reply...

but that is where my first problem was. As the documentation mentioned Postgres, I figured I could have dump the data into MySQL. the dump file created is tab delimited and MyQL workbench doesn't like that. I figured I may have to create a brand new schema and define my tables within the schema based on the definition before loading the file in via commandline...

However, I was wondering if that's what everyone else is doing, as it seems like a lot of repeating work for all administrators... but of course, there's always Redshift.... 

Well, I guess that's what i'll have to do... basically create a schema with all the tables an inject the data....

Thanks,

Sc.

0 Kudos
Highlighted
Learner II

Hi samson.chung@uoit.ca

As james@richland.edu‌ shared, when dealing with Big Data the issue is tools. The community tries to share what it can but some institutions have fancy tools that don't even need to define the schema, see the AWS article James linked above. Other's are stuck building a schema and trying to get the data into a database in a timely manner, see the Embulk thread James shared. This one has the schema built for 4 databases, even MySQL. This way you don't have to create another solution before you start.

Highlighted
Community Member

Thank you! I will take a look at your post that James have linked.... Smiley Wink

0 Kudos
Labels