When syncing the Canvas Data using the CLI tool, I am finding that not all the tables are downloading. From interpretation, the tables that are "_dim" should be combined. Am I missing something regarding the sync process - it works well from the CMD Tool but just not getting all the data.
The canvasDataCli sync operation synchronises your local data store with the source. If you use the debug switch you'll get more detail, (canvasDataCli sync -c config.js -l debug). It first gets a file list from the source then compares the local files with the list and only downloads the files which are different. The local data store is in the dataFiles directory which has a subdirectory for each table, each of which will contain one or more *.gz files.
The canvasDataCli unpack operation then unpacks and stitches these files together, adds a line containing field names at the head of the file and outputs the resulting *.txt file to the unpackedFiles directory. Only tables which contain data will be downloaded, if you don't use the specific functionality there will be no data. For both fact and dim tables each dump is a complete snapshot of the data at the point the dump was created. The requests table is the only exception to this.
Thanks Stuart – this worked like a champ!
Is there a quick way to unpack a directory without me having to put in each sub-folder name i.e. canvasDataCli unpack -c Users/username/Desktop/config.js -f account_dim course_dim requests
I'm not aware of a way to unpack multiple files with a single unpack operation. From memory the documentation indicated that this was a design decision due to the potential size of unpacked files, specifically requests.txt.
We download to a Linux host and have written a shell script to unpack all of the files after a successful sync, so it's not an issue.
That’s the next step – put the data in an mysql DB for processing with Cognos. If you have any suggestions, or good documentation, please pass forward! I’ve seen some but this will be a new process for me since I usually do not pull reports. Cool stuff!
Thanks for all your assistance!
If you're using mysql then you should probably be looking at the Canvas Data Loader tool rather than canvasDataCli.
We don't use it because it doesn't support Oracle, but I understand that it is able to download the data and upload it directly into a mysql database. It also handles periodic historical dumps of requests data. We load requests data into a table partitioned by month and compress partitions once they become inactive. Even then, the requests table is over 500GB after three semesters.
Another benefit of Oracle is the ability to use bitmap indexes and star transformation to improve the performance of Cognos reporting.