I am using the grab command to get the latest data dump but that does not also download the schema file. Is there a way to download just the schema.json file so that I always have the latest version whenever it is updated?
Are you looking for just the json file, or a generated DDL?
I believe this will work. Thanks
If you use the canvasDataCli sync command you maintain a local copy of the gzip files. Every time you invoke it you'll get the latest dump plus the current schema.json file. The other benefit of this approach is that the gzip files are written to individual directories per table and the canvasDataCli unpack command can then be used to unzip and concatenate the data and add a header record with field names to generate a set of text files. The grab command simply writes all of the gzip files into a single directory.
We currently do sync every day but are looking to move everything into AWS and do not want to provision 300+ GB of storage to run this command. I have a working schema right now but realize that it may be outdated eventually so I was looking for a place to find the latest schema. It looks like the link Robert provided me with will work.
I've tried a few ways to parse that into a MSSQL, it's a cumbersome task. I use James Jones canvancement/schema_to_mysql.php - GitHub, which I modified for MSSQL.
I've updated that script many times locally and I need to update the version on GitHub. It can now add indices to it and I've moved the exceptions out of the source code into a separate file.
That's exciting, any update for this task is delightful. The detail you have put into parsing and *patching the docs, enums etc is extremely appreciated! I'll try and post a MSSQL fork after the update.
I attempted -- or at least I thought about attempting -- to make it extensible, so one could specify the flavor of SQL that one was using. It may not need a separate fork, but possibly a configuration option. I really was just throwing something together when I made it.
With the latest version, I've enumerated some of the fields that I know are enumerated but don't say so in the docs or that I couldn't pick up with a scan from the docs. All the workflow_state fields are that way. There are some others that we could make that way.
That would probably work, and be extremely beneficial, maybe someone can contribute with Redshift and we'd cover a lot of bases. I don't think I had to modify much, maybe some datatypes and delete/table/create differences.
Retrieving data ...