simplifying schema.json

robotcars
Community Champion

I've been working on improving our Canvas Data import process and it involves parsing the latest schema into tablename.csv containing just the headers. I know there's been plenty of discussion about parsing the file and generating DDLs, docs, etc., most often by those just getting started with CD. Most require a considerable amount of recursive iteration and logic to compile something you'd want to use. I was on the #canvas-lms IRC channel awhile back, trying to parse JSON in Bash and someone pointed me to ./jq, a lightweight command-line JSON processor.

Anyway, after I tackled my task. I was curious about whether I could quickly populate a basic table/column file from the schema with little effort. Truthfully, I have to give more credit to the #jq IRC channel than myself.

I thought I'd share in case it helps anyone.

Here's a workable example on jq play 

#!/bin/bash
# requires https://stedolan.github.io/jq

wget "https://portal.inshosteddata.com/api/schema/latest" -O "latest.json"
jq '.schema | to_entries[] | "\(.value.tableName)", (.value.columns[] | " \(.name) \(.type), -- \(.description)")' "latest.json" >> schema.latest.txt
course_dim
id bigint, -- Unique surrogate id for a course
canvas_id bigint, -- Primary key for this course in the canvas courses table.
root_account_id bigint, -- The root account associated with this course.
account_id bigint, -- The parent account for this course.
enrollment_term_id bigint, -- Foreign key to enrollment term table
name varchar, -- The friendly name of the course.
code varchar, -- The code for the course (e.g. FA12 MATH 2000)
type varchar, -- [Deprecated] No longer used in Canvas, will always be NULL.
created_at timestamp, -- Timestamp when the course object was created in Canvas
start_at timestamp, -- Timestamp for when the course starts.
conclude_at timestamp, -- Timestamp for when the course finishes
publicly_visible boolean, -- True if the course is publicly visible
sis_source_id varchar, -- Correlated id for the record for this course in the SIS system (assuming SIS integration is configured)
workflow_state varchar, -- Workflow status indicating the current state of the course, valid values are: completed (course has been hard concluded), created (course has been created, but not published), deleted (course has been deleted), available (course is published, and not hard concluded), claimed (course has been undeleted, and is not published).
wiki_id bigint, -- Foreign key to the wiki_dim table.
syllabus_body text, -- Content of the syllabus for this course.
account_dim
id bigint, -- Unique surrogate id for an account
canvas_id bigint, -- Primary key for this entry in the Canvas accounts table
name varchar, -- Name of the account
depth int, -- Depth of the account in the hierarchy. The root node is at 0.
workflow_state varchar, -- Workflow status indicating that account is [deleted] or [active]
parent_account varchar, -- Name of this account's parent account. This field will be NULL for the root account.
parent_account_id bigint, -- Id of this account's parent account. This field will be NULL for the root account.
grandparent_account varchar, -- Name of this account's grand parent account. This field will be NULL for the root account and all accounts at level 1.
grandparent_account_id bigint, -- Id of this account's grand parent account. This field will be NULL for the root account and all subaccounts at level 1.‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

VSCode, has a nice RegEx replace feature.

308313_vs-code-regex-replace.png