We are working with the canvasCLI tool to integrate canvas activity into our data warehouse.
We have joined our student SIS data to canvas successfully using SIS_USER_ID in pseudonym_dim, however, when we try to get their GLOBA_CANVAS_ID from user_dim, there are no records for about 1/2 of our users.
I'm joining pseudonym_dim.user_id = user_dim.id
As additional evidence of a problem with missing data, we have roughly 440k unique records in pseudonym_dim, and only 220k unique records in user_dim.
Is there a reason users are not included in user_dim? We have good evidence that students enrolled in courses are those not included.
Solved! Go to Solution.
Just to update followers, the issue was actually that our warehousing software was scrubbing records with negative ID's, which had the effect of removing approximately 1/2 the rows.
We learn something new every day!