We are working with the canvasCLI tool to integrate canvas activity into our data warehouse.
We have joined our student SIS data to canvas successfully using SIS_USER_ID in pseudonym_dim, however, when we try to get their GLOBA_CANVAS_ID from user_dim, there are no records for about 1/2 of our users.
I'm joining pseudonym_dim.user_id = user_dim.id
As additional evidence of a problem with missing data, we have roughly 440k unique records in pseudonym_dim, and only 220k unique records in user_dim.
Is there a reason users are not included in user_dim? We have good evidence that students enrolled in courses are those not included.