Why are there NULL user_ids in discussion_topic_fact? Sharing some findings from our testing
Sharing some findings from testing we did exploring this question.
One of the schools we work with wanted to create a visual showing discussion activity with details about who originally created the topic or thread. They discovered a large number of discussion topics have NULL user_id’s, which means there is no way to know who created the topic in the course.
Background on Discussion Topics and Entries
In Canvas, discussion topics are tied to discussion entries. Information about discussion topics can be found in discussion_topic_dim, including the topic title. In addition, discussion_topic_fact contains identifiers for key related information, such as the user_id of the person who created the thread (user_id). It also includes the user_id of the person who edited the thread (editor_id).
Information about discussion entries, or replies, can be found in the discussion_entry_fact table. Discussion_entry_fact contains one row for each entry or reply, and the person who created or edited the associated topic for that entry will be listed as the topic_user_id and topic_editor_id, respectively.
Discussion topics can be created in several ways:
- Discussion topics included in courses imported through a common cartridge import
- Discussion topics included with courses imported through Commons
- Previously inherited discussion topics from cases 1& 2 above, subsequently copied into a new course
- Discussion topics newly created directly in the course
What we found
What we found is that for cases 1-3 from above, there will be a NULL value in the topic_user_id for those discussion topics.
This makes it difficult to trace activity all the way back to an original point for the course.
We did also find that if the instructor imported the topic and then proceeded to edit it, the correct user_id is logged as the topic_editor_id in the discussion_entry_fact table (corresponds to the editor_id in the discussion_topic_dim).
In addition, as expected, discussion topics which were newly created in the course (not inherited), correctly reflected the user who created the post.
What the school has decided to do is attribute all NULL topic_user_id’s to instructors (as opposed to students). This enables their visualization to distinguish between discussion activity in response to instructor initiated threads vs. student initiated threads.
To clarify: we don't think this behavior is necessarily a bug, because for those cases where the discussion topic was not created within the course in question, it would sort of make sense that the topic_user_id is NULL. But we think it's important to be aware of this when trying to attribute users to topics for reporting purposes.