Redshift vs. Canvas Analytics

mlewis23
Community Champion

I'm trying to reconcile two different data sources for the Canvas numbers: Redshift and the Canvas Analytics page.  I looked at the Canvas Analytics for a sub-account (department in our case) and term (Winter 2014). 

The number of Discussion Topics  on the Analytics page for this dept/term is 479.

Using Redshift the numbers I'm able to get are Discussion Topics 31 and distinct count of Messages is 635.

Here is the query I've pulled out of Tableau, gives the same results when running it through Workbench:

SELECT enrollment_term_dim.name AS none_name__enrollment_term_dim__nk,

  enrollment_term_dim.sis_source_id AS none_sis_source_id__enrollment_term_dim__nk,

  account_dim.subaccount1 AS none_subaccount1_nk,

  COUNT(1) AS cnt_number_of_records_ok,

  COUNT(DISTINCT discussion_topic_dim.id) AS ctd_id__discussion_topic_dim__ok

FROM public.discussion_entry_dim discussion_entry_dim

  INNER JOIN public.discussion_entry_fact discussion_entry_fact ON (discussion_entry_dim.id = discussion_entry_fact.discussion_entry_id)

  INNER JOIN public.discussion_topic_dim discussion_topic_dim ON (discussion_entry_fact.topic_id = discussion_topic_dim.id)

  INNER JOIN public.discussion_topic_fact discussion_topic_fact ON (discussion_topic_dim.id = discussion_topic_fact.discussion_topic_id)

  INNER JOIN public.course_dim course_dim ON (discussion_entry_fact.course_id = course_dim.id)

  INNER JOIN public.enrollment_term_dim enrollment_term_dim ON (course_dim.enrollment_term_id = enrollment_term_dim.id)

  INNER JOIN public.account_dim account_dim ON (course_dim.account_id = account_dim.id)

WHERE ((enrollment_term_dim.name = 'Winter Quarter 2014') AND (discussion_topic_dim.workflow_state = 'active'))

GROUP BY 1,

  2,

  3

Note: there is no change in the data if discussion_topic_dim.workflow_state = 'active' is removed, this also seams odd