Issue with Downloading Discussion Posts/Replies

mk4261
Community Novice

I am having trouble downloading discussion board posts for the current semester, Summer 2020. I need to calculate the average post length for different discussion topics within different courses. For example, I have a discussion topic with 33 posts/replies, but when I download the data and query by discussion topic (joining discussion_entry_dim and discussion_entry_fact), only 5 posts appear (2 that were deleted and 3 that are active). I downloaded the Canvas data via a batch file developed by our data scientist and uploaded it to an AWS server to query the data. This same process has worked before with all discussion posts in Canvas appearing in the query for another discussion in March 2020. I tried another route of downloading, unpacking, and opening the discussion_entry_dim and discussion_entry_fact tables in Excel using the Canvas LTI Tool and only the same 5 posts appeared again. I also searched a couple random posts using wildcards (both within mySQL and Excel) to see if maybe the ID's were incorrect in the tables and I cannot find those posts within the data files. Do you have any idea where these posts could be stored? I would rather use a more robust method than just copying and pasting all the discussion posts. Thank you.

mySQL script for specific discussion topic ID: 

SELECT * FROM DISCUSSION_ENTRY_DIM INNER JOIN
DISCUSSION_ENTRY_FACT ON DISCUSSION_ENTRY_DIM.id=DISCUSSION_ENTRY_FACT.discussion_entry_id
WHERE topic_id LIKE "%XXXXX";

mySQL script for trying to find discussion post message:

SELECT * FROM DISCUSSION_ENTRY_DIM INNER JOIN
DISCUSSION_ENTRY_FACT ON DISCUSSION_ENTRY_DIM.id=DISCUSSION_ENTRY_FACT.discussion_entry_id
WHERE message LIKE "%INSERT QUOTE HERE%";