Visualize Discussions - Threadz

Learner II
17 20 4,356

At Eastern Washington University we've built an LTI called Threadz that is open sourse and available for anyone to install at their own institution.  Threadz is a discussion visualization tool that adds graphs and statistics to Canvas discussions.

Online discussions provide valuable information about the dynamics of a course and its constituents. Much of this information is found within the content of the posts, but other elements are hidden within the social network connection and interactions between students and between students and instructors. Threadz is a tool that extracts this hidden information and puts it on display.

The visual representations created from social network connections and interactions between students and instructors in a discussion assist in identifying specific behaviors and characteristics within the course, such as: learner isolation, non-integrated groups, instructor-centric discussions, and key integration (power) users and groups. By identifying these behaviors and characteristics, the instructor can affect change in these interactions to help make the discussions and classroom discourse more accessible to all.

More information can be found on the Threadz website.

The files to the LTI can be found on github at: mcjelewis/threadz · GitHub


Community Team
Community Team

What a fascinating tool,! Am I correct in my assumption that only instructors would see these visualizations?

Community Coach
Community Coach, this is a pretty cool tool! I checked out the website and video and was wondering if there should have been sound with the video? Just wondering because it was sometimes hard to understand what was being shown or the significance of what was being shown in the video. The visuals tab (with explanation) helped, but I'm wondering if there's any additional discussion/explanation of the different parts of Threadz?

Thanks and thanks for sharing this!!

Learner II

Threadz will open for a discussion for anyone that has permissions to that discussion.  So both students and teachers have access to it. This can be changed of course in the LTI config.xml file if you wanted different behavior.


Awesome job​!

This looks like some of the stuff I wanted to do with a Google Spreadsheet I was working on for analyzing discussions (not my one that analyzes all discussions) except that it's a visual and mine was essentially just dataa. I was working on grabbing the like information and the number of unread posts so you could see how the groups fell into place by likes (rather than just by responses), but that requires the ability to masquerade and involves an API call for each student participating. That severely limits who can run it and it's usefulness.

Your product is much slicker and by making it an LTI, you get to control the interface as well. I've also been eyeing D3 for years and have never really been able to wrap my brain around it, so this is impressive on all accounts.

Thanks so much for putting this out there.

Learner II


I too originally started pulling all the data from discussions into a Google Spreadsheet.

Within Threadz there is a 'Statistics' tab that shows some of the basic counts (posts and word counts by user or thread).  I haven't yet explored the 'like' and 'unread' data that Canvas provides.  That is going to be an interesting idea to explore.  Maybe a portion of Threadz could be restricted by instructor role so more information can be displayed.

Community Team
Community Team

As always Matt, I am in awe of your brilliance!  Strong Work!  Thank you so much for sharing!


The likes and unreads are going to be a challenge since it currently requires masquerading privileges to get at that information. I've got admin rights, so I can do it, but most teachers don't. Of course you're more familiar with it than I am, but I don't see where adding a section that only teachers can view would be beneficial. You know what decisions you went through when developing this, but it looked like everything you're displaying was already available on the page that anyone with view permissions could obtain (there could be a problem with duplicate names if you just read from the screen instead of obtaining through the API). I think you've got about everything that can be obtained by a normal user.

I saw the statistics page in the video. Although there didn't seem to be an export to .CSV button, those who want to use that data in a spreadsheet can select/copy or use a browser plugins that will copy a table for you.

I have Google Sheets code that currently isn't functional (it was but I decided to clean up the code and that broke it). It wrote an entry for each discussion post that you could then analyze using pivot tables. It did the word counts and also provided the first 50 (configurable) characters of a post as part of the data so the instructor had some way to find it. One way I envisioned it working is where a teacher and an admin could team up on the Sheet. The admin could go in, fetch the data using masquerading, and then share it with the teacher. As long as the teacher didn't decide to re-fetch the data and overwrite what was there, the information the admin gathered would be available.

Of course, a better approach from an admin perspective would be to write some kind of database archiving that kept a lot of this information and then write an interface that allowed faculty to pull up their own courses. Maybe it could be done through an LTI like you have so it was all internal to Canvas. I've not worked with developer keys, although I've written LTIs that pull information from external databases like I'm describing here. Canvas Data provides some of the information, but it didn't have anything on likes or unreads. It also provides the length of the message, but not the word count. I had stripped out html tags and hyperlinks when counting words and providing the first 50 characters.

Learner II

The export to CSV has been in my sights for a while, just haven't gotten to it yet.  I also need to output the graphs better.  Currently the css stylesheet isn't being applied to the saved image, so while functional, it is not pretty.

Community Coach
Community Coach

Just like Kona, I got no audio track with the screencast.

This would be most helpful in understanding the resource.


Learner II

Yes, sorry about that.  I didn't record any audio with the screencast.  I just made a quick demo reel of what the pages and graphs look like. I'll work on getting an update video.

Community Coach
Community Coach

Seeing the demo was pretty terrific, but hearing you talk through it would be AWESOME! Smiley Happy

Learner II​,

I've made the statistic tables data downloadable and also added in the ability to download a full SNA dataset of the discussion.  Changes have been made to the Threadz github repository.

Thanks for the suggestion.



I threw that out as a work-around because, as a programmer wanna-be, I know that sometimes it can be difficult to find time to work on projects -- not to egg you into implementing it. But having it implemented in the code is definitely sweeter.

Learner II

I'd been planning on making the full SNA dataset available, but I needed that little nudge to get me to set that time aside.

Learner II​ I started to look at the read/unread and liked data coming out of the discussion API and have a couple questions you might have already explored.  Do you know the difference between 'rating_count' and 'rating_sum'?  When would these be different?

Also, even though I'm an admin, as a teacher in a course the data that is returned is to my role as teacher so unread and likes are specific to me.  But that would change if I'm not enrolled in the class.  Does that sound correct?

Thanks for your thoughts.



I have a hunch about rating_count and rating_sum, although I hadn't verified it until just now. When I was looking at the entry_ratings value that was returned with the full view, I did see a difference.

"entry_ratings": {
   17972826: 1
   17985232: 0
   17992501: 1
   18032556: 1
   18032897: 1
   18066892: 1
   18067503: 1
   18103759: 1
   18103855: 1
   18104221: 1

Notice that 17985232

When you go to that post, you see this (among other things)

"rating_count": 1
"rating_sum": 0

That was a message that I had liked and then decided that I didn't like it so much and so I unliked it. It increments the count, but not the sum. So, if you're trying to get a count of the number of likes, I would use rating_sum rather than rating_count.

For the second question, the unread_entries, forced_entries, and entry_ratings are specific to the user making the API call. So, if you were not in the class, they would be empty. If you were a teacher who didn't participate in their own discussions, they would be empty. The first two are arrays that contain entry_ids, the entry_ratings is an object containing an entry_id and the rating as shown above.

Since I'm an admin, I'm able to go through and iterate through the class and obtain the ratings and unread information for everyone in the class. I use the as_user_id= parameter on the query string to do that. Most faculty probably do not have masquerading permissions, so they wouldn't be able to obtain useful statistics for their entire course with their access token. Since I'm setting mine up in Google Sheets, you could have an admin download the information and grant the faculty access to it. But it really sounds like a more robust tool would preclude its use by the common folks and require an admin to setup a local database to gather the information and then make it available to the faculty. Either that or we just forget about unreads and likes given for the normal user.

The unread_entries array may be misleading anyway. My class told me that when they use the mobile app, it doesn't automatically mark them as read like it does within a browser. As we all know, scrolling slow enough that Canvas automarks them is not the same as actually reading them. And then there is the the "Mark all as read" button under the cog that students could use without having to read any of them. I kind of took it that if you had over 50% of the messages unread, you probably weren't really participating in the discussion. But I would not say that someone with 0 unreads actually read them all.

Since we're sharing what we're working on ...

I was looking at getting the interactions from the User Access Report (internally called Roster User Usage) that has the number of times an asset was viewed and how many participations occurred and adding that to my spreadsheet. From what I can tell, that information is not quickly available through the API. It is accessed through /courses/:course_id/users/:user_id/usage but without the /api/v1 in front of it. If you make it usage.json, you can get just the data in a usable form, you would still have to iterate through all the students to get it. However, since it's not part of the API, you would have to be logged in to Canvas and run it from the browser as the user.

I could write a GreaseMonkey script that would grab every student's access report and then export it to a .CSV file, but I'm not sure how to automatically get it into the Google Sheet I'm working on.

I'm not sure how you would handle that with the LTI application or if you had even thought about going that route. As an admin, I can grab a list of page views for each student through the API, but that's going to take longer than people want to wait (there are a lot of page views) and I might run into the 5 minute script execution limit on a Google Sheet. it might be worthwhile to tie it into our Canvas Data system (which I'm still setting up) and then that information would be available but not current. I just thought knowing how many times the student even viewed the discussion would be helpful for my analysis of what grade to give them.

Learner II​, first off, thank you.  You always provide very thoughtful and thorough responses. 

Great catch on the 'unlike'.  That would have taken me a while to stumble upon.

While not ideal, the user level data, I think will work for part of what I'm looking for.  While this doesn't give any data for more research type analytics, what it does provide is the ability to customize the network display to the individual when they access Threadz.  Not so much of a comparison to other students in the class, but potentially showing some useful information for themselves.

One possibility to get to the higher level analytic data could be to have the LTI look at the role of the user. If  role='Admin' then run the higher level analytics cycling through the API calls with 'as_user_id' (and again thank you, I was unaware of that parameter).

All great stuff.  Thanks again.


One of my faculty just brought Threadz to my attention.  While the demo seems pretty interesting, does anyone have any experience in it's usage on active courses?  Is it pretty much ready right out of the box, or will it require a bunch of customization work?  Mostly, as the installation of Threadz requires more time and money than a traditional LTI, I wanted to get a bit more info before I get too far into this project.

Learner II​, Threadz does require some customization.  There are a couple of places on the launch page that requires some specific information about your institution as well as setting up your own Developer Key on your Canvas installation. What I'm finding so far from those that have installed Threadz is that the most difficult part of the set up is getting the PHP server environment established. But once that gets created there isn't much other customization.

The reason we went with the "self-hosted" version of the LTI is because access tokens and private course content are being passed into the tool.  By being self-hosted we removed any FERPA issues of sharing content.

Learner II

We just released a new version of Threadz. Besides some boring back end upgrades, a new set of network diagrams have been added as well as a list of all non-participating students for each discussion.

I've recorded a new demo video (with sound this time) and uploaded some new images that can be found at: Threadz" title="