The Instructure Community will enter a read-only state on November 22, 2025 as we prepare to migrate to our new Community platform in early December.
Read our blog post for more info about this change.
Found this content helpful? Log in or sign up to leave a like!
Instructors who have been using Canvas for a long time who haven't necessarily been responsible for optimizing files. Uploaded files such as images and other instructional content that should have been part of a course files storage area seem to stick around long after a course is over. Is there a way to get a report of when a file in one's user space is ready to be deleted? I have an instructor who has blown up their personal storage space by quite a bit and needs a fresh start as they cannot embed any more images in discussion posts. We need a strategy of which files are ready to be deleted by usage. This information will help me to assist where to begin.
Solved! Go to Solution.
@themidiman, there is an API call for that.
GET /api/v1/users/:user_id/files?sort=size&order=desc
You can add ther per_page=100 to that if you like, but it lists the user files with the largest file first.
It appears to include all files under My Files, including the Submissions folder. You may need to grab the folders so you can exclude certain files, but that should help identify where the large files are.
Again, I don't know anything about the trust instance, so it may not give you what you're wanting.
I understand what you're trying to do, but am missing your definition of when a file is ready to be deleted. Is it when the file is no longer used in any current course? When the file hasn't been used in a course for a fixed period of time? Do you delete course content after a certain period of time?
The next question is what tools you have available. Do you have access to Canvas Data 2, only the REST API, just the web interface???
Some people have said that CIDI labs has a tool with file management capability, but I'm not familiar with it. It would still need some way of knowing what is no longer needed (unless perhaps you delete course content and so the file is not used anywhere).
I would do a little preliminary work in the Web UI.
I would start with the Account > Files > My Files > conversation attachments and make sure those are cleared out as much as possible. If your instructor sent a lot of attachments to students (I've had faculty send their notes via attachments rather than posting it in Canvas), that can definitely fill up quickly and they likely aren't needed anymore. I've also had faculty provide media comments to their students and had that fill up their quota really quickly. Even though that counted against their quota, it didn't stop them from recording more media, but it might have (I didn't check and this was several years ago) might cause them to not be able to upload other things.
Then we move to the Account > Files > My Files, which contains all of the images the user has uploaded. You can sort by date to get the oldest files first. That may be enough to jog the instructor's mind of "I don't need that" or "I still need that".
Determining where a file is used becomes tricky. There's no API for that.
This is where Canvas Data 2 would be beneficial.
The file information is stored in the attachments table. For example, this will give me all of the files that I've uploaded. It takes a while because none of the columns are indexed, but it comes back with a list 295 files that I've uploaded to my personal space. Note that my DAP client prefaces the namespace in front of the table, so the table's name for me is canvas__attachments.
SELECT *
FROM canvas__attachments
WHERE context_type = 'User'
AND context_id = 2175488;
You probably don't need to worry about files that are already deleted so you could check the file_state. This brings me down to 39 files.
SELECT *
FROM canvas__attachments
WHERE context_type = 'User'
AND context_id = 2175488
AND file_state = 'available';
Some of the properties include size and modified_at. This could allow you to search for large files or old files (or some combination of both).
In particular, I found a file called "health_map.PNG" that is used in a discussion. It's file_id (attachment_id in CD2) is 133128306.
Now there is a really tempting table called attachment_associations. It "Links user files to an assignment to allow grader to see the student’s submission." However, it does not link to discussions, only conversation messages, submission, course, or group. The attachement_associations has an attachment_id column, but because my file is only used in discussions, it isn't that table.
That means that we need to look through the discussions. In a discussion entry, the image might be part of an attachment or it might be part of the message. The message has type bounded_str. For MySQL, that comes through as MEDIUMTEXT, which is capped at 16 MB. If the link appears after 16 MB, it may be difficult to find. Hopefully, most professors don't post messages that long.
Just searching for the file (attachment) ID in the discussion is going to take a while. For me, it took 28.6 s (my MySQL interface times out after 30 seconds).
SELECT *
FROM canvas__discussion_entries
WHERE message LIKE '%133128306%';
Since the discussion_entries has a user_id column and it's unlikely (potentially possible, though) that anyone other than me would be posting using my file, I could specify that in the search to speed it up. I also want to make sure the discussion is still active and that I'm not getting a partial match on the file ID (if the ID was 1234, I want to make sure I don't pick up 12345 or 61234). This took my request down to 8.5 s (although caching lowered the original request to 19.6 s, so it's not as much of as in improvement as indicated, but still faster).
SELECT *
FROM canvas__discussion_entries
WHERE user_id = 2175488
AND workflow_state = 'active'
AND message LIKE '%/files/133128306/%';
To check for attachments of the file, you could add the ID there.
SELECT *
FROM canvas__discussion_entries
WHERE user_id = 2175488
AND workflow_state = 'active'
AND (message LIKE '%/files/133128306/%'
OR attachment_id = 133128306);
You probably want to combine the two queries together, rather than searching for a specific attachment ID and repeating the query.
Here is some MySQL using common table expressions to accomplish this. I set the user_id once at the top so I didn't have to specify it multiple places.
SELECT 2175488 INTO @user_id;
WITH cte1 AS (
SELECT
context_id AS user_id,
id AS attachment_id,
filename,
size,
COALESCE(modified_at, created_at) AS modified_at,
CONCAT('%/files/',id,'/%') AS link_text
FROM canvas__attachments
WHERE context_type = 'User'
AND context_id = @user_id
AND file_state = 'available'
),
cte2 AS (
SELECT
a.attachment_id,
COUNT(1) AS n
FROM cte1 a,
canvas__discussion_entries de
WHERE de.user_id = a.user_id
AND de.workflow_state = 'active'
AND (de.message LIKE a.link_text
OR de.attachment_id = a.attachment_id)
GROUP BY a.attachment_id
)
SELECT
a.attachment_id,
a.filename,
a.size,
a.modified_at
FROM cte1 a
LEFT JOIN cte2 b USING (attachment_id)
WHERE b.n IS NULL;
This gives me a list of 28 files that are not used in discussions.
However, some of them are used for profile pictures, conversation attachments, and other places. I would need additional queries to make sure that they weren't found in places I wanted to keep.
But anyway, that might give you a place to start from if you had Canvas Data 2. If you're limited to the API, then it's going to take a lot more work.
Thanks, @James
I don't think my access to CD2 is the best route to pursue at the present, but if nothing else works I'll look into it.
I think my question is more about how a course 'folder' within the user's files area relates to what other roles might lose access to if deleted. Also how to delete those files in bulk via the UI to free up some space? If we know a course is no longer relevant but the user has large files in their user space that seem to be contributing to their user file quota, what is the relationship to the folder in the user's files area? So far I've realized that if the course is deleted so goes the associated folder, but we wouldn't want to do this for a course that is active.
The instructor in question is not proficient in compressing images that she likes to post to discussions for instance. She also teaches many multiple sections of the same course in different modalities (mostly online), and there can be as many as 6 sections of the same course each term in her active course load. Our institution regularly archives courses after 2 calendar years, with Spring and Fall alone that could mean at least as many 24 sections (including summer) with multiple large image uploads each time.
I appreciate the 2nd pair of eyes. I know from interacting with you in the past you're super smart with respect to automation of things via the API, and I don't mind putting in the work if that's how it must be done.
I'm not sure I follow. Let me explain what I'm seeing and maybe you can steer me to the right path.
Files and folders within the "My Files" (except for submissions) count against your user file quotas. There may be a list of folders corresponding to courses after My Files, but those do not count against your quota. Deleting those won't help the quota.
Under My Files, I have conversation attachments, profile pictures, Submissions (locked so that I cannot delete them), and unfiled. Looking at another user's files, I also see Uploaded Media. Since those, except for submissions, are not tied to a course, the only way to tell if they have been used is to look for them in the content somewhere.
For me, those folder-courses correspond to courses I'm currently enrolled. If I'm the teacher, I see the files. If I'm a student and the teacher has the Files navigation menu item enabled, then I see the files for that course. You don't want to delete any of those folders.
There may be files within those courses that are not used within the course. Unfortunately, the only way to tell if they're used is to scan through everything looking for where they are attached. That's also where some people offer solutions to de-clutter or identify unused solutions. I don't have control over a budget, so I try to make do with free whenever possible and that's why I end up writing a bunch of my own solutions.
Those files, however, would count against the course quota (typically 500 MB) rather than the individual quota (typically 50 MB). Another thing to consider is that when you copy a course within Canvas (as opposed to file upload), it doesn't make duplicate copies of the file, it just updates pointers and links to the file. That is, it counts against you when you upload a file, so as long as you upload once and reuse multiple times, it's not hurting the quota for those subsequent re-uses.
This can be verified by going to your Course Settings > Course Statistics > File Storage page. My trigonometry class has 924.45 kb in 6 files. Except there are a lot more than 6 files (at least one slideshow for each section), it's just 6 that have been updated since I copied the course over for the new semester.
Now that I've written that and reread what you wrote, I think I'm catching on.
The Course Files that show up after My Files are the same Files section that the instructor would see in the course. Don't delete any of those. They don't count against the user quota and they should automatically disappear from the User once course has ended (assuming you're using dates on courses or terms). Deleting files from there may cause you to need to re-upload them in the future.
If the instructor is using the "Copy a Canvas Course" option to create courses and not uploading a downloaded course export, then there's no problem to be addressed.
The quota that is showing up at the bottom of the Files page is tied to user files, not to course files. Those are the files under My Files (except for Submissions and possibly Uploaded Media Files). Getting rid of those is what my original response was about.
If you delete images that were uploaded to the user files for inclusion in discussions, then they will disappear from the discussion if deleted. They show a broken file image and then the name of the file. If the course is old or inactive, no one will notice. It may impact your archiving process, but it should be okay to delete.
If the instructor is doing this for each course, then teach her how to upload the files to the course files rather than the user files. Then it won't count against her personal quota. If she has a master course that she copies content from, then upload the files into that course and then they'll be available in all of the derivative courses -- whether in the same term or in future terms.
If the files exist in the user space and you want to help with the process, you can use the File navigation capability within Canvas to Move a file to a course. When you move within a course, it moves the file. If you move between courses (or personal space to a course), it creates a copy and leaves the original.
This means that you can copy all of the files that she's going to upload to discussions from her personal files to the course files from the Files page. Then you can remove them from the personal file space and clear up the quota.
When I moved (copied) a file from my personal files into my trig class, it did not change the file usage for the course. That's good, because it means you could copy the files she's going to use into each of her courses and it doesn't change the course file storage. I then deleted the file from the My Files where it reduced my usage against my personal quota, but it didn't update the usage.
I did verify that uploading a new file affects the course file storage immediately. I wanted to make sure that the file I copied from my files and then deleted wouldn't show up later after some processing.
You do want to show some caution. If you copy a file from your user files to a course, then delete it from the course, it breaks any existing links to it. If you copy it back to your user files, it does not restore the links. It gets a new file ID, but not an extra copy as Canvas still just stores the file once.
You can actually use that as a hack to clear up a quota issue, but it will break any existing links. I had three conversation attachments that I didn't need and my personal quota was a 18%. I copied them to a course, then deleted them from my user space. My file usage went to 10% of the quota. I then copied them back from the course to my conversation attachments folder and the usage stayed at 10% of the quota. Again, that breaks all the links, but it's quick way to lower the file usage.
All of what I've written here can be done through the web interface except determining whether a file is currently being used. That's where dates come into play. If you know the instructor has been doing this for years and they are exclusively for use in discussions, the modified at date can get you a long way towards cleaning things up.
Thanks, I will pursue this as a training issue. In the meantime, I cannot seem to find a way to determine which files are the culprit for the absolutely bizarre display of an overblown user quota which I'm fairly certain has never been changed administratively to be more than the default 50Mb. I kid you not when Canvas shows me that the user is using 119% of a supposed 2.1 GB quota:
To me this spells large media files such as images or even video files, but I can't seem to find anything more than a few MB in size when digging through her 'My Files' directory. I see a 'Submissions' directory which contains folders for several courses (some recent and some VERY old) but they all have lock icons on them:
Clicking on any one of these folders displays a blank screen with 0 items selected:
In the meantime I'm going to ask via an escalated support case where the ridiculous 119% of 2.1 gigabytes numbers are coming from as it seems like a glitch.
Hi @themidiman,
Do you happen to have a trust relationship set up between multiple Canvas instances? If you do, the quota number will be the sum of the user quota for all trusted instances (in your example, it could be two instances with a 50MB user quota each and then another with a 2GB quota). I'm just suggesting this possibiloty because we ran into it a few years ago when we set up a second Canvas instance for our campus. The calculations back then were way off, which did get fixed, but I'm still not super thrilled with the way they have this configured. I'd personally rather have the user quoat from the "home" instance of the user enforced.
We have each of our instances set up to allow 500MB of user space, but the screenshot for my own account shows 1GB as it's adding those together.
-Chris
@chriscas We do have a trust instance, so I'll look into that as well. Thanks for the idea.
You can check the quota setting under Admin > Settings > Quota. Google's AI tells me how to change it for a specific user, but I think it is hallucinating because that's not an option.
The submissions do not count against the quota.
A file embedded or attached to a discussion is not a submission. You won't be able to look there to tell which files are in use. Stop focusing on submissions, that's not the problem or the solution.
When I looked into this several years ago, Media Files count in the visible quota but they don't count against the quota. That is, you may have more than 100% because of Media Files but can still upload more because the non-media files don't count against you (even though they appear to). I had a teacher with something like 2000% over the 50 MB quota because of the media files, but she was able to keep uploading things.
Roughly speaking (since I discovered the copy elsewhere, copy back hack yesterday), it's the My Files folder and anything under it except submissions and media uploads. There might be some other folders that don't count. For your screenshot, you only have My Files, conversation attachments, profile pictures, and unfiled that you should look at.
I don't know anything about the trust relationship that @chriscas mentioned. That might explain things, but since you keep looking in the wrong spot (submissions), I still do not know that it's not just files. Given what you've described about her uploading large discussion files over and over, I would expect the My Files folder (not a subfolder) to be the main spot to look.
@James ,
I'm seriously not finding any files larger than a few MB in all those areas in the My Files folder and sub-folders. I wish there was just a function that would reveal what the largest files in a user's storage were and attack it from that angle. I'm going to do some API calls to see what that reveals and if that doesn't work I'll reach out to support.
I appreciate your perspective and attempts at helping to resolve this issue.
@themidiman, there is an API call for that.
GET /api/v1/users/:user_id/files?sort=size&order=desc
You can add ther per_page=100 to that if you like, but it lists the user files with the largest file first.
It appears to include all files under My Files, including the Submissions folder. You may need to grab the folders so you can exclude certain files, but that should help identify where the large files are.
Again, I don't know anything about the trust instance, so it may not give you what you're wanting.
@James ,
I am indebted to you for pointing out the direction I was meaning to head while I'm waiting for Instructure Support to respond to explain once and for all the ridiculously large file quota and exceeding limit. Using the endpoint I was able to extract some information about the files that were the largest. Strangely enough the largest so far was 240MB, but at least I can now figure out a sum of all the file sizes to see if they match up to the number in the UI.
I'm indebted to you for pointing me in the direction I was just about to head. Using this endpoint I was able to get the list of the first 100 files. Strangely enough the largest was 240MB, but while I'm waiting for Canvas support to explain those absolutely ridiculous numbers, I can now iterate through all of them to see if the total of all the sizes matches the being shown in the UI, which by the way is a pain in the butt to navigate through to find this same information.
Community helpTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign inTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign in