My server admin has a few questions related to disk space usage. My understanding is that:
- The requests file is a brand new, appended file each day and previous files are not updated each day.
- The other files are all complete re-writes of the previous data since some could have been modified.
So, technically, using the Canvas-Data-CLI tool to sync would mean that all of the older canvas tables except for requests would be replaced with updated versions if they exist, and the requests files would only have the new data for the new files files downloaded since the others exist (and not be overwritten).
So, is there a reason to have to continue to store the gz files for all of the previous day's requests? right now our GZ files are over 30GB and have doubled in size the last 2 months - so if we don't need to store the historic requests files or could move them to "cheaper" storage that would be a huge help but right now if I delete one of the older requests files and then sync, it downloads it again...
I did find a possible hack to address his concerns - move the older files elsewhere, create a text file named the same thing as the gz file. But I wanted to make sure that there would be nothing new in the requests file that this could mess up. And I can always write this into my process.