I'm trying to set up my database and I see a lot of TEXT fields in the schema. Do I need to make these TEXT, MEDIUMTEXT or LONGTEXT?
Short Answer - LONGTEXT
Long Answer -
The PostgreSQL TEXT field allows up to 1 GB.
In MySQL, TINYTEXT is 255 bytes, TEXT is 64K, MEDIUMTEXT is 16MB, and LONGTEXT is 4GB. So, if you want to be absolutely positive that you could hold the entire thing, then LONGTEXT will cover it.
That said, most of the fields that use TEXT are short, like URLs or or descriptions and TEXT will be sufficient. Only a few items, like the page body or possibly some really, really, really long quiz questions would exceed that 64 KB limit.
But, if you have plenty of space, the TEXT takes 2 bytes per entry to store the length, MEDIUMTEXT takes 3, and LONGTEXT takes 4. I wouldn't really worry about 2 extra bytes when you're looking at fields possibly containing megabytes of information, but I guess you could save space if you wanted to.
I ran a command to see how long the longest line was in each of my dump files. Granted, this isn't the length of the longest field since there are multiple fields on each line, but it would be a good indicator of what tables had fields more than 64K.
Here is a list of the tables from our instance that had "long" lines and would be good candidates for LONGTEXT
The longest line in any of the other flat files was 471 characters. Granted, we may not use all of the features that return information in Canvas Data, but that should give you a good idea of which ones might need to be bigger than TEXT (65535 bytes).
But, if you don't want to have to worry - go with the short answer and just use LONGTEXT. A quick search online suggests that there isn't any performance hit, just that 1 extra byte for the size with each step up in size.
Note: I'm using character and byte interchangeably, but it really depends on the encoding of your MySQL tables (you should support utf-8) and the type of characters (ascii, unicode, ???) that you have in your file, but you're not going to exceed the 4GB mark, even if it's full of unicode characters.
James Jones, perfect, thank you! I wasn't sure how much of a big deal that consideration was anymore - I SQL classes way too long ago when disk space and memory were probably more of a worry!
Retrieving data ...