Glen Parker

How are you cleaning up invalid data after loading (mysql)

Discussion created by Glen Parker on Jun 3, 2016
Latest reply on Jun 3, 2016 by James Jones


Howdy,

 

After much fun I've got a process to load/reload the data nightly.   Now that this is stable, it's time to make use of the data.   This involves queries, and queries are better with indexes.   

 

In every tables I've tried so far, I attempt to create an index, but get an error message about

 

mysql> create index course_dim_idx1 on course_dim (canvas_id);

ERROR 1292 (22007): Incorrect datetime value: '0000-00-00 00:00:00' for column 'conclude_at' at row 47160

mysql> create index assignment_idk1 on assignment_dim(course_id);

ERROR 1292 (22007): Incorrect datetime value: '0000-00-00 00:00:00' for column 'due_at' at row 13632

and so on.

 

I can envision a few solutions to this in MySQL.

- Change the mysql parameter to ALLOW_INVALID_DATES

- Disallow zero dates using NO_ZERO_DATE mode

- exclude rows with bad dates before loading data

- Attempt to cleanup the bad dates before loading using a preload process

- Cleanup or delete the bad rows from canvas_data after loading

- Attempt to cleanup the source data in Canvas so that I only get good dates in canvas_data

- Use a different date type for MySQL (datetime as opposed to the current Timestamp)

 

How are you handling this?

 

Outcomes