I have been intending to add automated monitoring of our institutions SIS processing - which is based on transfers of csv files for terms, accounts, users, courses, sections, enrollments. There are a couple of API endpoints (SIS Imports and SIS Import Errors) that provide a wealth of info. Currently working on turning the Python scripts I am developed into a Xymon extension. The scripts currently look at the start and end times of processing to make sure the processing is occurring on schedule and completing within a defined maximum duration. Scripts also look at counts of records processed from courses, enrollments, sections, etc, to make sure count of records processed is within expected numbers. The code which looks at errors/ warnings in processing makes decisions about when to ignore error messages - we typically get batches of enrollment errors related to some of the independent study courses, since their underlying schedule information is prone to change (cancellations in particular). I anticipate new error-catching conditions will need to be added to the scripts, as new kinds errors come to our attention, so I am writing the code which detects error cases as simple methods in classes that can "chain" together, following a design pattern that is inspired by (but not quite) the GoF Chain of Responsibility pattern. The benefit is new pieces of error handling logic can be added to the chain without having to "splice into" a big hair-ball of conditional logic. Each piece of logic is stand-alone and does not affect other parts of the chain.
I am sure others have done similar work already, and would benefit from sharing ideas.
Almost done! I think my one new idea - the one that was not completely obvious when starting this project - is how to derive a consistent identity for all of the errors that Canvas returns. At our institution we get a bunch of warnings/errors around enrollments to various courses and sections that are independent studies courses. The SIS data around this stuff is somewhat messy at our institution - not too surprising since there are many, many independent courses in each department, each one subject to cancellation. If I did not ignore errors and warnings related to independent study courses, alarm bells in our monitoring system would be going off all the time. However, I do not want to throw-away the information I am getting about these errors. But I do not just want to naively throw the same errors, over and over, into a log (noise drowning out signal). What I have decided to do is create an identity for the enrollment errors by associating each error/ warning with a (hopefully unique and consistent key). I derive a composite key of the interesting attributes in the row-data returned for each error. In my implementation, the composite key is a Python tuple used as dictionary key (made durable with a Python pickle). There are some useful fields to pull out of row-data we get for each error, from which I derive the key. BTW, The fact that we get row data back is cool; that the row data is given to us as a string representation of a Ruby Hash .... not so cool. Anyway, since I create an identity for each these errors/warnings I am not continually logging the same error over and over again. Happy to talk about this more -- since, like I said this is my one (perhaps novel, hopefully good) idea.