The CSC recently experienced some unexpected downtime on Sunday morning (09/09) from roughly 5AM for about 15+ hours, on the servers glomag, caffeine, and mail. The following services were affected:
- www - mail delivery - mailman - smtp - imap - mysql - psql - ssh - webmail
At this time, all these services have been fully restored. If there is a problem with using one or more of these services, please e-mail syscom with your inquiry.
Minor issues may exist for some of our users who use the mbox delivery format on /var/mail. If you are one of these users, please note that your /var/mail has been archived to: - /var/mail-20120909 from before the crash on Sunday - /var/mail-20120512 from the mail host migration that occurred earlier this year We are working to consolidate all of these resources eventually, so please bear with us at this time.
If you have no idea what mbox is, or you are currently using the Maildir delivery format, then your mail is unaffected.
Having conducted some preliminary investigation into the matter, we have determined that the cause was due to resource exhaustion on one or more of our servers.
It is important to note in this case that caffeine is the machine you will be on if you ssh directly to csclub.uwaterloo.ca . As such, we must remind our members that caffeine primarily acts as our webserver, and that members should avoid running resource-intensive tasks on this host, unless it is to act as a public-facing webservice. We have many other machines that are useful and better-equipped for running resource-intensive tasks, such as taurine, corn-syrup, and hfcs. A full list of our machines available can be found on our wiki: http://wiki.csclub.uwaterloo.ca/Machine_List
Please be considerate: CSC systems are a shared resource, so actions you make on a system will affect all other users on that system.