72-hour Outage

I apologize for a 72-hour outage that affected all * services: DNS, Web, Blog, etc. Everything was down from Aug 12 0436 UTC to Aug 15 0447 UTC. My services are hosted on a single server which suffered multiple fan failures, which lead to overheating of the processor, and failure of the motherboard. This is really my fault. I was intentionally running a full row of fans in the 1U chassis on 5V instead of 12V to reduce noise —no, you do not want to know where the server is hosted— and they ran fine for more than a year until that day when they stopped spinning due to expected wear and tear of the ball bearings. They are now only able to spin on 12V.

This reminds me acutely that I need to, and still want to implement the high-availability blog CMS I described in my very first post. Finally, I will stop running fans out of their voltage spec :-)


Erkki wrote: Just out of curiosity, what you think is the mechanism causing more wear to fan bearing when driving the fans with lower voltage than with PWM?

I can see only connection that the bearings wear out more, because the CFM of the fans is lower and thus cooling is worse. That means the fans draw more heat (ie. hotter air) through them, which affects the bearing more than the voltage, what I can think about.

If one silences a machine by lowering the fan throughput by whatever means, then either the fan heat exposure should be better or the machine's heat generation should be lowered as well.
16 Aug 2011 06:41 UTC

mrb wrote: I don't think the lower voltage is responsible for increasing wear and tear.

It is just that normal wear and tear slightly increased friction over time. Now 5V is not enough to overcome the friction.
17 Aug 2011 05:27 UTC

TooMeeK wrote: How does it look like?
Did it shut down when was overheating? Suppose to.
I have such problem with my switch - it's metal case and is VERY hot in very hot days...
I'm also thinking about high-availability of web,database servers and other services - I was thinking about cheap VPS :)
06 Nov 2011 23:29 UTC