Musings on monitoring/alert options

Ray Wong rayw at rayw.net
Fri Aug 19 14:30:35 PDT 2005



Hi folks, haven't been paying enough attention to the list, but hadn't noticed
the subject recently, so figure it's a good time to bring it up again...

I'm curious if anyone's found compelling reason to CHANGE monitoring from
something that previously "basically" worked?  There's always the classics,
BigBrother/BigSister, Nagios/NetSaint, etc, but I haven't noticed any really
good comparisons of configuration/complexity options...

i.e. If I've got more than 12/100/1000 redundant servers, I probably don't
need to get a pager/SMS alert until at least 5% (or some arbitrary number)
are down...  Does anything opensource support this?  To this point, I've
always made do with messaging about down servers without alerts, and doing
criticals/alerts on the aggregated(i.e. load balanced) function getting
slower/intermittent, or just dealt with large numbers of alerts to roll
over and go back to sleep from. :)  Of course, having a 24/7 noc staff
screen it out before calling would be nice, but I doubt too many of us are
holding our breath for those days.

So, what's everyone else doing?  Had a chance to do anything new/interesting
with it?

Ray




More information about the Baylisa mailing list