Difference between revisions of "Alert notifications"
Jump to navigation
Jump to search
m |
|||
Line 19: | Line 19: | ||
* Software damages |
* Software damages |
||
** HD capacity |
** HD capacity |
||
− | ** CPU load 100% for more than X mins |
+ | ** CPU load 100% for more than X mins<br>The easiest is to take the third field of /proc/loadavg which is a mean over the last 15 mins, here with 2 CPUs: |
+ | awk '$3 > 2 {print "alert"}' /proc/loadavg |
||
** network load > X for more than Y mins |
** network load > X for more than Y mins |
||
** exim load > X mails sent per min |
** exim load > X mails sent per min |
Revision as of 15:24, 13 December 2006
This is a generic page about reporting all kind of misbehaviours from a server.
This is draft, to be implemented :-)
Data collection
What to filter for what kind of alert?
Mail alerts
- Syslog -> Logcheck
- We should send also at least what we want to report via jabber/SMS
Jabber/SMS alerts
- Hardware damages
- temp, fans
- raid
- Software damages
- HD capacity
- CPU load 100% for more than X mins
The easiest is to take the third field of /proc/loadavg which is a mean over the last 15 mins, here with 2 CPUs:
awk '$3 > 2 {print "alert"}' /proc/loadavg
- network load > X for more than Y mins
- exim load > X mails sent per min