Difference between revisions of "Archive:System Monitoring"

From Tardis
Jump to: navigation, search
m (Text replace - "<div style="width:55%;margin:0 auto;border:2px solid;border-left:20px solid;border-color:#d9534f;text-align:center;padding:5px;font-weight:bold;">This page is out of date and needs rewriting.<br /> The content is likely to be incomplete or)
Line 1: Line 1:
 
{{OODNotice}}
 
{{OODNotice}}
 
[[Category:OutOfDate]]
 
[[Category:OutOfDate]]
{{InfoBox|text=This service does not currently exist if you would like to reinstate it please [[contact|contact us]]. The following information should be for historical interest only.}}
+
 
 +
==Nagios==
  
 
==Munin==
 
==Munin==
 
+
{{InfoBox|text=This service does not currently exist if you would like to reinstate it please [[contact|contact us]]. The following information should be for historical interest only.}}
 
We're running [http://munin.projects.linpro.no/ munin] for some out of the box monitoring fun, which can be seen at [https://www.tardis.ed.ac.uk/restricted/munin/].
 
We're running [http://munin.projects.linpro.no/ munin] for some out of the box monitoring fun, which can be seen at [https://www.tardis.ed.ac.uk/restricted/munin/].
  
Line 30: Line 31:
 
</pre>
 
</pre>
  
==Custom plugins==
+
===Custom plugins===
 
Some custom plugins have been written to monitor:
 
Some custom plugins have been written to monitor:
 
* The number of items in the support inbox
 
* The number of items in the support inbox
Line 39: Line 40:
 
[[Piper]] uses the standard sensors_ plugin, which relies on data from lm-sensors.
 
[[Piper]] uses the standard sensors_ plugin, which relies on data from lm-sensors.
  
==Malcolm==
+
===Malcolm===
 
* Munin occasionally sends reports to [[Malcolm]], [[user:bung|bung]]'s irc bot, for example when cpu usage exceeds a threshold. These alerts can be seen in #tardismon
 
* Munin occasionally sends reports to [[Malcolm]], [[user:bung|bung]]'s irc bot, for example when cpu usage exceeds a threshold. These alerts can be seen in #tardismon
  
 
[[category:Services]]
 
[[category:Services]]

Revision as of 22:59, 12 October 2013

This page is out of date and needs rewriting.
The content is likely to be incomplete or incorrect.

Nagios

Munin

This service does not currently exist if you would like to reinstate it please contact us. The following information should be for historical interest only.

We're running munin for some out of the box monitoring fun, which can be seen at [1].

To add a new host to this, on the box you wish to monitor:

colin:~# apt-get install munin-node
...
colin:~# vim /etc/munin/munin-node.conf

and add...

allow ^193\.62\.81\.11$

and don't forget to...

colin:~# /etc/init.d/munin-node restart

Also on davros, edit /etc/munin/munin.conf to add:

[colin.tardis.ed.ac.uk]
    address 193.62.81.8
    use_node_name yes

Custom plugins

Some custom plugins have been written to monitor:

  • The number of items in the support inbox
  • The number of people logged into gallifrey

WOTAN uses the IPMI monitoring plugin from here. It has been changed to suit the idiosyncrasies of WOTAN's IPMI implementation. These changes should be documented some time...

Piper uses the standard sensors_ plugin, which relies on data from lm-sensors.

Malcolm

  • Munin occasionally sends reports to Malcolm, bung's irc bot, for example when cpu usage exceeds a threshold. These alerts can be seen in #tardismon