TechDocs/TechnicalProcesses/Monitoring/AddNewPlugin

Adding a new plugin to icinga 2

If your monitoring use case is not covered existing plugins (the officio list is here and here, also check out this git repository.), you may want to create your own. This guide explains how to create a new plugin and how to use it on monitoring.fsfe.org

Creating a new plugin

Icinga 2 plugins are compatible with nagios conventions. It uses the return code of the plugin as the status and stdout (not stderr) as the notification message. The return code may have 4 different values:

The notification message should be in the format STATUS: message. Example: OK: docker is running.

Plugins can therefore be written in all programming languages. When writing the plugin, you have to define criteria about what is acceptable, what is worth of a warning, and what is deemed a critical state.

A dummy Python plugin may look like this:

from random import choice

states = {
    0: "OK: everything is fine!",
    1: "WARNING: things may go wrong!",
    2: "CRITICAL: something is definitely wrong!",
    3: "UNKNOWN: we don't know what's happening!",
}

# Pick a random status
status = choice(list(states.keys()))
# Print the status
print(states[status])
# Quit with the corresponding return code
exit(status)

For the next section, let's call this plugin check_dummy.

Using the new plugin

Clone the icinga 2 configuration repository

git clone https://git.fsfe.org/fsfe-system-hackers/monitoring.git
cd monitoring/

Add the plugin to the user_plugins/ folder

cp check_dummy user_plugins_clients/

Define an icinga2 command that uses the plugin

If the servers on which you want to use the plugin are not already monitored by icinga 2, head over to the page ../AddNewServer.

Add the following content to conf.d/commands.conf:

object CheckCommand "by_ssh_dummy" {
  import "by_ssh"
  vars.by_ssh_identity = "/etc/icinga2/id_rsa"
  vars.by_ssh_command = "/usr/lib/nagios/plugins/check_dummy"
  vars.by_ssh_logname = "icinga"
}

If your plugin make use of variables defined at the host level, you can use the variables like so:

vars.by_ssh_command = "/usr/lib/nagios/plugins/check_dummy -c $critical_value$"

Here we define a variable that sets a threshold above which a CRITICAL state is triggered is critical_value.

Define the host variables used by the plugin

Go to the conf.d/hosts.conf file and add the variables used by the plugin for the hosts you wish to monitor:

object Host "host1" {
  vars.service["dummy"] = {
          critical_value = 80
  }
}

object Host "host2" {
  vars.dummy["dummy check"] = {
          critical_value = 95
  }
}

Assign the command to the hosts you wish to monitor with the plugin

Go to conf.d/services.conf and add the following content:

apply Service for (service => config in host.vars.dummy) {
  import "generic-service"
  check_command = "by_ssh_dummy"
  vars += config
}

This assigns the by_ssh_dummy command to every host which has dummy in its variables. Please go to the official documentation for more details about how to assign commands.

Deploy

To deploy the changed configuration, add your changes to the git repository and run the ansible playbook by following the intructions in the README.

TechDocs/TechnicalProcesses/Monitoring/AddNewPlugin (last edited 2020-04-01 17:40:26 by vincent)