TechDocs/TechnicalProcesses/Monitoring/AddNewPlugin

This page has been moved to docs.fsfe.org with the rest of the sysadmin documentation.

Adding a new plugin to icinga 2

If your monitoring use case is not covered existing plugins (the officio list is here and here, also check out this git repository.), you may want to create your own. This guide explains how to create a new plugin and how to use it on monitoring.fsfe.org

Please note that there are plugins running on the monitoring server, and those running on the monitored clients. Depending on this, the command may differ, as well as the repository in which they are defined.

Client plugins can be found in the baseline repo, while server plugins are in the monitoring repo. This has technical reasons.

Creating a new plugin

Icinga 2 plugins are compatible with nagios conventions. It uses the return code of the plugin as the status and stdout (not stderr) as the notification message. The return code may have 4 different values:

The notification message should be in the format STATUS: message. Example: OK: docker is running.

Plugins can therefore be written in all programming languages. When writing the plugin, you have to define criteria about what is acceptable, what is worth of a warning, and what is deemed a critical state.

A dummy Python plugin may look like this:

from random import choice

states = {
    0: "OK: everything is fine!",
    1: "WARNING: things may go wrong!",
    2: "CRITICAL: something is definitely wrong!",
    3: "UNKNOWN: we don't know what's happening!",
}

# Pick a random status
status = choice(list(states.keys()))
# Print the status
print(states[status])
# Quit with the corresponding return code
exit(status)

For the next section, let's call this plugin check_dummy.

Using the new plugin

Clone the icinga 2 configuration repository

git clone https://git.fsfe.org/fsfe-system-hackers/monitoring.git
cd monitoring/

Add the plugin to the user_plugins/ folder

cp check_dummy user_plugins_clients/

Define an icinga2 command that uses the plugin

If the servers on which you want to use the plugin are not already monitored by icinga 2, head over to the page ../AddNewServer.

Add the following content to conf.d/commands.conf if it's a client plugin:

object CheckCommand "by_ssh_dummy" {
  import "by_ssh"
}

If your plugin make use of variables defined at the host level, you can use the variables like so:

vars.by_ssh_command = "/usr/lib/nagios/plugins/check_dummy -c $critical_value$"

Here we define a variable that sets a threshold above which a CRITICAL state is triggered is critical_value.

Define the host variables used by the plugin

Go to the conf.d/hosts.conf file and add the variables used by the plugin for the hosts you wish to monitor:

object Host "host1" {
  vars.service["dummy"] = {
          critical_value = 80
  }
}

object Host "host2" {
  vars.dummy["dummy check"] = {
          critical_value = 95
  }
}

Assign the command to the hosts you wish to monitor with the plugin

Go to conf.d/services.conf and add the following content:

apply Service for (service => config in host.vars.dummy) {
  import "generic-service"
  check_command = "by_ssh_dummy"
  vars += config
}

This assigns the by_ssh_dummy command to every host which has dummy in its variables. Please go to the official documentation for more details about how to assign commands.

Deploy

To deploy the changed configuration, add your changes to the git repository and run the ansible playbook by following the intructions in the README.

In case of a client plugin, you'll have to make the change in the baseline repo.

TechDocs/TechnicalProcesses/Monitoring/AddNewPlugin (last edited 2023-01-18 12:19:28 by tobiasd)