Scope
Unused Round Robin Databses (*.rrd) for deleted hosts or services can accumulate in OP5 and may need to be periodically removed manually
Cause
OP5 chooses the safest possible option when dealing with RRDs that appear to no longer be needed and leaves them on the monitor. This is in case two peered masters have sync issues for a time. During that time period where there is a sync issue, it is desirable to keep RRDs to prevent historical data loss rather than immediately deleting the RRDs that appear to no longer be needed in the monitor which risks deleting historical data for a valid host.
Solution
The manual process to remove these is fairly simple with the following two one line commands. Please note that the following commands were developed on a CentOS 7.5 operating system with OP5 release 7.4.8.
The first command will show you how much space can be reclaimed by removing RRDs that have not been modified in more than X days: (please make sure to replace X in the command before running it)
{ find /opt/monitor/op5/pnp/perfdata/ -iname "*rrd" -mtime +X -printf "%s+"; echo 0; }|bc|numfmt --to=si
In this example, we are looking for RRDs that have not been modified in 2 years or more:
[root@op5 ~]# { find /opt/monitor/op5/pnp/perfdata/ -iname "*rrd" -mtime +$((365*2)) -printf "%s+"; echo 0; }|bc|numfmt --to=si
30G
[root@op5 ~]#
In the above example, we find there are 30G of RRDs that haven’t been modified in 2 years or more that should be safe to delete. Please note we chose modified time rather than creation time specifically. The reason is that an RRD may have been made more than your specified time period ago but still may be actively used by OP5. If the RRD is being used by OP5, it will be modified any time a new set of performance data is written to the RRD and therefore any that have not been modified in a long time should be for hosts or services that are no longer being monitored.
The second command will proceed to automatically clean up these files for you. Again, please make sure to update the X to a number of days before running the command. Also, it is advisable to run this command in a screen or tmux session so it can continue to run even if you lose connection to the OP5 server.
find /opt/monitor/op5/pnp/perfdata/ -mtime +X -iname "*rrd" -exec rm {} \;