The messages log file grows to fill the /var/log volume, preventing services from starting on RSA NetWitness Host
Issue
The /var/log/messages log file grows to occupy all of the available space in the /var/log partition preventing services such as the nwlogcollector (and other services) from starting.
For the failure of logrotate for other logs refer to the following KBs:
#000030086 RabbitMQ in NetWitness 10.4.0.2 - The /var/log partition becomes full on an RSA Security Analytics Log Collector due to rabbitmq log files not rotating
#000037185 logstash in NetWitness 11.x - RSA NetWitness 11.x /var/log mount is full due to logstash directory
Cause
This issue is most often seen in RSA NetWitness Hybrid and All-In-One (AIO) appliances and Virtual Log Collectors (VLCs) due to the volume of entries that the nwlogcollector service writes to /var/log/messages.VLCs often have a smaller /var/log volume (e.g. 3.9G) than physical appliances (e.g. 9.8G).
In order to detect the problem, log into the affected host using SSH and run the following commands. The outputs in the examples below were taken from a VLC
#
df -hP
Example Output:
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 9.9G 827M 8.6G 9% /
tmpfs 7.8G 0 7.8G 0% /dev/shm
/dev/mapper/VolGroup00-usr 3.9G 1.3G 2.4G 36% /usr
/dev/mapper/VolGroup00-usrhome 2.0G 3.1M 1.9G 1% /home
/dev/mapper/VolGroup00-var 3.9G 278M 3.4G 8% /var
/dev/mapper/VolGroup00-log 3.9G 3.9G 0 100% /var/log
/dev/mapper/VolGroup00-tmp 5.8G 12M 5.5G 1% /tmp
/dev/mapper/VolGroup00-vartmp 2.0G 3.0M 1.9G 1% /var/tmp
/dev/mapper/VolGroup00-opt 3.9G 468M 3.2G 13% /opt
/dev/mapper/VolGroup00-rabmq 10G 38M 10G 1% /var/lib/rabbitmq
/dev/mapper/VolGroup00-nwhome 12G 858M 12G 7% /var/netwitness
/dev/mapper/VolGroup01-logcoll 104G 1.4G 103G 2% /var/netwitness/logcollector
As can be seen in the hi-lighted section in red above, /var/log volume has reached 100% utilization.
Example Output:
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 9.9G 827M 8.6G 9% /
tmpfs 7.8G 0 7.8G 0% /dev/shm
/dev/mapper/VolGroup00-usr 3.9G 1.3G 2.4G 36% /usr
/dev/mapper/VolGroup00-usrhome 2.0G 3.1M 1.9G 1% /home
/dev/mapper/VolGroup00-var 3.9G 278M 3.4G 8% /var
/dev/mapper/VolGroup00-log 3.9G 3.9G 0 100% /var/log
/dev/mapper/VolGroup00-tmp 5.8G 12M 5.5G 1% /tmp
/dev/mapper/VolGroup00-vartmp 2.0G 3.0M 1.9G 1% /var/tmp
/dev/mapper/VolGroup00-opt 3.9G 468M 3.2G 13% /opt
/dev/mapper/VolGroup00-rabmq 10G 38M 10G 1% /var/lib/rabbitmq
/dev/mapper/VolGroup00-nwhome 12G 858M 12G 7% /var/netwitness
/dev/mapper/VolGroup01-logcoll 104G 1.4G 103G 2% /var/netwitness/logcollector
To locate which files and directories are occupying the most space.
#
du -ahx /var/log | sort -h | tail
Example Output:
96M /var/log/rabbitmq
123M /var/log/maillog-20181019.gz
169M /var/log/netwitness/logcollector/NwServerLog-000000055.log
251M /var/log/netwitness/logcollector/NwServerLog-000000052.log
251M /var/log/netwitness/logcollector/NwServerLog-000000053.log
251M /var/log/netwitness/logcollector/NwServerLog-000000054.log
940M /var/log/netwitness/logcollector
941M /var/log/netwitness
2.7G /var/log/messages
3.9G /var/log
We have identified the issue now, /var/log/messages is causing /var/log to fill up quickly.
Example Output:
96M /var/log/rabbitmq
123M /var/log/maillog-20181019.gz
169M /var/log/netwitness/logcollector/NwServerLog-000000055.log
251M /var/log/netwitness/logcollector/NwServerLog-000000052.log
251M /var/log/netwitness/logcollector/NwServerLog-000000053.log
251M /var/log/netwitness/logcollector/NwServerLog-000000054.log
940M /var/log/netwitness/logcollector
941M /var/log/netwitness
2.7G /var/log/messages
3.9G /var/log
An alternative way of doing this would be to use the 'ls' command and sorting file size to examine the files in /var/log directory (Hint: Could add the -R switch as well to recurse into subdirectories, however, the -S switch only sorts files within each directory):
#
ls -AhlSr /var/log
Example Output:
-rw-------. 1 root root 9.7M Dec 16 00:01 messages-20181019.gz
-rw-------. 1 root root 22M Dec 16 16:57 cron
-rw-------. 1 root root 50M Dec 16 16:57 secure
-rw-------. 1 root root 123M Dec 15 20:01 maillog-20181019.gz
-rw-------. 1 root root 2.7G Dec 16 16:52 messages
Example Output:
-rw-------. 1 root root 9.7M Dec 16 00:01 messages-20181019.gz
-rw-------. 1 root root 22M Dec 16 16:57 cron
-rw-------. 1 root root 50M Dec 16 16:57 secure
-rw-------. 1 root root 123M Dec 15 20:01 maillog-20181019.gz
-rw-------. 1 root root 2.7G Dec 16 16:52 messages
Note: If the utilisation of 'df -hP' and 'du -ahx' don't match then this is likely due to a failure of logrotate when writing to a new file. Run the following command to check for deleted but not released log files:
#
lsof -X /var/log 2>/dev/null | grep -E "(^COMMAND|\(deleted\))"
Example Output:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rsyslogd 3104 root 1w REG 253,5 2516186901 58 /var/log/messages-20190212 (deleted)
To release the space being taken by the deleted file (but held by rsyslogd as it still has an open file handle), you will either need to reboot the OS or restart the syslog services
Example Output:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rsyslogd 3104 root 1w REG 253,5 2516186901 58 /var/log/messages-20190212 (deleted)
#
service rsyslog restart
Workaround
BEFORE:The current configuration of logrotate for syslog services in 10.6.x is as follows:
#
cat /etc/logrotate.d/syslog
/var/log/cron
/var/log/maillog
/var/log/messages
/var/log/secure
/var/log/spooler
{
sharedscripts
postrotate
/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
endscript
}
/var/log/cron
/var/log/maillog
/var/log/messages
/var/log/secure
/var/log/spooler
{
sharedscripts
postrotate
/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
endscript
}
AFTER:
This file needs to be edited to the following:
/var/log/cron
/var/log/maillog
/var/log/messages
/var/log/secure
/var/log/spooler
{
weekly
rotate 4
maxsize 250M
dateext
notifempty
sharedscripts
postrotate
/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
endscript
}
/var/log/maillog
/var/log/messages
/var/log/secure
/var/log/spooler
{
weekly
rotate 4
maxsize 250M
dateext
notifempty
sharedscripts
postrotate
/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
endscript
}
In this way we are going to rotate /var/log/messages on a weekly basis (retaining 4 compressed logs) or when the file reaches the size of 250 MB (whichever comes first).
The dateext means that the date of rotate will be appended to the filename e.g. messages-20190212
Test that the configuration is correct by running logrotate manually using the following command:
#
logrotate --force -vd /etc/logrotate.d/syslog
If you are unsure of any of the steps above or experience any issues, contact RSA Customer Support and reference this article for further assistance.
0 Links
Resolution
The logrotate service's configuration need to be adjusted by editing /etc/logrotate.d/syslog to allow the normal rotation of /var/log/messages.
Notes
If after applying the above steps logrotate is not working, then the syslog service may need to be restarted as shown below.
#
service rsyslog restart
Note: Other non-standard packages installed on the host such as syslog-ng may also cause logrotate to fail due to additional file handles on /var/log/messages. RSA Support would recommend that these non-standard packages be removed. You may be able to find these processes using the following command:
#
lsof +D /var/log | grep messages
Internal Comments
Lee McCotter -- 13 Mar 2019Unfortunately KB version prior to today may not work if /var/log/messages manages to fill volume within a single day (despite KBs extensive application on customer sites)
Editing /etc/logrotate.d/syslog and restarting OS to the following doesn't avoid logrotate failing due to lack of space when creating new /var/log/messages:
/var/log/cron
/var/log/maillog
/var/log/messages {
daily
rotate 5
size=300M
dateformat -%Y%m%d-%s
sharedscripts
postrotate
/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
endscript
}
/var/log/secure
/var/log/spooler
{
sharedscripts
postrotate
/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
endscript
}
/var/log/maillog
/var/log/messages {
daily
rotate 5
size=300M
dateformat -%Y%m%d-%s
sharedscripts
postrotate
/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
endscript
}
/var/log/secure
/var/log/spooler
{
sharedscripts
postrotate
/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
endscript
}
Product Details
RSA Product Set: NetWitness Logs & Network/Security AnalyticsRSA Product/Service Type: NetWitness Appliances (including Hybrid & All-in-One appliances), VLC hosts
RSA Version/Condition: 10.4.x, 10.5.x, 10.6.x
Platform: CentOS
O/S Version: 6
Approval Reviewer Queue
RSA NetWitness Suite Approval Queue