Skip to content
  • There are no suggestions because the search field is empty.

Fix an issue with RabbitMQ lacking consumers on log queues after upgrading to 11.5.3 leading to log collection issues

Issue

The issue is related to a corrupt version of nw_admin.ez plug-in which assists the rabbitmq communication between the queues on both sides.


After upgrade, one or more event types are no longer being passed from a Remote Log Collector to a Local Log Collector, resulting in:
  1. Missing logs in Investigate
  2. H&W Alerts for "LogCollector Event Processor Queue with No Consumer" or similar
  3. H&W Alerts for "Critical RabbitMQ Queue Message Count"
  4. H&W Alerts for disk space due to a buildup of RDQ files on the Collectors


Cause

This behavior has been observed on upgrades to 11.5.3 and potentially 11.5.2 and may be attributable to a corrupt nw_admin.ez plug-in.


Workaround

Confirm the problem:
  1. Rabbitmq shovels show green on either the RLC (push) or the Local LC (Pull) - Green indicates that the shovel is up and running, but it does not highlight missing consumers.
  2. Running the following commands on the Remote Log Collector and the Local Log Collector to show one or more Event Queues with no consumers:
An example containing all consumers:
[root@NW11-LOG-HYBRID ~]# rabbitmqctl list_queues -p logcollection name messages consumers
Timeout: 60.0 seconds ...
Listing queues for vhost logcollection ...
name    messages    consumers
LogDecoder.logdecoder.windows    0    1
LogDecoder.logdecoder.checkpoint    0    1
LogDecoder.logdecoder.syslog    0    1
LogDecoder.logdecoder.file    0    1
LogDecoder.logdecoder.netflow    0    1
LogDecoder.logdecoder.sdee    0    1
LogDecoder.logdecoder.snmptrap    0    1
LogDecoder.logdecoder.vmware    0    1
rabbitmq.log    0    1
LogDecoder.logdecoder.cmdscript    0    1
LogDecoder.logdecoder.windowslegacy    0    1
LogDecoder.logdecoder.odbc    0    1

Example of missing consumers  (look at Windows and Syslog):
[root@NW11-LOG-HYBRID ~]# rabbitmqctl list_queues -p logcollection name messages consumers
Timeout: 60.0 seconds ...
Listing queues for vhost logcollection ...
name    messages    consumers
LogDecoder.logdecoder.windows    0     0
LogDecoder.logdecoder.checkpoint    0    1
LogDecoder.logdecoder.syslog    0     0
LogDecoder.logdecoder.file    0    1
LogDecoder.logdecoder.netflow    0    1
LogDecoder.logdecoder.sdee    0    1
LogDecoder.logdecoder.snmptrap    0    1
LogDecoder.logdecoder.vmware    0    1
rabbitmq.log    0    1
LogDecoder.logdecoder.cmdscript    0    1
LogDecoder.logdecoder.windowslegacy    0    1
LogDecoder.logdecoder.odbc    0    1


Workaround:
  • Download the provided nw_admin-11.5.3.0.ez
  • SCP it to /root/ on all Log Collectors (remote and local)
  • Run a sha256sum on it to confirm it is the correct version and was not corrupted in transit.
[root@NW11-LOG-HYBRID ~]# sha256sum nw_admin-11.5.3.0.ez 
b3efc16dee21d2df97f859fdb6eecb3995597671fccd30cc38396d4c4c1712b3  nw_admin-11.5.3.0.ez
  • Stop rabbitmq-server: systemctl stop rabbitmq-server
  • Take a backup of the current nw_admin.ez plug-in currently on the filesystem:
    • mv /usr/lib/rabbitmq/lib/rabbitmq_server-3.8.3/plugins/nw_admin.ez /root/nw_admin.ez.BAK
  • Copy the new version to the appropriate location (taking special consideration to rename it to simply "nw_admin.ez"):
    • cp /root/nw_admin-11.5.3.0.ez /usr/lib/rabbitmq/lib/rabbitmq_server-3.8.3/plugins/nw_admin.ez
  • Change the permissions:
[root@NW11-LOG-HYBRID ~]# cd /usr/lib/rabbitmq/lib/rabbitmq_server-3.8.3/plugins/
[root@NW11-LOG-HYBRID plugins]# chmod 644 nw_admin.ez
  • Verify the sha256sum and permissions once more:
[root@NW11-LOG-HYBRID plugins]# sha256sum nw_admin.ez 
b3efc16dee21d2df97f859fdb6eecb3995597671fccd30cc38396d4c4c1712b3  nw_admin.ez
[root@NW11-LOG-HYBRID plugins]# ls -lrth nw_admin.ez 
-rw-r--r--. 1 root root 49K Apr 20 19:56 nw_admin.ez
  •  Make a backup of the original plug-in and also copy the new plug-in to our chef reference directory for potential future use:
[root@NW11-LOG-HYBRID ~]# cp /opt/netwitness/nw_admin-11.5.3.0.ez /opt/netwitness/nw_admin-11.5.3.0.ez-ORIG
[root@NW11-LOG-HYBRID ~]# cp /root/nw_admin-11.5.3.0.ez /opt/netwitness/nw_admin-11.5.3.0.ez
cp: overwrite ‘/opt/netwitness/nw_admin-11.5.3.0.ez’? y
  • Verify the sha256sum of the backup and patched version once again:
[root@NW11-LOG-HYBRID ~]# sha256sum /opt/netwitness/nw_admin-11.5.3.0.ez*
b3efc16dee21d2df97f859fdb6eecb3995597671fccd30cc38396d4c4c1712b3  /opt/netwitness/nw_admin-11.5.3.0.ez
ef7569e292be011ef130c3e7b838026f87792afe90ab6cf7a738deac924ec65d  /opt/netwitness/nw_admin-11.5.3.0.ez-ORIG
  • Start rabbitmq-server: systemctl start rabbitmq-server
  • Restart the nwlogcollector service: systemctl restart nwlogcollector
    • Do this on both the local and remote LCs.
  • Verify that rabbitmq-server now recognizes the patched version (should be showing as "11.5.1.0") when doing a "rabbitmq-plugins list".
[root@NW11-LOG-HYBRID ~]# rabbitmq-plugins list | grep nw_admin
[E*] nw_admin                          11.5.1.0
  • Then verify that the consumers have returned:
[root@NW11-LOG-HYBRID plugins]# rabbitmqctl list_queues -p logcollection name messages consumers
Timeout: 60.0 seconds ...
Listing queues for vhost logcollection ...
name    messages    consumers
LogDecoder.logdecoder.windows    0     1
LogDecoder.logdecoder.checkpoint    0    1
LogDecoder.logdecoder.syslog    0     1
LogDecoder.logdecoder.file    0    1
LogDecoder.logdecoder.netflow    0    1
LogDecoder.logdecoder.sdee    0    1
LogDecoder.logdecoder.snmptrap    0    1
LogDecoder.logdecoder.vmware    0    1
rabbitmq.log    0    1
LogDecoder.logdecoder.cmdscript    0    1
LogDecoder.logdecoder.windowslegacy    0    1
LogDecoder.logdecoder.odbc    0    1
  •  After 15 minutes, check that normal log flow in the UI has returned per lc.cid or did not.
Additional Instructions for Windows Legacy Collectors (having issues with rabbitmq crashing after upgrade to 11.5.3.0):
 
  1. Stop rabbitmq and nwlogcollector services from "services.msc"
  2. Backup and remove the existing nw_admin pluggin from below 2 locations. (note: there may be previous versions in addition to 11.5.3.0, remove all nw_admin plugins in the following directories):
    1. C:\Program Files\RabbitMQ Server\rabbitmq_server-3.8.3\plugins
    2. C:\Program Files\NwLogCollector
  3. Copy the nw_admin.ez plug-in attached on this KB (that is also used on the CentOS appliances) to the following directories:
    1. C:\Program Files\RabbitMQ Server\rabbitmq_server-3.8.3\plugins
    2. C:\Program Files\NwLogCollector
  4. Start rabbitmq and nwlogcollector services from "services.msc"


Resolution

To install a patched version of the nw_admin.ez plug-in and restart the appropriate services.


Internal Comments

See the following JIRAs for more information:
  • https://bedfordjira.na.rsa.net/browse/SACE-15832
  • https://bedfordjira.na.rsa.net/browse/ASOC-109851


Product Details

  • Column 1: RSA Product Set: NetWitness Platform
    RSA Product/Service Type: Log Collector, Log Decoder
    RSA Version/Condition: 11.5.2, 11.5.3
    Platform: CentOS
    O/S Version: 7

​​​​​​

Summary

After upgrading to 11.5.3, logs flowing from remote to local collectors may unexpectedly quit working. When using rabbitmqctl list_queues, you would notice certain (but not necessarily all) consumers are missing. This occurs in both push and pull environments.


Approval Reviewer Queue

Technical approval queue