Performance of Concentrator Service on Hybrid Appliance is slow in RSA Security Analytics

Issue

After upgrading to Security Analytics 10.5.X or 10.6.X, the concentrator service performance on the hybrid appliance is affected.

Indications of Concentrator Performance Issues:

A large number of sessions behind (indicating concentrator aggregation of decoder sessions is falling behind).
Re-occurring Health & Wellness Alarms of 'Concentrator Meta Rate Zero' which periodically automatically clear themselves.
Slow performance when performing Investigation against the Concentrator (even for relatively small time periods).

Note: Hybrid appliances in this article refer to All-In-One appliances as well.

Cause

Some of the default settings in the SA 10.5.X and 10.6.X releases are not optimal for Hybrid appliances

Workaround

Within the SA UI use the Explore view of the Concentrator service to make the following changes:
/sdk/config/max.concurrent.queries=13 => 8
/sdk/config/parallel.values (Parallelize all values operations)=16 => 8
/database/config/session.files=auto => 50
/database/config/meta.files=auto => 50
Above values could also be used on the Log Decoder/Packet Decoder service.

A decoder service would also have the additioanl setting of:
/database/config/packet.files=auto => 50

Concentrator Index Checks
* Check the number of slices (should be 400 or less)
/index/stats/slices.total (Index Slice Total)

* Check size of index slices
cd /var/netwitness/concentrator
du -h index

Note: Index slices on hybrid should be <= 10G

* Check /index/config/save.session.count (600000000 by default in 10.5.X and later)
If /index/config/save.session.count=0 then index slice creation is still controlled by the service scheduler
So expand out scheduler which will look something like:
/sys/config/scheduler/351 = hours=8 pathname=/index msg=save

If /index/config/save.session.count=0 and index save schedule is every 8 hours it means there are at least 21 index slices created every week.
Assuming that the majority of queries are 7 days or less + 1 (current index slice)
/index/config/index.slices.open (Index Open Slice Count) = 0 => 22
This change should reduce the maximum amount of memory concentrator service can use for queries.

Note: Change is immediate and does not require a service restart.

If /index/config/save.session.count=600000000, then you will need to calculate how many days 600M sessions is and reduce to a number that corresponds to 1 - 7 days.

From 10.5.X, a manifest file is created in each index slice for which a CSV can be generated using the following 2 commands:
echo 'Index Slice,1st Session,Last Session,Time Start,Time End' > /root/concentrator.index.slices.csv
find /var/netwitness/concentrator* -name "managed-values-*.manifest" -print0 | xargs -0 -I % grep -E "(filename|id|time)" % | cut -d: -f2 | sed -r 's/\"managed-values-([0-9]+)\"/\1/g' | sed -r 's/([0-9]+)$/\1*/g' | tr "\n" " " | sed 's/ //g' | sed 's/*/\n/g' | sort -n >> /root/concentrator.index.slices.csv

Once /index/config/save.session.count has been lowered (to say 200000000), then /index/config/index.slices.open needs to be adjusted to reflect normal query time range.
e.g. If 200M sessions are 2 days then may only need 4 index slices for queries in the last 7 days.

Product Details

RSA Product Set: Security Analytics, RSA NetWitness Logs & Network
RSA Product/Service Type: All-In-One for Logs Appliance, All-In-One for Packets Appliance, SA Log Hybrid and SA Packet Hybrid
RSA Version/Condition: 10.5.x,10.6.X
Platform: CentOS
O/S Version: EL6

Summary

How to tune services on hybrid appliances for better performance.

Approval Reviewer Queue

RSA NetWitness Suite Approval Queue