How to migrate an existing core appliance to a new nw-node-zero in NetWitness 11.x
Issue
Moving a core appliance to new nw-node-zero results in certificates being stale and services will not come back online on the new nw-node-zero when installed. Errors similar to the examples below will be seen in the /var/log/messages file on the component host being migrated.
Jun 12 14:53:52 phybrid1 NwDecoder[1238]: [Login] [audit] Failed login attempt for nonexistent user 'escalateduser' from 192.168.2.102:36800
Jun 12 22:36:58 phybrid1 NwConcentrator[1218]: {"deviceVendor":"RSA","deviceProduct":"NetWitness","deviceService":"CONCENTRATOR","deviceVersion"...ailure"}
Jun 12 22:36:58 phybrid1 NwConcentrator[1218]: [Login] [audit] Failed login attempt for nonexistent user 'escalateduser' from 192.168.2.102:39178
The errors above are due to stale certificates, truststores and/or trustpeers from the old nw-node-zero.
Jun 12 22:36:58 phybrid1 NwConcentrator[1218]: {"deviceVendor":"RSA","deviceProduct":"NetWitness","deviceService":"CONCENTRATOR","deviceVersion"...ailure"}
Jun 12 22:36:58 phybrid1 NwConcentrator[1218]: [Login] [audit] Failed login attempt for nonexistent user 'escalateduser' from 192.168.2.102:39178
Tasks
PLEASE READThis KB has a high success rate for Core Devices and was originally designed for those devices only. These include Decoders, Log Decoders, Log collectors, Concentrators, Brokers, Archiver. Hybrids of these variants and these variants alone will also work. Endpoint Hybrids are a special case that may not be properly covered here.
Aaron Martin has included steps on how he has done ESA devices and Endpoint Hybrids in the past but he does not have enough examples at the time of this writing with high confidence that the steps given will work just as well as for the Core Devices. For these devices, please take these with a grain of salt. In many cases, it would be easier just to reimage the device or may even be the only solution for the given scenario. This will also help with any unforeseen circumstances that may occur as a result of getting it into the environment as none of this is tested.
There are also many unexpected circumstances that may occur because of the movement of these devices that will need to be corrected; this will not be a complete list of those as they are still being observed with every circumstance where we are successful.
Resolution
Follow the steps below to move the component host from the OLD nw-node-zero to the NEW nw-node-zero. A script has been provided that will do a lot of heavy lifting for you but it is still a work in progress and may need to be modified for the scenario at hand. If the script fails at parts, it is important that you understand what the manual steps are doing to help resolve it. Additional Troubleshooting information will be included at the bottom as well as there are some circumstances that can occur that may affect you whether you perform this manually or not. Maintain caution and let Aaron Martin know if there are any problems with the script so that they may be addressed.Automatic Steps
The script is a work in progress, but it works pretty well for at least Core devices. Attached to this article should be a script but the latest version that I have uploaded should also be available at the following Github link. Any feedback or bug reporting should be sent to Aaron Martin for how this script works and note your mileage may vary.
https://github.com/martina3203/RSA-NetWitness-Scripts/blob/master/MoveToNewNodeZero.sh
Manual Steps
From the component host to be migrated:
- Get the UUID of the host by running the command below.
cat /etc/salt/minion
From the OLD or existing nw-node-zero:
- Remove the component host from the Hosts view in the RSA NetWitness UI.
- Remove the UUID of the component host by running the command below.
orchestration-cli-client --remove-key <UUID>
From the NEW nw-node-zero:
- Issue the command below to show all keys on the RSA NetWitness Admin Server and note any denied, rejected, or unaccepted keys.
salt-key
- If necessary, issue the command below to remove and re-add any UUID identified in the previous step.
orchestration-cli-client --remove-key <UUID>
From the component host. This is any host that is NOT the Admin Server:
- Move the /etc/salt/pki/minion/minion_master.pub file to the /tmp directory.
mv /etc/salt/pki/minion/minion_master.pub /tmp
- Restart salt-minion with the command below.
systemctl restart salt-minion
- Move /etc/netwitness/platform to the /tmp directory.
mv /etc/netwitness/platform /tmp
- Move /etc/netwitness/security-cli to the /tmp directory.
mv /etc/netwitness/security-cli /tmp
- Move /etc/netwitness/orchestration-client to the /tmp directory. This will only exist on a few types of devices.
mv /etc/netwitness/orchestration-client /tmp
- Move /etc/netwitness/ng/appliance to the /tmp directory.
mv /etc/netwitness/ng/appliance /tmp
- Stop any services on the device that are considered core or tied to NetWitness in general. systemctl stop
Services to consider: - Any Core Services: nwdecoder, nwarchiver, nwconcentrator, nwappliance, nwlogcollector, nwworkbench,nwbroker, nwlogdecoder
- Launch Services: rsa-nw-correlation-server, rsa-nw-contexthub-server, rsa-nw-esa-analytics-server, rsa-nw-endpoint-server
- RabbitMQ: rabbitmq-server
- Mongo Database: mongod
- Move /etc/netwitness/ng/
to the /tmp directory. (For example /etc/netwitness/ng/decoder, /etc/netwitness/ng/concentrator, so forth) mv /etc/netwitness/ng/<service> /tmp - Move /etc/pki/nw to the /tmp directory.
mv /etc/pki/nw /tmp
- If you are on 11.4, clean up the node-infra-server service information for it to recreate new truststores.
mv /etc/netwitness/node-infra-server /tmp
mv /etc/systemd/system/rsa-nw-node-infra-server.service.d /tmp
systemctl daemon-reload - If you are on an ESA Primary/ESA Secondary Devices, you must do the following:
mv /etc/netwitness/contexthub-server /etc/netwitness/correlation-server /etc/netwitness/esa-analytics-server /tmp
#The purpose of these two lines is so that Chef knows not try and reconfigure an already configured mongo database.
mkdir -p /etc/netwitness/platform/mongo
touch /etc/netwitness/platform/mongo/mongo.registered
mv /etc/systemd/system/rsa-nw-contexthub-server.service.d /etc/systemd/system/rsa-nw-correlation-server.service.d /etc/systemd/system/rsa-nw-esa-analytics-server.service.d /tmp
systemctl daemon-reload
- If you are on an Endpoint Device, you can attempt to do the following:
mv /etc/netwitness/endpoint-server /tmp
#The purpose of these two lines is so that Chef knows not try and reconfigure an already configured mongo database.
mkdir -p /etc/netwitness/platform/mongo
touch /etc/netwitness/platform/mongo/mongo.registered
mv /etc/systemd/system/rsa-nw-endpoint-server.service.d /tmp
systemctl daemon-reload
- Run the nwsetup-tui command on the component host.
From the NEW nw-node-zero:
- Discover the migrated component host in the RSA NetWitness UI.
- Select Install Correct Service for the component host.
- After following the instructions above, watch or tail the chef-solo.log file on the component host while orchestrating/installing to confirm that the chef runs completed successfully.
tailf /var/log/netwitness/config-management/chef-solo.log
- Confirm that the new component host has been added to the RSA NetWitness UI and that its services are online. You may need to wait a while for the services to show as online.
- Finally, configure the component host as necessary on the new nw-node-zero environment.
Things to look out for before/after moving the ESA to another node zero.
If the ESA moved was a primary, this means Alert Data and Incident still exists on it. If it was moved to another environment to become that environment's primary, this means that the Incident counter may be off on the Admin Server (INC-1, INC-2, ....). You will need to correct this in the mongo; the easiest method is to simply take the highest number INC and add 1 to it to update the value in the mongo on the Admin Server.
[root@NW11-Admin ~]# mongo admin -u deploy_admin -p NetWitness
MongoDB shell version v4.0.13
connecting to: mongodb://127.0.0.1:27017/admin?gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("629f1ca2-261c-46a9-88c8-242c1056625e") }
MongoDB server version: 4.0.13
> use respond-server
switched to db respond-server
> show collections
aggregation_rule
categories
risk_rule
setting
tracking_id_sequence
> db.tracking_id_sequence.find()
{ "_id" : "incident", "lastId" : NumberLong(1) }
> db.tracking_id_sequence.update({},{"lastId": NumberLong(7783)})
Failure to do the above may prevent incidents from being generated due to a duplicate index key.
MongoDB shell version v4.0.13
connecting to: mongodb://127.0.0.1:27017/admin?gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("629f1ca2-261c-46a9-88c8-242c1056625e") }
MongoDB server version: 4.0.13
> use respond-server
switched to db respond-server
> show collections
aggregation_rule
categories
risk_rule
setting
tracking_id_sequence
> db.tracking_id_sequence.find()
{ "_id" : "incident", "lastId" : NumberLong(1) }
> db.tracking_id_sequence.update({},{"lastId": NumberLong(7783)})
Things to look out for before/after moving the Endpoint Server to another node zero.
A new environment means a new Certificate chain. This means that any agents deployed will probably need to be redeployed. In the one circumstance where Aaron Martin got this to work, after getting the service reinstalled, he had to reconfigure Meta Forwarding and perform a cert reissue on the device after it was in the environment. Bear this in mind before going down this road; I do not know if this can be continued to be used as agents would need to be deployed. It is probably better in most cases to reimage it especially if you do not like to gamble.
Notes
Related knowledge articles:- RSA NetWitness 11.x Admin Server does not discover new hosts
- How to add hosts or services back to the UI in RSA NetWitness Logs & Packets 11.0
When running nwsetup-tui, it complains about missing cookbooks shortly after finishing the prompts in the nwsetup-tui:
sample error:
[2020-05-07T09:06:35+00:00] <8708> (ERROR) Could not locate RSA cookbooks: '/var/lib/netwitness/config-management'
[2020-05-07T09:06:35+00:00] <8635> (ERROR) [nwsetup] Installation failed [error 20: bootstrap]!
Sometimes when performing my steps or because of other issues, you may need to reinstall the config-management rpm to bring back the cookbooks.
[2020-05-07T09:06:35+00:00] <8635> (ERROR) [nwsetup] Installation failed [error 20: bootstrap]!
yum reinstall rsa-nw-config-management
When running the nwsetup-tui, it complains about missing /var/netwitness/component-descriptor/bin/descriptor-generator.sh
sample error:
[2020-05-07T10:19:25+00:00] <12049> (INFO) Configuring node-x node...
/usr/bin/bootstrap: line 653: /var/lib/netwitness/component-descriptor/bin/ descriptor-generator.sh: No such file or directory [2020-05-07T10:19:25+00:00] <12049> (ERROR) Failed to generate node.json at: '/etc/netwitness/config-management'
[2020-05-07T10:19:25+00:00] <11972> (ERROR) [nwsetup] Installation failed [error 20: bootstrap]!
/usr/bin/bootstrap: line 653: /var/lib/netwitness/component-descriptor/bin/ descriptor-generator.sh: No such file or directory [2020-05-07T10:19:25+00:00] <12049> (ERROR) Failed to generate node.json at: '/etc/netwitness/config-management'
[2020-05-07T10:19:25+00:00] <11972> (ERROR) [nwsetup] Installation failed [error 20: bootstrap]!
Sometimes when performing my steps or because of other issues, you may need to reinstall the component-descriptor rpm.
yum reinstall rsa-nw-component-descriptor
Failure to find the device upon Discovery:
Ensure that the salt minion is running on the host and can communicate on the appropriate ports (4505 and 4506 from component host to Admin Server). Review the logs in /var/log/salt/minion on the node itself and /var/log/salt/master on the head unit.
Upon Service Installation, the core services show up as red in the UI yet appear to be running on the command line.
Check in the logs for the reason for this as you may find that the trust between these devices is broken. It is possible that the services have not restarted since the rediscovery, thus they are running with the old certificate set. Restart these services to see if this improves the situation. Otherwise, you may need to review and compare the /etc/netwitness/ng/
Product Details
RSA Product Set: NetWitness PlatformRSA Product/Service Type: Core Appliance, ESA, Endpoint Servers
RSA Version/Condition: 11.1.x, 11.2.x, 11.3.x, 11.4.x, 11.5.x, 11.6.x, 11.7.x
Platform: CentOS 7
Summary
Sometimes, we would want to remove an existing core appliance from its current nw-node-zero and add it to a new nw-node-zero.
Approval Reviewer Queue
Technical approval queue