Symptoms:
Unable to publish changes on NSX environment components.
– ESXi kernel log reports below error /var/log/vmkernel.log
2019-04-13T11:29:54.985Z error netcpa[F5279CA700] [Originator@6876 sub=Default] Connection error with controller <controller IP> on source port 15539, triggering reconnect
Cause:
– Issue happens because of the lost communication between netcpa agent and the NSX controller node.
– For single host the reasons might be with the netcpa agent service on the host.
– For all hosts having the same error reported , ensure controller status is normal.
Solution:
If the issue persist with one host , try restarting netcpa agent in the server.
1. Login to the host with root credentials.
2. Execute the command ” esxcli network ip connection list | grep 1234″
3. From the output of the command ,
Verify that NSX controllers IP addresses are connected to the netcpa-worker process with a state of Established.
4. Execute the command below to restart the netcpa agent.
/etc/init.d/netcpad restart
If the issue persist with all the hosts , try checking the NSX controller state.
1. Login to the vSphere web client
2. Navigate to Home — > Networking and security
3. Installation — > NSX controller nodes and see the status of each node.
4. If the status is not normal / unknown , right click and sync the controller.
5. If the issue still persists , connect to the NSX controller appliance and check the status by running below commands
show control-cluster status
show control-cluster startup-nodes
show control-cluster connections
6. Validate the status of the controllers , restart the appliance if necessary.
