You see ncp the process restarting continuously when running a watch monit summary command on a primary node.
You see messages similar to the following in the /var/vcap/sys/log/ncp/ncp.stderr.log file:
long_project_name, long_service_name, int(port_num), lb_pool,
ValueError: invalid literal for int() with base 10: 'http'
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
The issue is caused by having a named servicePort in an Ingress. The following steps can be used to validate this:
Issue the following command to list all ingress resources:
kubectl get ing –all-namespaces
Issue a command similar to the following to inspect each ingress returned by the previous command, looking for a named serviePort:
kubectl get ing <ingress name> -n <namespace> -o yaml
Note: You will see output similar to the following:
backend:
serviceName: test
servicePort: http
This is a known issue with VMware NCP plugin 2.3.2 affecting Enterprise PKS. Currently there is no resolution.
To work around this issue, use the Port numbers for servicePort instead of service names. You can issue a command similar to the following to accomplish this:
kubectl edit ing <ingress name> -n <namespace>
backend:
serviceName: test
servicePort: 80 (instead of http)
Note: After editing the ingress, ncp should become stable. Run the following command to verify ncp status on the primary node:
watch monit summary