Search the VMware Knowledge Base (KB)
View by Article ID

At times certain controller APIs could fail due to cleanup of API server reference files (2151719)

  • 2 Ratings

Symptoms

Important: While upgrading from NSX for vSphere 6.3.3 to a later release, perform the steps in Resolution section before proceeding with the NSX Manager upgrade. Failure to perform the steps may result in the NSX Controller status as Disconnected and Controller upgrade to fail.

  • NSX​ ​Manager​ ​GUI​ displays an NSX Controller node as Disconnected.

  • In the NSX Manager logs, you see entries similar to:

    ERROR http-nio-127.0.0.1-7441-exec-5 BaseRestController:452 - REST API failed : 'I/O error on POST request for " https://x.x.x.x/ws.v1/login ": Remote host closed connection during handshake; nested exception is javax.net.ssl.SSLHandshakeException: Remote host closedconnectionduring handshake'org.springframework.web.client.ResourceAccessException:I/O error on POST request for " https://x.x.x.x/ws.v1/login ": Remote host closedconnection during handshake; nested exception is javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake

  • Traceflow​ ​fails​ ​to​ ​work​ ​with​ ​the​ warning​ ​on​ the ​User Interface similar to:

    Controller:​ ​X.X.X.X​ ​communication​ ​error,​ ​details:​ ​I/O​ ​error​ ​on​ ​POST​ ​request​ ​for " https://x.x.x.x/ws.v1/login ":​ ​Remote​ ​host​ ​closed​ ​connection​ ​during​ ​handshake.

  • Central​ ​CLI​ ​commands​ ​querying​ the NSX Controller​ ​information​ ​fails​ ​similar to:

    nsx-mgr>​ ​show​ ​​ ​logical-switch​ ​controller​ ​<controller-id>​ ​vni​ ​<vni-id>​ ​brief

    Error:​ ​100:​ ​I/O​ ​error​ ​on​ ​POST​ ​request​ ​for​ ​" https://controller-ip-address/ws.v1/login ": Connection​ ​reset;​ ​nested​ ​exception​ ​is​ ​java.net.SocketException:​ ​Connection​ ​reset
    .

    Note:​ ​One​ ​or​ ​all​ ​of​ ​the​ ​above​ ​symptoms​ ​may​ ​be​ ​seen.​ ​Whenever​ ​there is ​a​ ​failure​ ​occurrence, NSX​ ​Manager​ ​logs​ ​will​ ​have​ ​a​ ​corresponding​ ​error​ ​log.

Cause

The NSX for vSphere 6.3.3 controller node has a periodic clean-up task that deletes a status file required by the API server if the API server has sufficiently low activity. After the file is deleted, any​ ​new​ ​connections​ ​from​ ​the​ ​NSX​ ​Manager​ ​to​ ​the​ ​API​ ​server​ ​will​ ​fail.

NSX Manager monitors and updates the controller cluster using REST API calls to the controller-cluster members.

The NSX Manager does maintain a persistent connection to each controller's API server for this purpose. Until the connections are disrupted (physical network issues or restart of NSX Manager) the NSX Manager continues to have access to the controller for cluster monitoring, NSX logical switch & router creation and modifications. Only operations such as Traceflow and Central CLI that do not use the same persistent connection will fail. If external events disrupt the persistent TCP connections, NSX Manager will lose the ability to make​ ​API​ ​connections​ ​to​ ​controllers.
 
Note: The controller API server's role is only for management plane access between the controller and NSX Manager. Disruptions to the API server will not have any impact to the controller-cluster operations or the control plane and dataplane states of NSX. Due to the fault-tolerant design of the distributed controller cluster, NSX Manager continues to be able to update the entire controller cluster as long as it has API connectivity to at least one of the controller​ ​nodes.

Resolution

This issue is resolved in VMware NSX for vSphere 6.3.4, available at VMware Downloads.

To work around the issue on NSX for vSphere 6.3.3, and to avoid encountering this issue while upgrading from NSX for vSphere 6.3.3 to a later version, VMware developed a signed script that recreates the status file required by the API server.

The workaround requires two signed scripts to be executed sequentially using REST API call to NSX Manager.

Download the attached signed_bsh_download_jar.encoded and signed_bsh_recreate_status_file.encoded files.

Note
: While performing the below steps, the data path will not be impacted; however, the Controller status will be in the disconnected state for a brief period.
 
Run the following POST calls on NSX Manager:
  1. Confirm IP connectivity from NSX manager to all the controllers using ping.
    Proceed only after the IP connectivity is established.

  2. Method: POST
    URL: https://nsxmgr_ip/api/1.0/services/debug/script
    Authentication: Basic authentication (Username : admin)
    Headers: content-type - application/xml
    Body : copy contents of the attached file signed_bsh_download_jar.encoded.
    Expected Response: 200

    Note: During copy/paste of the contents into the body, make sure no extra line/characters get added at the end to run the API successfully. Proceed to step-3 only if the response is 200.

    File a support request with VMware support if the API call fails after multiple attempts.

  3. Method: POST
    URL: https://nsxmgr_ip/api/1.0/services/debug/script
    Authentication: Basic authentication (Username : admin)
    Headers: content-type - application/xml
    Body : copy contents of the attached file signed_bsh_recreate_status_file
    Expected Response: 200

    As a part of Step 3, the API service in each of the Controller will be restarted sequentially; upon successful restart, the Controller status should show up as Connected in the NSX Manager Installation UI.

    Note
    : If any or all of the Controllers are re-deployed, repeat the preceding steps again.

Additional Information

To be alerted when this article is updated, click Subscribe to Document in the Actions box.

Attachments

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 2 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.

What can we do to improve this information? (4000 or fewer characters)




Please enter the Captcha code before clicking Submit.
  • 2 Ratings
Actions
KB: