At times certain controller APIs could fail due to cleanup of API server reference files
search cancel

At times certain controller APIs could fail due to cleanup of API server reference files

book

Article ID: 321093

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • NSX​ ​Manager​ ​GUI​ displays an NSX Controller node as Disconnected.

  • In the NSX Manager logs, you see entries similar to:

    ERROR http-nio-127.0.0.1-7441-exec-5 BaseRestController:452 - REST API failed : 'I/O error on POST request for " https://x.x.x.x/ws.v1/login ": Remote host closed connection during handshake; nested exception is javax.net.ssl.SSLHandshakeException: Remote host closedconnectionduring handshake'org.springframework.web.client.ResourceAccessException:I/O error on POST request for " https://x.x.x.x/ws.v1/login ": Remote host closedconnection during handshake; nested exception is javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
     
  • Traceflow​ ​fails​ ​to​ ​work​ ​with​ ​the​ warning​ ​on​ the ​User Interface similar to:

    Controller:​ ​X.X.X.X​ ​communication​ ​error,​ ​details:​ ​I/O​ ​error​ ​on​ ​POST​ ​request​ ​for " https://x.x.x.x/ws.v1/login ":​ ​Remote​ ​host​ ​closed​ ​connection​ ​during​ ​handshake.
 
  • Hardware Gateway status shows as 'Down' in vCenter UI.
     
  • Central​ ​CLI​ ​commands​ ​querying​ the NSX Controller​ ​information​ ​fails​ ​similar to:

    nsx-mgr>​ ​show​ ​​ ​logical-switch​ ​controller​ ​<controller-id>​ ​vni​ ​<vni-id>​ ​brief

    Error:​ ​100:​ ​I/O​ ​error​ ​on​ ​POST​ ​request​ ​for​ ​" https://controller-ip-address/ws.v1/login ": Connection​ ​reset;​ ​nested​ ​exception​ ​is​ ​java.net.SocketException:​ ​Connection​ ​reset
    .

    Note:​ ​One​ ​or​ ​all​ ​of​ ​the​ ​above​ ​symptoms​ ​may​ ​be​ ​seen.​ ​Whenever​ ​there is ​a​ ​failure​ ​occurrence, NSX​ ​Manager​ ​logs​ ​will​ ​have​ ​a​ ​corresponding​ ​error​ ​log.
 


Environment

VMware NSX for vSphere 6.3.x

Cause

The NSX for vSphere 6.3.3 controller node has a periodic clean-up task that deletes a status file required by the API server if the API server has sufficiently low activity. After the file is deleted, any​ ​new​ ​connections​ ​from​ ​the​ ​NSX​ ​Manager​ ​to​ ​the​ ​API​ ​server​ ​will​ ​fail.

NSX Manager monitors and updates the controller cluster using REST API calls to the controller-cluster members.
 
The NSX Manager does maintain a persistent connection to each controller's API server for this purpose. Until the connections are disrupted (physical network issues or restart of NSX Manager) the NSX Manager continues to have access to the controller for cluster monitoring, NSX logical switch & router creation and modifications. Only operations such as Traceflow and Central CLI that do not use the same persistent connection will fail. If external events disrupt the persistent TCP connections, NSX Manager will lose the ability to make​ ​API​ ​connections​ ​to​ ​controllers.
 
Note: The controller API server's role is only for management plane access between the controller and NSX Manager. Disruptions to the API server will not have any impact to the controller-cluster operations or the control plane and dataplane states of NSX. Due to the fault-tolerant design of the distributed controller cluster, NSX Manager continues to be able to update the entire controller cluster as long as it has API connectivity to at least one of the controller​ ​nodes.

Resolution

This issue is resolved in VMware NSX for vSphere 6.3.4, available at VMware Downloads.

Please note, the password expiration does not impact the Hardware Gateway, but it does impact the status reporting as NSX Manager cannot talk to the controller to pick up the right status. To fix this issue, follow the below mentioned work around to fix the password expiration.
 
To work around the issue on NSX for vSphere 6.3.3, and to avoid encountering this issue while upgrading from NSX for vSphere 6.3.3 to a later version, VMware developed a signed script that recreates the status file required by the API server.

The workaround requires two signed scripts to be executed sequentially using REST API call to NSX Manager.
 
Download the attached signed_bsh_download_jar.encoded and signed_bsh_passwd_expiry_napi.encoded files.

Notes:
  • The same scripts are also mentioned in the workaround section of KB article Deploying NSX Controller fails in NSX-v 6.3.3 and 6.3.4 (51144). Running the scripts applies the same workaround for both KB articles.
  •  While performing the below steps, the data path will not be impacted; however, the Controller status will be in the disconnected state for a brief period.
Run the following POST calls on NSX Manager:
  1. Confirm IP connectivity from NSX manager to all the controllers using ping.
    Proceed only after the IP connectivity is established.
     
  2. Method: POST
    URL: https://nsxmgr_ip/api/1.0/services/debug/script
    Authentication: Basic authentication (Username : admin)
    Headers: content-type - application/xml
    Body : copy contents of the attached file signed_bsh_download_jar.encoded.
    Expected Response: 200

    Note: During copy/paste of the contents into the body, make sure no extra line/characters get added at the end to run the API successfully. Proceed to step-3 only if the response is 200.

    File a support request with VMware support if the API call fails after multiple attempts.

    Alternatively, you can use cURL:

    curl -k -X POST -H "Content-Type: application/xml" -d "@signed_bsh_download_jar.encoded" -u user:password https://nsxmgr_ip/api/1.0/services/debug/script
     
  3. Method: POST
    URL: https://nsxmgr_ip/api/1.0/services/debug/script
    Authentication: Basic authentication (Username : admin)
    Headers: content-type - application/xml
    Body : copy contents of the attached file signed_bsh_passwd_expiry_napi.encoded.
    Expected Response: 200

    As a part of Step 3, the API service in each of the Controller will be restarted sequentially; upon successful restart, the Controller status should show up as Connected in the NSX Manager Installation UI.

    Note: If any or all of the Controllers are re-deployed, repeat the preceding steps again.

    Alternatively, you can use cURL:

    curl -k -X POST -H "Content-Type: application/xml" -d "@signed_bsh_passwd_expiry_napi.encoded" -u user:password https://nsxmgr_ip/api/1.0/services/debug/script


Additional Information

 有时,由于清理 API 服务器参考文件,某些控制器 API 可能会失败
API サーバ リファレンス ファイルのクリーンアップのために特定のコントローラ API が失敗することがある

Attachments

signed_bsh_passwd_expiry_napi.encoded get_app
signed_bsh_download_jar.encoded get_app