vCenter Server Appliance 7.0.x /storage/log partition runs out of space due to VMware Analytics service log file (analytics-runtime.log.stderr)
search cancel

vCenter Server Appliance 7.0.x /storage/log partition runs out of space due to VMware Analytics service log file (analytics-runtime.log.stderr)

book

Article ID: 318203

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:
  • The /storage/log partition in vCenter Server Appliance is 99% or 100% full.
  • vCenter shows the warning with the text:

Log Disk Exhaustion
 

  • /var/log/vmware/analytics/analytics-runtime.log.stderr is large in size (more than 1 GB).
  • /var/log/vmware/analytics/analytics-runtime.log.stderr has a high number of entries similar to:
    WARNING: Ignoring unsupported entry in 'jdk.certpath.disabledAlgorithms': SHA1 jdkCA & usage TLSServer
    Jun 05, 2021 7:42:58 PM org.bouncycastle.jsse.provider.ProvTrustManagerFactorySpi getDefaultTrustStore
    INFO: Initializing with trust store at path: /usr/java/jre-vmware/lib/security/cacerts
    Jun 05, 2021 7:43:01 PM org.bouncycastle.jsse.provider.ProvTlsClient notifyAlertRaised
    INFO: Client raised fatal(2) certificate_unknown(46) alert: Failed to read record
    org.bouncycastle.tls.TlsFatalAlert: certificate_unknown(46)
            at org.bouncycastle.jsse.provider.ProvSSLSocketWrap.checkServerTrusted(Unknown Source)
            at org.bouncycastle.jsse.provider.ProvTlsClient$1.notifyServerCertificate(Unknown Source)
            at org.bouncycastle.tls.TlsUtils.processServerCertificate(Unknown Source)
            at org.bouncycastle.tls.TlsClientProtocol.handleServerCertificate(Unknown Source)
            at org.bouncycastle.tls.TlsClientProtocol.handleHandshakeMessage(Unknown Source)
            at org.bouncycastle.tls.TlsProtocol.processHandshakeQueue(Unknown Source)
 
  • /var/log/vmware/analytics/analytics.log has a high number of entries similar to:
    2021-07-27T17:10:45.790Z data-app-collector-vsphere.adc.7_0U2  WARN  ph.phservice.collector.scheduler.DefaultCollectorLoop There are no scheduled collections to run.
    2021-07-27T17:10:47.472Z phStageLogDrainerTaskExecutor-1  WARN  vmware.ph.upload.rest.ProxyAwareHttpExecutor Failed 3 of 3 attempts to get manifest from VMware server. The reason for the last failure was (enable 'debug' level logging to see the error of every failed attempt): com.vmware.ph.upload.exception.ConnectionException: org.bouncycastle.tls.TlsFatalAlert: certificate_unknown(46).
    2021-07-27T17:10:47.473Z phStageLogDrainerTaskExecutor-1  ERROR ph.phservice.push.telemetry.DefaultTelemetryLevelService Unexpected error during telemetry level retrieval for CollectorAgent: {collectorId:vSphere.vpxd.switchOps.provisioning.7.0.1, collectorInstanceId:ph-vpxd-f7f1bc5c-d68a-4e62-bd47-c29d13a8dada}
    java.util.concurrent.CompletionException: com.vmware.ph.phservice.common.manifest.ManifestContentProvider$ManifestException: com.vmware.ph.upload.exception.ConnectionException: org.bouncycastle.tls.TlsFatalAlert: certificate_unknown(46)
            at com.github.benmanes.caffeine.cache.LocalLoadingCache.lambda$newMappingFunction$2(LocalLoadingCache.java:147)
            at com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2379)
            at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
            at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2377)
            at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2360)
            at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
            at com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:54)       


Environment

VMware vCenter Server 7.0.x

Cause

This issue occurs, when vCenter Server is configured with a Proxy that prevents it from communicating with the VMware online depots on port 443, filling the analytics-runtime.log.stderr log, which as a result causes the /storage/log to reach its maximum size.

For more information, see vCenter Server IP, DNS and Proxy Configuration

Resolution

Part 1 - Confirm that the analytics-runtime.log.stderr reaches an abnormal log file size:

  1. Connect to vCenter Server Appliance using SSH
  2. Browse to the log file location:
    # cd /var/log/vmware/analytics/
  3. Confirm the file size:
    # ls -lh
    ----> The command will list all files with human readable file size
(File should look like this:)
-rw------- 1 root root 3.5G Aug  4 15:03 analytics-runtime.log-0.stderr



Part 2 - Confirm there is no connectivity between vCenter Server and VMware Servers:

  1. Connect to vCenter Server Appliance using SSH
  2. Run the following command:
    # curl -iv https://vcsa.vmware.com
  • A successful connection should show "Connected to vcsa.vmware.com...port 443" along with some certificate information
  • In this case, DO NOT proceed with the resolution section of this article and check the related information.
Note: The text "403 Forbidden" in the results is also expected in a successful connection. Connecting to https://vcsa.vmware.com/ requires authentication parameters that are not passed by curl. Using curl to test connectivity will result in "403 Forbidden" but this still validates that vCenter can reach https://vcsa.vmware.com
  • If result is an "error" message, then HTTPS port 443 access is likely not enabled
 

Part 3 - Confirm the vCenter Server Appliance Proxy settings and change them accordingly:

  1. Connect to vCenter Server Appliance using SSH
  2. Browse to the proxy file and show contents:
    # cd /etc/sysconfig
    # less proxy
  3. You should see an output as below:
    # Enable a generation of the proxy settings to the profile.
    # This setting allows to turn the proxy on and off while
    # preserving the particular proxy setup.
    #
    PROXY_ENABLED="yes"
    # Some programs (e.g. wget) support proxies, if set in
    # the environment.
    # Example: HTTP_PROXY=http://proxy.provider.de:3128/
    HTTP_PROXY=http://x.x.x.x:8080 
    # Example: HTTPS_PROXY=https://proxy.provider.de:3128/
    HTTPS_PROXY=https://x.x.x.x:8443 ---> your proxy server IP address
    # Example: FTP_PROXY=http://proxy.provider.de:3128/
    FTP_PROXY=""
    # Example: GOPHER_PROXY=http://proxy.provider.de:3128/
    GOPHER_PROXY=""
    # Example: SOCKS_PROXY="socks://proxy.example.com:8080"
    SOCKS_PROXY=""
    # Example: SOCKS5_SERVER="office-proxy.example.com:8881"
Note: If you are using the same TCP Port of your proxy server for both HTTP and HTTPS (e.g. 3128), you will need to configure the HTTP_PROXY and HTTPS_PROXY options as shown below (mark that both port to HTTP://...)
HTTP_PROXY=http://x.x.x.x:3128 --> HTTPS_PROXY=http://x.x.x.x:3128
  1. Change the https to http.
    1. Use text editor to change the file contents.
      # vi proxy
  2. Press to change to Insert Mode
  3. Change the  the highlighted line from https://x.x.x.x to http://x.x.x.x 
  4. Press ESC to exit Insert Mode
  5. Press :wq! to save and exit.
 
  1. Restart the Analytics Service using the command:
    # vmon-cli -r analytics
​​​​

Workaround:

Note: Please ensure to create a fresh backup or offline snapshot (in powered off state) of the vCenter Server Appliance before implementing the steps below. In case the affected vCenter Server Appliance is part of an Enhanced Linked Mode (ELM) replication group, the backup or offline snapshots must be created for all of its replication partners as well. When restoring an ELM vCenter, all members of the ELM replication need to be restored too, otherwise there will be inconstencies in the VMDirectory LDAP database.


As a work around without changing the VCSA proxy settings, you can use the below steps to enforce  the vmware-analytics.lr log rotation:

  1. Run the following command to create a new file "vmware-analytics.lr"
    # vi /etc/logrotate.d/vmware-analytics.lr
  2. Press i to enter Insert Mode
  3. Copy and paste the following contest in the file:
    /var/log/vmware/analytics/analytics-runtime.log.stderr
    /var/log/vmware/analytics/analytics-runtime.log.stdout
    {
        size 10M
        nodateext
        rotate 10
        missingok
        notifempty
        compress
        copytruncate
        nocreate
        prerotate
            #Should not exist since it should be compressed to .gz
            filestderr1=/var/log/vmware/analytics/analytics-runtime.log.stderr.1
            if [ -f $filestderr1 ]; then
                rm $filestderr1
            fi
        endscript
        postrotate
           /usr/sbin/vmon-cli --stop analytics -c /storage/vmware-vmon/vmonserver > /dev/null
    
           #In rare cases timing issue occurs where the file is written with the previous offset and this fills log-0 instead of truncating it
           #Logrotate moves log-0.stderr in log-1.stderr and compresses it before that
           filestderr=/var/log/vmware/analytics/analytics-runtime.log-0.stderr
           if [ -n "$(find $filestderr -prune -size +1000c)" ]; then
               echo "Truncating $filestderr"
               >$filestderr
           fi
    
           /usr/sbin/vmon-cli --start analytics -c /storage/vmware-vmon/vmonserver > /dev/null
        endscript
    }
  4. Press ESC to exit Insert Mode.
  5. Enter :wq! to save and exit.


Additional Information

How to clear space on VCSA /storage/log partition (83070)

vCenter Server Appliance /storage/log partition runs out of space due to VMware Analytics service dump files (analytics/java_pidxxxx.hprof) (82483)

vSphere Health Service - Internet Connectivity Check (77563)