VMware vCenter Analytics Service is restarting often with out of memory errors and vSphere Skyline Health is unable to connect with "Unable to query vSphere health information error"
search cancel

VMware vCenter Analytics Service is restarting often with out of memory errors and vSphere Skyline Health is unable to connect with "Unable to query vSphere health information error"

book

Article ID: 318580

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:
  • Analytics service is restarting frequently
  • Skyline Health returns Unable to query vSphere Health information error
  • /var/log/vmware/analytics/analytics.log has plenty of logs that mention VDDK collector ID, like that:
phProdLogDrainerTaskExecutor-2 ERROR ph.phservice.push.telemetry.DefaultTelemetryLevelService Unexpected error during telemetry level retrieval for CollectorAgent: {collectorId:VDDK.7_0, collectorInstanceId:521a4b85-5f67-a38b-5124-13eee57427b}
java.util.concurrent.CompletionException: com.vmware.ph.phservice.common.manifest.ManifestContentProvider$ManifestException: com.vmware.ph.client.api.exceptions.PhClientException: The PhoneHome buffering feature is supposed to work with a single instanceId value for concrete collectorId. You are trying to create a client with collectorId=VDDK.7_0 and instanceId=521a4b85-5f67-a38b-5124-13eee57427b but you have already created a client with collectorId=VDDK.7_0 and instanceId=52a2ef0c-8c87-4618-1537-320e32f7fe5
  • There are 3-4 .hprof files in /var/log/vmware/analytics/ and the analysis of them shows that the memory is taken by 9 lines like:
phProdLogDrainerTaskExecutor-5 120 11,214,496 sun.misc.Launcher$AppClassLoader @ 0xf1200000 false java.lang.Thread @ 0xf5625d08 »
 phProdLogDrainerTaskExecutor-1 120 10,978,120 sun.misc.Launcher$AppClassLoader @ 0xf1200000 false java.lang.Thread @ 0xf5624c20 »


Environment

VMware vCenter Server 6.7.x
VMware vCenter Server 7.0.x

Cause

This occurs because a push telemetry process has send too many requests with different IDs and that creates too many files in "/var/log/vmware/analytics/prod" (There have been cases with a total of 185k to 375k files).

Resolution

To fix this issue please update your vCenter Server to 7.0 U2d (Build 18455184) or 6.7 U3p (Build 18831133).

Workaround:

If you are unable to update your vCenter Server at the moment, then download the attached script called "cleaning_vddk_script.sh", this script will greatly reduce VDDK files and prevent further errors.

To run the script:

1. Upload the script to vCenter Server via SCP. You could upload the script to any location (ex: /tmp)
2. Go to the location where you saved the file
3. Make sure the file has "execute" permissions with this command:

chmod 555 cleaning_vddk_script.sh

4. Run the script:

./cleaning_vddk_script.sh

Note: By default, the script will apply changes to files older than 180 days. You can pass a parameter to the script in order to use different value such as 90 days, ex:

./cleaning_vddk_script.sh 90


Attachments

cleaning_vddk_script get_app