Cell "FAIL" status for Multi-Cell Log Collection
search cancel

Cell "FAIL" status for Multi-Cell Log Collection

book

Article ID: 321445

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

Symptoms:
  • When running vmware-vcd-support script with the following options (-l -m) to gather logs from all cells in the server group, the tool's "Tool Executed Status" may return a "FAIL" status for one or more of the cells.  For example: 

    Tool Executed Status
    LogCollector(MultiCell)

    CELLID#

    <CELL_UUID> FAIL



Cause

When running vmware-vcd-support script with the "-l -m" options to gather logs from all cells in the server group, the tool may attempt to gather logs from cells which have been decommissioned and removed from the server-group inventory.

Consider the case where, the customer has a vcd server-group with 3 cells. And recently two of those cells have been decommissioned and replaced by two new cells. During the decommissioning of the two cells, the old cells have been removed from the vcd inventory using the delete option in Flex UI > System > Manage & Monitor > Cloud Resources > Cloud Cells.

In the above case, the following sample output would be shown when running "/opt/vmware/vcloud-director/bin/vmware-vcd-support -l -m".

YYYY-MM-DD HH:MM:SS | INFO | Support directory created successfully
YYYY-MM-DD HH:MM:SS | INFO | Output directory created successfully.
YYYY-MM-DD HH:MM:SS | INFO | Collecting multi cell logs...
Collecting logs... [Time to timeout: 30 seconds] 
Log bundle created: /opt/vmware/vcloud-director/data/transfer/vmware-vcd-support/bundles/logs_YYYY-MM-DD_HHMMSSUTC.tgz
Logs from following cells may not have been included in the bundle: <CELL4_UUID> <CELL5_UUID>

***VMware vCD Support Execution Summary***

Tool Executed Status

LogCollector(MultiCell)

CELLID#

<CELL1_UUID> PASS
<CELL2_UUID> PASS
<CELL3_UUID> PASS
<CELL4_UUID> FAIL
<CELL5_UUID> FAIL
----------------------------------------------------------------------------
YYYY-MM-DD HH:MM:SS | INFO | Support information has been collected and saved as /opt/vmware/vcloud-director/data/transfer/vmware-vcd-support/vmware-vcd-support-2019-07-11.27223.tgz.

FAIL is reported for the support bundle generation on two vcd cells, those with the following <CELL4_UUID>, <CELL5_UUID>.

Those UUIDs are related to the cells that have been decommissioned. To confirm, one would need to either check the cells table in the vcloud database, or can check the "vcloud.cell.uuid" field in the /opt/vmware/vcloud-director/etc/global.properties file on each remaining cell in the server-group. These two UUIDs should not be found in either source location.

The vmware-vcd-support tool use a different cells ID list, other than the current entries in the cells table, to determine which cells it should generate support bundles. The list comes from the cells directory in the transfer share.
/opt/vmware/vcloud-director/data/transfer/cells/ directory. Each file in that directory maps to an existing or decommissioned vcd cell in that server-group.

Here is a sample output from that directory.

ls -ltr
total 0
rwxr-x-- 1 1003 1002 0 Jan 1 1970 <CELL1_UUID>
rwxr-x-- 1 1003 1002 0 Jan 2 1970 <CELL2_UUID>
rwxr-x-- 1 1003 1002 0 Jan 2 1970 <CELL3_UUID>
rw------ 1 1003 1002 0 Jan 2 1970 <CELL4_UUID>
rwxr-x-- 1 1003 1002 0 Jan 2 1970 <CELL5_UUID>

In this case, since the support bundle tool did successfully collect support bundles for 3 cells, and only 3 cells are active, the two failure messages can be assumed to be a benign error can be safely ignored.

Resolution

To workaround the issue, identify the UUID's for all the currently active VCD cells that are part of the server-group, and delete the files in /opt/vmware/vcloud-director/data/transfers/cells that are NOT part of that active set/list. 

Note: When VCD cell service is restarted, it recreates its unique file in /opt/vmware/vcloud-director/data/transfers/cells if it does not already exist.