Error "Failed to calculate VCDB size" during backup in vCenter Server Appliance 6.0/6.5
search cancel

Error "Failed to calculate VCDB size" during backup in vCenter Server Appliance 6.0/6.5

book

Article ID: 320210

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:
  • Attempting to backup the VCSA using the VAMI file-based backup and restore mechanism result in an error:
Timeout failed to complete in 72000 seconds.
  • In the var/log/vmware/applmgmt/backup.log, you will see message similar to:
2019-09-20T11:40:03.605 [MainProcess:PID-10287] INFO: Executing command: ['/opt/vmware/vpostgres/current/bin/psql', '-U', 'postgres', '-d', 'VCDB', '-p', '5432', '-t', '-A', '-c', "SELECT nspname, relname, reltuples AS row_count, pg_catalog.pg_table_size(C.oid) AS table_size, 100.0 * pg_catalog.pg_table_size(C.oid) / sum(pg_catalog.pg_table_size(C.oid)) over () AS percentage FROM pg_catalog.pg_class C LEFT JOIN pg_catalog.pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND C.relkind = 'r' AND C.relname !~ '^vpx_hist_stat' AND C.relname !~ '^vpx_sample_time' AND C.relname !~ '^vpx_event' AND C.relname !~ '^vpx_task' AND C.relname !~ '^vpx_topn' AND C.relname !~ '^vpxv_hist_stat' AND C.relname !~ '^vpxv_event' ORDER BY table_size DESC"].
2019-09-20T11:40:03.682 [MainProcess:PID-10287] INFO: Sample table size record: vc|vpx_text_array|7.73666e+06|4694687744|80.8458111497648330
2019-09-20T11:40:03.682 [MainProcess:PID-10287] INFO: Sample table: name="vc.vpx_text_array",
Rows count: 7.73666e+06
size="4694687744",
percent="80.8458111497648330%".
2019-09-20T11:40:03.682 [MainProcess:PID-10287] ERROR: Failed to calculate VCDB size.
Traceback (most recent call last):
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/VCDB.py", line 398, in VCDBSize
    original_size = _estimate_size_partial_vcdb(db_name='VCDB')
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/VCDB.py", line 301, in _estimate_size_partial_vcdb
    int(reltuples),
ValueError: invalid literal for int() with base 10: '7.73666e+06'
.
.
.
2019-09-20T11:40:03.291 [MainProcess:PID-10286] ERROR: Failed to calculate VCDB size.
Traceback (most recent call last):
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/VCDB.py", line 398, in VCDBSize
    original_size = _estimate_size_partial_vcdb(db_name='VCDB')
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/VCDB.py", line 301, in _estimate_size_partial_vcdb
    int(reltuples),
ValueError: invalid literal for int() with base 10: '7.73666e+06'
.
.
.
2019-09-20T12:10:46.197 [VCDBBackup:PID-11966] ERROR: Failed to backup VCDB.
Traceback (most recent call last):
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/VCDB.py", line 1951, in BackupVCDB
    args.dispatch, full_db_file)
  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/VCDB.py", line 1119, in _full_database_backup
    status)
util.Common.BackupRestoreError: Failed to dispatch full database files.
Underlying process status. rc: 255

Note: Reviewing the logs, we can see the following errors, indicating that there was an issue calculating table size:(Note the tables referenced, may be different to the ones highlighted in this example).

The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.



Environment

VMware vCenter Server Appliance 6.5.x
VMware vCenter Server Appliance 6.0.x

Cause

The backup file does an SQL query for row count, but uses an int rather than a bigint, which means if the row count is above a certain value, number conversion fails, which terminates the query and causes the backup to timeout and fail.

Resolution


This issue is resolved in vCenter Server Appliance 6.7P01.

Workaround:
To workaround this issue:
  1. SSH into the vCenter Server Appliance.
  2. Take a back up of the file, run the command:
cp /usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/VCDB.py /some/backup/path/for/original
  1. Modify the file below as outlined:
/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/VCDB.py

Locate the line below in the VCDB.py file and modify it as outlined below:

from:
sql = ('SELECT nspname, relname, reltuples AS row_count, ...

Change to:
sql = ('SELECT nspname, relname, reltuples::bigint AS row_count, '...
 
  1. Run the backup manually to confirm the fix.
  2. If you want to monitor the backup progress, then run the following command to review the backup log live:
tail -f /var/log/vmware/applmgmt/backup.log