VMware Postgres Archiver Service changes status to "stopped"
search cancel

VMware Postgres Archiver Service changes status to "stopped"

book

Article ID: 318472

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

VMware Postgres Archiver Service changes status to "stopped" after configuring scheduled backups for the vCenter server appliance.

In /var/log/vmware/vpostgres/pg_archiver.log-[n].stderr, you see error similar to:

2018-05-22T10:27:36.133Z ERROR  pg_archiver could not receive data from WAL stream: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.


In /var/log/vmware/vpostgres/postgresql-[nn].log you see error similar to:

2018-05-22 10:27:36.133 UTC 5ae81b07.c7c9 0 [unknown] archiver LOG:  terminating walsender process due to replication timeout

Environment

VMware vCenter Server Appliance 6.7.x

Resolution

This is caused by the wal_sender_timeout value being too low. If the vCenter Server appliance experiences high CPU load, the Postgres service may stop the archiver, as it cannot report within the timeout limit configured by default. 

A workaround that can be performed is to increase the timeout value to 10 minutes (default is 60 seconds).

The value can be changed in the following file:  /storage/db/vpostgres/postgresql.conf

Change the following line:

#wal_sender_timeout = 60s

to 

wal_sender_timeout = 600s

Once the changes have been saved, please proceed to restart the services:
service-control --stop vmware-postgres-archiver vmware-vpostgres
service-control --start vmware-postgres-archiver vmware-vpostgres
* make sure no tasks are running on the VC before doing the restart.

Additional Information


How to stop, start, or restart vCenter Server 6.x services (2109881)
https://kb.vmware.com/s/article/2109881