High CPU utilization is observed on worker and primary nodes in Enterprise PKS

search cancel

High CPU utilization is observed on worker and primary nodes in Enterprise PKS

book

Article ID: 319523

calendar_today

Updated On:

Products

VMware

Issue/Introduction

Symptoms:

You see high CPU utilization on worker and primary nodes when running bosh vms -–vital.
vRealize Log Insight is configured in PKS tile without any rate limit under the logging section.
You see output similar to the following when running a command similar to bosh -d <service-instance_xxxx> ssh <worker/xxxx>
<worker/xxxx>#top

Tasks: 152 total, 1 running, 118 sleeping, 0 stopped, 0 zombie
%Cpu(s): 91.4 us, 8.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem: 2041816 total, 251604 free, 1444388 used, 345824 buff/cache
KiB Swap: 2040828 total, 1324040 free, 716788 used. 348904 avail Mem

PID USER      PR NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
10529 root      10 -10 1692720 826192   5600 S 90.4 40.5 24927:17 ruby
9899 root      10 -10 829624 63144 12636 S 3.7 3.1   1608:37 kubelet
9322 root      10 -10 378708 30128   1376 S 1.3 1.5 645:01.23 dockerd
You see output similar to the following when running a command similar to <worker/xxxx>#ps ax | grep ruby

10543 ? S<l 17553:36 ruby /var/vcap/packages/fluentd/bin/fluentd -c /var/vcap/jobs/fluentd/config/fluent.conf -p /var/vcap/packages/fluentd/plugins/ --no-supervisor
28356 pts/0 S+ 0:00 grep --color=auto ruby

Environment

VMware PKS 1.x

Cause

This issue is observed when Worker VM Type or Primary VM Type selected as small(cpu: 1, ram: 2 GB, disk: 8 GB) in plan in PKS Tile and vRealize Log Insight is configured in PKS tile without any rate limit under the logging section.

Resolution

This is a known issue affecting Enterprise PKS.

Workaround:

One of the following options can be employed to workaround this issue:

Apply a rate limit to the vRLI settings in PKS tile under the logging section. See the instructions at Installing PKS on vSphere with NSX-T for more information. Apply the changes and validate that the CPU utilization is lowered.
Increase the CPU count of the worker/primary node in the plan by changing the VM Type in the PKS Tile.

Feedback

thumb_up Yes

thumb_down No