While backing up a container in VZFS format, all write operations to the
/vz partition can be blocked.
There are many processes in the "D"-state.
A list of "D"-state processes can be collected with the following command:
~# vzps axww -eLo veid,ppid,pid,tid,wchan:32,rsz,vsz,state,cmd | awk '$8~/D/' > dproc.log
The Load Average (LA) on the node is very high:
~# uptime 03:27:03 up 1 day, 59 min, 3 users, load average: 1148.69, 1142.22, 1123.87
The command to create a file does not finish in a few minutes, being in the "D"-state:
~# touch /vz/somefile
As one of the stages to create a backup of a container, the processes of the container and the file system are frozen. Normally, this is a quick operation and lasts a few seconds at the most. However, the file system lock can be held for a very long time, blocking processes of services on the node and all running containers from writing to the partition holding the Virtuozzo containers.
To verify that this is the case, you can try to create a file and check if the operation finishes in an acceptable amount of time. If it is not completed in a few minutes, then this is the case.
~# touch /vz/somefile
If it hangs, check the
/vz/lock/CTID.lck file(s) to see if there is a backup creation operation locking any container, and if so, what the corresponding process PID is. In our case, we can see that a backup of the container #101 is being created by the process with PID 43767:
~# cat /vz/lock/101.lck 43767 backing-up
Then we can see that the process "43767" is indeed in the "D" state:
~# ps axlwwf | grep 43767 0 0 39474 38129 20 0 103248 824 pipe_w S+ pts/2 0:00 \_ grep 43767 4 0 43767 4391 20 0 556048 62728 sb_wai Dl ? 0:12 \_ /opt/pva/agent/bin/vzlpl /var/opt/pva/agent/tmp.aqUL22
The following requests were submitted to address this complex product issue:
- PCLIN-32011 and PCLIN-32055 for Parallels Virtuozzo Containers for Linux 4.7
- PSBM-22710 for Parallels Cloud Server 6.0
The fixes for these requests are available in the following product updates:
- PSBM-22710 has been fixed in Parallels Cloud Server 6.0 Update 4 Hotfix 2 (build 6.0.0-1631).
- PCLIN-32011 has been fixed in the kernel update CU-2.6.32-042stab081.3 for Parallels Virtuozzo Containers 4.7 update.
- PCLIN-32055 has been fixed in the kernel update CU-2.6.32-042stab081.5. It is available for Parallels Virtuozzo Containers for Linux 4.7, Parallels Server Bare Metal 5.0, and for Parallels Cloud Server 6.0 with the update mentioned just above.
If the issue has occurred already with a kernel version prior to 2.6.32-042stab081.5, the only safe way to resolve the issue is to reboot the server.