Virtualization and optimization for node operations with example of pratical use.
Proxmox is a virtual environment where you have a host Debian distro running virtual machines on top of it. This article wont go into all the details of this but focus on optimization and tricks learned for node operations such as on the Cardano network or in the future World Mobile network. Keep in mind virtualization solutions such as Proxmox is just one of many paths towards having your own node operations. There are pro’s and con’s to virtualization, but it can be quite efficient if you tune it a bit as I hope to show in this article. The focus will be on disk IO as this together with network latency are the two most important tweaking areas for node operations.
If you like this article consider delegating to ADA North Pool Cardano staking pool.
Host layer tweaks
Hardware wise it is recommended that you have 4–16GB ram spare on the host computer (on top of the ram you will use for the virtual machines) to run your hardisks and nvmes efficiently, especially if you use ZFS file system (and there are many good reasons you should). If you decide to use compression the amount of ram needed skyrockets (rule of thumb 1GB per 1TB). If you leave the host with less ram then your file IO will start to suffer as it uses the ram to cache both write and read operations.
To install see this guide. I recommend using ZFS filesystem for the host layer if you consider expanding disk size in the future and ext4 on the virtual machines so that is one change from that guide. If you know you wont expand then LVM could be a good choice since it has a bit less overhead in metadata. Ignore the section on ZFS settings if you use LVM. Hostname FQDN could be online or you could use a local address such as #yourservername#.local.
First things first after installation is to tune the host system. I implement all of the steps from the two previous articles (server hardening and optimization as well as a small guide on securing login with Secure Shell (SSH). For tuned-adm set the profile to virtual host. If you cant afford to use enterprise versions of Proxmox then there is some nice scripts out there (Proxmox VE 7 Post Install) to handle post installation tweaks such as changing you from the enterprise repos to the community based repos.
ZFS can also be tuned and since this will be the base layer of our disk performance we should give it some attention. Here is a good guide on this subject. In short these commands can help a bit: (replace mypool with your ZFS pool name)
sudo zfs set atime=off mypool
(should be good always)
sudo zfs set redundant_metadata=most mypool
(consider if you want less safety and slightly more speed)
It is also a good idea to see if your NVME’s can be put into 4K mode or more: (can be done for both LVM or ZFS)
lsblk (will show you the /dev/yournvme
nvme id-ns -H /dev/#yournvme
nvme format --lbaf=X /dev/yournvme (lbaf number with no metadata and highest number of block per operations such as 4K)
Make sure you also set in storage the blocksize to the same as your ZFS pool (if 4K set blocksize to 4k in storage and name of your disk under datasenter)
For background see this article. Basically you need to install NVME-CLI to check the NVME what modes it can be put in. (sudo apt install nvme-cli). This is a format operation so should be done before you add anything to the NVME disk. For a more lenghty understanding of this and other related operations see this article.
This article has some more tips on tuning and in particular with benchmarking performance with fio (sudo apt install fio) :
4K Random read
fio --filename=test --sync=1 --rw=randread --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test4k Random write
fio --filename=test --sync=1 --rw=randwrite --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test4k Sequential read
fio --filename=test --sync=1 --rw=read --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test4K Sequential write
fio --filename=test --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
Virtual machine layer tweaks
Now we have a good understanding of performance on the host layer. Next up is tweaking the virtual machines. In the article it is assumed you are installing Ubuntu 22.04 server images. During installation make sure not to add LVM to disks as basically you are doing this at the host layer either with ZFS or LVM and doing it twice just hurts performance unless you have a good reason to do so.
After installation do the same two steps of the previous articles. (server hardening and optimization as well as a small guide on securing login with Secure Shell (SSH). For tuned-adm the profile could be virtual guest. Now we are going to modify the kernel to improve it for real time latency performance as this is likely important in a network of nodes where you want information to propagate as fast as possible. You can find the guide on installing a zen kernel here. This installation also adds a lot of background tweaks such as using a disk sceduler (kyber) that is well suited for node operations / low latency. Basically it is two steps: (In general this kernel is great, with one exception. Don’t use this on a VM that has your graphic card drivers such as nvidia/amd as it will have to be reinstalled)
sudo add-apt-repository ppa:damentz/liquorix && sudo apt-get updatesudo apt-get install linux-image-liquorix-amd64 linux-headers-liquorix-amd64
Apart from tuning the kernel we can add to /etc/fstab/ to your disc where it says default we add noatime, for example with ext4 installed Ubuntu 22.04:
At this point it can be a good idea to reboot. Please make sure your fstab is correct or it will fail to reboot. A good way is to run
sudo findmnt — verify — verbose
One final tweak you can do if your VM has a lot of memory attached to it is to tweak a few memory options for improving disk IO to /etc/sysctl.conf. See this guide. However this has pro’s/con’s so only do it if your node operations needs lower latency and you have adeqauate IO to handle the loss of throughput.
vm.dirty_ratio = 10 (default is 20) this is good for latency but not for throughput of IO.
vm.dirty_background_ratio = 5 (should be half of vm dirty ratio)
vm.vfs_cache_pressure = 500 (flushes the cache often so to not get inconsistencies in latency).
Why bother with virtualization in the first place?
To end the article I want to show why Proxmox or such software can be useful. While it is not true “bare metal” in the sense that you are running on the host layer with hardware you are as close to bare metal as you can be with these types of tweaks and in general could loose only 1–3% of performance (according to Proxmox). You still own all your hardware and you are still the one in control of it as you are the single tenant of the computer. To illustrate some of the benefits here is an example of a server I run:
ANPSERVER5–1 runs 2x Cardano nodes and 1X GETH node and is the node operations VM and has priority in resources.
ANPSERVER 5–2 runs Zabbix to have an overview of the network.
ANPSERVER 5–3 runs portainer (controls docker containers), Chia (proof of storage crypto) and komga (comic book / book reader)
ANPSERVER 5–4 has passed the graphic card (nvidia) through PCI hardware and is mining Ergo.
I also have a dashboard where I gather resources from my cluster of Proxmox servers for one click access (cluster operations is maybe another article if you would like more information on Proxmox).
As you can see you can utilize the resources of your computer quite efficiently while also adding safety layers such as only having mining on a single VM. It is also much easier to start up new VM’s or close down old ones and do upgrades or other tasks that would be more tedious with a single host OS installation.