Virtualization and optimization for node operations with example of pratical use.

ADA North Pool
7 min readJul 31, 2022

--

Proxmox is a virtual environment where you have a host Debian distro running virtual machines on top of it. This article wont go into all the details of this but focus on optimization and tricks learned for node operations such as on the Cardano network or in the future World Mobile network. Keep in mind virtualization solutions such as Proxmox is just one of many paths towards having your own node operations. There are pro’s and con’s to virtualization, but it can be quite efficient if you tune it a bit as I hope to show in this article. The focus will be on disk IO as this together with network latency are the two most important tweaking areas for node operations.

If you like this article consider delegating to ADA North Pool Cardano staking pool.

Host layer tweaks

Hardware wise it is recommended that you have 4–16GB ram spare on the host computer (on top of the ram you will use for the virtual machines) to run your hardisks and nvmes efficiently, especially if you use ZFS file system (and there are many good reasons you should). If you decide to use compression the amount of ram needed skyrockets (rule of thumb 1GB per 1TB). If you leave the host with less ram then your file IO will start to suffer as it uses the ram to cache both write and read operations.

To install see this guide. I recommend using ZFS filesystem for the host layer if you consider expanding disk size in the future and ext4 on the virtual machines so that is one change from that guide. If you know you wont expand then LVM could be a good choice since it has a bit less overhead in metadata. Ignore the section on ZFS settings if you use LVM. Hostname FQDN could be online or you could use a local address such as #yourservername#.local.

First things first after installation is to tune the host system. I implement all of the steps from the two previous articles (server hardening and optimization as well as a small guide on securing login with Secure Shell (SSH). For tuned-adm set the profile to virtual host. If you cant afford to use enterprise versions of Proxmox then there is some nice scripts out there (Proxmox VE 7 Post Install) to handle post installation tweaks such as changing you from the enterprise repos to the community based repos.

ZFS can also be tuned and since this will be the base layer of our disk performance we should give it some attention. Here is a good guide on this subject. In short these commands can help a bit: (replace mypool with your ZFS pool name)

sudo zfs set atime=off mypool  
(should be good always)
sudo zfs set redundant_metadata=most mypool
(consider if you want less safety and slightly more speed)

It is also a good idea to see if your NVME’s can be put into 4K mode or more: (can be done for both LVM or ZFS)

lsblk (will show you the /dev/yournvme
nvme id-ns -H /dev/#yournvme
nvme format --lbaf=X /dev/yournvme (lbaf number with no metadata and highest number of block per operations such as 4K)

Make sure you also set in storage the blocksize to the same as your ZFS pool (if 4K set blocksize to 4k in storage and name of your disk under datasenter)

For background see this article. Basically you need to install NVME-CLI to check the NVME what modes it can be put in. (sudo apt install nvme-cli). This is a format operation so should be done before you add anything to the NVME disk. For a more lenghty understanding of this and other related operations see this article.

This article has some more tips on tuning and in particular with benchmarking performance with fio (sudo apt install fio) :

4K Random read
fio --filename=test --sync=1 --rw=randread --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
4k Random write
fio --filename=test --sync=1 --rw=randwrite --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
4k Sequential read
fio --filename=test --sync=1 --rw=read --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
4K Sequential write
fio --filename=test --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test

Virtual machine layer tweaks

Now we have a good understanding of performance on the host layer. Next up is tweaking the virtual machines. In the article it is assumed you are installing Ubuntu 22.04 server images. During installation make sure not to add LVM to disks as basically you are doing this at the host layer either with ZFS or LVM and doing it twice just hurts performance unless you have a good reason to do so.

Also uncheck firewall as this has around 0.1 ms penalty and we can manually do a firewall inside the virtual machine with UFW unless you have a good reason to have the firewall and think the performance penalty is negible.
Use VirtIO for the hardisks/nvmes as it is faster than the other options. Discard can be on if it is a disk having thin layer capabilities such as ZFS.
For CPU use HOST as type for added performance

After installation do the same two steps of the previous articles. (server hardening and optimization as well as a small guide on securing login with Secure Shell (SSH). For tuned-adm the profile could be virtual guest. Now we are going to modify the kernel to improve it for real time latency performance as this is likely important in a network of nodes where you want information to propagate as fast as possible. You can find the guide on installing a zen kernel here. This installation also adds a lot of background tweaks such as using a disk sceduler (kyber) that is well suited for node operations / low latency. Basically it is two steps: (In general this kernel is great, with one exception. Don’t use this on a VM that has your graphic card drivers such as nvidia/amd as it will have to be reinstalled)

sudo add-apt-repository ppa:damentz/liquorix && sudo apt-get updatesudo apt-get install linux-image-liquorix-amd64 linux-headers-liquorix-amd64

Apart from tuning the kernel we can add to /etc/fstab/ to your disc where it says default we add noatime, for example with ext4 installed Ubuntu 22.04:

At this point it can be a good idea to reboot. Please make sure your fstab is correct or it will fail to reboot. A good way is to run

sudo findmnt — verify — verbose

One final tweak you can do if your VM has a lot of memory attached to it is to tweak a few memory options for improving disk IO to /etc/sysctl.conf. See this guide. However this has pro’s/con’s so only do it if your node operations needs lower latency and you have adeqauate IO to handle the loss of throughput.

vm.dirty_ratio = 10 (default is 20) this is good for latency but not for throughput of IO. 
vm.dirty_background_ratio = 5 (should be half of vm dirty ratio)
vm.vfs_cache_pressure = 500 (flushes the cache often so to not get inconsistencies in latency).

Why bother with virtualization in the first place?

To end the article I want to show why Proxmox or such software can be useful. While it is not true “bare metal” in the sense that you are running on the host layer with hardware you are as close to bare metal as you can be with these types of tweaks and in general could loose only 1–3% of performance (according to Proxmox). You still own all your hardware and you are still the one in control of it as you are the single tenant of the computer. To illustrate some of the benefits here is an example of a server I run:

ANPSERVER5 with 4 virtual machines

ANPSERVER5–1 runs 2x Cardano nodes and 1X GETH node and is the node operations VM and has priority in resources.

64GB ram as well as two fast NVME’s are assigned to the node operations VM

ANPSERVER 5–2 runs Zabbix to have an overview of the network.

Only 2 GB ram and 2 CPU cores assigned to Zabbix as well as a smaller NVME. But plenty for this service.

ANPSERVER 5–3 runs portainer (controls docker containers), Chia (proof of storage crypto) and komga (comic book / book reader)

Plenty of disks for ANPSERVER5–3 due to Chia farming.
Komga with ebooks. Bundles with ebooks for example on cyber security can be found on pages such as https://www.fanatical.com/en/bundle/cybersecurity-bundle-3-rd-edition
Portainer allowing control of multiple docker containers.

ANPSERVER 5–4 has passed the graphic card (nvidia) through PCI hardware and is mining Ergo.

I removed the network device ID for safety but apart from that you can see that this is a low resource operations there to run a mining software for Ergo.
NBMiner mining with a Gefore 3060 LHR from ANPSERVER5–4.

I also have a dashboard where I gather resources from my cluster of Proxmox servers for one click access (cluster operations is maybe another article if you would like more information on Proxmox).

As you can see you can utilize the resources of your computer quite efficiently while also adding safety layers such as only having mining on a single VM. It is also much easier to start up new VM’s or close down old ones and do upgrades or other tasks that would be more tedious with a single host OS installation.

--

--

ADA North Pool

http://adanorthpool.com 0100000101000100010000010010000001001110010011110101001001010100010010000010000001010000010011110100111101001100