tech_documents:virtualization:virtualization_host_centos8_zfs

Partitioning

If 2 SSD + 2 HDD then mirror /boot / and swap on HDD, leave rest blank for ZFS
If 2 SSD/HDD then mirror /boot / and swap on SSD, leave rest blank for ZFS
If 4 SSD/HDD then mirror /boot, R10 root and swap, leave rest blank for ZFS or…
If 4 SSD/HDD then mirror /boot / and swap on 2 of SSD/HDD and leave rest blank for ZFS

For systems with 4 drives or more you can choose between using all disks for the OS and virtual guests or leave a pair of disks empty that will be configured as a mirrored backup destination later. If you don't have another network location to backup your guests to then a local mirror is highly recommended.

/boot should be 2GB XFS on R1 or R10 / should be 20GB + (GB of system RAM) XFS on R1 or R10 swap should be 0.5x of your RAM if RAM is 32GB or more, 1x your RAM if under, on R1 or R10

Note: if using a software R1, install grub2 on both drives that will participate in the R1 array or the R1 part of the R10 array. (grub2-install /dev/sda and grub2-install /dev/sdb or whatever device name you have; also make sure both drives are in the BIOS boot order list if possible).

The reason for the adding the amount of RAM in GB to your / partition is that the virtual guests will save their RAM states here on suspend.

Network

Setup network as you see fit, DHCP to start with is fine, this must be working and enabled for NTP to be configured. Ideally use 1 NIC for access/management bridge and 1+ NICS for use as network bridge for virtual guests, or use bonding and VLANs and bridges (configure the bridge NICs later).

Date/Time

Select your timezone, enable NTP

Software selection

Minimal

Begin install and create your root password.

1st Login

Update OS and install basic utils

dnf update
dnf install vim rsyslog
systemctl enable rsyslog
systemctl start rsyslog
shutdown -r now
Create limited user account and add to wheel group for sudo
useradd example_user && passwd example_user
usermod -aG wheel example_user

Logout of root and login using sudo user

Disallow root login over SSH
sudo vim /etc/ssh/sshd_config

then set

PermitRootLogin no
Restart SSHD
sudo systemctl restart sshd

Read more about ZFS to fully realize all of it's utility ZFS on CentOS 8

!!!ZFS stops loading between point releases, read the ZFS article listed above for a better understanding and to plan updates for point releases!!!

sudo dnf install epel-release wget
sudo wget http://download.zfsonlinux.org/epel/zfs-release.el8_3.noarch.rpm
sudo dnf install zfs-release.el8_3.noarch.rpm
sudo gpg --quiet --with-fingerprint /etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux

Install:

sudo dnf install zfs

Limit the amount of RAM it uses:

sudo vim /etc/modprobe.d/zfs.conf

Add: (This will use 4GB of your system RAM so adjust down to 2048MB max if needed.)

# Min 2048MB / Max 4096 MB Limit
options zfs zfs_arc_min=2147483648
options zfs zfs_arc_max=4294967296

Load ZFS module

sudo /sbin/modprobe zfs
Create Zpools using Partitions

Do this on the drives that have the /boot / and swap paritions already. Read through the whole ZFS section though if you have 4 or more drives as your layout might be different from this 2 drive example.

Get your storage device info, here we'll assume you have 2 drives, sda and sdb

sudo lsblk
Create Partitions
sudo parted -a optimal /dev/sda
print free

Take the Start and End Values of your Free Space and use them to create the parition

mkpart primary 72.0GB 500GB
print
quit

Repeat for each disk that will have free space added to the zpool. Run

lsblk

to get a list of the partitions that will be used in the zpool. If you created a boot / and swap parition then the ZFS parition should be sda4 and sdb4.

Mirrored Zpool

Create the zpool (here were are using HDDs and mirroring them, mounting them at /var/lib/libvirt/images and giving the zpool a name of vhsrv01_vg_images where vhsrv01 is the name of the virtualization host) Note: Use ashift=13 on any Samsung SSD 850 era and newer.

sudo zpool create -f -o ashift=12 -m /var/lib/libvirt/images vhsrv01_vg_images mirror /dev/sda4 /dev/sdb4
Mirrored and Striped Zpool

Or the equivalent of RAID10 with 4 HDDs

sudo zpool create -f -o ashift=12 -m /var/lib/libvirt/images vhsrv01_vg_images mirror /dev/sda4 /dev/sdb4 mirror /dev/sdc4 /dev/sdd4
ZFS Optimization

Here vhsrv01_vg_images is the pool name

sudo zfs set xattr=sa vhsrv01_vg_images
sudo zfs set acltype=posixacl vhsrv01_vg_images
sudo zfs set compression=lz4 vhsrv01_vg_images
sudo zfs set atime=off vhsrv01_vg_images
sudo zfs set relatime=off vhsrv01_vg_images
ZFS TRIM on SSDs

If you are using SSDs please enable TRIM
To run trim:

sudo zpool trim vhsrv01_vg_images

To check trim status:

sudo zpool status -t vhsrv01_vg_images

To make it automatic

sudo zpool set autotrim=on vhsrv01_vg_images
Resilvering/Scrubbing

This will verify the integrity of the data on the drives and repair any errors.

You should do this weekly if you are using cheap drives like I am:

sudo vim /etc/crontab

Add: (this will scrub every Sunday at 2am, be sure it isn't schedules with other disk intensive activities. If you have another zpool on a separate set of drives you can schedule those at the same time)

0 2 * * 0 root /sbin/zpool scrub vhsrv01_vg_images

If you have 4 or more drives and aren't using them all in a R10 array then use 2 of them for backups of the virtual guest disk images. Your /boot / and swap should already be on the larger/slower HDDs if going this route and the other 2 drives should be SSDs dedicated to /var/lib/libvirt/images.

In this case you'll have sda4 and sdb4 mounted at /vg_backups with a zpool name of vhsrv01_vg_backups

sudo zpool create -f -o ashift=12 -m /vg_backups vhsrv01_vg_backups mirror /dev/sda4 /dev/sdb4
sudo zfs set xattr=sa vhsrv01_vg_backups
sudo zfs set acltype=posixacl vhsrv01_vg_backups
sudo zfs set compression=lz4 vhsrv01_vg_backups
sudo zfs set atime=off vhsrv01_vg_backups
sudo zfs set relatime=off vhsrv01_vg_backups
sudo zpool set autotrim=on vhsrv01_vg_backups

And will use sdc and sdd for /var/lib/libvirt/images

sudo zpool create -f -o ashift=13 -m /var/lib/libvirt/images vhsrv01_vg_images mirror /dev/sdc /dev/sdd
sudo zfs set xattr=sa vhsrv01_vg_images
sudo zfs set acltype=posixacl vhsrv01_vg_images
sudo zfs set compression=lz4 vhsrv01_vg_images
sudo zfs set atime=off vhsrv01_vg_images
sudo zfs set relatime=off vhsrv01_vg_images
sudo zpool set autotrim=on vhsrv01_vg_images

Omit the autotrim=on line if the drive is a HDD.

Add the scrub cron job for both in /etc/crontab

0 2 * * 0 root /sbin/zpool scrub vhsrv01_vg_images
0 2 * * 0 root /sbin/zpool scrub vhsrv01_vg_backups
Permissions

Give permission to your wheel group (which your sudo user is a part of) to the ZFS datasets/pools

sudo zfs allow -g wheel compression,clone,create,destroy,hold,promote,receive,rollback,send,snapshot,mount,mountpoint vhsrv01_vg_images
sudo zfs allow -g wheel compression,clone,create,destroy,hold,promote,receive,rollback,send,snapshot,mount,mountpoint vhsrv01_vg_backups
ZFS Send/Recieve and Visudo

If you want to “zfs send” from another host to this host using a sudoer then you'll need add the following permissions to the visudo file otherwise it will prompt you for a password helper and won't work.

sudo visudo

Add the following (omit the @syncoid lines if you aren't using sanoid/syncoid

%wheel ALL=(ALL) NOPASSWD:/sbin/zfs get *
%wheel ALL=(ALL) NOPASSWD:/sbin/zfs snapshot *
%wheel ALL=(ALL) NOPASSWD:/sbin/zfs receive *
%wheel ALL=(ALL) NOPASSWD:/sbin/zfs rollback *@syncoid*
%wheel ALL=(ALL) NOPASSWD:/sbin/zfs destroy *@syncoid*
Datasets

Create datasets in the ZFS pool for each virtual guest or virtual guest image if you want very granular snapshot control.

This will also create the directory /var/lib/libvirt/images/vm_guest_name

sudo zfs create vhsrv01_vg_images/vm_guest_name

Create RAW images via qemu-img instead of the Virt-gui as it defaults to falloc allocation which will take forever.

sudo qemu-img create -f raw /var/lib/libvirt/images/vm_guest_name/VM_GUEST_NAME.img 50G -o preallocation=off

When using the Virt-Manager to create a virtual guest, to select the disk image you just created via qemu-img, use “browse local → /var/lib/lib/images/vm_guest_name/VM_GUEST_NAME.img” to select the disk image as it won't be listed in the default storage pool.

Install basic Gnome + Utilities:

sudo dnf install gnome-classic-session gnome-terminal nautilus-open-terminal control-center liberation-mono-fonts vim tar gnome-disk-utility gnome-system-monitor firefox

If you want Gnome to load on reboot run the commands below (though you don’t need to if you are only going to use VNC for remote management)

sudo unlink /etc/systemd/system/default.target
sudo ln -sf /lib/systemd/system/graphical.target /etc/systemd/system/default.target

Install Virtualization packages (virt-manager for a nice GUI for your virtual clients)

  
sudo dnf install libvirt virt-manager virt-top
Add your sudo user to libvirt group

https://computingforgeeks.com/use-virt-manager-as-non-root-user/

sudo usermod -aG libvirt example_user

Edit libvirtd.conf

sudo vim /etc/libvirt/libvirtd.conf

Uncomment/edit the following:

unix_sock_group = "libvirt"
unix_sock_ro_perms = "0770"
unix_sock_rw_perms = "0770"

You may also want to give the libvirt group r/w access to the /var/lib/libvirt/images directory so that you can scp remotely to and from using your sudo user since root ssh is disabled. If you are going to use Sanoid/Syncoid this will be needed and if you copy a guest image from another host run these commands again to update it's permission; maybe even put this in a daily script…

sudo chown -R root:libvirt /var/lib/libvirt/images
sudo chmod -R 771 /var/lib/libvirt/images
sudo chmod -R g+s /var/lib/libvirt/images

Restart libvirtd

sudo systemctl restart libvirtd

SSD Settings

If you used SSD drives on MDADM (linux software RAID) then enable FSTRIM service for cleanup

sudo systemctl enable fstrim.timer
sudo systemctl start fstrim.timer

Check status of timer by showing systemd timers:

systemctl list-timers

Check trim support by:

lsblk --discard

Stop writing a timestamp every time a file is accessed Edit the /etc/fstab file and replace all the defaults strings by defaults,noatime.

sudo vim /etc/fstab

For example:

/dev/mapper/rhel-root   /         xfs     defaults        1 1

becomes:

/dev/mapper/rhel-root   /         xfs     defaults,noatime        1 1

If using LVM enable Trim function by editing /etc/lvm/lvm.conf and change

issue_discards = 0

to

issue_discards = 1

Other notes for using SSDs: https://www.certdepot.net/rhel7-extend-life-ssd/

Performance Settings

Set the proper performance profile via tuned-adm:

sudo tuned-adm profile virtual-host

then check to make sure:

sudo tuned-adm list

This should adjust the swappiness, change to the deadline scheduler and other things.

Manually Specify Swappiness

By default swappiness is set to 10 with the virtual-host profile, if you really want to try to avoid using RAM set it to 1, though make sure you have enough RAM for all of your guests. Avoiding swaps if your swap file is on a SSD is good, otherwise the default of 10 should be fine for spinning disks. You might want to set your virtual guests that run linux the same so they avoid swapping if posssible.

sudo vim /etc/sysctl.conf

Add the following:

vm.swappiness = 1

Disable KSM (memory paging feature for oversubscribing memory on similar virtual guests)

sudo systemctl stop ksmtuned
sudo systemctl stop ksm
sudo systemctl disable ksm
sudo systemctl disable ksmtuned

Fix for Win10/2016+ BSOD or crashes

I'm not sure of the cause but this is the fix, research and determine if a better option exists. This is needed only for some architectures.

sudo vim /etc/modprobe.d/kvm.conf

Add the line:

options kvm ignore_msrs=1

VNC Server

Install a VNC server so you can quickly manage the VMs remotely:

sudo dnf install tigervnc-server

Edit the VNC user file and add your user/port

sudo vim /etc/tigervnc/vncserver.users

Example (this runs a server on port 5908 for user vadmin)

:8=vadmin

Edit the VNC config file and add gnome as your session manager

sudo vim /etc/tigervnc/vncserver-config-defaults

Add the following

session=gnome

Set VNC password, set a password different from your sudo user password, say no to view only password.

vncpasswd

Connect on port 5908 using your VNC viewer (preferably TigerVNC)

Note: If you connect via TigerVNC viewer it will show that your connection is insecure. This is because the certificates used aren't trusted, however TLS encryption should be active, you can verify this by pressing F8 when using TigerVNC viewer and checking connection info.

Firewall

Allow VNC server access

sudo firewall-cmd --permanent --zone=public --add-port=5908/tcp
sudo firewall-cmd --reload
sudo systemctl daemon-reload
sudo systemctl enable vncserver@:8.service
sudo systemctl start vncserver@:8.service

*Note: on older Dell servers DSU isn't supported anymore, definitely not R710/R410/R210 generation*

If you are using a Dell server it is recommended that you download and install Dells OMSA and use Dell System Update to update things. Dell: http://linux.dell.com/repo/hardware/dsu/ Set up the Dell OpenManage Repository at like this:

curl -O https://linux.dell.com/repo/hardware/dsu/bootstrap.cgi
sudo bash bootstrap.cgi
sudo dnf install srvadmin-all dell-system-update

Note: RHEL 8 has removed libssh2, you need to use epel to get it and to update the BIOS on some Dell servers you'll need libstdc++.i686

sudo dnf install epel-release
sudo dnf install libssh2 libstdc++.i686

run dsu to update firmware/bios/etc

sudo dsu

Note: you can login to OMSA from the local computer at: https://localhost:1311
Login as root otherwise you won't be able to change things.

Network Bridge for virtual clients. It is recommended to have a dedicated (or multiple dedicated) bridge(s) for your clients(if more than 1 bridge, separate bridges need to be vlan’ed or go to separate networks or you’ll create a loop), lag groups for better throughput is good too. Also, use a separate network card for management and file transfers that won’t interfere with bridged network traffic:

Creating Network Initscripts

Use a consistent bridge name across hosts to move VMs is easy, remember case sensitive to! Recommend naming them BR0, BR1, BR2, BRX. Please try to have consistency across hosts so if you have two bridges on 1 host, have 2 bridges on all others configured the same way, connected to the same switched network.

To find the HWADDR do this:

ethtool -P <if-name>

or

ip link

if-name is the name of the Ethernet interface, normally eth0 or em0 or eno1.

In the /etc/sysconfig/network-scripts directory it is necessary to create 2 config files. The first (ifcfg-eth0) (or ifcfg-em1 or em0 or eth0 etc) defines your physical network interface, and says that it will be part of a bridge:

 
sudo vim /etc/sysconfig/network-scripts/ifcfg-eno1

Configure as so:

DEVICE=eno1
HWADDR=00:16:76:D6:C9:45 (Use you actual HWADDR, or mac address here)
ONBOOT=yes
BRIDGE=br0

The second config file (ifcfg-br0) defines the bridge device:

 
sudo vim /etc/sysconfig/network-scripts/ifcfg-br0

Configure as so:

DEVICE=br0
TYPE=Bridge
BOOTPROTO=none
ONBOOT=yes
DELAY=2

WARNING: The line TYPE=Bridge is case-sensitive - it must have uppercase 'B' and lower case 'ridge'

Also, if you have only 1 Ethernet adapter you will want to give the Bridge device an IP on your LAN for management, see static IP example below. After changing this restart networking (or simply reboot) .

nmcli connection reload && systemctl restart NetworkManager

Example of ifcfg-br0 for static IP:

DEVICE=br0
TYPE=Bridge
BOOTPROTO=none
ONBOOT=yes
DELAY=2
IPADDR=10.222.190.249
NETWORK=10.222.190.0
NETMASK=255.255.255.0
GATEWAY=10.222.190.250
DNS1=208.67.220.220
DNS2=208.67.222.222

This is used if you are going to copy virtual guests between hosts

NFS Setup

Install NFS packages and enable services

sudo dnf install nfs-utils libnfsidmap
sudo systemctl enable rpcbind
sudo systemctl enable nfs-server
sudo systemctl start rpcbind
sudo systemctl start nfs-server
sudo systemctl start rpc-statd
sudo systemctl start nfs-idmapd

make a directory called VG_BACKUPS in /var/lib/libvirt/images

mkdir /var/lib/libvirt/images/VG_BACKUPS

We have to modify “/etc/exports“ file to make an entry of directory “/var/lib/libvirt/images” that you want to share .

sudo vim /etc/exports

Example of exports file

/var/lib/libvirt/images 172.21.21.0/24(rw,sync,no_root_squash,fsid=<somenumber>)

/var/lib/libvirt/images: this is the shared directory
172.21.21.0/24: this is the subnet that we want to allow access to the NFS share
rw: read/write permission to shared folder
sync: all changes to the according filesystem are immediately flushed to disk; the respective write operations are being waited for
no_root_squash: By default, any file request made by user root on the client machine is treated as by user nobody on the server.(Exactly which UID the request is mapped to depends on the UID of user “nobody” on the server, not the client.) If no_root_squash is selected, then root on the client machine will have the same level of access to the files on the system as root on the server.
fsid=somenumber gives the mount a unique id so that mounts are more easily managed by hosts. I recommend using the first and last octets of the host static IP as the “somenumber”

Export the the NFS share

sudo exportfs -r

We need to configure firewall on NFS server to allow client servers to access NFS shares. To do that, run the following commands on the NFS server.

sudo firewall-cmd --permanent --zone public --add-service mountd
sudo firewall-cmd --permanent --zone public --add-service rpc-bind
sudo firewall-cmd --permanent --zone public --add-service nfs
sudo firewall-cmd --reload

Configure NFS Clients

sudo dnf install nfs-utils libnfsidmap
sudo systemctl enable rpcbind
sudo systemctl enable nfs-server
sudo systemctl start rpcbind
sudo systemctl start nfs-server
sudo systemctl start rpc-statd
sudo systemctl start nfs-idmapd

Set SELinux options

sudo setsebool -P nfs_export_all_rw 1
sudo setsebool -P virt_use_nfs=on

Create mount points for NFS shares

sudo mkdir /mnt/ VHSRV02/VG_IMAGES

(where VHSRV02 is the remote computer name, make one for each mount you will have). Client FSTAB entry to mount NFS share:

172.18.18.24:/var/lib/libvirt/images     /mnt/VHSRV02/VG_IMAGES         nfs4    noauto,nofail,x-systemd.automount,_netdev,x-systemd.device-timeout=14,proto=tcp,rsize=131072,wsize=131072 0 0
192.168.21.14:/VG_BACKUPS     /mnt/VHSRV02/VG_BACKUPS               nfs4    noauto,nofail,x-systemd.automount,_netdev,x-systemd.device-timeout=14,wsize=131072,rsize=131072 0 0

VNC Server

If power is lost or server isn't shutdown cleanly then the VNC server service might not restart on boot and manually restarting the VNC service fails. Normally the fix is to delete the lock files in the /tmp folder.

https://access.redhat.com/discussions/1149233 Example:

[root@xxx ~]# ls /tmp/.X
.X0-lock   .X1-lock   .X11-unix/ .X2-lock
[root@xxx ~]# rm -Rf /tmp/.X0-lock
[root@xxx ~]# rm -Rf /tmp/.X1-lock
[root@xxx ~]# rm -Rf /tmp/.X11-unix
[root@xxx ~]# rm -Rf /tmp/.X2-lock

And when connecting be sure you're connecting via port 5908 if you followed the setup according to this document, so… ip.add.r.ess:5908 (otherwise it defaults to 5900).

Network

If your virtual host becomes unplugged from a network switch then all network interfaces (bonds, bridges, vlans and vnets) will go down. On plugging it back in the bonds, bridges and vlans will come back up automatically but the vnets won't. This means your virtual guests won't have network access until your shut them down then back on. Using

ip link setup vnet<x> up 

seems like it brings the interface up and the guest can ping out but devices on the other side of the vnet interface can't seem to get in. Still working on an automated way to fix this. Nice IP command cheatsheet from Redhat: https://access.redhat.com/sites/default/files/attachments/rh_ip_command_cheatsheet_1214_jcs_print.pdf

Shutting Down, Boot, Startup

I am still unclear if the guests cleanly shutdown when the host is issued a shutdown -r now, there is a config file

/etc/sysconfig/libvirt-guests

where options can be set on what to do but I haven't tested them. Here is a link to some info from Redhat: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/sect-shutting_down_rebooting_and_force_shutdown_of_a_guest_virtual_machine-manipulating_the_libvirt_guests_configuration_settings

Also, if you shutdown your virtual guests to do dnf updates on the host, if any of the guests are set to autoboot at startup then will automatically start after an update to libvirt is installed. They will also do this if you restart libvirtd.

  • tech_documents/virtualization/virtualization_host_centos8_zfs.txt
  • Last modified: 2021/08/29 00:16
  • by jacob.hydeman