Now here’s something more like what I was originally expecting the content on this blog to look like. I’m in the process of moving all of our FreeBSD servers (about 30 in total) from 11.3 to 12.1. We have our own local build of the OS, and until “packaged base” gets to a state where it’s reliably usable, we’re stuck doing upgrades the old-fashioned way. I created a set of notes for myself while cranking through these upgrades and I wanted to share them since they are not really work-specific and this process isn’t very well documented for people who haven’t been doing this sort of upgrade process for 25 years.

Our source and object trees are read-only exported from the build server over NFS, which causes things to be slow. /etc/make.conf and /etc/src.conf are symbolic links on all of our servers to the master copies in /usr/src so that make installworld can find the configuration parameters the system was built with. The first phase, because this is a major version upgrade, is to install the new kernel:

# zfs snapshot -r tank@before-12.1 # mount /usr/src # mount /usr/obj # cd /usr/src # make -s installkernel # shutdown -r now

(If this were a minor version upgrade, it would be a lot simpler.) We then boot single-user and get the server back on the network:

OK boot -s # mount -u / # /etc/rc.d/zfs start # sysctl net.inet.icmp.icmplim=50000 # /etc/netstart # /usr/local/etc/rc.d/unbound start

I might stop here and do some tests to ensure that /etc/netstart has actually brought the system back up with full connectivity — and if we’re using CARP, to down the CARP interfaces so that we don’t unintentionally become a non-functional CARP master for whatever service would normally be running. IPv6 can be a sticking point, because the build server has AAAA records but not all of the other machines have IPv6 connectivity. Now time for the userland part of the OS upgrade:

# mount /usr/src # mount /usr/obj # cd /usr/src # etcupdate -p -t /usr/obj/ref-12.1-etcupdate # much faster to use a tarball built once # make -s installworld # etcupdate -t /usr/obj/ref-12.1-etcupdate # unset EDITOR # probably not necessary in single-user # etcupdate resolve

Next we clean up old crap. This would be a lot simpler except that make delete-old wants to delete configuration files that are actually required by packages we install and managed by our configuration management (which we don’t put in /usr/local because we don’t want to have to hack our Puppet modules to figure out whether these services are running from packages or the base install). /etc/ssh is a symlink to /usr/local/etc/ssh in these systems, but make delete-old has no way to avoid traversing the symlink.

# make check-old-files | sed -e '1d; /^\/etc\/ntp\.conf/d; /^\/etc\/ssh/d' | \ xargs rm -v # make delete-old # (say "no" to /etc/ntp.conf and /etc/ssh/*)

This just cleans up the mountd database on the build server:

# cd # umount /usr/obj # umount /usr/src

Next we update the boot blocks. Our servers vary a lot in terms of which devices are the boot drives, but always use GPT and have the boot meta-loader on partition 1. The most typical case is mirrored SATA, but newer devices may have a single SSD for booting instead:

# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0 # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1

Next is to handle the package upgrades. Before doing the OS upgrade, all servers will have been upgraded to our latest package build for 11.3, so this should be a practical no-op other than switching from the 11.x ABI to the 12.x ABI, but in reality we’ve found a number of things that require manual intervention. In particular, if the python2 package is still installed, pkg upgrade -f crashes. We manually install the rcs package on systems where it’s likely to be needed (servers with some local data still managed in local RCS files) because RCS was removed from 12.x. Note that our pkg.conf has HANDLE_RC_SCRIPTS enabled by default, but in this specific case, since we’re still running single-user, it’s important that the startup scripts not be run because they would start services prematurely.

# pkg-static install -f -y pkg # pkg remove -y python2 # HANDLE_RC_SCRIPTS=NO pkg upgrade -f # pkg install -y rcs # pkg query '%n %q' | fgrep -v :12: # consider whether any of these outdated packages should be deleted

Finally, some minor configuration tweaks to handle features that were introduced in 12.x (or suppress misfeatures that were introduced in 12.x, as the case may be):

# pkg install -y devcpu-data # sysctl hw.model (check /boot/firmware to see if there is an appropriate microcode file for this machine's CPU type; this is correct for current devcpu-data on Intel processors) # echo 'cpu_microcode_load="YES"' >> /boot/loader.conf # echo 'cpu_microcode_name="/boot/firmware/intel-ucode.bin"' >> /boot/loader.conf # echo 'kern.cryptodev_warn_interval=0x7fffffff' >> /etc/sysctl.conf # echo 'ntpd_flags="-p /var/run/ntpd.pid"' >> /etc/rc.conf # reboot

For jails, the process is much the same but with the appropriate DESTDIR or chroot flags to allow the updates to be installed into each jail from the jail-host side.

After a few weeks, I’ll go back into all the machines, delete the before-12.1 snapshot, run make delete-old-libs to clean up obsolete shared libraries, and zpool upgrade to enable the latest ZFS features.

I would really have liked this to be the point at which I set up ZFS boot environments for all the servers, but since the actual requirements seem to be totally undocumented, I was rather stymied on that and will have to do it in a year when we’re going to 12.2. Perhaps by then “packaged base” will be in a usable state as well.