2019-12-07 bsda2: 0.3.0 Release, the Return of pkg_validate

The 0.3.0 release of bsda2 reintroduces the pkg_validate command, providing the same functionality as running pkg check -s (see pkg-check(8) ). The first BSD Administration Scripts collection provided pkg_validate , because at the time this functionality was missing. With bsda2 this was considered obsolete, but given the current state of multi core computing and fast SSDs there is an opportunity for significant performance gains.

A Comparison

The output of pkg_validate is very similar to pkg check -s .

Progress

An obvious difference is how progress is indicated. pkg check shows a percentage (based on the number of packages, not on the amount of actual work), whereas pkg_validate lets you know what it is currently working on.

root# pkg check -s Checking all packages: 68% py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc Checking all packages: 85%

The progress of pkg check -s .

kamikaze# pkg_validate py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc Checking package 772 of 944: subversion-1.13.0

The progress of pkg_validate .

Output Capturing

Something that pkg_validate supports much better than pkg check is redirecting output:

root# pkg check -s | tee issues Checking all packages: ......py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc Checking all packages....... done root# cat issues Checking all packages: ...... Checking all packages....... done root#

Capture pkg check -s output with tee(1) .

So what happened here? The interesting output apparently goes into /dev/stderr . The progress goes to /dev/stdout , so we end up capturing the progress instead of the interesting data. This can be fixed by exchanging the outputs:

root# ((pkg check -s 1>&3) 2>&1) 3>&2 | tee issues Checking all packages: 68% py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc Checking all packages: 100% root# cat issues py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc root#

Capture pkg check -s output with tee(1) , for real this time.

The pkg_validate output goes directly to /dev/stdout , error messages to /dev/stderr and the progress to /dev/tty . The latter is removed when pkg_validate exits. This makes output redirection much easier:

kamikaze# pkg_validate | tee issues py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc kamikaze# cat issues py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc kamikaze#

Capture pkg_validate output with tee(1) .

Running Unprivileged

One of the drawbacks of pkg check is that it cannot run without root privileges:

kamikaze# pkg check -s pkg: Insufficient privileges kamikaze#

Running pkg check -s without root privileges.

This is not an issue with pkg_validate . However, it should be noted that it ignores files it cannot check due to lack of necessary permissions. The reason is that in the vast majority of cases these files are not relevant to the user running the application.

Nonetheless, pkg_validate can report these files:

kamikaze# pkg_validate -m cups-2.2.12: user kamikaze cannot access /usr/local/libexec/cups/backend/dnssd cups-2.2.12: user kamikaze cannot access /usr/local/libexec/cups/backend/ipp cups-2.2.12: user kamikaze cannot access /usr/local/sbin/cupsd cups-2.2.12: user kamikaze cannot access /usr/local/libexec/cups/backend/lpd cups-2.2.12: user kamikaze cannot access /usr/local/etc/cups/cups-files.conf.sample cups-2.2.12: user kamikaze cannot access /usr/local/etc/cups/cupsd.conf.sample cups-2.2.12: user kamikaze cannot access /usr/local/etc/cups/snmp.conf.sample dbus-1.12.16: user kamikaze cannot access /usr/local/libexec/dbus-daemon-launch-helper gutenprint-5.3.3: user kamikaze cannot access /usr/local/libexec/cups/backend/gutenprint53+usb hplip-3.19.11: user kamikaze cannot access /usr/local/libexec/cups/backend/hp polkit-0.114_3: user kamikaze cannot access /usr/local/etc/polkit-1/rules.d(/50-default.rules) py36-pycparser-2.19: checksum mismatch for /usr/local/lib/python3.6/site-packages/pycparser/__pycache__/c_ast.cpython-36.pyc rxvt-unicode-9.22_1: user kamikaze cannot access /usr/local/bin/urxvt rxvt-unicode-9.22_1: user kamikaze cannot access /usr/local/bin/urxvtd trousers-0.3.14_2: user kamikaze cannot access /usr/local/etc/tcsd.conf.sample vpnc-0.5.3_13: user kamikaze cannot access /usr/local/etc/vpnc.conf.sample kamikaze#

Running pkg_validate --no-filter .

A noteworthy example is the following line:

polkit-0.114_3: user kamikaze cannot access /usr/local/etc/polkit-1/rules.d(/50-default.rules)

Missing file?

This line is unusual, because a fraction of the path is wrapped in parentheses. This indicates that the file /usr/local/etc/polkit-1/rules.d/50-default.rules could not be checked, because /usr/local/etc/polkit-1/rules.d is not accessible.

Runtime Measurements

Of course none of these differences are what pkg_validate was written for, it was meant to be fast.

The test setup is an Intel Core i7-9750H with 32 GiB of RAM running FreeBSD 12.1-stable on a RaidZ1 with geli full disk encryption over two 1 TB ADATA SX8200PNP NVME SSDs.

2.5 GHz fixed clock Single package (texlive-texmf) [2pt/s] 15.20 15.23 15.12 15.38 15.24 TeX Live packages (-x texlive) [2pt/s] 31.73 31.97 31.69 31.74 32.03 All 943 packages (-a) [2pt/s] 187.64 186.82 187.32 188.22 186.70

Turbo enabled (max performance) Single package (texlive-texmf) [2pt/s] 8.77 8.81 8.64 8.68 8.67 TeX Live packages (-x texlive) [2pt/s] 18.07 18.05 18.04 18.01 18.04 All 943 packages (-a) [2pt/s] 104.57 104.72 104.45 104.71 105.98



Closing Thoughts

Because a few large packages contribute a majority of files per package dispatch like in pkg_libchk was not satisfactory. Especially when checking a single package performance was abysmal until per file dispatch was introduced. There is still room for improvement, because workers compete for access to the single job queue. For now, with pkg check as the baseline, this is pretty good.

References