Yay, GTK has reftests. Finally it’s as easy to write tests as it should be. (Actually, it should be even easier, but I think it’s bearable now.) What are reftests? Robert O’Callahan has an explanation. Why are there reftests? My post to the GTK mailing list has the motivation. How do they work in GTK? The commit message explains it in detail. The only question that remains: How are people actually supposed to write reftests?

While I was looking for an easy-to-demonstrate example for this case, I came across this on IRC:

<kklimonda> hmm, why can’t I center horizontally a GtkLabel that wraps its text? There is no problem alligning it vertically

This looked like a perfect way to write a reftest and to actually fix the problem.

Verify the bug

First I had to make sure there actually is a bug. So what I did was open Glade and do a little test. I created a window, added a label, set it to wrap and center and resized the window. Lo and behold, the text was not centered at all:



Create the test case

Ok, so this is broken. Probably right justified is broken, too, so let’s create a test case. I want to actually use multiline text, because I want to see that GtkLabel makes Pango correctly justify the text. This will make the test a bit more complex. Here we go:



A bunch of things I had to think about while writing the test (which you will notice looks broken, as it should):

I made the window a popup window. That way the test runner can run faster, as the window manager doesn’t need to be kept in the loop. Always use popup windows, unless you must use a normal window to reproduce a bug.

I used GtkGrid as the container. GtkGrid is the container for future Gtk. Box and table are on their way out.

container for future Gtk. Box and table are on their way out. It’s not necessary for the label to actually wrap to show this bug. It just needs to be set to wrapping. And it makes writing the reference test a lot easier.

I used Monospace text. This is to make writing the reference easier.

I added a fourth label to increase the width of the window. This is necessary, because the test runner will use a window’s default size. The size you use for the window in Glade has nothing to do with the final size.

I can now throw the test runner at the problem by running (in the toplevel GTK directory) tests/reftests/gtk-reftest reftest.ui . This will fail because there is no reference image to compare it to, but it will already create a rendering in /tmp/reftest.out.png :



Create the reference file

Ok, now to create the reference file from the bug-showing Glade file. The rendering of the reference should look pixel-perfect exactly as we’d expect the correctly justified label to look, but without actually running into the bug. This is where the monospaced font comes in to help – I can use it to align the text with spaces, but keep it left-justified.



Now I have a reference file. If I now run tests/reftests/gtk-reftest reftest.ui again, I’ll get 3 files in the temp directory, the rendering from the test as reftest.out.png , the rendering from the reference as reftest.ref.png and, if those two files differ, the differences between the two as reftest.diff.png . In this case they look like this:



As you can see, the difference file is highlighting where the two renderings differ to make it easy for us to find what needs to be fixed.

Done

That’s a full testcase. Now it just needs to be moved to the test directories, be properly named, get added to Makefile.am and committed. Of course, a lot more complex test cases are possible, and it’s possible to use custom CSS to achieve certain visual effects.

Keep in mind that a test should not just look identical to the reference on your current setup and your computer, but also on computers with different fonts, different settings and a different theme. That makes the test independent of a lot of GTK internals and allows us to change around other parts of the code without breaking this test. And that is awesome.

Oh, and of course after all this work, one thing still remains: the bugfix. Verifying that the fix actually works is incredibly easy now compared to olden times. Where one had to restart the application reproducing the bug, possibly properly LD_PRELOAD’ing libgtk and navigate to where the bug was visible, now it’s just a simple tests/reftests/gtk-reftest reftest.ui . Couldn’t be easier. And after every run, one can check the images to see what is still wrong. Until it finally says “OK”.

And if you’re not a GTK developer

If you’re not a GTK developer and don’t have the testsuite handy, you can still help a lot in getting your bugs fixed. Because as you saw above, most bugs are visible in Glade. And Glade is available on your computer. So you can just use it to produce a file that reproduces the bug. And if you’re really nice to us, you’ll even create a reference file. And then we can just add it to the GTK testsuite and fix it. And then you will know that this bug will never come back, because the testsuite will make sure it doesn’t.

Of course, this test runner only works for rendering bugs, so there is still no easy way to check event emissions, API functionality, reactions to keyboard and mouse input and so forth. But that was not the goal of this test runner. And I’m sure that if this test runner is successful, we will write a lot more test runners for a lot more use cases.