So for Christmas a few days ago I ended up getting a 1080p, 144Hz monitor, partially because I just needed a new monitor in general, but also because I wished to mirror my DK2 to it since I planned on using it as a part of my senior project product I am working on for my school. However, I came to be extremely disappointed due to the fact that, for some reason, ovrd (the background daemon which manages the Rift and ultimately provides a lot of functionality to the SDK itself) segfaulted shortly after starting. After looking up the stack trace I got off gdb for the segfault, apparently only a very small few had also had this same exact issue, and there was no fix, because Linux support was dropped before the issue could be fixed.

For a while, I was actually quite irritated at Oculus because, in addition to already having dropped support for Linux fairly quickly after adding it, the last of their software that I could even develop for was entirely broken with my new hardware. I did what I could at the time and just throw up a “Hey, I just recently had this same issue too” in the Oculus forums, but my issue report went unanswered and was buried.

However, I decided today that I wanted to get to the bottom of the actual issue and see why on earth upgrading my monitor has broken ovrd. In short, I’m still really not sure, but I fixed it regardless, so whatever. I’m sure these details will probably be enough to put a fix into ovrd when it finally returns to Linux again.

Naturally, the first thing I decided to try doing was just some basic tests: Was it actually my monitor that broke it? What else could have possibly caused the issue? To do this, I pulled my previous 900p monitor out from my parts bin and returned it temporarily as my main monitor, removing my 144Hz monitor entirely from my PC. Upon starting ovrd, it worked fine, so it wasn’t any weird software upgrade that had broken it. After having done that test, I pulled out my 900p monitor and put in my 1080p one without rebooting. I had to adjust the resolution back and for some reason Xorg had an absurdly low DPI, but for some reason ovrd continued to work, even after killing it and starting it again, with my 144Hz monitor. Wat?

Again, I still wasn’t sure exactly what broke it, but it must have been something between what Xorg was telling ovrd, refresh rates, and DPI. Still doesn’t help in actually figuring out where it dies, so it was time to go into IDA. I hooked up ovrd to gdbserver and connected IDA to debug it, let it run until it died, and it died right here:

From an initial look, it seems there’s a read off a null pointer, which is definitely a problem. To get a better idea about what was actually going on (and what was even null), I threw it into Hex Rays:

If you take a look, you can see where the problem might be. s1 (a char*) is being read from, but it’s initial value is 0, so if it’s never assigned then it will fail at v26 += s1[v25++], since no check is done to make sure s1 isn’t NULL. To see where s1 was actually set, since Hex Rays didn’t seem to pick that up, it’s set somehow by XRRGetOutputProperty. However, if s1 doesn’t end up getting set, the entire function is basically a Segfault death trap: XFree(s1) if XRRGetOutputProperty doesn’t return 0, v26 += s1[v25++], strncmp on s1, so many things that go wrong in here because they didn’t think s1 could possibly have been NULL. So how can I fix?

Since there was the weird while loop to make double sure s1 was entirely 00, even though they compare it to ‘/0′ in the strncmp, I figured that would probably be my best bet to insert some fixing code. Specifically, I decided to target this portion of the while as a possible exit vector in the event of our failure:

loc_40750D is referred several times throughout the display info fetching and seemed to be a sort of “if stuff goes bad, abandon ship and go here” location. So lets see what goes on there I guess?

…gosh darn it. The first thing that happens there too is our NULL s1 string would get freed, so I can’t just go there. However, I took a guess since I’m not the best at x86 and assumed I could so something a bit different: check that s1 isn’t 0, if it is, skip past the XFree and continue our exit as normal. So I drafted up a quick patch by inserting 48 85 C9 0F 84 6C 01 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 at 0x73A0 in the ovrd binary, and suddenly:

So, I guess that solves that problem? While I’m still not 100% sure what exactly went wrong in ovrd, or what was actually the problem with my monitor, I still fixed it I guess. Time to start developing again.

EDIT: For those wanting to patch their ovrd binary, you can also use sed to replace the occurrence (thanks to /u/haagch for pointing this out to me):

sed -i ’s/\x02\x14\x01\x48\x83\xC0\x01\x48\x3D\x80\x00\x00\x00\x75\xF1\x84\xD2\x0F\x85\x56\x01\x00\x00/\x48\x85\xC9\x0F\x84\x6C\x01\x00\x00\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90/g’ ovrd

I went ahead and tested it and it works just as well, so this is a good option if you don’t have a hex editor ready (or are lazy or don’t want to potentially mess anything up).