The situation at this point is that we're looking at a common issue with acting on static analysis reports, which is making a change to fix a theoretical bug at the risk of introducing a real bug in the process. Any changes to a code base have risk, as the poor guy who added a comment with a backslash at the end knows. As it turns out, this code usually only executes on the image border, so any failures in the field would have been harder to detect, and I couldn't really justify fixing this on the stable branch. I will admit that I have less of an excuse for not fixing it on the dev branch, but honestly that's the least of the problems with that code.

That leaves the question of why I over-indexed the array. It's been so long that I don't remember why I did this. It was likely either a result of rewriting the asm routine back into C/C++ -- back from when I used to prototype directly in asm -- or from refactoring the structure to replace a 10-long array with a 9-long coefficient array and a named bias field. Indexing the tenth element is likely a violation of the C/C++ standard and there's no reason the code couldn't reference the bias field, which is the correct fix. Problem is, the code works as written: the field is guaranteed to be at the correct address and the most likely source of breakage would be the compiler doing aggressive load/store optimizations on individual structure fields. As it happens, the store and load are very far apart -- the struct is initialized in the filter start phase and read much later in the per-frame filter loop -- and the Visual C++ compiler that I use does not do anything of the sort here, so the generated code works.

This code is from the general convolution filter, which is one of the oldest filters in VirtualDub. It computes a new image based on the application of a 3x3 grid of coefficients and a bias value. What this code is doing is initializing the color accumulators for the windowing operation with the bias value. The structure in question here is special in that it has a fixed layout that is referenced by many pieces of code, some written in assembly language and some dynamically generated (JITted) code, and so it is known -- and required -- that the element after the coefficient array (m) is the bias value. As such, this code works as intended, and if someone were to correct the array index to 8 thinking it was an off-by-one error, it would break the code.

Okay, Mr. Karpov decided to use VirtualDub again as an example of a detected code defect in his article , and while I respect him and his software, I resent the implication that I don't understand how C/C++ arrays work and that he included this example again without noting that the code actually works. I'd like to clarify this here.

Comments

Comments posted:

Hello,Thanks for interest to our article. You must understand that in article we provide only some examples of code that seems like contains an errors. For detailed code analysis developer must know own code, data structure, etc. We can give a free license key for you if you want to check your code with our static analyzer.

Evgeniy Ryzhkov - 02 11 11 - 17:30

You could have done the same with example code fragments. Since you have referred to the projects directly it would have been prudent to be more respectful in your discussion, particularly since you are using their reputation in order to boost your article.

Phaeron - 02 11 11 - 19:09

When I saw that article yesterday and read that part, I immediately (and as it turns out: correctly) thought you were intentionally accessing the "bias" member. That said, if it had been the code of a lesser programmer, I would have assumed that it was an error. The code is still unclear and should probably be fixed. Even though it's not an error in this case, I do think it's a good thing that the analyser points it out so it can be cleared up (as a matter of style), even though the interpretation in the article was wrong. Static analysis is a great tool and a useful guide as an addition, but not a replacement for a capable programmer's eyes.

Steven (link) - 02 11 11 - 21:13

You're right, any code is permitted to have his legacy, but in the eyes of people who do not know the history of the code, things as such looks like an error in coding style. So, they are right too.

PastorGL - 02 11 11 - 22:57

Looking at the definition of the struct, it is fairly obvious that m[9] actually points to bias. However, without documentation you can't easily determine whether this is what the developer intended. If it were my code, I'd link to this article right there. ;)

Haploid - 03 11 11 - 00:11

I don't think the properties you rely on are guaranteed. I think the C99 standard says that undefined behavior results if you use an out-of-range array index, even if it "works". You may be lucky with your current compiler and optimization settings, but it's not guaranteed.In the end, I can understand your slight frustration that someone says the code has a bug while you know it has a long history of working, but please consider adjusting the code so it doesn't rely on undefined behavior. At the very least, consider adding a static assertion to make sure the "bias" member has the offset you expect.One more link about undefined behavior (highly recommended, no matter what you decide to do): http://blog.llvm.org/2011/05/what-every-.. :-)

Paweł - 03 11 11 - 01:14

Note that I didn't say that the existing code was good style: it certainly isn't. The issue I have is with people who look at this code and say, "oh, that's obviously on off-by-one, it should have been [8]." That's a really bad way to approach static analysis warnings. There is a *vast* amount of source code that intentionally or unintentionally relies on implementation defined or undefined behavior, and the danger is that the fixing the warning can introduce an error. Indexing past the end of an array at the end of a structure is actually a common idiom, often used to define extensible structures.The issue of structure layout is an interesting one. I believe the compiler is not allowed to reorder entries in a POD struct, but I don't know if C allows padding to be inserted between an array of elements and a single element after it. This would break a lot of code that is mapping fields to a fixed layout, especially ones that use a union to mirror a struct to named fields. Also, I'm pretty sure this falls firmly in implementation defined, which is much safer to depend on than undefined behavior if you know your target platforms. I don't know many programmers who avoid doing right shifts on signed numbers.

Phaeron - 03 11 11 - 03:38

A very good discussion about this has popped up on Reddit: http://www.reddit.com/r/programming/comm.. Many people there point out that compilers are free to add padding between 'm' and 'bias'. So while the code does work on all major compilers, it is not standard-compliant. Whether you believe that is a bug is up to you.Other people believe that this is unnecessarily obfuscated and that is a bug -- not a bug in functionality per se, but still a bug.

Kevin - 03 11 11 - 04:56

That's a wierd mixture of types in that struct, I mean long and uint32, one of them has a fuzzy size (4 or 8 bytes, depending on the compiler), the other is clearly defined. Generally, I never ever use "long" as a type.

Gabest - 03 11 11 - 08:05

@Kevin:Many people pointing out the same thing does not always mean they are right -- the compiler cannot add padding if you specify structure packing using either /Zp[n] or #pragma pack(n) or __attribute__((__packed__)), not to mention that there is no reason to add padding between elements of the same size.Besides, most of you are missing the point -- static analysis without full code understanding can only introduce more bugs.

Igor Levicki (link) - 03 11 11 - 13:38

I understand your wider point about not blindly following static analysers, but I think, on balance, they make a code base more robust. I think removing potential bugs is a good idea because they may become actual bugs if you port to a different compiler, platform or CPU.More obviously though, they can become actual bugs if you change code that interfaces with them. How far away are those two statements in the code? If someone DID change that structure around, it would break the later statement. If it referenced bias instead of m[9], it would work fine.Personally I think this whole thing is ridiculous. Doing something like changing a field name in the structure would be a refactoring operation affecting lots of other code. But this change wouldn't even affect any other code...

NeXus - 03 11 11 - 19:49

Isn't this the case that the C++ standard allows accessing a one-past-the-size index of an array just to make arrays work with algorithms from (so that pointers inside the array can act as the iterators obtained from usual begin() and end())?Even so, I say that accessing the bias element not by name but by abusing the array indexing is bad style. It works OK, but the code does not show the intent.Regarding static analysis tools, as every tool you need to know how to use it. You say that you cannot rely on static analysis if you do not know the source code and its intent. That's why static analysis works best if you use it on your own code. This should help you get rid of potential errors, reliance on undefined behaviour and so on.With all the above, I still have great respect for Phaeron, his work on VirtualDub, and the most interesting articles he posts in this blog.

Krzysiek - 03 11 11 - 21:34

It seems angle brackets were removed from my previous post. In the first sentence I meant "(...) with algorithms from standard "algorithm" header (so (...)".

Krzysiek - 03 11 11 - 21:37

After some reading, I stand corrected: one is able to produce a one-past-array-size pointer, but is not allowed to dereference it.

Krzysiek - 03 11 11 - 22:10

I think that the offer of a free license is very appropriate, and would surely appreciate it myself. But additionaly a correction of the original article is in order (just a link to this article in the description of Example 5 would suffice).

Jorge - 04 11 11 - 04:01

Origin article was fixed, URL for this discussion is provided in article.

Evgeniy Ryzhkov - 06 11 11 - 17:13

No Evgeniy, the article has not been fixed. You need to correct your error explanation.

Igor Levicki (link) - 09 11 11 - 22:36

Hi,And about the lexer.cpp "case" ?

heep - 11 11 11 - 00:55

That one appears to be a valid bug, but I haven't analyzed the impact of it yet. It doesn't affect the main program as it's in the build tool that's used to compile part of the UI. From what I can tell, it could cause WEOF to be pushed into the backtrack stack if a backtrack occurred at the end of a file, which could then cause a second false WEOF. For this to have an effect it would have to occur at the end of an include file that was included from a second include file. I don't believe I have this case and so I would have to create a test case for it.

Phaeron - 12 11 11 - 08:22

Ah O.K. thank you for making this so clear.And big thank you for your hard work.

heep - 12 11 11 - 10:07

Interesting discussion. I can say that I have seen structure padding, but this was on a non-intel embedded platform. In this case the compiler needed to ensure that word objects were aligned correctly as the cpu could not access a word at an odd address. This would occur where a struct used a combination of byte and word types, but would only really be a problem if you over-indexed an array, did raw pointer-indexing into the base address of a struct, or refused to SIZEOF the struct when allocating memory.I would not be too surprised if I heard compilers were padding structs in order to optimise copy/move operations or pipelines. I remember hearing that keeping everything 32 or 64 bits wide can be faster at or despite the cost of greater size. Phaeron isn't that likely to suffer from padding causing trouble here since the array is an even size and both struct elements are the same type (long) which itself should be the default width most of the time.

John Mills - 17 11 11 - 03:03

Could you have uses a "union" to make the code correct?So that you could address something likelong mu[10];which would be union-ified withlong m[9];long bias;...I'm not even sure if "union" is in C++, I read about them in an ANSI C book but don't recall ever using them... but they would be seem to be made for this sort of thing, if I remember their intent correctly.

Bajan13K (link) - 22 12 11 - 02:49

Unions are primarily meant to allow storage sharing, not aliasing. The C++ standard says that you can only cross-access parts of union members that are common initial sequences of layout compatible members. Problem is, layout compatible requires compatible types, and long[10] is not compatible with long[9]. Interestingly the C++ standard doesn't say what happens if you violate this, but the C89 standard says it is implementation defined and so I assume that is true for C++ as well. The failure mode would be similar: the compiler would decouple accesses to the different members and the reader of the union would get the wrong value.That having been said, I believe this is a fairly common pattern when writing hardware drivers, and so the compiler writers would need to make this work at least for volatile accesses to avoid pissing off a lot of driver writers. Those who write the compiler have a lot of power, but even they must obey those who work on the kernel....