First, I'll note that although I only mention "C" here, the same really applies about equally to C++ as well.

The comment mentioning Godel was partly (but only partly) on point.

When you get down to it, undefined behavior in the C standards is largely just pointing out the boundary between what the standard attempts to define, and what it doesn't.

Godel's theorems (there are two) basically say that it's impossible to define a mathematical system that can be proven (by its own rules) to be both complete and consistent. You can make your rules so it can be complete (the case he dealt with was the "normal" rules for natural numbers), or else you can make it possible to prove its consistency, but you can't have both.

In the case of something like C, that doesn't apply directly -- for the most part, "provability" of the completeness or consistency of the system isn't a high priority for most language designers. At the same time, yes, they probably were influenced (to at least some degree) by knowing that it's provably impossible to define a "perfect" system -- one that's provably complete and consistent. Knowing that such a thing is impossible may have made it a bit easier to step back, breathe a little, and decide on the bounds of what they would try to define.

At the risk of (yet again) being accused of arrogance, I'd characterize the C standard as being governed (in part) by two basic ideas:

The language should support as wide a variety of hardware as possible (ideally, all "sane" hardware down to some reasonable lower limit). The language should support writing as wide a variety of software as possible for the given environment.

The first means that if somebody defines a new CPU, it should be possible to provide a good, solid, usable implementation of C for that, as long as the design falls at least reasonably close to a few simple guidelines -- basically, if it follows something on the general order of the Von Neumann model, and provides at least some reasonable minimum amount of memory, that should be enough to allow a C implementation. For a "hosted" implementation (one that runs on an OS) you need to support some notion that corresponds reasonably closely to files, and have a character set with a certain minimum set of characters (91 are required).

The second means it should be possible to write code that manipulates the hardware directly, so you can write things like boot loaders, operating systems, embedded software that runs without any OS, etc. There are ultimately some limits in this respect, so nearly any practical operating system, boot loader, etc., is likely to contain at least a little bit of code written in assembly language. Likewise, even a small embedded system is likely to include at least some sort of pre-written library routines to give access to devices on the host system. Although a precise boundary is difficult to define, the intent is that the dependency on such code should be kept to a minimum.

The undefined behavior in the language is largely driven by the intent for the language to support these capabilities. For example, the language allows you to convert an arbitrary integer to a pointer, and access whatever happens to be at that address. The standard makes no attempt at saying what will happen when you do (e.g., even reading from some addresses can have externally visible affects). At the same time, it makes no attempt at preventing you from doing such things, because you need to for some kinds of software you're supposed to be able to write in C.

There is some undefined behavior driven by other design elements as well. For example, one other intent of C is to support separate compilation. This means (for example) that it's intended that you can "link" pieces together using a linker that follows roughly what most of us see as the usual model of a linker. In particular, it should be possible to combine separately compiled modules into a complete program without knowledge of the semantics of the language.

There is another type of undefined behavior (that's much more common in C++ than C), which is present simply because of the limits on compiler technology -- things that we basically know are errors, and would probably like the compiler to diagnose as errors, but given the current limits on compiler technology, it's doubtful that they could be diagnosed under all circumstances. Many of these are driven by the other requirements, such as for separate compilation, so it's largely a matter of balancing conflicting requirements, in which case the committee has generally opted to support greater capabilities, even if that means lack of diagnosing some possible problems, rather than limiting the capabilities to ensure that all possible problems are diagnosed.

These differences in intent drive most of the differences between C and something like Java or a Microsoft's CLI-based systems. The latter are fairly explicitly limited to working with a much more limited set of hardware, or requiring software to emulate the more specific hardware they target. They also specifically intend to prevent any direct manipulation of hardware, instead requiring that you use something like JNI or P/Invoke (and code written in something like C) to even make such an attempt.

Going back to Godel's theorems for a moment, we can draw something of a parallel: Java and CLI have opted for the "internally consistent" alternative, while C has opted for the "complete" alternative. Of course, this is a very rough analogy -- I doubt anybody's attempting a formal proof of either internal consistency or completeness in either case. Nonetheless, the general notion does fit fairly closely with the choices they've taken.