$\begingroup$

There will not be a continuous optimum, at least for $d = 1$. Assume that we have a solution $g$ --- we will see how to improve upon it. Write $C = C_1 + C_2 + C_3$ for the three terms of the functional.

Let $i$ be such that sign$(g(x_i))

eq$ sign$(g(x_{i+1}))$. For concreteness, we assume that $g(x_i) > 0$, and that $\int_a^bgdx \geq 0$ (the other cases can be handled similarly). Let $(a,b) = (x_i,x_{i+1})$.

What does $g$ look like on this interval? Typically, it will be decreasing from $g(a)$ to $g(b)$, in which case $C_3(g) = g(a) - g(b)$. However, it could be that $\bar{g} := \max_{[a,b]}g(x) > g(a)$, in order to make $\int_a^bfdx$ sufficiently large (to help make $\int_\Omega f dx = 0$). In this case we will have $C_3(g) = (\bar{g} - g(a)) + (\bar{g} - g(b))$. Note that we will never have $\min_{[a,b]}g(x) < g(b)$, or $g$ could be improved.

We let $f = g$ everywhere except on $(a,b)$, so $C_1(f) = C_1(g)$. We let $k > 0$ be a large positive number, and construct $f$ on $(a,b)$ like this: i) grow linearly with rate $k$ until reaching $\bar{g}$, ii) stay constant for a time $h$, iii) decrease linearly with rate $k$ to 0, iv) stay 0, v) decrease linearly with rate $k$ to $g(b)$ at the end of the interval. Given $k$, we can always choose $h$ so that $\int_a^b f dx = \int_a^b g dx$. (In the case where $\bar{g} \leq g(a)$ step i) becomes void, and in the case $\int_a^bgdx = 0$ step ii) becomes void.)

The construction obviously gives $C_3(f) = C_3(g)$. But as $k$ increases, $C_2(f) = \int_a^b |f| dx$ will decrease downwards towards $\int_a^b f dx$, which contradicts the optimality of $g$.

The "problem" is that $C(f)$ does not penalise higher derivatives, which the construction above exploits. From a statistical learning perspective, this is usually solved by restricting the minimization of $C$-like functionals to some class of functions (the simplest examples would be ridge or lasso regressions).