$\begingroup$

Let me focus on the question of your title, and mention that there is another quite robust way to understand what it means to say that a random Turing machine has such-and-such property.

Specifically, we use the concept of asymptotic density as the size of the program increases. For any natural number $n$, there are only finitely many Turing machine program using $n$ states. The asymptotic density or asymptotic probability of a set $A$ of Turing machine programs is the limit (if it exists)

$\lim_{n\to\infty} \frac{|A\cap P_n|}{|P_n|}$,

where $P_n$ is finite the set of Turing machine programs with exactly $n$ states. Thus, the asymptotic probability of a set $A$ of Turing machine programs is simply the limit of the proportion of $n$-state programs in $A$. In particular, if a set $A$ has asymptotic density $1$, then it means that more than $99\%$, more than $99.9\%$, of Turing machine programs are in $A$, as close to $1$ as desired as the number of states increases. In this case, we would seem to be justified in saying that almost every Turing machine program is in $A$.

One can interpret your title question this way as inquiring: what is the asymptotic density of the set of Turing machine programs that decide sets that are equivalent to a DFA?

To give an elementary sample calculation, a Turing machine program $p$ in finite alphabet $\Sigma$ with states $S$ (not counting the halt state) is a function $\Sigma\times S\to \Sigma\times (S\cup\{halt\})\times\{L,R\}$. For example, if the alphabet has $2$ symbols and there are $n$ states, then there are $(4(n+1))^{2n}$ many programs. The number of programs that never transition to the halt state, however, is $(4n)^{2n}$, which has proportion $(\frac{n}{n+1})^{2n}$, which goes to $\frac{1}{e^2}$ as $n\to\infty$. Thus, the density of programs that never halt at all, because they can never transition to the halt state, is $\frac{1}{e^2}$, or about $13.5\%$. Of course, all such programs decide the empty language, which is also decided by a DFA, so this is a lower bound on the title question.

This way of thinking is the foundation of the topic of generic case complexity. A central concern of this topic is the fact that many undecidable or unfeasible decision problems admit a black hole, a very small region where the problem is difficult, outside of which it is easy. For example, it is not good to base a financial encryption scheme on a problem whose difficulty is confined to a black hole---a robber is after all satisfied to rob the bank even only $90\%$ of the time, or even only $1\%$ of the time. Alexei Miasnikov inquired whether the halting problem itself admits a black hole, and it turned out that for one of the standard models of computability, the answer is yes:

Theorem.(Hamkins+Miasnikov) For the Turing machine model with one-way infinite tapes, there is a set of Turing machine programs $A$ such that

$A$ has asymptotic density $1$, so almost every program is in $A$.

$A$ is polynomial time decidable.

The halting problem is polynomial time decidable for programs in $A$.

Thus, for this model of computation, the halting problem is decidable with probability $1$. The reason has to do with the fact that for the one-way infinite tape Turing machine model, it turns out that almost every Turing machine program, like Polya's drunken man, falls off the tape before repeating a state. And this is something that can be detected in linear time. It follows that with asymptotic probability one, a Turing machine program computes a finite set. Since all finite sets are DFA computable, we conclude:

Corollary. For the standard one-way infinite tape model of Turing machines, with asymptotic probability one, a random Turing machine computes a set that is DFA decidable.

See J. D. Hamkins, A. Miasnikov, The halting problem is decidable on a set of asymptotic probability one, Notre Dame Journal of Formal Logic, Notre Dame J. Formal Logic 47 (2006), 515–524.

See also this MO answer, which mentions similar ideas.