$\begingroup$

There are two approaches when considering this question: historical that pertains to how concepts were discovered and technical which explains why certain concepts were adopted and others abandoned or even forgotten.

Historically, the Turing Machine is perhaps the most intuitive model of several developed trying to answer the Entscheidungsproblem. This is intimately related to the great effort in the first decades of the 20th century to completely axiomatize mathematics. The hope was that once you have proven a small set of axioms to be correct (which would require substantial effort), you could then use a systematic method to derive a proof for the logical statement you were interested in. Even if someone considered finite automata in this context, they would be quickly dismissed since they fail to compute even simple functions.

Technically, the statement that all computers are finite automata is false. A finite automaton has constant memory that cannot be altered depending on the size of the input. There is no limitation, either in mathematics or in reality, that prevented from providing additional tape, hard disks, RAM or other forms of memory, once the memory in the machine was being used. I believe this was often employed in the early days of computing, when even simple calculations could fill the memory, whereas now for most problems and with the modern infrastructure that allows for far more efficient memory management, this is most of the time not an issue.

EDIT: I considered both points raised in the comments but elected not to include them both of brevity and time I had available to write down the answer. This is my reasoning as to why I believe these points do not diminish the effectiveness of Turing machines in simulating modern computers, especially when compared to finite automata: