First published Tue Aug 20, 2013; substantive revision Thu Jan 19, 2017

The philosophy of computer science is concerned with those ontological, methodological, and ethical issues that arise from within the academic discipline of computer science as well as from the practice of software development. Thus, the philosophy of computer science shares the same philosophical goals as the philosophy of mathematics and the many subfields of the philosophy of science, such as the philosophy of biology or the philosophy of the social sciences. The philosophy of computer science also considers the analysis of computational artifacts, that is, human-made computing systems, and it focuses on methods involved in the design, specification, programming, verification, implementation, and testing of those systems. The abstract nature of computer programs and the resulting complexity of implemented artifacts, coupled with the technological ambitions of computer science, ensures that many of the conceptual questions of the philosophy of computer science have analogues in the philosophy of mathematics , the philosophy of empirical sciences, and the philosophy of technology . Other issues characterize the philosophy of computer science only. We shall concentrate on three tightly related groups of topics that form the spine of the subject. First we discuss topics related to the ontological analysis of computational artifacts, in Sections 1–5 below. Second, we discuss topics involved in the methodology and epistemology of software development, in Sections 6–9 below. Third, we discuss ethical issues arising from computer science practice, in Section 10 below. Applications of computer science are briefly considered in section 11.

1. Computational Artifacts

Computational artifacts underpin our Facebook pages, control air traffic around the world, and ensure that we will not be too surprised when it snows. They have been applied in algebra, car manufacturing, laser surgery, banking, gastronomy, astronomy, and astrology. Indeed, it is hard to find an area of life that has not been fundamentally changed and enhanced by their application. But what is it that is applied? What are the things that give substance to such applications? The trite answer is the entities that computer scientists construct, the artifacts of computer science, computational artifacts, if you will. Much of the philosophy of computer science is concerned with their nature, specification, design, and construction.

1.1 Duality

Folklore has it that computational artifacts fall into two camps: hardware and software. Presumably, software includes compilers and natural language understanding systems, whereas laptops and tablets are hardware. But how is this distinction drawn: How do we delineate what we take to be software and what we take to be hardware?

A standard way identifies the distinction with the abstract-physical one (see the entry on abstract objects), where hardware is taken to be physical and software to be abstract. Unfortunately, this does not seem quite right. As Moor (1978) points out, programs, which are normally seen as software, and therefore under this characterization abstract, may also be physical devices. In particular, programs were once identified with sequences of physical lever pulls and pushes. There are different reactions to this observation. Some have suggested there is no distinction. In particular, Suber (1988) argues that hardware is a special case of software, and Moor (1978) argues that the distinction is ontologically insignificant. On the other hand, Duncan (2011) insists that there is an important difference but it is one that can only be made within an ontological framework that supports finer distinctions than the simple abstract-physical one (e.g., B. Smith 2012). Irmak (2012) also thinks that software and hardware are different: software is an abstract artifact, but apparently not a standard one, because it has temporal properties.

Whether or not the software-hardware distinction can be made substantial, most writers agree that, although a program can be taken as an abstract thing, it may also be cashed out as a sequence of physical operations. Consequently, they (e.g., Colburn 2000; Moor 1978) insist that programs have a dual nature: they have both an abstract guise and a physical one. Indeed, once this is conceded, it would seem to apply to the majority of computational artifacts. On the one hand, they seem to have an abstract guise that enables us to reflect and reason about them independently of any physical manifestation. This certainly applies to abstract data types (Cardelli & Wegner 1985). For example, the list abstract data type consists of the carrier type together with operations that support the formation and manipulation of lists. Even if not made explicit, these are determined by several axioms that fix their properties: e.g., if one adds an element to the head of a list to form a new list, and then removes the head, the old list is returned. Similarly, an abstract stack is determined by axioms that govern push and pop operations. Using such properties, one may reason about lists and stacks in a mathematical way, independently of any concrete implementation. And one needs to. One cannot design nor program without such reasoning; one cannot construct correct programs without reasoning about what the programs are intended to do. If this is right, computational artifacts have an abstract guise that is separable from their physical realization or implementation. Indeed, this requirement to entertain abstract devices to support reasoning about physical ones is not unique to computer science.

On the other hand, they must have a physical implementation that enables them to be used as things in the physical world. This is obviously true of machines, but it is equally so for programs: Programmers write programs to control physical devices. A program or abstract machine that has no physical realization is of little use as a practical device for performing humanly intractable computations. For instance, a program that monitors heart rate must be underpinned by a physical device that actually performs the task. The computer scientist Dijkstra puts it as follows.

A programmer designs algorithms, intended for mechanical execution, intended to control existing or conceivable computer equipment. (Dijkstra 1974: 1)

On the duality view, computer science is not an abstract mathematical discipline that is independent of the physical world. To be used, these things must have physical substance. And once this observation is made, there is a clear link with a central notion in the philosophy of technology (Kroes 2010; Franssen et al. 2010), to which we now turn.

1.2 Technical Artifacts

Technical artifacts include all the common objects of everyday life such as toilets, paper clips, tablets, and dog collars. They are intentionally produced things. This is an essential part of being a technical artifact. For example, a physical object that accidentally carries out arithmetic is not by itself a calculator. This teleological aspect distinguishes them from other physical objects, and has led philosophers to argue that technical artifacts have a dual nature fixed by two sets of properties (e.g., Kroes 2010; Meijers 2001; Thomasson 2007; Vermaas & Houkes 2003): functional properties and structural properties.

Functional properties say what the artifact does. For example, a kettle is for boiling water, and a car is for transportation. On the other hand, structural properties pertain to its physical makeup. They include its weight, color, size, shape, chemical constitution, etc. For example, we might say that our car is red and has white seats.

The notion of a technical artifact will help to conceptualize and organize some of the central questions and issues in the philosophy of computer science. We begin with a concept that underpins much of the activity of the subject. Indeed, it is the initial expression of functional properties.

2. Specification and Function

In computer science, the function of an artifact is initially laid out in a (functional) specification (Sommerville 2016 [1982]; Vliet 2008). Indeed, on the way to a final device, a whole series of specification-artifact pairs of varying degrees of abstractness come into existence. The activities of specification, implementation and correctness raise a collection of overlapping conceptual questions and problems (B.C. Smith 1985; Turner 2011; Franssen et al. 2010).

2.1 Definition

Specifications are expressed in a variety of ways, including ordinary vernacular. But the trend in computer science has been towards more formal and precise forms of expression. Indeed, specialized languages have been developed that range from those designed primarily for program specification (e.g., VDM, Jones 1990 [1986]; Z, Woodcock & Davies 1996; B, Abrial 1996) and wide spectrum languages such UML (Fowler 2003), to specialized ones that are aimed at architectural description (e.g., Rapide, Luckham 1998; Darwin, Distributed Software Engineering 1997; Wright, Allen 1997). They differ with respect to the their underlying ontologies and their means of articulating requirements.

Z is based upon predicate logic and set theory. It is largely employed for the specification of suites of individual program modules or simple devices. UML (Fowler 2003) has a very rich ontology and a wide variety of expression mechanisms. For example, its class language allows the specification of software patterns (Gamma et al. 1994). In general, an architectural description language is used to precisely specify the architecture of a software system (Bass et al. 2003 [1997]). Typically, these languages employ an ontology that includes notions such as components, connectors, interfaces and configurations. In particular, architectural descriptions written in Rapide, Darwin, or Wright are precise expressions in formalisms that are defined using an underlying mathematical semantics.

But what is the logical function of the expressions of these languages? On the face of it, they are just expressions in a formal language. However, when the underlying ontology is made explicit, each of these languages reveals itself to be a formal ontology that may be naturally cast as a type theory (Turner 2009a). Under this interpretation, these expressions are stipulative definitions (Gupta 2012). As such, each defines a new abstract object within the formal ontology of its system.

2.2 Definitions as Specifications

However, taken by itself a definition need not be a specification of anything; it may just form part of a mathematical exploration. So when does a definition act as a specification? Presumably, just in case the definition is taken to point beyond itself to the construction of an artifact. It is the intentional act of giving governance of the definition over the properties of a device or system that turns a mere definition into a specification. The definition then determines whether or not the device or system has been built correctly. It provides the criteria of correctness and malfunction. From this perspective, the role of specification is a normative one. If one asks whether the device work, it is the definition functioning as a specification that tells us whether it does. Indeed, without it, the question would be moot. At any level of abstraction (see §8.1), the logical role of specification is always the same: It provides a criterion for correctness and malfunction. This is the perspective argued for by Turner (2011). Indeed, this normative role is taken to be part of any general theory of function (Kroes 2012).

It should go without saying that this is an idealization. A specification is not fixed throughout the design and construction process. It may have to be changed because a client changes her mind about the requirements. Furthermore, it may turn out for a variety of reasons that the artifact is impossible to build. The underlying physical laws may prohibit matters. There may also be cost limitations that prevent construction. Indeed, the underlying definition may be logically absurd. In these cases, the current specification will have to be given up. But the central normative role of specification remains intact.

Unlike functional descriptions, specifications are taken to be prescribed in advance of the artifact construction; they guide the implementer. This might be taken to suggest a more substantive role for specification i.e., to provide a method for the construction of the artifact. However, the method by which we arrive at the artifact is a separate issue from its specification. The latter dictates no such method. There is no logical difference between a functional specification and functional description; logically they both provide a criterion of correctness.

2.3 Abstract Artifacts

Software is produced in a series of layers of decreasing levels of abstraction, where in the early layers both specification and artifact are abstract (Brooks 1995; Sommerville 2016 [1982]; Irmak 2012). For example, a specification written in logical notation might be taken to be a specification of a linguistic program. In turn, the linguistic program, with its associated semantics, might be taken as the specification of a physical device. In other words, we admit abstract entities as artifacts. This is a characteristic feature of software development (Vliet 2008). It distinguishes it from technology in general. The introduction of abstract intermediate artifacts is essential (Brooks 1995; Sommerville 2016 [1982]). Without them logically complex computational artifacts would be impossible to construct.

So what happens to the duality thesis? It still holds good, but now the structural description does not necessarily provide physical properties but another abstract device. For example, an abstract stack can act as the specification of a more concrete one that is now given a structural description in a programming language as an array. But the array is itself not a physical thing, it is an abstract one. Its structural description does not use physical properties but abstract ones, i.e., axioms. Of course, eventually, the array will get implemented in a physical store. However, from the perspective of the implementer who is attempting to implement stacks in a programming language with arrays as a data type, the artifact is the abstract array of the programming language. Consequently, the duality thesis must be generalized to allow for abstract artifacts.

2.4 Theories of Function

Exactly how the physical and intentional conceptualizations of our world are related remains a vexing problem to which the long history of the mind-body problem in philosophy testifies. This situation also affects our understanding of technical artifacts: a conceptual framework that combines the physical and intentional (functional) aspects of technical artifacts is still lacking. (Kroes & Meijers 2006: 2)

The literature on technical artifacts (e.g., Kroes 2010; Meijers 2001; Thomasson 2007; Vermaas & Houkes 2003) contains two main theories about how the two conceptualizations are related: causal-role theories and intentional ones.

Causal-role theories insist that actual physical capacities determine function. Cummins’s theory of functional analysis (Cummins 1975) is an influential example of such a theory. The underlying intuition is that, without the physical thing and its actual properties, there can be no artifact. The main criticism of these theories concerns the location of any correctness criteria. If all we have is the physical device, we have no independent measure of correctness (Kroes 2010): The function is fixed by what the device actually does.

Causal role theories… have the tendency to let functions coincide with actual physical capacities: structure and function become almost identical. The main drawback of this approach is that it cannot account for the malfunctioning of technical artifacts: an artifact that lacks the actual capacity for performing its intended function by definition does not have that function. The intentions associated with the artifact have become irrelevant for attributing a function. (Kroes 2010: 3)

This criticism has the same flavor as that made by Kripke (1982) in his discussion of rule following.

Intentional theories insist that it is agents who ascribe functions to artifacts. Objects and their components possess functions only insofar as they contribute to the realization of a goal. Good examples of this approach are McLaughlin (2001) and Searle (1995).

But how exactly does the function get fixed by the desires of an agent? One interpretation has it that the function is determined by the mental states of the agents, i.e., the designers and users of technical artifacts. In their crude form such theories have difficulty accounting for how they impose any constraints upon the actual thing that is the artifact.

If functions are seen primarily as patterns of mental states, on the other hand, and exist, so to speak, in the heads of the designers and users of artifacts only, then it becomes somewhat mysterious how a function relates to the physical substrate in a particular artifact. (Kroes 2010: 2)

For example, how can the mental states of an agent fix the function of a device that is intended to perform addition? This question is posed in a rather different context by Kripke.

Given … that everything in my mental history is compatible both with the conclusion that I meant plus and with the conclusion that I meant quus, it is clear that the skeptical challenge is not really an epistemological one. It purports to show that nothing in my mental history of past behavior—not even what an omniscient God would know—could establish whether I meant plus or quus. But then it appears to follow that there was no fact about me that constituted my having meant plus rather than quus. (Kripke 1982: 21)

Of course, one might also insist that the artifact is actually in accord with the specification, but this does not help if the expression of the function is only located in the mental states of an agent. This version of the intentional theory is really a special case of a causal theory where the agent’s head is the physical device in which the function is located.

However, there is an alternative interpretation of the intentional approach. On his commentary on Wittgenstein’s notion of acting intentionally (Wittgenstein 1953), David Pears suggests that anyone who acts intentionally must know two things. Firstly, she must know what activity she is engaged in. Secondly, she must know when she has succeeded (Pears 2006). According to this perspective, establishing correctness is an externally observable, rule-based activity. The relation between the definition and the artifact is manifest in using the definition as a canon of correctness for the device. I must be able to justify my reasons for thinking that it works: If I am asked if it works I must be able to justify that it does with reference to the abstract definition. The content of the function is laid out in the abstract definition, but the intention to take it as a specification is manifest in using it as one (§2.2).

3. Implementation

Broadly speaking an implementation is a realization of a specification. Examples includes the implementation of a UML specification in Java, the implementation of an abstract algorithm as a program in C, the implementation of an abstract data type in Miranda, or the implementation of a whole programming language. Moreover, implementation is often an indirect process that involves many stages before physical bedrock, it involves a specification-artifact pairing and a notion of implementation. But what is an implementation? Is there just one notion or many?

3.1 What Is Implementation?

The most detailed philosophical study of implementation is given by Rapaport (1999, 2005). He argues that implementation involves two domains: a syntactic one (the abstraction) and a semantic one (the implementation). Indeed, he suggests that a full explication of the notion requires a third hidden term, a medium of implementation: \(I\) is an implementation of \(A\) in medium \(M\). Here \(I\) is the semantic component, \(A\) is the abstraction, and \(M\) is the medium of implementation. He allows for the target medium to be abstract or physical. This is in line with the claim that artifacts may be abstract or concrete.

Superficially, this seems right. In all the examples cited, there is a medium of implementation in which the actual thing that is the implementation is carved out. Perhaps the clearest example is the implementation of a programming language. Here, the syntactic domain is the actual language and the semantic one its interpretation on an abstract machine: the medium of interpretation. He suggests that we implement an algorithm when we express it in a computer programming language, and we implement an abstract data type when we express it as a concrete one. Examples that he does not mention might include the UML definition of design patterns implemented in Java (Gamma et al. 1994).

He further argues that there is no intrinsic difference between which of the domains is semantic and which is syntactic. This is determined by the asymmetry of the implementation mapping. For example, a physical computer process that implements a program plays the role of the semantics to the linguistic program, while the same linguistic program can play the role of semantic domain to an algorithm. This asymmetry is parallel to that of the specification-artifact connection. On the face of it, there is little to cause any dissension. It is a straightforward description of the actual use of the term implementation. However, there is an additional conceptual claim that is less clear.

3.2 Implementation as Semantic Interpretation

Apparently, the semantic domain, as its name suggests, is always taken to be a semantic representation of the syntactic one; it closes a semantic gap between the abstraction and the implementation in that the implementation fills in details. This is a referential view of semantics in that the syntactic domain refers to another domain that provides its meaning. Indeed, there is a strong tradition in computer science that takes referential or denotational semantics as fundamental (Stoy 1977; Milne & Strachey 1976; Gordon 1979). We shall examine this claim later when we consider the semantics of programming languages in more detail (§4). For the moment, we are only concerned with the central role of any kind of semantics.

One view of semantics insists that it must be normative. Although the exact form of the normative constraint (Glüer & Wikforss 2015; Miller & Wright 2002) is debated, there is a good deal of agreement on a minimal requirement: a semantic account must fix what it is to use an expression correctly.

The fact that the expression means something implies that there is a whole set of normative truths about my behavior with that expression; namely, that my use of it is correct in application to certain objects and not in application to others…. The normativity of meaning turns out to be, in other words, simply a new name for the familiar fact that, regardless of whether one thinks of meaning in truth-theoretic or assertion-theoretic terms, meaningful expressions possess conditions of correct use. Kripke’s insight was to realize that this observation may be converted into a condition of adequacy on theories of the determination of meaning: any proposed candidate for the property in virtue of which an expression has meaning, must be such as to ground the “normativity” of meaning—it ought to be possible to read off from any alleged meaning constituting property of a word, what is the correct use of that word. (Boghossian 1989: 513)

On the assumption that this minimal requirement has to be satisfied by any adequate semantic theory, is implementation always, or even ever, semantic interpretation? Are these two notions at odds with each other?

One standard instance of implementation concerns the interpretation of one language in another. Here the abstraction and the semantic domain are both languages. Unfortunately, this does not provide a criterion of correctness unless we have already fixed the semantics of the target language. While translating between languages is taken to be implementation, indeed a paradigm case, it is not, on the present criterion, semantic interpretation. It only satisfies the correctness criterion when the target language has an independently given notion of correctness. This may be achieved in an informal or in a mathematical way. But it must not end in another uninterpreted language. So this paradigm case of implementation does not appear to satisfy the normative constraints required for semantic interpretation. On the other hand, Rapaport (1995) argues that providing a recursive definition of implementation requires a base case, that is, the process must end in an uninterpreted language. However, such a language can be interpreted on itself, mapping each symbol either on itself or on different symbols of that language.

Next consider the case where the abstraction is a language and the semantic medium is set theory. This would be the case with denotational semantics (Stoy 1977). This does provide a notion of correctness. Our shared and agreed understanding of set theory provides this. Unfortunately, it would not normally be taken as an implementation. Certainly, it would not, if an implementation is something that is eventually physically realizable.

Now consider the case where the syntactic component is an abstract stack and the semantic one is an array. Here we must ask what it means to say that the implementation is correct. Does the medium of the array fix the correct use of stacks? It would seem not: The array does not provide the criteria for deciding whether we have the correct axioms for stacks or whether we have used them correctly in a particular application. Rather, the stack is providing the correctness criteria for the implementation that is the array. Instead, the axioms provide the fundamental meanings of the constructs. While the array is an implementation of the stack, it does not provide it with a notion of correctness: The cart and the horse have been interchanged.

Finally, suppose the semantic domain is a physical machine and the syntactic one is an abstract one. The suggestion is that the physical machine provides a semantic interpretation of the abstract one. But again, a semantic interpretation must provide us with a notion of correctness and malfunction, and there are compelling arguments against this that are closely related to the causal theories of function (§2.4). This issue will be more carefully examined in section (§4) where we consider programming language semantics.

Given that a semantic account of a language must supply correctness criteria, and that the term semantics is to have some bite, these are serious obstacles for the view that implementation is semantic interpretation. There are several phenomena all rolled into one. If these objections are along the right lines, then the relationship between the source and target is not semantic interpretation. Of course, one may counter all this by arguing against the correctness requirement for semantic theory.

3.3 Specification and Implementation

An alternative analysis of implementation is implicit in Turner (2014, 2012). Consider the case where the data type of finite sets is implemented in the data types of lists. Each of these structures is governed by a few simple axioms. The implementation represents finite sets as lists, the union operation on sets as list concatenation, and equality between sets as extensional equality on lists etc. This is a mathematical relationship where the axioms for sets act as a specification of the artifact, which in this case is implemented in the medium of lists. It would appear that the logical connection between the two is that of specification and artifact. The mapping does not have to be direct i.e., there does not have to be a simple operation-to-operation correspondence, but the list properties of the implemented operations must satisfy the given set axioms. In standard mathematical terms, the list medium must provide a mathematical model (in the sense of model theory, W. Hodges 2013) of the set axioms. The case where one language is implemented in another is similar, and fleshed out by the semantic definitions of the two languages.

Finally, consider the case where the medium of implementation is a physical device e.g., an abstract stack is implemented as a physical one. Once again the abstract stack must provide the correctness criteria for the physical device. This is what happens in practice. We check that the physical operations satisfy the abstract demands given by the axioms for stacks. There are issues here that have to do with the adequacy of this notion of correctness. We shall discuss these when we more carefully consider the computer science notion of correctness (§7.4).

If this analysis is along the right lines, implementation is best described as a relation between specification and artifact. Implementation is not semantic interpretation; indeed, it requires an independent semantic account in order to formulate a notion of implementation correctness. So, what is taken to be semantic interpretation in computer science?

4. Semantics

How is a semantic account of a programming language to be given? What are the main conceptual issues that surround the semantic enterprise? There are many different semantic candidates in the literature (Gordon 1979; Gunter 1992; Fernández 2004; Milne & Strachey 1976). One of the most important distinctions centers upon the difference between operational and denotational semantics (Turner 2007; White 2003).

4.1 Two Kinds of Semantic Theory

Operational semantics began life with Landin (1964). In its logical guise (Fernández 2004) it provides a mechanism of evaluation where, in its simplest form, the evaluation relation is represented as follows. \[P \Downarrow c\] This expresses the idea that the program \(P\) converges to the canonical form given by \(c\). The classical case of such a reduction process occurs in the lambda calculus where reduction is given by the reduction rules of the calculus, and canonical forms are its normal forms i.e., where none of the reduction rules apply. The following is a simple example: \[z(\lambda x.y)\]

This is usually called big step semantics. It is normally given in terms of rules that provide the evaluation of a complex program in terms of the evaluation of its parts. For example, a simple rule for sequencing (\(\degr\)) would take the form

\[ \frac{P\Downarrow c \quad Q\Downarrow d} {P\degr Q \Downarrow c\degr d} \]

These canonical or normal forms are other terms in the programming language which cannot be further reduced by the given rules. But they are terms of the language. For this reason, this operational approach is often said to be unsatisfactory. According to this criticism, at some point in the interpretation process, the semantics for a formal language must be mathematical.

We can apparently get quite a long way expounding the properties of a language with purely syntactic rules and transformations… One such language is the Lambda Calculus and, as we shall see, it can be presented solely as a formal system with syntactic conversion rules … But we must remember that when working like this all we are doing is manipulating symbols-we have no idea at all of what we are talking about. To solve any real problem, we must give some semantic interpretation. We must say, for example, “these symbols represent the integers”. (Stoy 1977: 9)

In contrast, operational semantics is taken to be syntactic. In particular, even if one of them is in canonical form, the relation \(P\Downarrow c\) relates syntactic objects. This does not get at what we are talking about. Unless the constants of the language have themselves an independently given mathematical meaning, at no point in this process do we reach semantic bedrock: we are just reducing one syntactic object to another, and this does not yield a normative semantics. This leads to the demand for a more mathematical approach.

Apparently, programming languages refer to (or are notations for) abstract mathematical objects, not syntactic ones (Strachey 2000; McGettrick 1980; Stoy 1977). In particular, denotational semantics provides, for each syntactic object \(P\), a mathematical one. Moreover, it generally does this in a compositional way: Complex programs have their denotations fixed in terms of the denotations of their syntactic parts. These mathematical objects might be set theoretic, category theoretic, or type theoretic. But whichever method is chosen, programs are taken to refer to abstract mathematical things. However, this position relies on a clear distinction between syntactic and mathematical objects.

4.2 Programming Languages as Axiomatic Theories

Mathematical theories such as set theory and category theory are axiomatic theories. And it is this that makes them mathematical. This is implicit in the modern axiomatic treatment of mathematics encouraged by (Bourbaki 1968) and championed by Hilbert (1931).

It is worth pointing out that the axiomatic account, as long as it is precise and supports mathematical reasoning, does not need to be formal. If one accepts this as a necessary condition for mathematical status, does it rule out operational accounts? Prima facie it would seem so. Apparently, programs are reduced to canonical constants with no axiomatic definitions. But Turner (2009b, 2010) argues this is to look in the wrong place for the axiomatization: the latter resides not in the interpreting constants but in the rules of evaluation, i.e., in the theory of reduction given by the axiomatic relation \(\Downarrow\).

Given that both denotational and operational semantics define matters axiomatically, it should not matter which we take to define the language as a formal mathematical theory. Unfortunately, they don’t always agree: The notion of equality provided by the operational account, although preserved by the denotational one, is often more fine grained. This has led to very special forms of denotational semantics based upon games (Abramsky & McCusker 1995; Abramsky et al. 1994). However, it is clear that practitioners take the operational account as fundamental, and this is witnessed by the fact that they seek to devise denotational accounts that are in agreement with the operational ones.

Not only is there no metaphysical difference between the set theoretic account and the operational one, but the latter is taken to be the definitive one. This view of programming languages is the perspective of theoretical computer science: Programming languages, via their operational definitions, are mathematical theories of computation.

However, programming languages are very combinatorial in nature. They are working tools, not elegant mathematical theories; it is very hard to explore them mathematically. Does this prevent them from being mathematical theories? There has been very little discussion of this issue in the literature; Turner (2010) and Strachey (2000) are exceptions. On the face of it, Strachey sees them as mathematical objects pure and simple. Turner is a little more cautious and argues that actual programming languages, while often too complex to be explored as mathematical theories, contain a core theory of computation that may be conservatively extended to the full language.

4.3 The Implementation of Programming Languages

However, Turner (2014) further argues that programming languages, even at their core, are not just mathematical objects. He argues that they are best conceptualized as technical artifacts. While their axiomatic definition provides their function, they also require an implementation. In the language of technical artifacts, a structural description of the language must say how this is to be achieved: It must spell out how the constructs of the language are to be implemented. To illustrate the simplest case, consider the assignment instruction. \[x := E\] A physical implementation might take the following form.

Physically compute the value of \(E\).

Place the (physical token for) the value of \(E\) in the physical location named \(x\) any existing token of value to be replaced.

This is a description of how assignment is to be physically realized. It is a physical description of the process of evaluation. Of course, a complete description will spell out more, but presumably not what the actual machine is made of; one assumes that this would be part of the structural description of the underlying computer, the medium of implementation. The task of the structural description is only to describe the process of implementation on a family of similarly structured physical machines. Building on this, we stipulate how the complex constructs of the language are to be implemented. For example, to execute commands in sequence we could add a physical stack that arranges them for processing in sequence. Of course, matters are seldom this straightforward. Constructs such as iteration and recursion require more sophisticated treatment. Indeed, interpretation and compilation may involve many layers and processes. However, in the end there must be some interpretation into the medium of a physical machine. Turner (2014) concludes that a programming language is a complex package of syntax and semantics (function) together with the implementation as structure.

Some have suggested that a physical implementation actually defines the semantics of the language. Indeed, this is a common perspective in the philosophy of computer science literature. We have already seen that Rapaport (1999) sees implementation as a semantic interpretation. Fetzer (1988) observes that programs have a different semantic significance from theorems. In particular, he asserts:

…programs are supposed to possess a semantic significance that theorems seem to lack. For the sequences of lines that compose a program are intended to stand for operations and procedures that can be performed by a machine, whereas the sequences of lines that constitute a proof do not. (Fetzer 1988: 1059)

This seems to say that the physical properties of the implementation contribute to the meaning of programs written in the language. Colburn (2000) is more explicit when he writes that the simple assignment statement \(A := 13\times 74\) is semantically ambiguous between something like the abstract account we have given, and the physical one given as:

physical memory location \(A\) receives the value of physically computing 13 times 74. (Colburn 2000: 134)

The phrase “physically computing” seems to imply that what the physical machine actually does is semantically significant i.e.; what it actually does determines or contributes to the meaning of assignment. Is this to be taken to imply that to fix what assignment means we have to carry out a physical computation? However, if an actual physical machine is taken to contribute in any way to the meaning of the constructs of the language, then their meaning is dependent upon the contingencies of the physical device. In particular, the meaning of the simple assignment statement may well vary with the physical state of the device and with contingencies that have nothing to with the semantics of the language, e.g., power cuts. Under this interpretation, multiplication does not mean multiplication but rather what the physical machine actually does when it simulates multiplication. This criticism parallels that for causal theories of function (§2.4).

5. The Ontology of Programs

The nature of programs has been the subject of a good amount of philosophical and legal reflection. What kinds of things are they? Are they abstract (perhaps mathematical or symbolic) objects or concrete physical things? Indeed, the legal literature even contains a suggestion that programs constitute a new kind of (legal) entity (§10.1).

The exact nature of computer programs is difficult to determine. On the one hand, they are related to technological matters. On the other hand, they can hardly be compared to the usual type of inventions. They involve neither processes of a physical nature, nor physical products, but rather methods of organization and administration. They are thus reminiscent of literary works even though they are addressed to machines. Neither industrial property law nor copyright law in their traditional roles seems to be the appropriate instrument for the protection of programs, because both protections were designed for and used to protect very different types of creations. The unique nature of the computer program has led to broad support for the creation of sui generis legislation. (Loewenheim 1989: 1)

This highlights the curious legal status of programs. Indeed, it raises tricky ontological questions about the nature of programs and software: they appear to be abstract, even mathematical objects with a complex structure, and yet they are aimed at physical devices. In this section, we examine some of the philosophical issues that have arisen regarding the nature of programs and software.

5.1 Programs as Mathematical Objects

What is the content of the claim that programs are mathematical objects? In the legal literature, the debate seems to center on the notion that programs are symbolic objects that can be formally manipulated (Groklaw 2012a, 2012b—see Other Internet Resources). Indeed, there is a branch of theoretical computer science called formal language theory that treats grammars as objects of mathematical study (Hopcroft & Ullman 1969). While this does give some substance to the claim, this is not the most important sense in which programs are mathematical. This pertains to their semantics, where programming languages are taken to be axiomatic theories (§4.2). This perspective locates programs as elements in a theory of computation (Turner 2007, 2010).

5.2 Programs as Technical Artifacts

While agreeing that programs have an abstract guise, much of the philosophical literature (e.g., Colburn 2000; Moor 1978) has it that they also possess a concrete physical manifestation that facilitates their use as the cause of computations in physical machines. For example, Moor observes:

It is important to remember that computer programs can be understood on the physical level as well as the symbolic level. The programming of early digital computers was commonly done by plugging in wires and throwing switches. Some analogue computers are still programmed in this way. The resulting programs are clearly as physical and as much a part of the computer system as any other part. Today digital machines usually store a program internally to speed up the execution of the program. A program in such a form is certainly physical and part of the computer system. (Moor 1978: 215)

The following is of more recent origin, and more explicitly articulates the duality thesis in its claim that software has both abstract and physical guises.

Many philosophers and computer scientists share the intuition that software has a dual nature (Moor 1978; Colburn 2000). It appears that software is both an algorithm, a set of instructions, and a concrete object or a physical causal process. (Irmak 2012: 3)

5.3 Abstract and Concrete

Anyone persuaded by the abstract-physical duality for programs is under an obligation to say something about the relationship between these two forms of existence. This is the major philosophical concern and parallels the question for technical artifacts in general.

One immediate suggestion is that programs, as textual objects, cause mechanical processes. The idea seems to be that somehow the textual object physically causes the mechanical process. Colburn (2000, 1999) denies that the symbolic text itself has any causal effect; it is its physical manifestation, the thing on the disk, which has such an effect. For him, software is a concrete abstraction that has a medium of description (the text, the abstraction) and a medium of execution (e.g., a concrete implementation in semiconductors). The duality is unpacked in a way that is parallel to that found in the philosophy of mind (see the entry on dualism), where the physical device is taken as a semantic interpretation of the abstract one. This is close to the perspective of Rapaport (1999). However, we have already alluded to problems with this approach (§3.3).

A slightly different account can be found in Fetzer (1988). He suggests that abstract programs are something like scientific theories: A program is to be seen as a theory of its physical implementation—programs as causal models. In particular, the simple assignment statement and its semantics is a theory about a physical store and how it behaves. If this is right, and a program turns out not to be an accurate description of the physical device that is its implementation, the program must be changed: If the theory that is enshrined in the program does not fit the physical device, it should be changed. But this does not seem to be what happens in practice. While the program may have to be changed, this is not instigated by any lack of accord with its physical realization, but by an independent abstract semantics for assignment. If this is correct, the abstract semantics appears not to be a theory of its concrete implementation.

The alternative picture has it that the abstract program (determined by its semantics) provides the function of the artifact, and the physical artifact, or rather its description, provides its structure. It is the function of the program, expressed in its semantics, that fixes the physical implementation and provides the criteria of correctness and malfunction. Programs as computational artifacts have both an abstract aspect that somehow fixes what they do and a physical aspect that enables them to cause physical things to happen.

5.4 Programs and Specifications

What is the difference between programming and specification? One suggestion is that a specification tells us what it is to do without actually saying how to do it. For instance, the following is a specification written in VDM (Jones 1990 [1986]).

SQRTP \((x\):real, \(y\):real) Pre : \(x \ge 0\)

: \(x \ge 0\) Post: \(y* y = x\) and \(y \ge 0\)

This is a specification of a square root function with the precondition that the input is positive. It is a functional description in that it says what it must do without saying how it is to be achieved. One way to unpack this what-how difference is in terms of the descriptive-imperative distinction. Programs are imperative and say how to achieve the goal, whereas specifications are declarative and only describe the input/output behavior of the intended program. Certainly, in the imperative programming paradigm, this seems to capture a substantive difference. But it is not appropriate for all. For example, logic and functional programming languages (Thompson 2011) are not obviously governed by it. The problem is that programming languages have evolved to a point where this way of describing the distinction is not marked by the style or paradigm of the programming language. Indeed, in practice, a program written in Haskell (Thompson 2011) could act as a specification for a program written in C (Huss 1997, Other Internet Resources).

A more fundamental difference concerns the direction of governance, i.e., which is the normative partner in the relationship and which is the submissive one. In the case of the specification of the square root function, the artifact is the linguistic program. When the program is taken as the specification, the artifact is the next level of code, and so on down to a concrete implementation. This is in accord with Rapaport (2005) and his notion of the asymmetry of implementation.

6. Verification

One of the crucial parts of the software development process is verification: After computational artifacts have been specified, instantiated into some high-level programming language, and implemented in hardware, developers are involved in the activities of evaluating whether those artifacts are correct with respect to the provided program specifications. Correctness evaluation methods can be roughly sorted into two main groups: formal verification and testing. Formal verification (Monin 2003) involves some mathematical proof of correctness, software testing (Ammann & Offutt 2008) rather implies running the implemented program and observing whether performed executions comply or do not comply with the advanced specifications on the behaviors of such program. In many practical cases, formal methods and testing are used together for verification purposes (see for instance Callahan et al. 1996).

6.1 Models and Theories

Formal verification methods include the construction of representations of the piece of software to be verified against some set of program specifications. In theorem proving (see Van Leeuwen 1990), programs are represented in terms of axiomatic systems and a set of rules of inference for programs’ transition conditions; a proof of correctness is provided by deriving opportunely formalized specifications from those set of axioms. In model checking (Baier & Katoen 2008), a program is represented in terms of some state transition system, the program’s property specifications are represented in terms of temporal logic formulas (Kröger & Merz 2008), and a proof of correctness is achieved by a depth-first search algorithm that checks whether those temporal logic formulas hold of the state transition system.

Axiomatic systems and state transition systems used to evaluate whether the executions of the represented computational artifacts conform or do not conform with the behaviors prescribed by their specifications can be understood as theories of the represented systems in that they are used to predict and explain the future behaviors of those systems. In particular, state transition systems in model checking can be compared, on a methodological basis, with scientific models in empirical sciences (Angius & Tamburrini 2011). For instance, Kripke Structures are in compliance with Suppes’ (1960) definition of scientific models as set-theoretic structures establishing proper mapping relations with models of data collected by means of experiments on the target empirical system (see also the entry on models in science). A Kripke Structure \(M = (S\), \(S_0\), \(R, L)\) is a set-theoretic model composed of a non-empty set of states \(S\), together with a non-empty set of initial states \(S_0\), a total state transition relation \(R \subseteq S \times S\), and a function \(L: S \rightarrow 2^{\textit{AP}}\) labeling each state in \(S\) with subsets of a set of atomic propositions AP.

Kripke Structures and other state transition systems utilized in formal verification methods are often called system specifications. They are distinguished from common specifications, also called property specifications. The latter specify some required behavioral properties the artifact to be encoded must instantiate, while the former specify (in principle) all potential executions of an already encoded program, thus allowing for algorithmic checks on its traces (Clarke et al. 1999). In order to achieve this goal, system specifications are to be considered as abductive structures hypothesizing the set of potential executions of a target computational artifact on the basis of the program’s code and the allowed state transitions (Angius 2013b). Indeed, once some temporal logic formula has been checked to hold or not to hold of the modeled Kripke Structure, the represented program is empirically tested against the behavioral property corresponding to the checked formula to evaluate whether the model-hypothesis is an adequate representation of the target artifact. Accordingly, property specifications and system specifications differ also in their intentional stance (Turner 2011): Property specifications are requirements \(on\) the program to be encoded, system specifications are (hypothetical) descriptions \(of\) the encoded program. The descriptive and abductive character of state transition systems in model checking is an additional and essential feature putting state transition systems on a par with scientific models.

6.2 Testing and Experiments

The so-called “agile methods” in software development make extensive use of software testing to evaluate the dependability of the implemented computational artifacts. Testing is the more “empirical” process of launching a program and observing its executions to evaluate whether they comply or do not comply with the supplied property specifications. Philosophers and philosophically-minded computer scientists analyzed the software testing techniques under the light of traditional methodological approaches in scientific discovery (Snelting 1998; Gagliardi 2007; Northover et al. 2008; Angius 2014) and questioned whether software tests can be acknowledged as scientific experiments evaluating the correctness of programs (Schiaffonati & Verdicchio 2014; Schiaffonati 2015; Tedre 2015).

Dijkstra’s well-known dictum “Program testing can be used to show the presence of bugs, but never to show their absence” (Dijkstra 1970: 7), introduces Popper’s (1959) principle of falsifiability into computer science (Snelting 1998). Testing a program against an advanced property specification for a given interval of time, may exhibit some failures but if no failure is executed while observing the running program, one cannot conclude that the program is correct. An incorrect execution might be observed at the very next system’s test. The reason is that testers can only launch the program with a finite subset of the potential program’s input set and for a finite interval of time; accordingly, not all potential executions of the artifact to be tested can be empirically observed. For this reason, the aim of software testing is to detect programs’ faults and not to assure for their absence (Ammann & Offutt 2008: 11). A program is falsifiable in that tests can reveal them (Northover et al. 2008). Given a computational artifact and a property specification, a test is akin to a scientific experiment which, by observing the system’s behaviors, tries to falsify the hypothesis that the program is correct with respect to the interested specification.

However, one should be careful to note that other methodological and epistemological traits characterizing scientific experiments are not shared by software tests. A first methodological distinction can be recognized in that a falsifying test leads to the revision of the artifact, not of the hypothesis, as in the case of testing scientific hypotheses. This is due to the difference in the intentional stance of specifications and empirical hypotheses in science (Turner 2011). Specifications are requirements whose violation demands for program revisions until the program becomes a correct instantiation of the specifications.

Accordingly, the notion of scientific experiments, as it has been traditionally examined by the philosophy of empirical sciences, needs to be somehow “stretched” in order to be applied to software testing activities (Schiaffonati 2015). Theory-driven experiments, characterizing most of experimental sciences, find no counterpart in actual computer science practice. Indeed, if one excludes the cases wherein testing is combined with formal methods, most experiments performed by software engineers are rather explorative. An experiment is explorative when it is aimed at “exploring”

the realm of possibilities pertaining to the functioning of an artefact and its interaction with the environment in the absence of a proper theory or theoretical background. (Schiaffonati 2015: 662)

Software testers often do not have theoretical control on the experiments they perform; exploration on the behaviors of the artifacts interacting with users and environments rather provides testers with theoretical generalizations on the observed behaviors. Explorative experiments in computer science are also characterized by the fact that programs are often tested in a real-like environment wherein testers play the role of users. However, it is an essential feature of theory-driven experiments that experimenters do not take part in the experiment to be carried out.

As a result, some software testing activities are closer to the experimental activities one finds in empirical sciences, some others rather define a new typology of experiment that turns out to belong to the software development process. Five typologies of experiments can be distinguished in the process of specifying, implementing, and evaluating computing artifacts (Tedre 2015). Feasibility experiments are performed to evaluate whether an artifact of interest performs the functions specified by users and stakeholders; trial experiments are more specific experiments carried out to evaluate isolated capabilities of the system given some set of initial conditions; field experiment are performed in real environments and not in simulated ones; comparison experiments test similar artifacts, instantiating in different ways the same function, to evaluate which instantiation better performs the desired function both in real-like and real environments; finally,controlled experiments are used to appraise advanced hypotheses on the behaviors of the testing artifact. Only controlled experiments are on a par with scientific theory-driven experiments in that they are carried out on the basis of some theoretical hypotheses under evaluation.

6.3 Explanation

A software test is considered successful when miscomputations are detected (assuming that no computational artifact is 100% correct). The successive step is to find out what caused the execution to be incorrect rather than correct, that is, to trace back the fault (more familiarly named “bug”), before proceeding to the debugging phase and then testing the system again. In other words, an explanation of the observed miscomputation is to be advanced.

Efforts have been spent in analyzing explanations in computer science (Piccinini 2007; Piccinini & Craver 2011; Piccinini 2015; Angius & Tamburrini forthcoming) in relation to the different models of explanations elaborated in the philosophy of science. In particular, computational explanations can be understood as a specific kind of mechanist explanations (Glennan 1996; Machamer et al. 2000; Bechtel & Abrahamsen 2005), insofar as computing processes can be analyzed as mechanisms (Piccinini 2007, 2015; see also the entry on computation in physical systems). A mechanism can be defined in terms of “entities and activities organized such that they are productive of regular changes from start or set-up to finish or termination condition” (Machamer et al. 2000: 3), in other words, as a set of components, their functional capabilities, and their organization enabling them to bring about an empirical phenomenon. And a mechanistic explanation of such a phenomenon turns out to be the description of the mechanism that brings about that phenomenon, that is, the description of the involved components and functional organization. A computing mechanism is defined as a mechanism whose functional organization brings about computational processes. A computational process is to be understood here, in general terms, as a manipulation of strings, leading from input strings to output strings by means of operations on intermediate strings.

Consider a processor executing an instruction. The involved process can be understood as a mechanism whose components are state and combinatory elements in the processor instantiating the functions prescribed by the relevant hardware specifications (specifications for registers, for the Arithmetic Logic Unit etc.), organized in such a way that they are capable of carrying out the observed execution. Accordingly, providing the description of such a mechanism or, in other words, describing the functional organization of hardware components, counts as advancing a mechanist explanation of the observed computation, such as the explanation of an operational malfunction.

For every type of miscomputation defined in §7.5, a corresponding mechanist explanation can be defined at the adequate level of abstraction and with respect to the set of specifications characterizing that level of abstraction. Indeed, abstract descriptions of mechanisms still supply one with a mechanist explanation in the form of a mechanism schema, defined as “a truncated abstract description of a mechanism that can be filled with descriptions of known component parts and activities” (Machamer et al. 2000: 15). For instance, suppose the very common case in which a machine miscomputes by executing a program containing syntax errors, called slips §7.5. The computing machine is unable to correctly implement the functional requirements provided by the program specifications. However, for explanatory purposes, it would be redundant to provide an explanation of the occurred slip at the hardware level of abstraction, by advancing the detailed description of the hardware components and their functional organization. In such cases, a satisfactory explanation may consist in showing that the program’s code is not a correct instantiation of the provided program specifications (Angius & Tamburrini forthcoming). In these cases, in order to explain mechanistically an occurred miscomputation, it may be sufficient to provide the description of the incorrect program, abstracting from the rest of the computing mechanism (Piccinini & Craver 2011). Abstraction is a virtue not only in software development and specification, but also in the explanation of computational artifacts’ behaviors.

7. Correctness

One of the earliest philosophical disputes in computer science centers upon the nature of program correctness. The overall dispute was set in motion by two papers (De Millo et al. 1979; Fetzer 1988) and was carried on in the discussion forum of the ACM (e.g., Ashenhurst 1989; Technical Correspondence 1989). The pivotal issue derives from the duality of programs, and what exactly is being claimed to be correct relative to what. Presumably, if a program is taken to be a mathematical thing, then it has only mathematical properties. But seen as a technical artifact it has physical ones.

7.1 Mathematical Correctness

On the face of it, Hoare seems to be committed to what we shall call the mathematical perspective, i.e., that correctness is a mathematical affair; i.e., establishing that a program is correct relative to a specification involves only a mathematical proof.

Computer programming is an exact science in that all the properties of a program and all the consequences of executing it in any given environment can, in principle, be found out from the text of the program itself by means of purely deductive reasoning. (Hoare 1969: 576)

Consider our specification of a square root function. What does it mean for a program \(P\) to satisfy it? Presumably, relative to its abstract semantics, every program \((P)\), carves out a relationship \(R_P\) between its input and output, its extension. The correctness condition insists that this relation satisfies the above specification, i.e.,

(C) \( \forall x: \textit{Real}. \forall y:\textit{Real}\cdot x \ge 0 \rightarrow (R_P(x, y) \rightarrow y* y = x \textrm{ and } y \ge 0)\)

This demands that the abstract program, determined by the semantic interpretation of its language, satisfies the specification. The statement (C) is a mathematical assertion between two abstract objects and so, in principle, the correctness maybe established mathematically. A mathematical relationship of this kind is surely what Hoare has in mind, and in terms of the abstract guise of the program, there is little to disagree with. However, there are several concerns here. One has to do with the complexity of modern software (the complexity challenge), and the other the nature of physical correctness (the empirical challenge).

7.2 The Complexity Challenge

Programmers are always surrounded by complexity; we cannot avoid it. Our applications are complex because we are ambitious to use our computers in ever more sophisticated ways. Programming is complex because of the large number of conflicting objectives for each of our programming projects. If our basic tool, the language in which we design and code our programs, is also complicated, the language itself becomes part of the problem rather than part of its solution. (Hoare 1981: 10)

Within the appropriate mathematical framework, proving the correctness of any linguistic program, relative to its specification, is theoretically possible. However, real software is complex. In such cases, proving correctness might be infeasible practically. One might attempt to gain some ground by advocating that classical correctness proofs should be carried out by a theorem prover, or at least one should be employed somewhere in the process. However, the latter must itself be proven correct. While this may reduce the correctness problem to that of a single program, it still means that we are left with the correctness problem for a large program. Moreover, in itself this does not completely solve the problem. For both theoretical and practical reasons, in practice, human involvement is not completely eliminated. In most cases, proofs are constructed by hand with the aid of interactive proof systems. Even so, a rigorous proof of correctness is rarely forthcoming. One might only require that individual correctness proofs be checked by a computer rather than a human. But of course the proof-checker is itself in need of checking. Arkoudas and Bringsjord (2007) argue that since there is only one correctness proof that needs to be checked, namely that of the proof checker itself, then the possibility of mistakes is significantly reduced.

This is very much a practical issue. However, there is a deeper conceptual one. Are proofs of program correctness genuine mathematical proofs, i.e., are such proofs on a par with standard mathematical ones? (De Millo et al. 1979) claim that correctness proofs are unlike proofs in mathematics. The latter are conceptually interesting, compelling and attract the attention of other mathematicians who want to study and build upon them. This argument parallels the graspability arguments made in the philosophy of mathematics. Proofs that are long, cumbersome, and uninteresting cannot be the bearers of the kind of certainty that is attributed to standard mathematical proofs. The nature of the knowledge obtained from correctness proofs is said to be different to the knowledge that may be gleaned from standard proofs in mathematics. In order to be taken in, proofs must be graspable. Indeed, Wittgenstein would have it that proofs that are not graspable cannot act as norms, and so are not mathematical proofs (Wittgenstein 1956).

Mathematical proofs such as the proof of Gödel’s incompleteness theorem are also long and complicated. But they can be grasped. What renders such complicated proofs transparent, interesting, and graspable involves the use of modularity techniques (e.g., lemmas), and the use of abstraction in the act of mathematical creation. The introduction of new concepts enables a proof to be constructed gradually, thereby making the proofs surveyable. Mathematics progresses by inventing new mathematical concepts that facilitate the construction of proofs that would be far more complex and even impossible without them. Mathematics is not just about proof; it also involves the abstraction and creation of new concepts and notation. In contrast, formal correctness proofs do not seem to involve the creation of new concepts and notations. While computer science does involve abstraction, it is not quite in the same way.

One way of addressing the complexity problem is to change the nature of the game. The classical notion of correctness links the formal specification of programs to its formal semantic representation. It is at one end of the mathematical spectrum. However, chains of specification-artifact pairings, positioned at varying degrees of abstraction, are governed by different notions of correctness. For example, in the object-oriented approach, the connection between a UML specification and a Java program is little more than type checking. The correctness criteria involve structural similarities and identities (Gamma et al. 1994). Here, we do not demand that one infinite mathematical relation is extensionally governed by another. At higher levels of abstraction, we may have only connections of structure. These are still mathematical relationships. However, such methods, while they involve less work, and may even be automatically verified, establish much less.

7.3 The Empirical Challenge

The notion of program verification appears to trade upon an equivocation. Algorithms, as logical structures, are appropriate subjects for deductive verification. Programs, as causal models of those structures, are not. The success of program verification as a generally applicable and completely reliable method for guaranteeing program performance is not even a theoretical possibility. (Fetzer 1988: 1)

In fact, this issue is alluded to by Hoare in the very text that Fetzer employs to characterize Hoare’s mathematical stance on correctness.

When the correctness of a program, its compiler, and the hardware of the computer have all been established with mathematical certainty, it will be possible to place great reliance on the results of the program, and predict their properties with a confidence limited only by the reliability of the electronics. (Hoare 1969: 579)

All seemed to be agreed that computational systems are at bottom physical systems, and some unpredictable behavior may arise because of the causal connections. Indeed, even when theorem provers and proof checkers are used, the results still only yield empirical knowledge. A proof checker is a program running on a physical machine. It is a program that has been implemented and its results depend upon a physical computation. Consequently, at some level, we shall need to show that some physical machine operations meet their specification. Testing and verification seem only to yield empirical evidence. Indeed, the complexity of program proving has led programmers to take physical testing to be evidence that the abstract program meets its specification. Here, the assumption is that the underlying implementation is correct. But prima facie, it is only empirical evidence.

In apparent contrast, Burge (1998) argues that knowledge of such computer proofs can be taken as a priori knowledge. According to Burge, a priori knowledge does not depend for its justification on any sensory experience. However, he allows that a priori knowledge may depend for its possibility on sensory experience; e.g., knowledge that red is a color may be a priori even though having this knowledge requires having sensory experience of red in order to have the concepts required to even formulate the idea. If correct, this closes the gap between a priori and a posteriori claims about computer-assisted correctness proofs, but only by redrawing the boundary between a priori and a posteriori knowledge so that some empirical assertions can fall into the former category. For more discussion on the nature of the use of computers in mathematical proofs, see Hales 2008; Harrison 2008; Tymoczko 1979, 1980.

Unfortunately, practice often does not even get this far. Generally, software engineers do not construct classical correctness proofs by hand or even automatically. Testing of software against its specification on suites of test cases is the best that is normally achieved. Of course, this never yields correctness in the mathematical sense. Test cases can never be exhaustive (Dijkstra 1974). Furthermore, there is a hidden assumption that the underlying implementation is correct: at best, these empirical methods tell us something about the whole system. Indeed, the size of the state space of a system may be so large and complex that even direct testing is infeasible. In practice, the construction of mathematical models that approximate the behavior of complex systems is the best we can do.

The whole correctness debate carried out in the forum of the ACM (e.g., Ashenhurst 1989; Technical Correspondence 1989) is put into some perspective when programs are considered as technical artifacts. But this leaves one further topic: When we have reached physical structure, what notion of correctness operates?

7.4 Physical Correctness

What is it for a physical device to meet its specification? What is it for it to be a correct physical implementation? The starting point for much contemporary analysis is often referred to as the simple mapping account.

According to the simple mapping account, a physical system \(S\) performs as a correct implementation of an abstract specification \(C\) just in case (i) there is a mapping from the states ascribed to \(S\) by a physical description to the states defined by the abstract specification \(C\), such that (ii) the state transitions between the physical states mirror the state transitions between the abstract states. Clause (ii) requires that for any abstract state transition of the form \(s_1 \rightarrow s_2\), if the system is in the physical state that maps onto \(s_1\), it then goes into the physical state that maps onto \(s_2\).

To illustrate what the simple mapping account amounts to, we consider the example of our abstract machine (§2.1) where we employ an instance of the machine that has only two locations \(l\) and \(r\), and two possible values 0 and 1. Subsequently, we have only four possible states (0, 0), (0, 1), (1, 1), and (1, 0). The computation table for the update operation may be easily computed by hand, and takes the form of a table with input-output pairings. For example, Update\((r,1)\) sends the state (0,0) the state (0,1). The simple mapping account only demands that the physical system can be mapped onto the abstract one in such a way that the abstract state transitions are duplicated in the physical version.

Unfortunately, such a device is easy to come by: Almost anything with enough things to play the role of the physical states will satisfy this quite weak demand of what it is to be an implementation. For example, any collection of colored stones arranged as the update table will be taken to implement the table. The simple mapping account only demands extensional agreement. It is a de-facto demand. This leads to a form of pancomputationalism where almost any physical system implements any computation.

The danger of pancomputationalism has driven some authors (D.J. Chalmers 1996; Egan 1992; Sprevak 2012) to attempt to provide an account of implementation that somehow restricts the class of possible interpretations. In particular, certain authors (D.J. Chalmers 1996; Copeland 1996) seek to impose causal constraints on such interpretations. One suggestion is that we replace the material conditional (if the system is in the physical state \(S_1\) …) by a counterfactual one. In contrast, the semantic account insists that a computation must be associated with a semantic aspect which specifies what the computation is to achieve (Sprevak 2012). For example, a physical device could be interpreted as an AND gate or an OR gate. It would seem to depend upon what we take to be the definition of the device. Without such there is no way of fixing what the artifact is. The syntactic account demands that only physical states that qualify as syntactic may be mapped onto computational descriptions, thereby qualifying as computational states. If a state lacks syntactic structure, it is not computational. Of course, what remains to be seen is what counts as a syntactic state. A good overview can be found in (Piccinini 2015; see also the entry on computation in physical systems).

Turner (2012) argues that abstract structure and physical structure are linked, not just by being in agreement, but also by the intention to take the former as having normative governance over the latter. On this account, computations are technical artifacts whose function is fixed by an abstract specification. This relationship is neither that of theory to physical object nor that of syntactic thing to semantic interpretation.

But there is an ambiguity here that is reflected in the debate between those who argue for semantic interpretation (Sprevak 2012), and those who argue against it (Piccinini 2008). Consider programs. What is the function of a program? Is it fixed by its semantic interpretation, or is it fixed by its specification? The ambiguity here concerns the function of a program as part of a programming language or its role as part of a larger system. As a program in a language, it is fixed by the semantics of the language as a whole. However, to use a program as part of a larger system, one only needs to know what it does. The function of the program, as part of a larger system, is given by its specification. When a computation is picked out by a specification, exactly how the program achieves its specification is irrelevant to the system designer. The specification acts as an interface, and the level of abstraction employed by the system designer is central.

7.5 Miscomputations

It follows from what has been said so far, that correctness of implemented programs does not automatically establish the well-functioning of a computational artifact. Turing (1950) already distinguished between errors of functioning and errors of conclusion. The former are caused by a faulty implementation that is unable to execute the instructions of some high-level language program. Errors of conclusion characterize correct abstract machines that nonetheless fail to carry out the tasks they were supposed to accomplish. This may happen in those cases in which the specifications a program is correctly instantiating do not properly express users’ requirements on such a program. In both cases, machines implementing correct programs can still be said to miscompute.

Turing’s distinction between errors of functioning and errors of conclusion has been expanded into a complete taxonomy of miscomputations (Fresco & Primiero 2013). The provided classification is established on the basis of the many different levels of abstraction one may identify in the software development process. The functional specification level refers to the functional requirements a computational artifact should fulfill and which are advanced by users, companies, software architects, or other general stakeholders expressing constraints on the allowed behaviors of the system to be realized. At the design specification level, those requirements are more formally expressed in terms of a system design description detailing the system’s states and the conditions allowing for transitions among those states. A design specification level specification is, in its turn, instantiated in a proper algorithm, usually using some high-level programming language, at the algorithm design level. At the algorithm implementation level, algorithms can be implemented either in software, by means of assembly language and machine code instructions, or directly in hardware, the latter being the case for many special purpose machines. Finally, the algorithm execution level refers to runtime executions.

Errors can be conceptual, material, and performable. Conceptual errors violate validity conditions requiring consistency for specifications expressed in propositional conjunctive normal form; material errors violate the correctness requirements of programs with respect to the set of their specifications; and performable errors arise when physical constraints are breached by some faulty implementing hardware.

Performable errors clearly emerge only at the algorithm execution level, and they correspond with Turing’s (1950) error of functioning, also called operational malfunctions. Conceptual and material errors may arise at any level of abstraction from functional specification level down to the algorithm implementation level. Conceptual errors engender mistakes, while material errors can induce failures. For instance, a mistake at the functional specification level consists of an inconsistent set of requirements, or at the algorithm implementation level it may correspond to an invalid hardware design (such as in the choice of the logic gates for the truth-functional connectives). And failures occurring at the design specification level may be due to a design that is deemed to be incomplete with respect to the set of functional requirements expressed at the functional specification level while a failure at the algorithm design level occurs in those frequent cases in which a program is found not to fulfill its specifications. Beyond mistakes, failures, and operational malfunctions, slips are a source of miscomputations at the algorithm implementation level. Slips may be conceptual or material errors due to, respectively, a syntactic or a semantic flaw in the software implementation of algorithms. Conceptual slips appear in all those cases in which the syntactical rules of the programming languages are violated; material slips involve the violation of the semantic rules of programming languages, such as when a variable is used but not initialized.

Abstract machines […] are incapable of errors of functioning. In this sense we can truly say that “machines can never make mistakes”. Errors of conclusion can only arise when some meaning is attached to the output signals from the machines. (Turing 1950: 449)

On the basis of Turing’s remark, a distinction can be made between dysfunctions and misfunctions of technical artifacts (Floridi, Fresco, & Primiero 2015). Software can only misfunction but cannot ever dysfunction. An artifact token dysfunctions when it is not able to perform the task(s) it was designed for; and an artifact token misfunctions in case it is able to perform the required task(s) but is prone to manifest some undesired side-effects.

Software development is characterized by more levels of abstraction than one can find in any other artifact’s production cycle. Typical artifacts’ production only involves functional specification level and design specification level; after design, technical artifacts are physically implemented. As seen above, software development is also characterized by the algorithm implementation level, that is, the designed algorithm has to be instantiated in some high-level language program before hardware implementation. An artifact token can dysfunction in case the physical implementation fails to satisfy functional specifications or design specifications. Dysfunctions only apply to single tokens since a token dysfunctions in that it does not behave as the other tokens of the same type do with respect to the implemented functions. For this reason, dysfunctions do not apply to functional specification level and design specification level. On the contrary, both artifacts types and tokens can misfunction, since misfunctions do not depend on comparisons with tokens of the same type being able to perform some implemented function or not. Misfunction of tokens usually depends on the dysfunction of some other component, while misfunction of types is often due to poor design.

A software token cannot dysfunction, because all tokens of a given type implement functions specified at functional specification level and design specification level in the very same way. This is due to the fact that those functions are implemented at algorithm implementation level before being performed at the algorithm execution level; in case of correct implementation, all tokens will behave correctly at the algorithm execution level (provided that no operational malfunction occurs). For the very same reason, software tokens cannot misfunction, since they are equal implementations of the same design and specifications at algorithm implementation level. Only software types can misfunction in case of poor design; misfunctioning software types are able to correctly perform their functions but may also produce some undesired side-effect.

8. Abstraction

Abstraction facilitates computer science. Without it we would not have progressed from the programming of numerical algorithms to the software sophistication of air traffic control systems, interactive proof development frameworks, and computer games. It is manifested in the rich type structure of contemporary programming and specification languages, and underpins the design of these languages with their built-in mechanisms of abstraction. It has driven the invention of notions such as polymorphism, data abstraction, classes, schema, design patterns, and inheritance. But what is the nature of abstraction in computer science? Is there just one form of it? Is it the same notion that we find in mathematics?

8.1 Abstraction in Computer Science

Computer science abstraction takes many different forms. We shall not attempt to describe these in any systematic way here. However, Goguen (Goguen & Burstall 1985) describes some of this variety of which the following examples are instances.

One kind involves the idea of repeated code: A program text, possibly with a parameter, is given a name (procedural abstraction). In Skemp’s terms, the procedure brings a new concept into existence, where the similarity of structure is the common code. Formally, this is the abstraction of the lambda calculus (see the entry on the lambda calculus). The parameter might even be a type, and this leads to the various mechanisms of polymorphism, which may be formalized in mathematical theories such as the second order lambda calculus (Hankin 2004).

Recursion is an early example of operation or mechanism abstraction: It abstracts away from the mechanisms of the underlying machine. In turn, this facilitates the solution of complex problems without having to be aware of the operations of the machine. For example, recursion is implemented in devices such as stacks, but in principle the user of recursion does not need to know this.

The type structure of a programming or specification language determines the ontology of the language: the kinds of entity that we have at our disposal for representation and problem solving. To a large extent, types determine the level of abstraction of the language. A rich set of type constructors provides an expressive system of representation. Abstract and recursive types are common examples.

In object-oriented design, patterns (Gamma et al. 1994) are abstracted from the common structures that are found in software systems. Here, abstraction is the means of interfacing: It dissociates the implementation of an object from its specification. For example, abstract classes act as interfaces by providing nothing but the type structure of its methods.

In addition, in mathematics (Mitchelmore & White 2004), computer science, and philosophy (Floridi 2008) there are levels of abstraction. Abstractions in mathematics are piled upon each other in a never-ending search for more and more abstract concepts. Likewise, computer science deals with the design and construction of artifacts through a complex process involving sequences of artifacts of decreasing levels of abstractness, until one arrives at the actual physical device.

8.2 Information Hiding

In mathematics, once the abstraction is established, the physical device is left behind. On this account, the abstraction is self-contained: An abstract mathematical object takes its meaning only from the system within which it is defined. The only constraint is that the new objects be related to each other in a consistent system that can be operated on without reference to their previous meaning. Self-containment is paramount. There are no leaks.

Some argue that, in this respect at least, abstraction in computer science is fundamentally different to abstraction in mathematics (Colburn & Shute 2007). They claim that computational abstraction must leave behind an implementation trace. Information is hidden but not destroyed. Any details that are ignored at one level of abstraction (e.g., programmers need not worry about the precise location in memory associated with a particular variable) must not be ignored by one of the lower levels of abstraction (e.g., the virtual machine handles all memory allocations). At all levels, computational artifacts crucially depend upon the existence of an implementation. For example, even though classes hide the implementation details of their methods, except for abstract ones, they must have implementations. This is in keeping with the view that computational artifacts have both function and structure: Computational abstractions have both an abstract guise and an implementation.

However, matters are not quite so clean cut. While it is true that abstraction in mathematics generates objects whose meaning is defined by their relationships, the same is so in computer science. Abstract notions could not have a normative function unless they had such independent meanings. Moreover, certain forms of constructive mathematics resembles computer science in that there has to be an implementation trace: one must always be able to recover implementation information from proofs by reading between the lines. Of course, this is not the case for classical mathematics.

Moreover, many would argue that mathematical abstractions do not completely leave behind their physical roots.

One aspect of the usefulness of mathematics is the facility with which calculations can be made: You do not need to exchange coins to calculate your shopping bill, and you can simulate a rocket journey without ever firing one. Increasingly powerful mathematical theories (not to mention the computer) have led to steady gains in efficiency and reliability. But a calculational facility would be useless if the results did not predict reality. Predictions are successful to the extent that mathematical models appropriate aspects of reality and whether they are appropriate can be validated by experience. (Mitchelmore & White 2004: 330) How is it that the axiomatic method has been so successful in this way? The answer is, in large part, because the axioms do indeed capture meaningful and correct patterns. … There is nothing to prevent anyone from writing down some arbitrary list of postulates and proceeding to prove theorems from them. But the chance of those theorems having any practical application [is] slim indeed. … Many fundamental mathematical objects (especially the more elementary ones, such as numbers and their operations) clearly model reality. Later developments (such as combinatorics and differential equations) are built on these fundamental ideas and so also reflect reality even if indirectly. Hence all mathematics has some link back to reality. (Devlin 1994: 54–55)

If would appear that the difference between abstraction in computer science and abstraction in mathematics is not so sharp. However, there appears to be an important conceptual difference. If Turner (2011) is right, in computer science, the abstract partner is the dominant one in the relationship: It determines correctness. In the case of (applied) mathematics, things are reversed: The mathematics is there to model the world, and it must model it accurately. In computer science, the relationship between the abstraction and its source is the specification-artifact relationship; in mathematics, it is between, on the one hand, model or theory, and, on the other hand, reality. When things go wrong the blame is laid at a different place: with the artifact in computer science but with the model in mathematics.

9. The Epistemological Status of Computer Science

The problem of defining the epistemological status of computer science arose as soon as computer science became an independent discipline, distinct from mathematics, between the 1960s and the 1970s (Tedre 2011). Since the 1970s it has been clear that computer science has to be considered partially as a mathematical discipline, partially as a scientific discipline, and partially as an engineering discipline, insofar as it makes use of mathematical, empirical, and engineering methods (Tedre & Sutien 2008). Nonetheless, a debate took place concerning whether computer science has to be mostly considered as a mathematical discipline, a branch of engineering, or as a scientific discipline.

9.1 Computer Science as a Mathematical Discipline

Each epistemological characterization of computer science is based on ontological, methodological, and epistemological commitments, that is, on assumptions about the nature of computational artifacts, the methods involved in the software development process, and the kind of reasoning thereby involved, whether deductive, inductive, or a combination of them (Eden 2007).

Holders of the mathematical nature of computer science assume that programs are mathematical entities about which one can pursue purely deductive reasoning provided by the formal methods of theoretical computer science. As examined in §4.2 and §5.1, Dijkstra (1974) and Hoare (1986) were very explicit in stating that programs’ instructions can be acknowledged as mathematical sentences and how a formal semantics for programming languages can be given in terms of an axiomatic system (Hoare 1969). Provided that program specifications be advanced in a formal language, and provided that a program’s code be represented in the same formal language, formal semantics provide a means by which to prove correctness. Accordingly, knowledge about behaviors of computational artifacts is acquired by the deductive reasoning involved in mathematical proofs of correctness.

The reason at the basis of such a rationalist optimism (Eden 2007) about what can be known about computing systems is that they are artifacts, that is, human-made systems and that, as such, one can predict their behaviors with certainty (Knuth 1974b).

The original motivation for a mathematical analysis of computation came from mathematical logic. Its origins are to be found in Hilbert’s question concerning the decidability of predicate calculus (Hilbert & Ackermann 1928): could there be an algorithm, a procedure, for deciding of an arbitrary sentence of the logic whether it is provable (The Entscheidungsproblem)? In order to address this question, a rigorous model of the informal concept of an effective or mechanical method in logic and mathematics was required. Providing this is first and foremost a mathematical endeavor: one has to develop a mathematical analogue of the informal notion.

Although a central concern of theoretical computer science, the topics of computability and complexity are covered in existing entries on the “Church-Turing thesis”, “computational complexity theory”, and “recursive functions”.

9.2 Computer Science as an Engineering Discipline

In the 1970s, the growing complexity of programs, the increasing number of applications of software systems in everyday contexts, and the consequent booming of market demands caused a deviation of interests of computer scientists, both academics and practitioners, from proofs of programs’ correctness to methods managing the complexity of those systems and evaluating their reliability (Wegner 1976). Indeed, providing formal specifications of modular programs, representing highly complex programs in the same formal language, and providing inputs for systems that are often embedded and interacting with users is practically impossible. It turned out that providing mathematical proofs of correctness was mostly unfeasible. Computer science research rather developed toward testing techniques able to provide a statistical evaluation of correctness, often called reliability (Littlewood & Strigini 2000), in terms of estimations of distributions of errors in a program’s code.

Computer science evaluates the reliability of computing systems in the same way that civil engineering does for bridges or that aerospace engineering does for airplanes (DeMillo et al. 1979). In particular, whereas empirical sciences examine what exists, computer science focuses on what can exist, that is, on how to produce artifacts, and it should be therefore acknowledged as an “engineering of mathematics” (Hartmanis 1981). Similarly, whereas scientific inquiries are involved in discovering laws concerning the studied phenomena, one cannot identify proper laws in computer science practice, insofar as the latter is rather involved in the production of the phenomena to be studied, that is, those concerning computational artifacts (Brooks 1996).

9.3 Computer Science as a Scientific Discipline

Software testing and reliability measuring techniques are nonetheless known for their incapability of assuring for the absence of code faults (Dijkstra 1970). In many cases, and especially in the evaluation of the so-called safety-critical systems (such as controllers of airplanes, rockets, nuclear plants etc.), both formal methods and empirical testing are used to evaluate the correctness and the dependability of computational artifacts. Computer science can accordingly be understood as a scientific discipline in that it makes use of both deductive and inductive probabilistic reasoning to examine computational artifacts (Denning et al. 1981; Denning 2005, 2007; Tichy 1998; Colburn 2000). Indeed, as examined in §6, verification and testing methods are often jointly involved in advancing hypotheses on the behaviors of implemented computing systems, and providing evidence (either algorithmically or empirically) in support of those hypotheses.

The thesis that computer science is, on a methodological viewpoint, on a par with empirical sciences traces back to Newell, Perlis, and Simon’s 1967 letter to Science (Newell et al. 1967) and dominated all the 1980s (Wegner 1976). In the 1975 Turning award lecture, Newell and Simon argued:

Computer science is an empirical discipline. We would have called it an experimental science, but like astronomy, economics, and geology, some of its unique forms of observation and experience do not fit a narrow stereotype of the experimental method. Nonetheless, they are experiments. Each new machine that is built is an experiment. Actually constructing the machine poses a question to nature; and we listen for the answer by observing the machine in operation and analyzing it by all analytical and measurement means available. (Newell & Simon 1976: 114)

Since Newell and Simon’s Turing award lecture, it has been clear that computer science can be understood as an empirical science but of a special sort, and this is related to the nature of experiments in computing. Indeed, much current debate on the epistemological status of computer science concerns the problem of defining what kind of science it is (Tedre 2011) and, in particular, on the nature of experiments in computer science (Schiaffonati & Verdicchio 2014), on the nature, if any, of laws and theorems in computing (Hartmanis 1993; Rombach & Seelish 2008), and on the methodological relation between computer science and software engineering (Gruner 2011).

10. Computer Ethics

Computer ethics is the analysis of the nature and social impact of computer t