For about 60 years sociologists who draw on Garfinkel’s ethnomethodology explore “structures of social action” (Atkinson and Heritage 1985). These sociologists began their research with detailed inspections of the analysis of talk. At the forefront of this research was Harvey Sacks who, initially together with Garfinkel, developed and applied conversation analysis to reveal the organization of action (Sacks 1966, 1992; Sacks et al. 1974). Conversation analysts consider utterances in talk, however small they might be, as action, and with Garfinkel they assume that the meaning of an action is constituted in, and through, the sequential organization of action. Thus, they further develop Garfinkel’s notion of the recursive relationship between action by arguing that such an organization is the basis for meaning to arise moment by-moment. In this view, meaning is not intrinsic to action, but it arises retrospectively in the context that is provided by the previous action and prospectively in that it creates the context for the next action (see Cicourel 1973; Heritage 1984).

The detailed examination of talk relies on audio-recording as principal data. Whilst the recording of actions creates distance between the ethnomethodologist and the field, the possibility to review short fragments of data repeatedly and to examine still frames provide ethnomethodologists with unprecedented closeness to the action. Recordings cannot replace researcher’s “existential engagement” (Honer and Hitzler 2015) with the field but they allow ethnomethodologists to reconstruct the prospective and retrospective orientation of action. Thus, the data enable the recovery of the participants’ attitude to the situation at hand.

The analysis necessitates a focus on short fragments of talk that are transcribed to aid the uncovering of the organization of utterances. The transcription helps the researcher to make intelligible the sequential organization of utterances as participants produced them (Hepburn and Bolden 2017; Jefferson 1984). Ethnomethodologists pursuing conversation analysis consider the participants themselves as conversation analysts who inspect each other’s actions as a basis for the production and design of their own actions. They do not observe the action as distant scientific observers but take the perspective of the participants and ask why an action has been produced in a particular moment, and why it has been designed in a particular way (Have 1998; Heath et al. 2010; vom Lehn 2018a, b).

Already in the 1970s Harvey Sacks realized that “[b]ody behavior in interaction also seems to be, in many respects, sequentially organized” (Sacks and Schegloff 2002: 136) and began to develop a system for the transcription of non-vocal action. It took however, until the 1980s for the analysis of video-recorded interaction to become widely used by ethnomethodologists.Footnote 4 The analysis of talk and bodily action was pioneered by Charles Goodwin (1981) and Christian Heath (1986). Since then, a burgeoning body of research on the organization of vocal, visual and bodily action has emerged, including studies of interaction in workplaces (Heath and Luff 2000; Szymanski and Whalen 2011) and public places like coffee shops and museums (Heath and vom Lehn 2004; Laurier and Philo 2007) as well as the analysis of interaction of mobile participants (see Haddington et al. 2013).

The studies examine interaction by exploiting the opportunities offered by video-recordings, including the possibility to repeatedly view fragments of interaction, the slow-motion function and the inspection of still frames (Heath et al. 2010; Knoblauch et al. 2015; vom Lehn 2018a, b). They are grounded in ethnomethodology and use the methodological tools developed by conversation analysts, in particular the use of transcripts as an aid to uncover the sequential organization of action. The analysis mostly begins with the transcription of talk and then maps participants’ bodily, visual and material action onto the talk in order to reveal how utterances are interwoven with non-vocal action and with aspects of the material and visual environment. Resulting from the detailed analysis are detailed descriptions of how participants accomplish a sense of intersubjectivity in, and through, the organization of their action. In the following section, I will briefly discuss video-recorded fragments of interaction to reveal the interactional production of intersubjectivity when participants are concerned with what each other is seeing.

The Interactional Achievement of Visual Phenomena

People often meet in situations where the concerted seeing of events is important for them. Examples are all kinds of situations where multiple people witness the same situation or object, for example as an audience. When two people stand or sit next to each other they view the same object or event assuming they are seeing it in the same way. Or if they assume they have not seen it in the same way they often begin to engage in conversation. In this section I will discuss two fragments of interaction, one video-recorded in an art museum and the other in an optometric consultation. The analysis of both fragments is concerned with how the participants create a sense of intersubjectivity in and through their interaction.

Aligning Perspectives: Achieving Intersubjectivity

The following fragment has been recorded in an exhibition that shows amongst others Rubens’ painting of the family of Jan Brueghel the Younger (Fragment 1)Footnote 5. We join the interaction after the older lady on the right, Eva, who has inspected and voiced her admiration to a companion before, has turned away from the painting only to return a short moment later. At this moment, another of her companions, Maggie, who has read the label to the left of the piece and briefly glanced at the painting begins to leave the exhibit (Image 1.1.). As Maggie turns to her right Eva arrives near her and encourages her companion to return to the painting by saying, “But I like that that’s Rubens with the Brueghel family” while gesturing with her stretched out right arm and index finger to the piece (Transcript 1, line 1, Image 1.2.).

Fragment 1 2 Ladies, Eva and Maggie Full size image

A moment later, Eva and Maggie stand next to each other, both looking to Brueghel’s painting. Eva continues her description of the exhibit without obtaining an audible response from her companion. By virtue of a brief pause after highlighting the quality of the painting, “before photographs were there” (line 2) Eva offers Maggie an opportunity to respond. Yet, when a response is not forthcoming Eva expands on her description further by saying, “to bring them to life another pain ting you see pain ted ” and puts additional emphasis on it by pointing to it (line 2–3; Image 1.3.). Only then, after another short pause Maggie displays a response, “>yah<” (line 5), which brings the joint looking at the painting to a close.

Although both participants have looked at the same painting their response to it is very different. Maggie has read the label and only briefly glanced at the piece. Her companion appears to notice Maggie’s lack of captivation or excitement about the quality of the piece. She produces an expanded description that highlights the painting’s quality but still fails to elicit a response to the piece from Maggie. Even when the two participants stand next to each other and look to the painting Maggie does not display a response that reflects an experience of similar quality as Eva’s. She stands still and looks to the painting but remains silent throughout her companion’s description that is interspersed with pauses that provide her with opportunities to interject and voice her response to the piece. Only at the end of the encounter at the painting, Maggie says “yah” and thus displays agreement with Eva’s evaluation of the piece. This is a moment in which the two participants achieve intersubjectivity; each of their perspectives to the work of art is, or appears to be, in alignment. However, because we do not know how the artist or the curator imagined an audience to see and experience the piece, it remains unclear if the participants also achieved “imagined intersubjectivity,” i.e., if their perspective to the work of art is in alignment with the perspective the artist or curator imagined viewers to adopt (see vom Lehn 2018a, b).

Although eventually Maggie produces a response that displays agreement with her companion’s description Eva cannot be sure that Maggie has seen Rembrandt’s painting in the same way as she did. For the participants the uncertainty about an actual alignment of perspectives is unproblematic, and they rarely scrutinize each other about their experience. Instead, a relatively blunt display of agreement is sufficient to bring the joint examination of the exhibit to a close. If we have an interest in how people manage to see the world together in the same way it might be worthwhile to look for a profession with an expertise in assessing what and how clearly other people can see.Footnote 6

Co-producing Optometric Intersubjectivity

In the United Kingdom optometrists are a profession that specializes in uncovering the quality of their clients’ ability to see. They undertake a series of tests that allow them to progressively determine a quantitative score or metric that describes a client’s quality of seeing and once the series of tests has been completed to prescribe, if necessary, particular lenses to correct their vision. One of the tests undertaken as part of optometric examinations is the Distance Vision Test. Many readers will know this test that involves a standard chart showing rows of letters that from top to bottom become smaller. To the right of each line a figure like 3/6 or 6/6 is printed that indicates the visual acuity score, i.e., the metric that describes how clearly a client can see in the distance. A client who is able to read letters up to the line marked 6/6 has the same clarity of vision at the distance of six meters as a standard client; the visual acuity score of 3/6 on the contrary suggests that the client’s distance vision is half as clear as that of a standard client.

In Fragment 2Footnote 7 we join a consultation when the optometrist has completed the interview and now moves to begin the Distance Vision Test. From the start of the test the optometrist uses formulations that allow the client to make mistakes or to say he is unable to read any letters, “if you can read anything on the mi ddle li:ne” (Fragment 2, line 11).

Fragment 2 Distance vision test: optometric consultation Full size image

As the optometrist encourages the client to read out letters from the middle line she leans slightly toward the client and holds an occluder over his left eye. While she completes her utterance the optometrist turns her eyes from the client to the monitor attached to the wall overhead and pulls up the letter chart she wants the client to read (Fragment 2, Image 1). The client who sits upright with his face oriented to the screen where the letters have appeared immediately responds to the optometrist’s request and produces a token “eh::” that prefaces the reading of the letters in an even rhythm, “eFf eNn Peeh Dee yoUh↓”. The optometrist acknowledges the client’s reading by turning from the screen to him, lowering her head to a nod and by saying, “thats great” (Image 2, line 13). She then does not bring the test to a close but continues it by encouraging the client to read with another formulation that allows him to make mistakes or to say that he is unable to see any of the letters clearly enough to read them out, “the bottom line at all?” (line 13). After a short hesitation (line 14) the client reads, “Peeh >eitcH or eNn< Deeh whY Zed”. The reading is accomplished quickly with a change in rhythm after the first letter. Through this change in rhythm and the particular vocalization the client displays uncertainty about one letter in the row, “eitcH or eNn” (line 16) before bringing the reading of the line to a close. Save for the display of uncertainty and mistaking a ‘V’ for a ‘Y’ the client displays confidence in reading out this smaller line of letters encouraging the optometrist to acknowledge the reading as “great”. She then says she will pose the client “a bit of a challenge” (line 17) and changes the chart to show a set of smaller letters” for the client to read. She asks him if he can read any letters from the smallest line in this chart, “so anything at all on that bottom line?” (line 18). Knowing that the client might have difficulties to read this line of letters the optometrist further qualifies the request by saying that, “you might not got (.) might not get very many” (line 18). Already in the middle of the production of the optometrist’s request the client begins a vocalization, “n::::::” that might be treated as the beginning of an attempt to read the line. He then firmly states that he is unable to read any letters on the line, “n:::::::::n::::::::::::::NO” (line 19). Subsequently the optometrist brings the test of the right eye to a close (line 20).

The fragment reveals the organization of actions through which the Distance Vision Test is undertaken. It involves the optometrist working to encourage the client to read rows of letters from the chart even if they have displayed difficulties in reading out a previous row. In formulating their request optometrists do take clients’ prior reading performance into account and allow them to make mistakes. Their interest is in identifying the line of letters that clients are unable to read or where they can read only a few letters. This allows them to transfer the visual acuity score from the chart to the client record form and add how many letters the client was unable to read from this line (vom Lehn et al. 2013). The score written in the record form enables optometrists to compare the client’s ability to see in the distance with a standard client defined in textbooks and by the creators of the vision chart.

Discussion: Two Kinds of Intersubjectivity

The analysis of these two fragments suggests how participants organize their actions in two institutional settings. In both settings the participants’ actions are oriented to a particular visual object. Participants standing at a painting in a museum display through their bodily positions and visual orientation that they are looking at a particular work of art together. For the Distance Vision Test in optometric consultations, optometrist and client focus their actions on the letter chart. What the participants in both settings actually look at emerges in interaction between them. At the painting, talk and gesture are used to reference aspects of the piece. Similarly, the optometrist uses talk and referential practice to highlight particular lines of letters for the client to read out. In the situation at the painting it has been sufficient for the co-participant to voice a confirmation or agreement with her companion. The procedure of the Distance Vision Test puts certain demands on the client who is requested to read out lines of letters to display that he is actually able to see what the optometrist asks him to look at. Other than the voicing of a confirmatory “>yah<” in a museum, the reading out of letters makes seeing accountable. The optometrist can use the reading performance to produce a score that reflects the ability of the client to see in the distance.

Coupled with these differences in the local order of the interaction in the two settings is the observation that in both cases the participants align their visual orientations and establish a sense of intersubjectivity that serves the purposes at hand. As part of the Distance Vision Test clients’ reading performance is assigned a visual acuity score that embodies a theoretical standard of client’s clarity of vision. Resulting from the process of the test therefore is an optometric intersubjectivity (vom Lehn 2018a, b) that defines the distance from where the client can see the letters as clearly as the standard client. The score indicates how close or far the client has to come to the letter chart to be able to maintain the assumption that her/his standpoint in principle is interchangeable with that of the standard client (and that s/he adopts the same system of relevances to the situation as the standard client).Footnote 8

The detailed scrutiny of fragments suggests that the local order of the organization of the participants’ actions underpins the possibility of the emergence of intersubjectivity. If we return to Garfinkel’s (2006/1948) metaphor of “experiment in miniature” for a moment, we can argue that in museums through each vocal or bodily action participants “test” each other’s seeing of an object. The participants mutually entertain the expectation that others will respond to each other’s action in a particular way. By virtue of vocal and/or bodily action participants display how they orient to and experience the work of art in light of the co-participant’s action. Thus, they generate a sense of a mutually aligned orientation to the painting.

In the Distance Vision Test we can take Garfinkel’s metaphor literally. The optometrist literally tests if the client sees the letters on the chart. The purpose of the sight test, however, is not to establish intersubjectivity between the client and the optometrist, but between the client and the standard client. By turning to the chart and reading out the requested lines of letters the client aligns with the optometrist’s orientation, and practical intersubjectivity is achieved. As the reading from the chart is brought to a close the optometrist uses the information gauged from the client’s actions and is able to establish the visual acuity score and therewith optometric intersubjectivity between the client and a standard client. Optometric intersubjectivity implies that the geographic locations of client and standard client are interchangeable and that when in an optometric consultation they both could approach the sight test with the same system of relevances.