The insight that animals' cognitive abilities are linked to their evolutionary history, and hence their ecology, provides the framework for the comparative approach. Despite primates renowned dietary complexity and social cognition, including cooperative abilities, we here demonstrate that cleaner wrasse outperform three primate species, capuchin monkeys, chimpanzees and orang-utans, in a foraging task involving a choice between two actions, both of which yield identical immediate rewards, but only one of which yields an additional delayed reward. The foraging task decisions involve partner choice in cleaners: they must service visiting client reef fish before resident clients to access both; otherwise the former switch to a different cleaner. Wild caught adult, but not juvenile, cleaners learned to solve the task quickly and relearned the task when it was reversed. The majority of primates failed to perform above chance after 100 trials, which is in sharp contrast to previous studies showing that primates easily learn to choose an action that yields immediate double rewards compared to an alternative action. In conclusion, the adult cleaners' ability to choose a superior action with initially neutral consequences is likely due to repeated exposure in nature, which leads to specific learned optimal foraging decision rules.

Although evidence suggests that the primates will excel in tasks that involve future consequences in the context of cooperation and foraging, the specifics of our task may favor cleaners. For example, cooperation and foraging are intertwined in cleaners in a way that is absent in primates; most importantly, cleaners cooperate with their food sources. In addition, primates encounter ephemeral food sources (e.g., insects, small vertebrates) unpredictably and opportunistically, and thus the ecological constraints are quite different from those of the fish, for whom the interaction with ephemeral sources is predictable. Based on this, we predicted that unlike the cleaner wrasse, the primates would not perceive the task as a social interaction but just as an optimal foraging task. Thus, our experiment offered us the opportunity to test the ecological intelligence hypothesis in a quite specific way. We expect that if ecology is the driving force that helps to solve the problem, then cleaners should individually learn to solve the tasks faster than any of the primate species. Conversely, if the general context and brain size (relative or absolute) prepare better for the task than rather specific ecological conditions do, then the primates should learn to solve the task faster than the cleaner wrasse. We also considered an additional way to test the role of learning for the cleaners' decision making process, reversing the role of the two plates once an individual reached the learning criterion. The former permanent plate now became the ephemeral plate and vice versa. Although cleaners are able to discriminate between different client categories, including resident and visitor, and can even individually recognize clients [13] , reversal of roles does not occur under natural conditions, i.e. a visitor individual/species never turns into a resident. Therefore, it appears to be highly unlikely that reversal learning could be aided by the adaptation of an innate program. For the primates we included this task only to see whether once the task has been solved, they understood its general principle. We predicted that if primates found the task initially difficult but solving it triggered a more general understanding, then their performance would greatly improve during the reversal.

An important aspect of the ecological approach is to test whether other species that do not engage in cleaning interactions are less able to solve the task. We decided to use primates – capuchin monkeys, chimpanzees and orang-utans – for the comparison for several reasons. First, the general circumstances of the cleaners' decisions involve social interactions and foraging, which matches the two contexts that have been proposed to select for large brains in primates [19] – [22] . Second, primates, and in particular our three study species, have been shown to possess a large array of cognitive mechanisms in the context of social behavior and foraging. Specifically, all three species have a complex diet and have been classified as extractive foragers [21] . In addition, at least chimpanzees and capuchins hunt for meat and catch mobile insects and reptiles [23] – [25] , and in doing so, encounter ephemeral food sources. Moreover, all three species are able to solve some cooperation tasks in the laboratory [26] – [32] , and capuchins and chimpanzees do so in the wild [23] – [25] , [33] . Also, our task involved the ability to take not only immediate but also future consequences into consideration, an ability that primates have repeatedly demonstrated in foraging experiments (delayed rewards experiments: [34] , [35] ; planning experiments: [36] – [39] ). Finally, all three of our primate study species have large brains compared to other species, and large relative brain sizes (e.g., brain-to-body or neocortex-to-body ratios) even compared to other primates [40] , again indicating high general cognitive abilities.

Here, we provide the first test of the hypothesis that cleaner wrasse foraging decisions are the result of specific cognitive abilities. Our laboratory experiment involved two identical food sources – two plates differing in colour and patterns to allow discrimination, but providing exactly the same food - where one source (plate) was ephemeral and the other one permanent. This mimicked the simultaneous visit of a resident and a visitor to the cleaning station. Accordingly, the food maximizing solution involved eating from the ephemeral food source first and only then from the permanent one. The potential difficulty of the task is due to the fact that no matter which plate an individual chooses first, it will receive exactly the same immediate reward, and only then will it (possibly) have the chance to perform a second act that would yield an additional reward. Thus, the initial decision may not lead to reinforcement learning unless an animal is somehow able to integrate the future consequences into its immediate decision. Despite theoretical considerations indicating that the task is not trivial to solve, a previous study suggested that cleaners could quickly solve it, though individual learning was not investigated [16] . In order to test whether the ability to solve the task is linked to its ecological relevance and whether the solution by cleaners reflect specific learning rules, we subjected both adult and juvenile wild-caught cleaners as well as three primate species to the task. The comparison between adult and juvenile cleaners allowed us to address the potential role of individual experience. Client composition shifts during ontogeny, with adult cleaners interacting about three times more frequently with visitors than do juveniles (comparing data published in [17] , [18] ). Thus, juveniles rarely experience the situation in which a visitor and a resident seek cleaning simultaneously. Therefore, if adult cleaners perform better than juveniles that would indicate that individual experience in the field helps to solve the abstract laboratory task.

The ecological approach to cognition proposes that a species' ability to solve a particular problem is tightly linked to its evolutionary history and, hence, to the ecological conditions under which it was selected [1] – [3] . A classic example is the tight link between spatial memory abilities and the dependency on food caching in corvids [4] . The ecological approach provides a general functional theoretical framework which allows for the integration of studies on any animal species, including invertebrates, such as the demonstration of sophisticated spatial orientation skills of bees [5] , and the ability of jumping spiders to plan where to go to in order to attack prey [6] . The ecological approach has led to a great diversification of animals studied, and in particular to the appreciation that animal clades that lack particularly large and complexly structured brains may provide examples of impressive cognitive abilities. This is in particular true for fishes [7] , which have provided some excellent examples for complex social strategies. Male cichlids (Astatotilapia burtoni) use transitive inference to predict fighting abilities of competitors [8] and sticklebacks (Pungitius pungitius) employ so-called hill climbing social learning strategies [9] , in which they compare their own foraging success with the success of observed individuals to update foraging decisions. Another example involves the foraging decisions of cleaner wrasse, Labroides dimidiatus. These cleaner fish occupy small territories (so-called ‘cleaning stations’) in which they interact with a variety of reef fish species (so-called ‘clients’) from which they remove ectoparasites, but also mucus and scales [10] . Conflict occurs because cleaners prefer to eat mucus over ectoparasites [11] , where eating the former constitutes cheating (for a review of cleaners' decision rules, see [12] , [13] ). Cleaners adjust levels of cooperation to the strategic options available to clients to react to cheating by cleaners. Predatory clients typically receive the highest service quality, whereas non-predatory resident clients, who lack choice options, punish cleaners for cheating. Visiting clients who have access to alternative cleaning stations receive faster service than resident clients that have access to only one cleaning station. This is because visiting clients represent an ephemeral food source: they may swim off and visit another cleaner for their next inspection if not inspected immediately. In contrast, resident clients must wait for inspection because of a lack of alternatives. Furthermore, cleaners pay attention to the presence of potential clients and are more cooperative to current clients if that allows them to access bystanders [14] . Thus, cleaner wrasse show high adaptation to the specifics of an interaction in their foraging decisions, which are at the same time linked to interspecific social behavior. The precision with which cleaners adapt current service quality to current conditions may be predicted by their ecology: cleaners have over 2000 interactions per day with a great variety of clients and fully depend on cleaning for their diet [15] , thus their performance during the interactions has a major impact on their fitness. However, the ecological approach is rather nonspecific with respect to the cognitive processes that underlie the performance. Hence, we cannot infer from the precision and flexibility in cleaner foraging decisions that they warrant much learning, memory or comprehension and hence, ultimately any adaptive changes in corresponding brain areas. In addition, we do not know whether reaching their food maximizing decisions involves widespread learning rules or whether rather specific abilities must be evolved or developed. Thus the question of interest is whether any (vertebrate) species could easily behave like a cleaner wrasse if it switched its diet to ectoparasites and mucus of fishes, or whether specific selection pressures on cleaner wrasses have caused specific abilities? And if specific abilities do exist in cleaner wrasses, what is the role of cognition?

For this component of the experiment, the previously ephemeral plate/tray became the permanent plate/tray, and vice versa. All the adult fish developed a significant preference for the new ephemeral plate within 10 sessions (median: 7; ranging: 6–9; Fig. 2 ). With one exception it took individuals slightly longer to re-learn the task after the plates suddenly inversed their behavior (reversal learning phase) as compared to learning the initial behavior of the plates (exact Wilcoxon signed rank test, n = 6, W = −13, p>0.05). The one juvenile that succeeded in the initial task after only 20 trials apparently had had a preference for the initial ephemeral plate: it failed to alter its preference over the next 100 reversal trials. Seven out of eight capuchins learned the reversal task in 6–9 sessions, yielding similar results to the adult cleaners. In contrast only one orang-utan out of three and neither of the two chimpanzees learned the reversal task within 10 sessions. Overall, there was a non-significant difference in learning speed between the species (Kruskal-Wallis Test excluding the one juvenile cleaner, df = 3, H = 6.8, p = 0.078). If the few chimpanzee and orang-utan individuals were pooled as ‘apes’ the differences between species became significant (Kruskal-Wallis Test excluding the one juvenile cleaner, df = 2, H = 6.5, p = 0.038). Post-hoc comparisons revealed that both adult cleaners and capuchin monkeys performed significantly better than the apes (Student Newman-Keuls, both p<0.05). Both chimpanzees that failed the test developed a significant side bias, whereas the orang-utans did not develop a discernable bias. The apes' unexpected lack of success appeared to be due to frustration with the task [41] , [42] .

All six adult cleaner fish individuals learned to eat first from the ephemeral plate, which was smoothly withdrawn if the cleaner were to forage on the permanent plate first. Individuals took 3–10 sessions (of 10 trials each) to reach the criterion of significance with a median of 4.5 sessions. In contrast to the adult cleaners, only one juvenile cleaner and two out of four chimpanzees solved the task within 10 sessions, and all other subjects failed ( Fig. 1 ). Thus, there was a significant difference in learning speed between the species/age classes (Kruskal-Wallis Test: df = 4, H = 18.4, p = 0.001). Post-hoc comparisons revealed that adult cleaners performed better than juvenile cleaners or any of the three primate species (Student Newman-Keuls, all p<0.05). Most of the primates that failed to learn the task developed a strong side preference (7/8 capuchin monkeys, 3/4 orang-utans and 1/2 chimpanzees). Juvenile cleaners that failed the task developed a preference for the permanent plate. All primate subjects that had failed to learn the task within 10 sessions (100 trials) were then exposed to changes in the experimental design to learn the solution. The details varied between species and are described in Information S1. Under the altered conditions, all capuchin monkeys and three out of four orang-utans eventually developed a significant preference for the ephemeral plate while the two remaining chimpanzees failed to learn the task at all. We then included the capuchins and the three orang-utans in the reversal learning task.

Discussion

A key conclusion from our experiment is that the sophisticated foraging decisions which cleaner wrasses demonstrate during interactions with client reef fish are not easily achieved by other species with larger and more complexly organized brains. The ability to choose between an ephemeral and a more permanent food source of otherwise identical quality is apparently far from simple as the vast majority of individuals from three primate species that otherwise excel in cognitive tasks failed to learn the task within 100 trials, as did juvenile cleaners. However adult cleaners consistently solved the task. Thus, our task differs from experiments that demonstrate extremely fast learning of solutions if individuals are placed into a key stimulus-response context, in which even invertebrates like bees may outperform primates, including humans [42]–[43].

Why the task may be difficult to solve When confronted with a choice that directly yields two different amounts of food primates can easily discriminate outcomes with one reward from those with two [44]–[47] (for that matter, fish can do the same [48], [49]), even in cases in which the quantity to be received is indicated symbolically (e.g., via tokens or Arabic numerals [50]–[52]). Thus there must be another explanation for the decrement in performance in the primates as compared to the adult wrasses. We consider several possibilities for why this task may be difficult to learn. First, assuming that both species saw the task as a sequence of two tasks (rather than one task with two steps, a reasonable assumption since they got fed after their first choice, and hence before their second), then the difficulty of the task may relate to known reinforcement mechanisms; in this case, no matter which plate an individual chooses first, it will receive exactly the same immediate reward, and only then will it (possibly) have the chance to perform a second act that would yield an additional reward. That is, our design, compared to classic associative learning designs (i.e. go to A, then to B, then collect reward), adds the complication of requiring animals to go to B to collect a second reward after A already has been rewarded. Thus it is possible that the first plate chosen becomes a conditioned stimulus that is stronger, as it is always the first stimulus to be rewarded (and thus may result in the greatest satisfaction). After this, there may be little novelty or information value left for the second plate, lowering the incentive as compared to the first plate/reward. Thus phenomena like blocking (e.g., little conditioning is occurring) or overshadowing (e.g., less conditioning is occurring to this weaker conditional stimulus) might explain why there seem to occur little learning about the second plate if the first plate already has been rewarded. Second, it is possible that the fish experienced the removal of the plate as a stronger punishment than did the primates. Both the fish and the primates presumably reacted to the removal of the second plate, containing food, as a negative reinforcer (e.g., punishment). However, fish may have additionally experienced it as a social punishment; one indication that they indeed perceive the task as a cleaning situation is that they respond with tactile stimulation when the plate returns, a behaviour cleaners use to reconcile and to make clients stay longer under natural conditions [53] to encourage it to stay this time. Hence negative social reinforcement (or: social punishment) would make the task more aversive, and hence easier to learn, for the adult fish as compared to the primates and juvenile fish, both of which have far less experience with this situation. Finally, a more cognitive mechanism than associative learning that would allow subjects to solve the task is insight based on backwards induction. In backwards induction, one has to start with the desired endpoint and then figure out which steps lead to that endpoint. Evidence for backwards induction has been demonstrated in a chimpanzee, Julia, who had to open up to 10 Plexiglas boxes with specific tools inside in the right sequence to finally obtain food in the last box [36]. However, the primates in our study apparently failed to use backward induction, despite a large number of trials. Given the evidence for insight learning in our primate species, why did they fail to use this ability? One possibility again relates to reinforcement; Julia was not rewarded for each step of her process, while in our experiment the subjects were. As discussed above, it is possible that the receipt of intermediate rewards interferes with learning mechanisms in that it lowers the incentive value of the second reward [54]. We finally note that the apes' unexpectedly low performance on the reversal task was likely due to frustration with the procedure. Apes – including some of these subjects – are typically very good at reversal learning tasks [41], [42]. Moreover, within the primates, reversal learning performance is associated with brain size [55], and apes typically outperform capuchins [56]. However, our subjective impressions indicated that the apes found this task very frustrating. Despite there being only 10 trials in a session, we initially had to change the ITI from 5 minutes to 90 seconds in order to get them to complete a session (see Methodological considerations, below, for a more detailed discussion of this). Even with the 90 second ITI, by the later sessions, apes were hitting or grabbing the choice trays rather than choosing a reward, and often refused to participate. We believe it was this frustration with the task that caused the unexpectedly low performance on reversal learning. What is perhaps more notable is that the fish did so well. Their behavior is counter to that predicted by the primates' association between reversal learning and brain size [55] and deserves far more attention as a potential area in which fish cognition equals that of the larger-brained primates (see Methodological considerations, below, for other areas in which fish cognition appears to equal that of far larger brained species).

Why adult cleaner wrasses may have been able to solve the task We propose two non-mutually exclusive explanations for why adult cleaners learned to solve the task. First, the cleaners may have developed the decision rule to preferentially approach ephemeral food under natural conditions and then applied the same rule to this task. In contrast, the primates were born in captivity, where sufficient food is provided multiple times per day (at all facilities) and they rarely catch ephemeral food like invertebrates. Second, as discussed above, the cleaners may have perceived the task as a social interaction. In that case they would have perceived the removal of the ephemeral plate as the loss of a cooperation partner and hence as a negative reinforcer that reduced the likelihood that the subject would choose the permanent plate again on future trials. The aversion to losing any client would make the ephemeral plate more attractive to cleaners, whereas primates are not selected to experience either the negative reinforcement of a missed opportunity or social reinforcement for interacting with their foraging substrate. Thus, we consider it likely that cleaners, but not the primates, simultaneously experienced a positive and a negative reinforcer, which would explain why they learned to solve the task rather quickly as compared to the primates. If that was the case, a change in protocol for the primates that let them perceive the interaction as social (for example by replacing the trays with human partner) should yield much faster learning. If our hypotheses are correct then one would also predict that even individuals of the closely related cleaner wrasse species L. bicolor should have problems solving the task. This is because adult bicolor individuals rove over large areas and typically approach the clients they want to interact with rather than having to wait for them at a cleaning station [57]. Thus, the distinction between residents and visitors is not crucial to them, and they can follow clients that are about to leave in order to prolong interactions. For bicolor individuals, it appears to be mainly important where an interaction takes place within their home range: they are more cooperative in their core area than in the periphery [58]. To explore this hypothesis, we additionally collected preliminary data on L. bicolor. We tested two individuals in May 2009 at the University of Neuchâtel following exactly the same protocol as we used for adult L. dimidiatus. One bicolor failed to learn the initial task within 200 trials. The other one learned the initial task in 70 trials but failed at the reversal: after a short period of random choices it redeveloped a preference for the initially ephemeral plate. Taken together, there is thus a significant difference in overall performance (in trials to complete the entire experiment) between the two species (Mann-Whitney-U-Test, m = 6, n = 2, U = 0, p<0.05). Clearly, more bicolor individuals should be tested (unfortunately, they are very difficult to obtain from licensed commercial pet shops; three individuals were all we managed to obtain over a six week search period, with one not willing to participate in the experiment). Nevertheless, the preliminary results suggest that the ability of L. dimidiatus individuals to solve the task is linked to very specific ecological conditions that are not met in L. bicolor.

A comparison between juvenile and adult cleaners There are various potential explanations for why juveniles failed to solve the task while adults managed. One possibility is that maturation processes in the brain preclude juveniles from solving the problem at hand. Second, there were small differences in the experimental protocol due to different research sites and in turn testing possibilities: juveniles – kept on Lizard Island for the period of the experiment - experienced longer time intervals between subsequent trials as compared to the adults which where housed and tested in Neuchâtel, Switzerland. However, in an earlier experiment adult cleaner fish that were trained on a similar task (i.e. “one plate remains until inspected while the other does not”, p. 132), but with 30 min intervals between trials, significantly chose to first clean the plate that would not wait until being inspected [16]. Thus we doubt that the differences in the ITI are reason enough for the differences in learning performance. While maturation and (to a lesser extent if at all) experimental design may have affected the results, we consider it likely that individual experience plays a major role; juveniles have fewer visiting clients and are therefore rarely in a situation that calls for this discrimination. The situation changes for adults; in a field study, adults had to make choices between a visitor and a resident client more than twice per hour (120 times in 52 hours of observation [13]; our subjects were wild caught). It has long been known that maturation and experience combine to determine performance [59]. But only recently has it been shown that, for example, guppies possess from birth on numerical abilities (discrimination of small numbers), which unfolds as a result of both maturation and social learning (discrimination of larger numbers) [60]. A logical follow up experiment should therefore test adult cleaners that have been kept in captivity without simultaneous exposure to residents and visitors.