Abstract Division of labor is ubiquitous in biological systems, as evidenced by various forms of complex task specialization observed in both animal societies and multicellular organisms. Although clearly adaptive, the way in which division of labor first evolved remains enigmatic, as it requires the simultaneous co-occurrence of several complex traits to achieve the required degree of coordination. Recently, evolutionary swarm robotics has emerged as an excellent test bed to study the evolution of coordinated group-level behavior. Here we use this framework for the first time to study the evolutionary origin of behavioral task specialization among groups of identical robots. The scenario we study involves an advanced form of division of labor, common in insect societies and known as “task partitioning”, whereby two sets of tasks have to be carried out in sequence by different individuals. Our results show that task partitioning is favored whenever the environment has features that, when exploited, reduce switching costs and increase the net efficiency of the group, and that an optimal mix of task specialists is achieved most readily when the behavioral repertoires aimed at carrying out the different subtasks are available as pre-adapted building blocks. Nevertheless, we also show for the first time that self-organized task specialization could be evolved entirely from scratch, starting only from basic, low-level behavioral primitives, using a nature-inspired evolutionary method known as Grammatical Evolution. Remarkably, division of labor was achieved merely by selecting on overall group performance, and without providing any prior information on how the global object retrieval task was best divided into smaller subtasks. We discuss the potential of our method for engineering adaptively behaving robot swarms and interpret our results in relation to the likely path that nature took to evolve complex sociality and task specialization.

Author Summary Many biological systems execute tasks by dividing them into finer sub-tasks first. This is seen for example in the advanced division of labor of social insects like ants, bees or termites. One of the unsolved mysteries in biology is how a blind process of Darwinian selection could have led to such highly complex forms of sociality. To answer this question, we used simulated teams of robots and artificially evolved them to achieve maximum performance in a foraging task. We find that, as in social insects, this favored controllers that caused the robots to display a self-organized division of labor in which the different robots automatically specialized into carrying out different subtasks in the group. Remarkably, such a division of labor could be achieved even if the robots were not told beforehand how the global task of retrieving items back to their base could best be divided into smaller subtasks. This is the first time that a self-organized division of labor mechanism could be evolved entirely de-novo. In addition, these findings shed significant new light on the question of how natural systems managed to evolve complex sociality and division of labor.

Citation: Ferrante E, Turgut AE, Duéñez-Guzmán E, Dorigo M, Wenseleers T (2015) Evolution of Self-Organized Task Specialization in Robot Swarms. PLoS Comput Biol 11(8): e1004273. https://doi.org/10.1371/journal.pcbi.1004273 Editor: Olaf Sporns, Indiana University, UNITED STATES Received: January 27, 2015; Accepted: April 8, 2015; Published: August 6, 2015 Copyright: © 2015 Ferrante et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited Data Availability: Data are available in the Dryad repository (accession URL: http://dx.doi.org/10.5061/dryad.7pn80). Funding: EF, AET and TW acknowledge the European Science Foundation "H2Swarm" program (http://www.esf.org/) and the KU Leuven (www.kuleuven.be) for the IDO-BioCo3 project and KU Leuven Excellence Center project PF/2010/007. EDG acknowledges the KU Leuven for the Grant F+/11/033. MD acknowledges the Fonds de la Recherche Scientifique—FNRS (F.R.S.–FNRS—www.fnrs.be) and the European Research Council (ERC—erc.europa.eu) ERC Advanced Grant “E-SWARM: Engineering Swarm Intelligence Systems” (grant 246939). AET acknowledges the Scientific and Technological Research Council of Turkey (Tubitak—www.tubitak.gov.tr/en) grant TUBITAK-2219. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist.

Materials and Methods The task and the environment Our experimental setup is inspired by the type of task partitioning observed in Atta leafcutter ants [42,43], that collect leaves and other plant material as a substrate for a fungus that is farmed as food (Fig 1A). In these insects, particularly in species that harvest leaves from trees, leaf fragments are retrieved in a task partitioned way, whereby some ants (“droppers”) specialize in cutting and dropping leaf fragments to the ground, thereby forming a leaf cache, and others specialize in collecting leaves from the cache to bring them back to the nest (“collectors”) [42,43]. In addition, another strategy is known whereby the whole leaf cutting and retrieval task is carried out by single individuals (“generalists”), without any task partitioning [42,43]. Task partitioning in this scenario is thought to be favored particularly in situations where the ants forage on leaves from trees, due to the fact that the leaf fragments can then be transported purely by gravity, which saves the ants the time to climb up and down the tree, and the fact that there are few or no costs associated with material loss thanks to the large supply of leaves [7,43,48] (Fig 1A). This theory is supported by the fact that species living in more homogeneous grassland usually retrieve leaf fragments in an unpartitioned way, without first dropping the leaves (Fig 1C), particularly at close range to the nest [43,49]. In the corresponding robotic setup, we substituted the tree with a slope area and leaves with cylindrical items. A team of robots then had to collect these items from what we call the source area and bring them back to what we refer to as the nest area (Fig 1B). Simulations were carried out using the realistic, physics-based simulator ARGoS [45]. As demonstrated in the past, controllers developed within ARGoS can be directly transferred to real robots with minimal or no intervention [50,51]. The robots involved in the experiments were a simulated version of the foot-bot robot, a version of the MarXbot robot [44], which is a differential-drive, non-holonomic, mobile robot (Fig 2A). A screen-shot of a simulation instant is shown in Fig 2B. We used a setup whereby 5 items were always present in the source area. The 5 items were replaced and put in a random position within the source area each time a robot picked up one of them. This is justified by the fact that leaf availability in the natural environment is often virtually unlimited. A light source was placed at a height of 500 m, 500 m away from the nest, in the direction of the source area. The light allowed the robots to navigate in the environment, since phototaxis allowed them to go towards the item source, whereas anti-phototaxis allowed them to return to the nest. The slope area had an inclination of about 8 degrees. The linear velocity of the robots on the flat part of the arena was 0.15 m/s, but this reduced to a maximum speed of 0.015 m/s when they had to climb up the slope, and increased to 0.23 m/s when they came down from the slope. If an item was dropped in the slope area, it slid down the slope at a speed of 1 m/s until it reached the cache area, where it was stopped due to friction and to the impact with other items in the cache. This was done to simulate leaves being dropped from the tree, as in Fig 1A. In addition, in some of the experiments, we considered a flat environment of the same length and width as the one described above (Fig 1D), to mirror the case in nature where ants forage in a flat, homogeneous environment (Fig 1C). Evolution of task-partitioning from pre-adapted building blocks In a first set of experiments, we assumed that the behavioral strategies required to carry out each of the subtasks (dropper or collector behavior, as well as generalist, solitary foraging) were available to the robots as pre-adapted behavioral building blocks and then determined the optimal mix of each of the strategies [12]. This setup, therefore, matched some evolutionary scenarios proposed for the origin of division of labor in biological systems based on co-opting pre-adapted behavioral patterns [2,17–22]. In addition, this scenario allowed us to determine under which environmental conditions task partitioning is favored, and provided a fitness benchmark for the second scenario below, where task partitioning was evolved entirely de-novo. In this first set of experiments, dropper, collector and generalist foraging strategies were implemented as follows: Dropper strategy: A dropper robot is a robot that climbs the slope area and never descends it again, continuously collecting items from the source area and dropping them to the slope area. Collector strategy: A collector robot is a robot that never climbs the slope area. Instead, it continuously collects items from the cache (when present) and brings them back to the nest. If it cannot find any items, the collector robot keeps exploring the cache area by performing random walk, until an item is found. Generalist strategy: A generalist robot is a robot that performs a standard foraging task. It climbs the slope and explores the source area, collects items, and brings them all the way back to the nest. The generalist robot does not explore the cache area, but in case it finds an item at the cache while going towards the source, it collects it and brings it back to the nest. The rules that we employed to implement these strategies are shown in S1 Table. We also assumed that the robots would specialize for life in each of these available strategies according to a particular evolved allocation ratio. This was equivalent to assuming that in nature, these behavioral strategies would already have evolved due to selection in their ancestral environment, and that natural selection would favor a particular hard-wired individual allocation between the different sets of tasks, e.g. through fine-tuning of the probability of expression of the gene-regulatory networks coding for the different behavioral patterns. For these experiments, we used teams of 4 robots, to match the evolutionary experiments with fine-grained building blocks (cf. next section). Subsequently, a fitness landscape analysis was used to determine the optimal mix between the three strategies in one of two possible environments, a flat or a sloped one (Fig 1B and 1D). This was done via exhaustive search, that is, by testing all possible ratio combinations and determining the corresponding fitness values in the two environments, rather than using an evolutionary algorithm. This was possible due to the relatively small search space, which gave access to the full fitness landscape. Group performance, measured by the total number of items retrieved to the nest over a period of 5,000 simulated seconds, for each possible mix of the three strategies, was measured in 10 simulated runs and then averaged. Evolution of task-partitioning from first principles In a second set of experiments, we considered an alternative scenario where both task specialization and task allocation could evolve entirely de-novo, starting only from basic, low-level behavioral primitives. These primitives were simply navigational behaviors allowing robots to either go towards the source or towards the nest, as well as a random walk behavior: PHOTOTAXIS: uses the light sensor to make the robot go towards the direction with the highest perceived light intensity. ANTI-PHOTOTAXIS: uses the light sensor to make the robot go towards the lowest perceived light intensity. RANDOM WALK: makes the robot move forward for a random amount of time and then turn to a random angle, repeating this process while the block is activated, without using any sensors. In addition, a mechanism of obstacle avoidance, based on the robot’s range and bearing and proximity sensors, was switched on at all times to avoid that the robots would drive into each other or into the walls of the foraging arena. Finally, two instantaneous actions were allowed, namely picking up and dropping an item. To be able to evolve adequate behavioral switching mechanisms, we allowed the robots to perceive their position in space, that is, whether they were in the source, slope, cache or nest, based on sensorial input from the ground and light sensors, as well as perceive whether or not they were currently holding an item. The fine-grained behavioral building blocks were combined together using a method known as grammatical evolution [52] as implemented in GESwarm [41], in order to evolve rule-based behaviors representing more complex strategies. GESwarm was developed for the automatic synthesis of individual behaviors consisting of rules leading to the desired collective behavior in swarm robotics. These rules were represented by strings, which in turn were generated by a formal grammar. The space of strings of such a formal grammar was used as a behavioral search space, and mutation, crossover and selection were then used to favor controllers that displayed high group performance. The individual behavior of a given robot was expressed by a set composed of an arbitrary number n R of rules R i : Each rule was composed of three components: where denotes a subset of all possible fine-grained behavioral building blocks (phototaxis, anti-phototaxis and random walk), denotes a subset of all possible instantaneous actions (pickup, drop, change behavior or change an internal state variable) and denotes a subset of all possible preconditions. The preconditions were specified as logical conditions with respect to the current value of a number of state variables, which included both sensorial input (the environment they were in and whether or not they were carrying an item) and internal state variables (a state variable that specified whether they wanted to pick up an item or not and two memory state variables, with evolvable meaning). If all the preconditions in were met, and if a given robot was executing any of the low-level behaviors present in , all actions contained in were executed with evolvable probability p l . In this way, we could allow the evolution of probabilistic behaviors, which have been extensively used both in the swarm robotics literature [53,54] and as microscopic models of the behavior of some social animals [55,56]. Finally, each robot executed all rules and actions in their order of occurrence. To be able to generate the rules above, we devised a grammar using the Extended Backus-Naur Form notation [57]. Within the framework of grammatical evolution [41,52], a genotype represented a sequence of production rules to be followed to produce a valid string (in our case a set of rules) starting from that grammar. Mutation and crossover acted at the level of this genotype, modifying the sequence of production rules. The full grammar of GESwarm is described in [41]. Biologically speaking, our GESwarm rule-based controllers can be considered analogous to gene-regulatory networks or to logic circuits in the brain, and the internal memory state variables in our model can be seen as analogous to epigenetic states or memory states in the brain. Furthermore, as in biological systems, we use a generative encoding (a string coding for a series of conditional rules, similar to a DNA sequence coding for conditionally expressed gene regulatory networks) and evolve our system using mutation and crossover. One departure in our setup from biological reality, however, was that we used genetically homogeneous teams, as is common in evolutionary swarm robotics [58], but different from the situation in most social insects, where sexual reproduction tends to be the norm. This choice was made because homogeneous groups combined with team-level selection has been shown to be the most efficient approach to evolve tasks that require coordination [28]. Nevertheless, this setup can still be considered analogous to the genetically identical cells of multicellular organisms [59] or the clonal societies of some asexually reproducing ants [60] that both display complex forms of division of labor. We executed a total of 22 evolutionary runs on a computer cluster, of which we used 100 to 200 nodes in parallel. The number 22 was chosen to meet the total amount of computational resources we had at our disposal (3 months of cluster time) and was statistically speaking more than adequate. All evolutionary runs were carried out for 2,000 generations using 100 groups of 4 robots and were each evaluated 3 times. This relatively small number of robots was chosen to limit the computational burden of the evolutionary runs. Nevertheless, we also verified if the evolved controllers could be scaled to larger teams of 20 robots each. In this case, the foraging arena was scaled in direct proportion with the number of robots. We used single-point crossover with crossover probability 0.3 and mutation probability 0.05. We chose a generational replacement with 5% elitism, in order to exploit parallel evaluation of multiple individuals on a computer cluster. We used roulette-wheel selection, that is, the probability to select a given genotype was proportional to its fitness relative to the average fitness of all genotypes in the population. As fitness criterion we used group performance, measured as the total number of items retrieved to the nest over a period of 5,000 seconds. During post-evaluation, this same fitness criterion was used to evaluate the evolved controllers. We also assessed the average absolute linear speed of the robots along the long axis of the arena, measured as a percentage of the theoretical maximum speed, and the degree of task specialization, measured as the proportion of items that were retrieved through the action of multiple robots (i.e. by task specialists).

Discussion One of the unsolved mysteries in biology is how a blind process of Darwinian selection could have led to the hugely complex forms of sociality and division of labor as observed in insect societies [4]. In the present paper, we used simulated teams of robots and artificially evolved them to achieve maximum team performance in a foraging task. Remarkably, we found that, as in social insects, this could favor the evolution of a self-organized division of labor, in which the different robots automatically specialized into carrying out different subtasks in the group. Furthermore, such a division of labor could be achieved merely by selecting on overall group performance and without pre-specifying how the global task of retrieving items would best be divided into smaller subtasks. This is the first time that a fully self-organized division of labor mechanism could be evolved entirely de-novo. Overall, these findings have several important implications. First, from a biological perspective, they yield novel evidence for the adaptive benefits of division of labor and the environmental conditions that select for it [4], provide a possible mechanistic underpinning to achieve effective task specialization and task allocation [4] and provide possible evolutionary pathways to complex sociality. Second, from an engineering perspective, our nature-inspired evolutionary method of Grammatical Evolution clearly has significant potential as a method for the automated design of adaptively behaving teams of robots. In terms of the adaptive benefits of division of labor and the environmental conditions that select for it, our results demonstrated that task partitioning was favored only when features in the environment (in our case a slope) could be exploited to achieve more economic transport and reduce switching costs, thereby causing specialization to increase the net efficiency of the group. Previous theoretical work has attributed the evolution of task specialization to several ultimate factors, some of which are hard to test empirically [61]. Duarte et al. [4], for example, reviewed modeling studies that showed that the adaptive benefits of a behaviorally-defined division of labor could be linked to reduced switching costs between different tasks or locations in the environment, increased individual efficiency due to specialization, increased behavioral flexibility or reduced mortality in case only older individuals engage in more risky tasks (“age polyethism”). Out of these, there is widespread agreement on the role of switching costs and positional effects as key factors in promoting task specialization [4,10,47,62], and our work confirms this hypothesis. Indeed, in our set-up, task partitioning greatly reduced the amount of costly switching required between environmental locations. Furthermore, our work also confirms the economic transport hypothesis, i.e. that task partitioning results in more economical transport, which in our case was due to the fact that gravity acted as a helping hand to transport the items. Previously, this hypothesis had also found significant empirical support [7,43,46,48], e.g. by the fact that in leafcutter ants, species that collect leaves from trees tend to engage in task partitioned leaf retrieval, whereas species living in more homogeneous grassland usually retrieve leaf fragments in an unpartitioned way, without first dropping the leaves, particularly at close range to the nest [43,49]. A surprising result in our evolutionary experiments was that adaptive task specialization was achieved despite the fact that the robots in each team all had identical controllers encoded by the same genotype. This implies that a combination of individual experience, stigmergy and stochastic switching alone were able to generate adaptive task specialization, akin to some of the documented mechanisms involved in behavioral task specialization in some asexually reproducing ants [63] and in cell differentiation in multicellular organisms and clonal bacterial lineages [59,64,65]. The choice of using homogeneous, clonal groups of robots with an identical morphology precluded other mechanisms of division of labor observed in nature from evolving, based, for instance, on morphological [4,12] or genetic [4] role specialization. Such mechanisms, however, could be considered in the future if one allowed for genetically heterogeneous robot teams [28] or evolvable robot morphologies. Lastly, the grammar we used did not specifically allow for recruitment signals to evolve, such as those observed in leafcutting ants, where both trail pheromones and stridulation are used as mechanisms to recruit leaf cutters [66,67], or the ones in honeybees, where the tremble dance is used to regulate the balance between number of foragers and nectar receivers inside the colony [68,69]. Nevertheless, including low-level primitives for communication behavior into the grammar, which we plan to do in future work, would readily allow for the evolution of such mechanisms, and would likely boost the performance of the evolved controllers even further (cf. [26,27]). In terms of the mechanisms of task specialization and task allocation evolved, our work is important in that it alleviates one of the limitations of existing models on the evolution of task specialization, namely, that they normally take pre-specified subtasks and an existing task allocation model (e.g. the response threshold model) as point of departure [4], thereby greatly constraining the path of evolution. Our work is an important cornerstone in establishing, to the best of our knowledge, the first model that bridges the gap between self-organization and evolution without significantly constraining the behavioral strategies and coordination mechanisms that can be obtained to achieve optimal task specialization and task allocation. In fact, compared to other previous studies on evolution of task specialization [47,62,70–72], our work is the first to consider non-predefined sub-tasks that could evolve de-novo and combine into complex individual behavioral patterns. Although our experiments demonstrate that division of labor and behavioral specialization in teams of identical robots could evolve in both the scenarios we considered, fitness landscape analyses showed that optimal task allocation could be achieved more easily if optimized behaviors capable of carrying out the different subtasks were available as pre-adapted behavioral building blocks. This leads us to suggest that when building blocks are solidified in earlier stages of the evolution, complex coordination strategies such as task specialization are more likely to evolve as the fitness landscape becomes smoother and also easier to explore due to its greatly reduced size. In addition, it brings further support for the hypothesis that, in nature, the evolution of division of labor in social groups and other transitions in the evolution of sociality also tends to be based on the co-option of pre-existing behavioral patterns, as opposed to requiring the de-novo evolution of many entirely new social traits [17]. Our results, therefore, match and can be integrated with available evidence with respect to the importance of preadaptations in the origin of advanced forms of sociality [2,17–22,73]. For example, reproductive division of labor and worker task specialization are thought to be derived from mechanisms that initially regulated reproduction and foraging in solitary ancestors [17,20–22], sibling care is thought to be derived from ancestral parental care [19], and reproductive altruism (i.e., a sterile soma) in some multicellular organisms evolved via the co-option of a reproduction-inhibiting gene expressed under adverse environmental conditions [73]. Furthermore, it confirms other studies that have examined the building block hypothesis with various digital systems, for example in the context of genetic algorithms [74], evolution of single robot morphologies [75] and the open-ended evolution of simple computer programs [76]. From an engineering perspective our study is the first to achieve a complex form of division of labor using an evolutionary swarm robotics approach, and the first to use the method of Grammatical Evolution to evolve complex, non-trivial behavioral patterns. This result is novel in the field of evolutionary swarm robotics, where, few exceptions aside, most studies have used non-incremental and non-modular approaches, e.g. based on monolithic neural networks [38,77]. In fact, previously, the only other studies which evolved a rudimentary task allocation in swarms of robots were those of Tuci et al. [78], which used a neural network controller combined with a fitness function favoring a required preset task allocation [78], of Duarte et al. [40], which used evolved neural network controllers capable of carrying out particular subtasks, which were then combined with a manually engineered decision tree, and the work of refs. [79–81], which used open-ended evolution and a simplified robotic scenario to evolve heterogeneous behaviors for collective construction [79,80] and pursuit [81] tasks in presence of a pre-specified set of three sub-tasks. Typically, the behavioral complexity that could be reached in these artificial neural network-based studies was quite limited, making the evolution of self-organized task specialization in homogeneous groups out of reach for these methods. In fact, the evolution of self-organized task specialization would clearly require a non-standard neural network approach, involving recurrent neural connections to keep track of the internal state (e.g. the current direction of motion to be able to perform phototaxis), a mechanism to achieve modularity and a mechanism to switch stochastically between these modules. Extending the neural network approach used in evolutionary swarm robotics to this level of complexity would be an interesting task for the future. Other studies on task allocation and task partitioning in swarm robotics typically used traditional, manually engineered approaches [82–88] (reviewed in [89]). All these methods are significantly less general than ours, given that we used a nature-inspired automatic design method with a single fitness criterion, group performance, without any pre-engineered decision-making mechanisms, and simultaneously evolved a self-organized task decomposition and task allocation mechanism as well as optimized behaviors to carry out each of the evolved subtasks. We therefore believe that GESwarm and grammatical evolution will play a key role in the future of evolutionary swarm robotics. In conclusion, our work and the results we obtained are therefore important both to explain the origin of division of labor and complex social traits in nature, as well as to advance the field of evolutionary swarm robotics, as we showed that the novel methodological and experimental tools we developed were able to synthetize controllers that were beyond the level of complexity achieved to date in the field.

Author Contributions Conceived and designed the experiments: EF AET EDG TW. Performed the experiments: EF. Analyzed the data: EF AET TW. Contributed reagents/materials/analysis tools: EF AET MD TW. Wrote the paper: EF AET MD TW.