Experimental protocol

Once the computer provided a recipe as described via a genetic algorithm, a lattice search or any other method, the 3D-printed device was programmed to prepare the mixture, execute five experiments and clean the arena in order to get ready for the next recipe. In order to mix the oils through the serpentine channel, the oil with the highest value in the recipe was set to a volume of 400 μl, while all the other oils were given a volume proportional to 400 μl based on the ratio defined in the recipe. These volumes were then pumped into the device and through the serpentine channel at a flowrate derived from the proportional ratios. The oil with the highest volume was pumped in at 2.5 μl per second, while the other oils were pumped in at a slower speed proportional to the recipe. These oil mixtures were dispensed into the device arena, and once the mixture was completed the device removed the contents into a waste drum, and the arena was washed with 2.5 ml of acetone. This process was not only used to prepare a new mixture, but also to remove the contents from the previous mixture. This mixing process was repeated three times. After this, a series of cleaning cycles were executed in order to be sure that the arena was fully cleaned. The first step was to wash it with 3 ml of acetone, and remove the contents. Then 2.5 ml of aqueous phase, and remove its content. Then 10 ml of acetone, and remove its content. This step was executed twice. Finally, 3.75 ml of aqueous phase, and remove its content. This step was executed twice. The next objective was to perform the experiment. In order to do so, the arena was initially filled with 3.75 ml of aqueous phase, and then five oil injections of 10 μl were executed in order to generate the oil droplets. At this point, the experiment was performed and it was recorded using a camera. Each experiment lasted 1 min. The next objective was to clean the arena in order to perform the next experiment. In order to do so, the arena was initially filled with 7.5 ml of acetone, and its contents removed. Then it was filled with 3.5 ml of aqueous phase, and its contents removed. This last step was repeated twice. Once this last step was performed, the device would fill the arena with aqueous phase to execute again the same recipe (each recipe was tested five times), or a new recipe would have been sent from the computer, in which case the device would start again with the mixing procedure.

Preparation of solutions

Initially, NaOH (20.0 g) was dissolved in distilled water (ca 4.8 l), then TTAB (33.65 g) was added. Finally, the pH was adjusted to pH 13 using 6 M NaOH solution, and the volume was adjusted to 5 l. The pH metre used was calibrated between pH 7 and 10. The oils 1-octanol, 1-pentanol and DEP were prepared in 200 ml aliquots in reagent bottles, while octanoic acid was diluted with 1-pentanol (20% octanoic acid 80% 1-pentanol), and also prepared in 200 ml aliquot in a reagent bottle. DEP and 1-octanol were dyed with 0.25 mg ml−1 Sudan II blue and vortexed to mix. 1-Pentanol and octanoic acid were dyed with 0.25 mg ml−1 Sudan III red and vortexed to mix.

3D-printed device manufacturing

The devices were designed using the CAD software “Rhinoceros 5”. The 3D models were exported into STL files, and the STL files were transformed into “G-code” using the software “Bits from Bytes Axon 2”, see Supplementary Fig. 1 for the configuration used. The devices were printed using the 3D printer “Bits from Bytes 3D touch”. PP filament with 3 mm diameter was used. Both transparent and white filament were used without any significant difference. Supplementary Fig. 2 shows the base design of the device used during the experiments. Different variations of this device only added obstacles into the arena. Once a device was 3D printed, the only manual operation required was to tap its inlets in order to be connected to tube connectors. In order to do so, a thread of size M6 was used.

3D-printed fluidic platform

A fully automated fluidic device capable of producing droplets in a Petri dish-like arena with aqueous subphase was constructed. The device was a monolithic 3D-printed piece using a commercial 3D printer in its standard configuration. The device had a series of inputs and outlets connected to commercial liquid pumps. The pumps used were defective “Tricontinent C-Series Syringe Pumps”. Their stepper motors were connected to “Pololu a4988” drivers. The drivers were controlled using an “Arduino Due” board. A homemade PCB shield was used to easily connect the stepper drivers to the Arduino board (Supplementary Fig. 5), and custom firmware was written using the Arduino suite in order to control the pumps. The syringe pumps used 500 μl syringes for the four oil phases, and 5 ml syringes for all the other pumps. The pumps used three-way PEEK valves, as provided by Tricontinent. FEP tubing was used to connect all the liquid components. Above the arena, there was a camera for video recording/image analysis. Supplementary Fig. 6 shows an overall picture of the platform with the main parts highlighted.

Droplet generation calibration

Although the devices were always printed the same way, the final result was always slightly different. This difference was important in the case of the droplet generator outlet, as can be seen in Supplementary Fig. 3. In order to have a homogeneous droplet generation through all the devices with potential different outlet sizes, the speed at which the pulses were generated from the pumps was calibrated in order to obtain a perfect droplet generation when only 1-octanol was present in the mixture.

Image processing and droplet detection

A Microsoft LifeCam Cinema Web camera was situated above the arena in order to record the experiment. While the experiment happened, the camera stream was fed into a running Python (2.7.11) OpenCV (2.4.12) script, which performed the image analysis and returned the number of droplets active at the end of the experiment. The video was configured to 800 × 600 pixels, and 30 frames per second (FPS). The first step consisted of defining a circular area with 275 pixels of radius. This area overlapped with the experimental circular arena, and only the pixels inside this area were considered for image processing. The image processing was performed using a mixture of gaussians (MoG) model for background subtraction. OpenCV’s MoG was used for this purpose using the default configuration values. The MoG model was reinitialised before each experiment. It is important to remark that everything in the scene remained constant except the oil droplets, therefore, all the pixels marked as foreground were droplets. The foreground subtracted was then used with a find contours operation from OpenCV in order to describe the droplets. At the end of the experiment, the number of droplets active was returned as a fitness value. By using a MoG with a small window, all the droplets that remained static a few seconds were considered as part of the background and discarded. This way, only the droplets that always moved were considered part of the foreground. See Supplementary Fig. 4 for pictures of how the different steps were performed.

Lattice search

Combinations of the four oils individually, in pairs, threes or fours with a granularity of 10% were tested in order to execute the lattice search. In the case of the oils being tested individually, only one experiment was performed, where one of the oils was active, and all the other ones were inactive. In all the other cases, a sequential and evenly spaced search was performed, where every step represented 10%. In this way, for example, in the case of two oils, there would be nine possible combinations (ignoring the extremes were only one of the oils is active): 0.1/0.9–0.2/0.8–0.3/0.7–0.4/0.6–0.5/0.5–0.6/0.4–0.7/0.3–0.8/0.2–0.9/0.1. The same procedure was applied to combinations of three or four oils. In total, there were 11 different combinations, six for pairs, four for triples and one for fours. Therefore, our lattice search consisted of 282 different oil formulations. As before, each formulation was tested five times, generating 1410 videos. Supplementary Fig. 7 shows the most interesting results.

Genetic algorithm and fitness function

A genetic algorithm was programmed using LabView 2015 standard libraries. Each GA run, except when described otherwise, consisted of 10 generations, and each generation had a population of 20 individuals. Each individual was defined by its recipe, which was the ratio between the four oils used (1-octanol, DEP, octanoic acid and 1-pentanol). Each of the oils was assigned a real number between 0 and 1, and the sum of the four oils for a given recipe was always 1. The first generation was constructed by assigning randomly generated recipes to each individual. Each individual was tested five times using the protocol described, and the average of these five executions was returned as its fitness value. Once all the individuals from a generation were given a fitness value, a new generation was constructed by choosing 10 parents from the just finished generation using the roulette wheel algorithm, where the individuals were selected with probability directly proportional to their fitness value. A given parent could only appear once in the following generation. The other 10 individuals were constructed by crossing the parents in randomly chosen pairs (each individual was chosen twice as parent), applying a random position one-point crossover, and a Gaussian 10% mutation (noise sample from a Gaussian distribution of mean 0 and variance 0.1). The final recipe was then normalised to 1.

Fitness function

Given a frame as provided by the camera stream, the droplets in that frame were detected using the algorithm described in the “Image processing and droplet detection” within this “Methods” section. Given an experiment, the fitness function calculated its fitness value as the number of droplets detected in the last frame of the provided camera stream. Because the experiments ran for 1 min, and we used 30 frames per second, this means that our fitness function can be exactly defined as the number of droplets detected by our algorithm in frame 1800. Because our droplet detection algorithm is based on a background subtraction where we considered the droplets as foreground, if a droplet did not move for a set period of time (roughly 3 s) it was marked as part of the background, and removed from the count.

Experimental reproducibility and control tests

The result showed on Fig. 5 was repeated four times with similar results, see Supplementary Figs. 8, 9, 10 and 11. The same experiment but using an arena with pillars was repeated twice, see Supplementary Figs. 12 and 13. The same experiment but using an arena with a procedurally generated environment was repeated twice, see Supplementary Figs. 14 and 15. The experiment described in Fig. 6 was repeated once, with similar results, see Supplementary Fig. 16.

In order to study the drop in the evolutionary trajectories, the same hybrid GA run was performed but the first empty arena device was swapped by another empty arena device. In this case, the evolutionary trajectory kept a similar value, see Supplementary Figs. 20 and 21. The opposite experiment was also performed, where an initial GA run using the pillars arena was then swapped by one with the empty arena. The evolutionary trajectories grew slightly, see Supplementary Fig. 22. The experiment where an empty environment was swapped by one procedurally generated was also performed, see Supplementary Figs. 23 and 24. In both cases, the evolutionary trajectories dropped, but not as much as before. This reinforces the results seen on Fig. 6.

Arena obstacles specification

The pillars used both in the “pillars environment” and in the “cave environment” had a diameter of 2 mm. Their height was variable depending on their position, because the base of the arena has a slope. The shortest ones had a height around 4.7 mm, while the longest ones had a height of around 6.2 mm. In the “pillars environment”, the pillars were placed in a grid, where each pillar had a neighbour in each of the possible four directions (north, south, east and west) when possible. The distance between pillars was 3 mm. See Supplementary Fig. 25.

The second custom arena used the same pillars as before, but instead of manually placing them in a linear/grid way, an algorithm was used to generate a pattern. The algorithm used was a Lindenmayer system. Examples of the patterns generated using this approach can be seen in Supplementary Fig. 26. The actual device used can be seen on Supplementary Fig. 27.

Fitness landscape model

The fitness landscapes shown in Fig. 6 were generated using support vector regression (SVR) with a radial basis function kernel. The experiment consisted of the described GA run where the 10 first generations used an empty environment, the following 10 used an environment populated with pillars and the last 10 used an environment with cave-like structures. For each of the three environments, data from a full GA run was collected, resulting in a data set of 200 experiments each. Because each experiment was repeated five times, each fitness landscape represents 1000 data points. To find the most accurate representation, best parameters for the SVR were estimated using 10-fold cross-validation and with respect to the mean squared error on the training set. Fitness landscapes are shown as ternary plot, a way to represent on a 2D plane three-coupled variables, which sum to a constant. In our case, we represent three out of the four components (with the fourth parameter held constant at zero). Our parameters represent percentage of each oils, the sum is thus constrained to 1. The C (penalty parameter of the error term) parameter was searched within [0.01, 0.1, 1, 10, 100]. The best parameters were C = 100 and gramma = 10. Supplementary Fig. 28 shows all the fitness landscapes produced this way with the collected data. The third row of this figure represents the fitness landscapes, which can be seen in Fig. 6.

Each one of the environments was also tested individually, meaning isolated GA runs for each environment where the first generation was generated randomly. Supplementary Fig. 29 shows the fitness landscapes generated with this data, which were also generated using a SVR. For each environment (empty, pillars and caves), data from two full GA runs were collated resulting in a data set of 400 experiments each. Because every experiment was repeated five times, 2000 points were used. The fitness landscapes were then generated using the method just described. For the empty environment, the best parameters were {‘C’: 100, ‘gamma’: 10} for an average mean square error of 6.83 (std = 4.00). For the pillars environment, the best parameters were {‘C’: 10, ‘gamma’: 100} for an average mean square error of 21.30 (std = 10.68). For the caves environment, the best parameters were {‘C’: 10, ‘gamma’: 100} for an average mean square error of 11.50 (std = 4.50).

In Supplementary Movie 3 describing the evolution of the fitness landscapes through the different environments, the fitness landscapes were calculated from the same data set as Fig. 6 in the main manuscript, but in this case the method used was kernel ridge regression with a radial basis function kernel as before. The main difference is that in this case the same kernel was used to calculate all the fitness landscapes. The parameters here used for the kernel were {‘alpha’: 10, ‘gamma’: 100}.

Computer code availability

All relevant computer code is available from the authors on reasonable request.

Data availability

All relevant data are available from the authors on reasonable request.