15 grey wolves (11 males, 4 females, age: 2 to 8 years) and 12 mixed-breed dogs (7 males, 5 females, age: 2 to 7 years) housed at the Wolf Science Center (WSC) in Ernstbrunn, Austria, participated in loose-string experiments together with a human partner (Fig. 1), who, by working as a ‘trainer and keeper’ at the WSC, had extensive daily contact with the animals in a variety of settings (leash walking, tests, animal care, hand-raising etc.) thereby establishing a close affiliative bond with them. Each animal was tested with the trainer it had the best relationship with. The trainer was kept constant across all sessions, but trainers varied between animals.

Figure 1 Illustration of a wolf (A) and a dog (B) working with a human cooperation partner (photographs by Rooobert Bayer). Full size image

All animal-human dyads were first exposed to 6 sessions of 6 trials each of the ‘classic’ version of the loose-string paradigm19,20,21,22,23,24, where a single tray was presented (spontaneous condition). At the start of each test trial, the animal subject (either wolf or dog) and the human cooperation partner were positioned in the shifting system facing the test tray at a distance of 40 m. In 50% of the trials, we released the animal when the human partner had reached the middle of the enclosure so that the animal reached the tray approximately 3 seconds before the human cooperation partner. In these trials, the animal could choose the side it wanted to pull the rope on, and then wait for the human partner to arrive and pull the other rope. In the other 50% of the trials, the animal was released when the human cooperation partner was only 3 meters away from the tray, so that the human cooperation partner arrived approx. 3 seconds before the animal at the tray and could choose which rope to pull on. The human cooperation partner was not allowed to talk, look or gesture to the animal in any way during a trial and followed a strict protocol (see SI Materials and Methods for details on the procedures and protocol) so that the animal had to coordinate with the human partner and not vice versa.

Since all subjects except 3 wolves had previously participated in the string-pulling paradigm with conspecific partners and thus had different experiences8, in the analyses, we included two variables that captured the more salient aspects of their previous experience with the task: (a) training: i.e. whether animals had received individual training to hold both ends of the rope in their mouth and pull the tray forward before being tested with a conspecific partner and (b) previous success: the average percentage of trays successfully solved across testing in all conditions with all previous conspecific partners (i.e. if an individual was tested with 3 different partners in the one-tray condition, with 2 of these partners in the two-apparatus condition, and once in the delay condition where the partner was released 10 seconds later, the individual’s score was calculated as the mean percentage of success across these 6 partners/conditions)(see8)(see ST9 for a list of the animals with their respective experiences).

We found that when tested with a human partner in the spontaneous condition, wolves succeeded above chance level: on average in 61.5% of trials (range: 25–91%, binomial test: p < 0.001; see ST 11 for the means and SDs of the coded behaviours). Dogs on the other hand performed at chance level i.e. on average they succeeded in 49% of trials; range: 0–88%, (binomial test p = 0.32). Only one animal never succeeded in solving the task with the human partner (Imara, a dog). A Generalized Linear Mixed Model (GLMM- proportional data- binomial distribution) with number of successes (defined as getting access to the food by pulling the rope simultaneously with the partner) in each session as the dependent variable, species, session, previous success and training, as the explanatory factors and animal ID as the random factor was run, followed by a model comparison approach based on AICc (MuMin package in R)(see Supplementary Information for details of the statistical analyses). The best model showed a strong effect of session, with success increasing across sessions (for model comparisons see Supplementary Table 1(ST1). Furthermore, having successfully cooperated with conspecifics in the previous experiment (‘previous success’) also weighed on the likelihood of success, but to a lesser extent (Table 1). Species and prior training on the tray had very little impact on the animals’ rate of success.

Table 1 Estimated effect size, adjusted standard error (SE), z-value and relative variable importance (RVI) estimated by a generalized linear mixed model to determine the effects of session (each session compared to session 1), training, previous success and species on the likelihood of success in the spontaneous condition. Full size table

Animals were significantly more successful in trials in which they arrived at the tray before the human did (wolves: mean animal first 88% vs. human first 50%; dogs: mean animal first 70% vs. human first 40%)(Wilcoxon: V = 263, p-value = 0.0002) highlighting the difficulty for the animals to give up their own preferences and adjust to the human. Furthermore, we ran a number of GLMMs to evaluate whether wolves and dogs differed in their behaviors with the human partner during the trial namely (a) the frequency of gazing at the human partner (b) and the likelihood of stealing the rope from the human partner, which usually happened when the human was at the tray first and chose the animals’ preferred rope-side. Session and previous success were included as explanatory factors in all models, as well as animal ID as random factor. Frequency of gazing was normalized by the time spent within one body length of the tray (using the offset function), since the behavior was recorded during the time the animals were in front of the tray (and hence within view of the camera).

The average frequency of looking to the human partner per trial was 0.3 times (range: 0–8) for wolves and 0.7 times (range 0–10) for dogs. An interaction between who arrived first (human vs. animal) at the tray and species emerged on the frequency of looking at the human partner (GLMM: Chisq = 4.989, df = 1, p = 0.026). In trials, where the human arrived first, there was no effect of species, but an effect of previous success and session (Table 2, model comparisons: ST2) with animals that had less experience looking more at the human, and an overall increase in looking at the human partner across sessions.

Table 2 Estimated effect size, adjusted standard error (SE), z-value and relative variable importance (RVI) estimated by a generalized linear mixed model to determine the effects of session, species and previous success on gazing at the human partner in trials in which the human arrived first at the tray. Full size table

In trials, in which the animal arrived first at the tray there was no effect of previous success or session, but wolves looked at the partner significantly less frequently than dogs (mean looking at partner: wolf: 0.54, range 0–8; dog: 1.2, range 0–10)(Table 3, model comparisons: ST3).

Table 3 Estimated effect size, adjusted standard error (SE), z-value and relative variable importance (RVI) estimated by a generalized linear mixed model to determine the effects of session, species and previous success on gazing at the human partner in trials in which the animal arrived first at the tray. Full size table

Since, if animals arrived at the tray first, the person was instructed to operate the opposite (free) rope, no opportunity for stealing the rope from the human arose in these trials. So we limited our analyses of rope-stealing to trials in which the human arrived first. We found that 14 out of 15 wolves stole the rope in at least one trial (mean: 6.9; range 0–18), while only two out of 12 dogs stole the rope (one dog in two and one in five trials). Overall, wolves stole the rope in 19% of trials and thus were more likely to steal the rope from the human than dogs; however, this behavior decreased across sessions and moreover, it was not affected by previous success (Table 4, model comparisons: ST4).

Table 4 Estimated effect size, adjusted standard error (SE), z-value and relative variable importance (RVI) estimated by a generalized linear mixed model to determine the effects of session, species and previous success on likelihood of the animals stealing the rope from the human partner. Full size table

Nine wolves (7 males, 2 females, age: 2 to 8 years) and 7 dogs (4 males, 3 females, age: 2 to 7 years) solved the single tray with the human partner on at least 4/6 trials in 2 consecutive sessions and thus also participated in the dual tray condition with the exception of two wolves that despite reaching criterion were not tested further due to them being uncomfortable with unfamiliar people, which would have required trainers to run the entire testing. In this condition, the human was released first and approached the trays from the mid-line, only starting to directly approach the a priori designated tray, when reaching the second half of the enclosure. The animal was always released when the human had reached the middle of the enclosure and thus, while it had to adjust to the human in regards to which tray to solve first, given the greater speed of the animals in moving, it could arrive first and choose which rope-end to pull. If the animal was the first to reach the designated tray, the human partner took the rope-end not chosen by the animal. If the human cooperation partner reached the first tray before the animal, she chose the side of the rope not preferred by the animal based on side preferences observed in the previous condition. Since in this condition we were primarily interested in whether the animals would coordinate their behaviour with the human partner and approach the same tray, we opted not to force them to also use their non-preferred side to pull the rope.

If the human cooperation partner and the animal solved the first tray successfully or if the rope was pulled out or stolen by one of the partners, the human cooperation partner either (1) followed the animal, if it moved towards the second tray or (2) waited for 5 seconds before approaching the second tray. Apart from these differences, the same protocol was followed as for the first experimental condition (see details of the experimental procedure in the Suppl. Material and ST 12 for the means and SDs of the coded behaviours).

Wolves successfully solved both trays in 76% (range 44–94%) and dogs in 67% (range 55–77%) of trials (both binomial tests: p < 0.001). A Generalized Linear Mixed Model (GLMM) with the number of successes (i.e. solving both trays in a trial) in each session as the dependent variable, species and session as the explanatory factors and animal ID as the random factor followed by a model comparison approach based on AICc (package MuMin in R) revealed an effect of session but not species on the likelihood of success in the dual tray condition (see Table 5, model comparison: ST5), with animals becoming more successful across sessions.

Table 5 Estimated effect size, adjusted standard error (SE), z-value and relative variable importance (RVI) estimated by a generalized linear mixed model to determine the effects of session and species on success in the dual tray condition. Full size table

Since the human partner was released first and hence chose which tray to go to, following the human was a crucial aspect of the task. Following to tray 1 was defined as moving with the partner (within 1 body length), however the animals could still arrive and solve tray 1 successfully and not follow the person (so for example by visiting tray 2 first or lagging behind and then quickly joining the human on tray 1) or they could follow and then not be successful (by for example not pulling the rope with the partner). Overall, tray 1 was successfully solved on 190 of a total 252 trials and of these 190 trials in which animals successfully solved tray 1 they followed the partner in 140 trials. They followed but failed in 10 trials. We ran a GLMM with ‘following the human to tray 1’ as the dependent variable. We found no effect of species (wolves followed the human partner in 62% and dogs in 60% of trials), but an effect of session (with an increase across sessions) on the likelihood of following the human partner to the first tray (Table 6, model comparisons: ST6).

Table 6 Estimated effect size, adjusted standard error (SE), z-value and relative variable importance (RVI) estimated by a generalized linear mixed model to determine the effects of session and species on the likelihood of following the human partner to the first tray in the dual tray condition. Full size table

Having (successfully or unsuccessfully) completed tray one, two scenarios could emerge: the animal could wait (for 5 seconds) for the human partner to move towards the other tray or take the lead itself. We therefore ran a second analysis with the likelihood of ‘leading to tray 2’ as the dependent variable. Finally, if the human (rather than the animal) took the lead, the subject could choose to follow (defined as staying within 1 body length of the human cooperation partner) or not (for example lingering longer at tray 1 and/or sniffing around the enclosure). Neither leading nor following the partner to tray 2 equated with success, since once the animals reached the tray, they still needed to coordinate their pulling actions with the partner. At the same time, the dyad could still be successful if the animal did something else while the human moved to the second apparatus, but then ran over to pull the second end of the rope just in time.

Overall, the animals solved at least tray 2 in 220 of 252 trials. Of these 220 trials, animals led in 119 and followed in 67 trials. Of all the trials they led (120) they failed only one. They followed the human on 78 trials in total and failed in 11 of these. We ran a final GLMM on a subset of the data considering only trials in which the human led the way from the first to the second tray and including ‘following the human to tray 2′ as the dependent variable. Wolves were more likely than dogs to take the lead in moving from the first to the second tray (wolves took the lead in 60% and dogs in 35% of trials) (Fig. 2). While taking on the leading role increased across sessions (Table 7, model comparisons: S79), the difference between wolves and dogs remained significant also in the last session (see Supplementary Results). Moreover, previous experience of the wolves with the two-tray condition with conspecifics did not influence the likelihood that a wolf took over the leading role of the human (see Supplementary Results).

Figure 2 Visualizing the differences between wolves and dogs in leading from the first to the second apparatus (animal leads) and whether or not the wolves and dogs would follow a human leading from the first to the second apparatus. *Represent statistical differences. Full size image

Table 7 Estimated effect size, adjusted standard error (SE), z-value and relative variable importance (RVI) estimated by a generalized linear mixed model to determine the effects of session and species on the likelihood of the animal leading from the first to the second tray in the dual tray condition. Full size table

Finally, in trials in which the human led from the first to the second tray (N = 50 for wolves and N = 82 for dogs), we found that the dogs were more likely to follow the human partner (doing so in 70% of trials) than wolves (which followed in 42% of trials). Again, while the likelihood of following increased across sessions (Table 8, model comparison ST8), the difference between wolves and dogs also remained significant in the final session and was independent of previous success in the wolves (see Supplementary Analyses).