In this study, we aimed at evaluating the diagnostic accuracy of pooled nursing staff estimations of the level of consciousness in patients with DoC. Through their clinical practice, nursing staff (ie, nurses and nursing assistants) accumulates extended observation time of patient’s behaviour. Interacting with patients through standardised procedures (such as nursing care, medication administration, blood sample, etc…), they spontaneously generate a subjective estimation of the level of consciousness of the patient. Pooling opinions of several individuals have been shown to outperform individual judgements in specific settings (effect known as ‘wisdom of the crowds’). 14 15 In this study, we hypothesised that pooling individual nursing staff estimations of the level of consciousness can help in the detection of MCS.

Accurate diagnosis of the level of consciousness in a brain-damaged patient is of great importance to better predict recovery. Disorder of consciousness (DoC) taxonomy has been recently challenged 1–3 but schematically includes the unresponsive wakefulness syndrome (UWS, also termed vegetative state) and the minimally conscious state (MCS). The detection of MCS has a huge prognostic impact since the functional outcome is dramatically better for patients with MCS. 4–8 However, assessing consciousness in patients with DoC can be challenging and in such cases, clinicians may need dedicated clinical tools and brain-imaging techniques specifically designed to probe consciousness. 9 Even when using dedicated clinical tools such as the Coma Recovery Scale-Revised (CRS-R 10 ), a unique assessment remains associated with a high frequency of diagnostic error. 11 This can be due to fluctuations of consciousness level over time. To circumvent this limitation, repeated clinical assessments have been proposed, but this can be limited by the availability of trained clinicians. 12 13

We next pooled the individual ratings obtained for each patient using the median to obtain the DoC-feeling score (index test). We, thus, obtained a DoC-feeling score as well as a reference standard label (UWS or MCS) for each patient. We performed a direct comparison of the scores between the two populations using a Wilcoxon-Mann-Whitney test. In order to assess the diagnostic accuracy of DoC-feeling scores to detect MCS (target condition), we computed the area under the receiver operating characteristic (ROC) curve (AUC) and report sensitivities and specificities for several cut-offs of DoC-feeling scores. All statistical tests were two sided with a type I error rate of 5%. Categorical variables were expressed as numbers (percentage), quantitative variables as median (IQR). Analyses were performed using the R statistical software V.3.4.1. 19 LMM was performed using the lme4 package. 20 AUC, sensitivity and specificity with their 95% CIs were computed using 2000 stratified bootstrap replicates (AUC) and binomial test (sensitivity and specificity) respectively using the pROC package. 21

First, to evaluate the association of individuals’ DoC-feeling ratings with the standard reference, we computed a linear mixed model (LMM) using DoC-feeling individual ratings as the dependent variable, the state of consciousness as the fixed effect explanatory variable and patients as well as raters as random effects. Normality of residuals distribution was assessed by visual inspection. LMM provides the optimal approach in order to take into account the non-independence between DoC-feeling ratings due to the repeated measurements over time at both the patient level (same patient rated by several raters) and the rater level (several ratings by rater).

Demographics, aetiology and delay since the acute brain injury (ABI) were collected. In addition to CRS-R and DoC-feeling ratings, we also collected complementary metrics (such as the classical distinction between wakefulness and awareness, interaction during nursing and/or painful care) using the same VAS approach as well as the best FOUR-score observed during each shift 18 (online supplementary material ).

Disorders of consciousness (DoC)-feeling score. Each patient was evaluated around three times by DoC experts using the Coma Recovery Scale-Revised (CRS-R). In parallel, nursing staff members reported their daily observations using the DoC-feeling Visual Analogue Scale. The reference standard was defined as the best state of consciousness observed during one of the CRS-R and the patient was coded as being in an unresponsive wakefulness syndrome (UWS) or a minimally conscious state (MCS) accordingly (reference standard). All individual DoC-feeling scores obtained during the whole hospital stay were pooled and the median value (represented by the vertical dashed line) of the polled results was defined as the DoC-feeling score (index test).

Nursing staff members (nurses and nursing assistants) taking care of a DoC patient were asked to fill in a form at the end of their shift containing a scale called ‘DoC-feeling’. DoC-feeling was designed as a 100 mm Visual Analogue Scale (VAS) aiming at quantifying the caregiver subjective reports of patient’s best consciousness level observed during the shift. We specifically asked caregivers to rate their ‘gut feeling’ about the best level of consciousness observed during the shift or the ‘présence’ (presence), using the French idiom ‘le patient est-il là?’ which is very close to the English one ‘Is there anybody home?’ ( figure 1 ; see online supplementary material for the original VAS and its English translation). This wording reproduced the commonly used language to communicate observations relative to consciousness level of a patient among caregivers. Individual DoC-feeling ratings were collected prospectively. Caregivers were blinded to the previous caregivers’ ratings and to the reference standard (the CRS-R) and expert physicians were blinded to the index test. In order to obtain a final global metric, for each patient, all individual ratings were pooled using the median to obtain the DoC-feeling score that constituted the index test of this study.

Patients were hospitalised in the neurointensive care unit (neuro-ICU) and were observed for at least 1 week during which they encompassed multiple neurological assessments and brain imagery such as high-density electroencephalogram, event-related potentials, magnetic resonance imaging and [ 18 F]-fluorodeoxyglucose positron emission tomography. Clinical assessments consisted of repeated neurological examinations which included the CRS-R, 17 performed by expert clinicians (BH, BR, FF, LN) belonging to an external expert team in patients with DoC. CRS-R scoring ranges from 0 to 23 and is based on the presence or absence of responses on a set of hierarchically ordered items testing auditory, visual, motor, oromotor, communication and arousal function. State of consciousness (ie, UWS, MCS) is determined by specific key behaviours probed during the CRS-R assessment. For instance, visual pursuit, reproducible movements to command and/or complex motor behaviour scores for MCS. 17 Since consciousness level can fluctuate over time, we used the highest level of consciousness among all the CRS-R performed on a given patient as the reference standard. Following this procedure, each patient was thus labelled as being in a UWS or MCS. MCS was the target condition.

No patients or patients’ relatives were involved in the study design or the management of this study. Results of the study have been released as a preprint on a public repository 16 and the dataset of this study is available on Dryad ( https://doi.org/10.5061/dryad.1m03145 ).

All patients referred for evaluation of consciousness at the Department of Neurology of La Pitié-Salpêtrière Hospital, Paris, between February 2016 and October 2017, were screened prospectively. On hospital admission, patients’ relatives were approached to give consent for participation to the study. All patients with a UWS or MCS condition and consent were eligible.

DoC-feeling scores. DoC-feeling scores were obtained by pooling individual ratings obtained for each patient. DoC-feeling scores were smaller for patients with UWS than for MCS (A, B) and also correlated with the CRS-R score (A). Area under the ROC curve (C), sensitivity (Se) and specificity (Sp) for several cut-offs (D) revealed very good performances at identifying the MCS. ***P<0.001. CRS-R, Coma Recovery Scale-Revised; DoC, disorders of consciousness; MCS, minimally conscious state; ROC, receiver operating characteristic; UWS, unresponsive wakefulness syndrome.

Overall, patients underwent 12 9–19 DoC-feeling individual ratings, performed by 7 5–10 different raters. All DoC-feeling ratings obtained for a given patient were summarised using the median to obtain the pooled metric called DoC-feeling score (index test, figure 4A ). DoC-feeling scores were smaller for patients with UWS than for MCS (7.2 mm (2.4–11.4) vs 59.2 mm (27.3–77.3), respectively; p<0.001; figure 4B ). ROC curve revealed excellent accuracy at detecting MCS (AUC=0.92 (95% CI 0.84 to 0.99); figure 4C ) with, for instance, a sensitivity of 89% (95% CI 71% to 98%) and a specificity of 85% (95% CI 62 to 97) when using a DoC-feeling score cut-off at 16.7 mm ( figure 4D ). Note that this cut-off is only used to give the reader an idea about the diagnostic performances using the more intuitive sensitivity and specificity metrics (see the Discussion section). The six misclassified patients using this cut-off are described in the online supplementary material . Simulations of AUCs using a various number of ratings per patient suggested that a minimal number of 4 ratings is needed to reach an AUC of 0.9 (online supplementary material ). Of note, DoC-feeling score also helped discriminate UWS patients from MCS ‘minus’ patients (patients with non-reflexive behaviours but absence signs of language at bedside) 23 (see online supplementary material for additional details).

Individual disorders of consciousness (DoC)-feeling ratings. DoC-feeling ratings tended to be smaller in patients with unresponsive wakefulness syndrome (UWS) when compared with patients with minimally conscious state (MCS). All individual ratings are shown (dots, n=692), alongside boxplots helping to visualise the median and the IQR for both UWS (on the left in red) and MCS (on the right in blue) patients.

Six hundred and ninety-two DoC-feeling individual ratings were obtained (median of 12 9–19 ratings per patient). Eighty-three caregivers, 57 nurses and 26 nurses assistants (composed of 47 neuro-ICU regular staff members and 36 float staff members) participated in the study. Each nursing staff member filled a median of 4 1–12 evaluations. Median delay between the first and the last individual rating was 6 days. 5–9 No statistical differences were found between UWS and MCS in the number of DoC-feeling ratings per patient, a number of raters per patient or in terms of number of ratings per rater ( table 1 ).

One hundred and forty-seven CRS-R assessments were performed, with a median of 3 2–4 per patient (ranging from 2 to 6). According to the best CRS-R, 27 patients (57%) were diagnosed as being in an MCS and 20 (43%) were classified as being in a UWS. Patients with MCS less frequently suffered from anoxia and had a longer delay between the ABI and the study inclusion (see table 1 ). No differences were found in the number of CRS-R assessments per patient or brain-imaging explorations between patients with UWS and MCS.

Seventy-two patients were eligible during the inclusion period, 23 were not included because of a lack of informed consent from a legal representative. Two patients were excluded because they had been diagnosed as conscious (‘Exit-MCS’). Forty-seven patients were included in the analysis (see figure 2 ).

Discussion

In the present study, we developed and assessed a new behavioural tool called DoC-feeling to help diagnose MCS. This score, which pools multiple subjective reports obtained among several caregivers over several days of evaluation, showed a very good accuracy to diagnose MCS.

DoC-feeling is not intended to replace the clinical examination nor the current CRS-R gold standard. However, taking advantage of valuable information collected by all caregivers involved in the care of a patient with DoC, the implementation of DoC-feeling could improve the overall diagnostic accuracy of patients with DoC. Caregivers are trained to evaluate pain and suffering in patients during all delivered procedures. These procedures constitute standardised interactions that can allow the generation of very reliable heuristic processes to assess one’s percept in terms of pain suffering and also consciousness.

Pooling opinions of several individuals have been previously shown to outperform individual judgements in specific settings. Recently, there has been a growing interest for this kind of approach (called collective intelligence or ‘wisdom of the crowds’) in the medical field, especially in diagnosis procedure (diagnosis of skin cancer, mammography screening, etc…).14 24–26

In that perspective, quantifying expertise that is not restricted to physicians might be of prime interest. Capitalising on assessments of consciousness gathered at any hour of the day and through multiple observers may also potentially increase our ability to detect signs of consciousness in these patients who usually show large fluctuations of cognitive state and arousal.12 DoC-feeling may also help to better describe and quantify these fluctuations. Additionally, it also enables to acknowledge the caregiver group expertise and to increase care team attention through a coherent and cumulative set of observational data.

The good accuracy of DoC-feeling obtained in our setting is likely to be generalisable elsewhere. First, as the distribution of CRS-R scores obtained in this cohort spanned most of the possible CRS-R scores, it is unlikely that the good accuracy of DoC-feeling results from two easily discernible patients’ clusters. Second, as all the patients included in this study, either in an acute or a chronic stage, were specifically referred to our institution for expertise, it is most likely that our cohort was actually representative of patients for whom the diagnosis is the most difficult. However, we would like to emphasise that the used cut-off in the result section might be variable across teams and across time for a given team. This is why DoC-feeling should only be used in addition and not instead of CRS-R.

Our study presents some limitations inherent to the aim of developing a pragmatic and easily implementable tool in daily clinical practice. First, as for all studies on consciousness disorders, we faced a typical situation of an imperfect gold standard. Although CRS-R is still the most widely accepted reference, the optimal number of assessments remains unknown.13 According to a recent study, using three CRS-R assessments can lead to a 17% rate of misdiagnoses.12 It is worth noting that this is exactly the reason why we developed DoC-feeling. CRS-R requires a specialised expertise that is not available everywhere and that can be extremely time-consuming, especially now that multiple assessments are recommended to take into account fluctuations of consciousness over time.13 In sharp contrast, DoC-feeling scale could be implemented in any team, is much faster and allows to gather multiple observations per day. Second, caregivers might have been influenced by other factors that would have been very difficult to control. For instance, they might have been influenced by insights from other caregivers or, in case of multiple ratings for a given rater, by their previous ratings. However, the variability of individual ratings for a given patient (that tended to increase over time, see online supplementary material) suggests that caregivers did report their own perception independently from each other and their eventual previous ratings. Moreover, interactions among small groups of people could, in fact, have had a positive effect since the aggregation of small groups’ insights have been shown to outperform the overall judgement of the whole group.27 This kind of tool might thus be less prone to individual subjective bias which is frequent during decision-making under a high degree of uncertainty such as assessment of patients with DoC.28 Staff members could also have been biased by classical predictors of consciousness recovery such as aetiology or delay from ABI or by the perception of patients’ relatives, although it is commonly acknowledged that relatives frequently lack objectivity (in both directions) in such dramatic situations.29 Finally, although the number of float staff members involved and the result of a preliminary survey assessing prior knowledge of regular nursing staff on DoC (online supplementary material) suggest together that DoC-feeling should be accurate in other settings, the monocentric design of this study requests external validation.

Despite these limitations, we think that the implementation of DoC-feeling score can significantly improve diagnostic accuracy and confidence in the diagnosis when supporting other metrics (ie, CRS-R and functional brain imaging at rest or during cognitive tasks). Moreover, even when incongruent with other metrics, DoC-feeling score could be still useful. Indeed, this could either suggest that key clinical elements have been missed by physicians while performing punctual CRS-R assessments, but it could also reveal, in case of discrepancy with all the other elements (clinical and brain imagery), a possible misperception of a patient’s consciousness level that needs to be acknowledged and considered in any further medical decision processes. This last point could be crucial in bridging the gap between the caregiver’s team and the patient’s relatives in situations of conflict.

In conclusion, we propose a new behavioural tool, called DoC-feeling, based on the ‘wisdom of the crowds’ effect (or, in our case, the ‘wisdom of the caregivers’), which can help to improve the diagnostic of MCS and thus to promote a better prognostication and decision-making in patients suffering from DoC.