General access metrics

This section presents the results obtained from analysis of the collected web observations for the 1682 unique journal articles authored by the 297 ethicists included in the study.

The annual publication output (2010–2015) is presented in Table 2 together with the annual share of publications to which it was possible to retrieve at least one copy for free. The annual volume of journal articles ranged between 250 and 305 and the share of articles available for free between 52 and 61%. These high-level results demonstrate no consistent tendency for either more recent or older articles being available more frequently. In total, a free copy could be retrieved for 948 of the 1682 articles, producing a total share of open access to be 56%.

Table 2 Annual publication volumes and share of annual publications with at least 1 copy available online for free Full size table

The high-level results found in Table 2 only paint a simple outline for the complexity found within the dataset. Since we collected web observations for up to 8 copies of freely available versions per article the variation in observed web location types and document versions within the 948 articles to which a copy could be found varied greatly.

In order to summarise the collected data as comprehensively as possible Table 3 provides a breakdown of every recorded observation per web location category subdivided by document version found for all of the 948 articles to which between one and eight free copy observations were made. The three most frequent providers of access to free copies in descending order was ASN, subject repositories, personal webpages. In all three of these categories the most frequent document version was the publisher’s version.

Table 3 Breakdown of all observations (2183) of free copies, grouped by year of original article publication, web location category, and document version Full size table

The results so far have not explored the extent to which article access overlaps across multiple web locations. Figure 1 presents a visualization of the distribution of article access, with particular focus on conveying shares of articles available either nowhere, or then at the other extreme, across six different web location categories which was the maximum value observed in the dataset. Note that this merely presents spread across unique categories, articles could be featured multiple times on the same location category, e.g. in two different institutional repositories, but that is not conveyed here. Of the 1682 articles, 726 articles (43%) were not available anywhere, 454 (27%) only through one web location category, 280 (17% through two different categories, 126 (7%) through three, 64 (4%) through four, 26 (2%) through five, and 6 (0%) though six categories.

Fig. 1 Distribution of article access across different web location categories Full size image

Having articles available through more than one web location arguably increases their resilience for becoming completely unavailable, however, some web locations can be assumed to be more future-proof than others in providing sustained access. Table 4 provides a closer look at particularly the 454 articles that were only available on one type of web location. ASNs were found to be the leading category for providing unique free access to articles (98 articles), followed by publisher webpages (87 articles), and personal webpages (77 articles).

Table 4 Web locations providing unique access to one or more copies of a single article Full size table

Continuing on the thread of exploring ways through which unique access to content is being provided, similarly to how unique web location categories were dealt with Table 5 provides a breakdown of which articles only have a single document version made available. A clear majority of the 774 articles with only one document version available were publisher versions, 500 or 64.6%. This result has implications for volatility of access, as a very small minority of publishers allow distribution of the publisher version.

Table 5 Document version distribution for copies where only one type of document version was recorded Full size table

While the institutional level is not a primary focus in this study, a high-level comparison grouped by institution can help in discovering access patterns that relate to institutional environments, and particularly the degree of use that the institutional repository has. Table 6 provides a list of institutional affiliations included in the study, sorted by the total number of ethicists identified from each institution in descending order. The higher on the list the higher the usefulness and reliability of drawing conclusions based on the obtained numbers due to inclusion of more ethicists and articles for conducting the calculations. What is apparent is that UK-based institutions have a higher share of copies available through institutional repositories, something which likely stems from the strong open access policies that been implemented within the country. The relationship between ASNs and institutional repositories is interesting to look at from this perspective as authors affiliated with UK-based institutions are also the among the ones with the highest proportion of copies available through ASNs.

Table 6 List of included institutions from which the ethicists were identified Full size table

The publication activity among the ethicists included in the study varied a lot in terms of volume (1 article at the minimum, 92 at the maximum). To convey the spread Table 7 provides a categorization of ethicists based on their publication activity during the time period of 2010–2015, placing them into one of four categories. Most ethicists published between 1 and 3 articles (120), followed by the category of 4–6 articles (92), the 7–9 article category (53), and finally the category of authors with more ten or more articles published. The category comparison consistently suggests that higher publication activity is related to higher proportion of open access. The share of articles not having any article available open access also drops as more publications are produced, from 44% in the 1–3 article category to the 0% in the over 10 publications category.

Table 7 Analysis of proportion of articles available open access based on individual publication activity Full size table

Through our data collection we recorded 234 unique articles being available directly through publisher websites. In order to provide a better understanding for the exact open access mechanism through which these articles were made available through we returned to the collected URLs and manually classified these observations into more granular categories. In cases where an individual article was available through multiple web locations classified under the ‘publisher website’ category the open access mechanism was derived from the information related to the copy available through the primary journal website. Table 8 provides a summary of the results. What stands out is that 51% of the publisher website observations were within full open access journals that provide all of their content open on the web immediately on publication, and 19% of the publisher website copies being provided as hybrid open access, i.e. articles individually made open access within subscription journals.

Table 8 Distribution of the 234 unique articles with copies found on publisher websites Full size table

Another category warranting a closer look than simply the top-level web location category are observations made on ASNs. Table 9 contains a breakdown of all observations made within this web location category. While both Academia.edu and ResearchGate could be considered well-represented, Academia.edu provided access to more than double the amount of articles compared to ResearchGate in this population of articles (351 vs. 164). From the distribution of article versions across the two platforms Academia.edu has a higher relative representation of article versions other than the publisher version (46% non-publisher versions), while ResearchGate has a notably different version distribution (29% non-publisher versions).

Table 9 Copies found on academic social networks, i.e. ResearchGate and and Academia.edu Full size table

Continuing with the focus on ASNs, Table 10 provides further insight into the exclusivity and overlap in providing access to individual articles between Academia.edu and ResearchGate. This perspective suggests that Academia.edu provides access to almost three times as many articles as ResearchGate (15.4% Academia.edu vs. 5.6% ResearchGate). Something rarely explored at this level of detail is the overlap between access provided through the two services, which we here got a figure of 3.9% of all articles in the population.

Table 10 Exclusivity and overlap in access provided by academic social networks Full size table

The web location category we labelled as ‘aggregators’ were web locations where access to copies is provided through a secondary mechanism where content is automatically cached and mirrored after first being available out in the open somewhere else. They provide little insight into author behaviour, since individual action is not needed, however, they play a substantial part in contributing towards availability resilience should the original copy be removed. Table 11 shows a breakdown of the observations made within this category: Semantic Scholar (170 copies), CiteSeerX (47 copies), and Core (27 copies). Table 4 showed earlier that locations belonging to this category provided unique access to 8 articles so while redundancy is high there is a handful of articles which have been mirrored by these services before being removed from their original location.

Table 11 Breakdown of observations in the aggregators web location category Full size table

Table 12 provides a closer look at the breakdown of copies found in subject repositories, where PhilPapers constitutes 43% of all observations in this category (159 copies). Second, third, and fourth of the list are subject repositories belonging to the PubMedCentral network which have a focus on biomedical and life sciences content (168 copies in total spread out on the US, European and Canadian platforms).

Table 12 Breakdown of observations from subject repositories Full size table

Regarding copies found within the web location category of ‘other website’ no individual domain registered reached even 10 observations, as such no detailed analysis of these domains is provided.

Compliance analysis

The first part of the results section was dedicated to providing a comprehensive picture of access to all of the articles included in the population. The remainder of the results section is dedicated to investing the degree to which copies are aligned to the distribution instructions set out by journals as part of the self-archiving instructions provided to authors. The 1682 journal articles of the sample were published by a total of 481 different journals. Since detailed information about journal self-archiving policies need to be collected and coded on a per-journal basis the compliancy analysis is limited to articles belonging to the twenty most frequent journal outlets in the dataset. The policies were collected during the summer of 2017 and compliance of articles published during 2010–2015 interpreted through that information. Please see the methodology section for more discussion about the potential implications of this methodological limitation.

Table 13 provides an overview of which journals are included together with the article count for each journal which spans from 100 articles for Philosophical Studies to 14 for Erkenntnis.

Table 13 Journals included in compliance analysis, with overview of available copies Full size table

The total number of articles included in the compliance analysis was 597, and concerning authors it included 217 of the 297 ethicists included in the full population. Since the focus of this analysis was on studying author behaviour when it comes to access provision in light of journal policies, observations belonging to copies found directly on publisher websites, through aggregators, and as JSTOR read-only copies are not included since they are not reliant on journal self-archiving policies and provide little opportunity for authors to influence their availability. Of the 597 articles included in the analysis our data collection had retrieved at least one copy for 293 the articles with the previously mentioned limitations in place. Document versions where the exact version status could not be established were compared to the publisher´s policy for allowing dissemination of accepted manuscripts.

As with the overview of the complete dataset previously, giving one single exhaustive table or visualisation of the contents is not possible without losing important information on the way due to the way that observations overlap. Starting with an overview of the policy alignment over all observations Table 14 gives insight into the policy status of copies found at the five web location categories included in the analysis. Of all the 487 copies observed, 258 were non-compliant, 166 compliant, and 63 had an unclear status where the combination of web location category and document version was not prohibited nor permitted explicitly in the publisher policy. Most of the non-compliancy is due to use of the publisher version across all location categories, and with few journals allowing copies to be distributed on commercial platforms (i.e. ASNs) in any form.

Table 14 Overview of all copies found and their policy compliancy related to the 597 articles included in the compliancy analysis Full size table

Table 14 does not shed light on overlap, where multiple combinations of version-location copies could be observed per original article in the sample, and is thus of little aid for understanding policy-alignment at a deeper level. Figure 2 aids to remedy this by showing the complete per-article policy distribution of the population of 597 articles included in the compliance analysis. Of the 293 articles for which at least one copy could be found, the journal policy status of 211 articles belonged to just one policy category, the copies retrieved for the 82 remaining articles produced mixes of aligned, infringing, and unclear policy status.

Fig. 2 Diagram produced by using eulerAPE open source software (Micallef and Rodgers 2014) Compliancy overlap of copies found. This figure visualises the policy status and overlap for the 293 articles to which one or more free copies could be found (overlap = multiple copies with multiple policy status´ found per one article). Full size image

Conclusions regarding the aspect of undersharing, i.e. the degree to which research that could but is not made open access, can be grounded by observing Fig. 2 in conjunction with the journal policies. All but one of the twenty journals included in the compliance analysis explicitly allow self-archiving of the accepted version on institutional and subject repositories, and that journal merely leaves those locations unclear while explicitly allowing self-archiving on a personal webpage. As such the theoretical maximum that could be made available within reasonable effort on the author side is 100%. The current utilization of policy-compliant self-archiving is 22.1%, while disregarding the aspect of policy-alignment the utilization is 49.1%.

The final component introduced as part of the compliancy analysis is the perspective author publication activity in relation to policy alignment. Table 15 provides similar publication activity categories to those found within the full overview of the dataset (Table 7), however, now the scope is limited to articles published in the 20 journals which were part of the compliancy analysis. From the comparison between the categories it is possible to discern that it is more common for authors to have at least one policy-infringing copy of one of their articles available than what the proportion is for authors that have at least one policy-aligned copy available at least one article, and this relationship was found across all publication activity categories.

Table 15 Ethicist publication activity categorization and policy alignment comparison as part of compliancy analysis Full size table

This concludes the presentation of results. In the following section we will provide further interpretation of the results both in terms of potential implications as well as describe how they relate to previous work within this are of research.