Skip to content

Contributorship and content guidanceđź”—

Our team is strongly committed to Team Science, and to recognising the deep technical and methodological contribution of research software engineers and other non-academic professions to research outputs.

Contributorship in academia is a complex issue. It affects funding, and therefore the sustainability of projects and teams. Historically software developers who make substantial practical and creative contributions to the delivery of data science projects have often been excluded in many teams (wrongly, in our view) which affects recruitment, the quality of software engineering on projects, and what can be delivered and sustained. We therefore want to ensure that contributions to research outputs, whether indirectly or directly, are appropriately recognised through authorship, acknowledgements, or citations, as set out in our contributorship guidance.

We also want to ensure that the OpenSAFELY Service is accurately described, to mitigate misunderstandings and maintain public trust. Our content guidance provides suggested text to include for published papers, reports and presentations, to be used if required or considered useful context for readers.

We want to emphasise that this guidance on contributorship and content is recommended, not mandatory. We aim to resolve grey areas through discussion. Please get in touch if you have any questions.

Authorshipđź”—

In the earlier phases of OpenSAFELY, projects typically reflected a deep collaboration between users and the platform team: the datasets and the tools were being developed and used for the first time; for a smaller group of users; and there was substantial crossover between platform work, and analysis. Now, the platform is expanding to a wider user-community, and specifically to teams that are more “users” than “collaborators”. In this new phase, we are aiming to reduce the involvement of co-pilots, data experts and developers in delivering academic work or specific new features for projects in general, unless these projects have been clearly identified as collaborations. We are also reflecting this in our authorship guidance. Nonetheless – because the OpenSAFELY service and tools deliver more than a simple data download service – there are situations where co-pilots or developers might contribute more than neutral non-academic “customer service” to the development and delivery of a research project. Below, we have tried to set out guidance on authorship in this grey area: we are aware that other platforms and teams have adopted more or less inclusive models; we are happy to receive feedback; and we expect this section to be refined through experience over time.

Our authorship guidance is as follows:

  • Authorship should be discussed at the introductory project meeting with your co-pilot. More generally, we recommend discussing authorship at an early stage with all project collaborators.
  • The OpenSAFELY co-pilot(s) for the project, if any, should be offered authorship.
  • If your project requires or uses substantial new contributions from other OpenSAFELY team members – for example new technical features, data acquisition, curation of data sources, development of tools to manage or analyse data in a standardised way, or development of codelists with or for your team – we strongly encourage offering authorship to these members. This is particularly likely during the pilot phase of new platform features or data ingests when projects are delivered in close collaboration.
  • We strongly encourage inclusion of a CRediT statement describing how each person contributed to the paper, including a contribution from “The OpenSAFELY collaborative”. A template is provided, along with guidance on how OpenSAFELY platform related activities can be considered to have contributed to your project within CRediT.
  • As with standard authorship guidance, named authors should usually participate in reading, revising, and approving the final manuscript before submission or publication.
  • Any contributors who are offered authorship should be able to decline the offer if preferred, and either be included in the acknowledgements instead or not included at all.

The above guidance relates to platform or service related activities by the OpenSAFELY team that have directly or indirectly contributed to the output. If any activities are undertaken by members of the OpenSAFELY team in addition to platform or service related activities, such as study conceptualisation, statistical study design, or writing or revising analysis code, then they should be recognised in the usual way via a CRediT statement, including authorship where appropriate.

Acknowledgementsđź”—

NHS England and GP IT System Suppliers should be acknowledged. Suggested content:

We acknowledge all the support received from the [Optum Technical Operations team] [TPP Technical Operations team] [Optum and TPP Technical Operations teams] throughout this work, and for generous assistance from the information governance and database teams at NHS England and the NHS England Transformation Directorate.

We acknowledge members of the OpenSAFELY Collaborative, who have contributed to the development and maintenance of the OpenSAFELY platform.

Please note additional suggested acknowledgement text if your project uses certain linked data sources.

Citationsđź”—

We request that you cite the following materials in your work when describing OpenSAFELY:

Please note additional suggested citations if your project uses certain linked data sources.

Fundersđź”—

OpenSAFELY funders should be acknowledged. Suggested content:

The OpenSAFELY platform is principally funded by grants from:

  • NHS England [2023-2026];
  • The Wellcome Trust (222097/Z/20/Z [2020-2024] and 311535/Z/24/Z [2025-2031]);
  • The Medical Research Council (MRC) (MR/V015737/1 [2020-2021]).

Additional contributions to OpenSAFELY have been funded by grants from:

  • Medical Research Council (MRC) via the National Core Study programme Longitudinal Health and Wellbeing strand (MC_PC_20030, MC_PC_20059 [2020-2022]) and the Data and Connectivity strand (MC_PC_20058 [2021-2022]);
  • The National Institute for Health Research (NIHR) and the Medical Research Council (MRC) via the CONVALESCENCE programme (COV-LT-0009, MC_PC_20051 [2021-2024]);
  • NHS England via the Primary Care Medicines Analytics Unit [2021-2024].

The views expressed are those of the authors and not necessarily those of the NIHR, Wellcome Trust, NHS England, Wellcome Trust, the Department of Health and Social Care, or other funders. Funders had no role in the study design, collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the article for publication.

Approvalsđź”—

When describing the OpenSAFELY projects approvals process, suggested content:

This work was conducted with the approval of NHS England [project XXX and link], and approved for publication following review by the Bennett Institute and the Department for Health and Social Care.

The ethics committee or board responsible for ethical oversight of the study should be referenced. Suggested content:

This study was approved by the XXX Research Ethics Committee [reference XXX]

Or:

This study did not require ethical approval as confirmed by XXX Research Ethics Committee.

Linked data sourcesđź”—

If the High Cost Drug dataset was used, required acknowledgement content:

North East Commissioning Support Unit provided support on behalf of all Commissioning Support Units to aggregate the high cost drugs data for use in OpenSAFELY studies.

If the ICNARC data was used, required acknowledgement content:

This report is independent research which used data provided by the MRC funded ISARIC 4C Consortium and which the Consortium collected under a research contract funded by the National Institute for Health Research. The views expressed in this publication are those of the author(s) and not necessarily those of the ISARIC 4C consortium.

If the ONS-CIS dataset was, required acknowledgement content:

The Coronavirus (Covid-19) infection survey is delivered by the Office for National Statistics in partnership with the University of Oxford, University of Manchester, UK Health Security Agency and Wellcome Trust. The study is funded by the Department of Health and Social Care with in-kind support from the Welsh Government, the Department of Health on behalf of the Northern Ireland Government and the Scottish Government. The collection and testing of samples is carried out by the Lighthouse laboratory. Genome sequencing is funded by the COVID-19 Genomics UK (COG-UK) consortium. COG-UK is supported by funding from the Medical Research Council (MRC) part of UK Research and Innovation (UKRI), the National Institute of Health Research (NIHR), and Genome Research Limited operating as the Wellcome Sanger Institute.

If the UKRR dataset was used, required acknowledgement content:

This project includes data from the UKRR derived from patient-level information collected by the NHS as part of the care and support of kidney patients. We thank all kidney patients and kidney centres involved. The data are collated, maintained, and quality assured by the UKRR, which is part of the UK Kidney Association. The interpretation and reporting of these data are the responsibility of the authors and in no way should be seen as an official policy or interpretation of the UK Kidney Association. Access to the data was facilitated by the UKRR’s Data Release Group. UKRR data are used within OpenSAFELY to address a number of critical research, audit and service delivery questions related to the impact of COVID-19 on patients with kidney disease.

Methods - Data Sharing, Data Sources, software, and reproducibilityđź”—

When describing the OpenSAFELY platform, Service, and data, suggested content:

All data were linked, stored and analysed securely using the OpenSAFELY platform, https://www.opensafely.org/, as part of the NHS England OpenSAFELY Analytics Service. Primary care records managed by the GP software provider, [TPP/Optum] were linked to [ONS death data, etc.] through OpenSAFELY [For more information on data linkage, see Data Provenance, Access, and Verification section]. [If your project is ID 156 or above: No data from patients who registered a Type-1 Opt Out with their GP surgery were included in this study]. More information on the patients available for selection in the Service is available in the OpenSAFELY data page of the documentation page documentation.

The analysis dataset(s) was defined and created using ehrQL / Python [X.X]. Analysis was executed using [Stata 16.1 / Python X.X / R X.X]. All analysis code is shared openly for review and re-use under MIT open license [LINK TO GITHUB REPO].

All iterations of the pre-specified study protocol are archived with version control https://github.com/opensafely/xxxxxx/protocol.

When describing federated analyses across TPP and Optum databases (for example https://doi.org/10.3399/BJGP.2021.0376), suggested content:

This was an analysis delivered using federated analysis through the OpenSAFELY platform. A federated analysis involves conducting patient-level analyses across multiple secure datasets, then later combining them: codelists and code for data management and data analysis were specified once using the OpenSAFELY tools; then transmitted securely from the OpenSAFELY jobs server to the OpenSAFELY-TPP platform within TPP’s secure environment, and separately to the OpenSAFELY-Optum platform within Optum’s secure environment, where they were each executed separately against local patient data; summary results were then reviewed for disclosiveness, released, and combined for the final outputs.

Patient and Public Involvement and Engagement (PPIE)đź”—

Consider describing PPIE activities for the OpenSAFELY Service, in addition to any other project-specific PPIE. Suggested content:

OpenSAFELY has involved patients and the public in various ways: we developed a public website that provides a detailed description of the platform in language suitable for a lay audience (https://opensafely.org); we have participated in two citizen juries exploring public trust in OpenSAFELY; we have co-developed an explainer video (https://www.opensafely.org/about/); we have patient representation who are experts by experience on our OpenSAFELY Oversight Board; we have partnered with Understanding Patient Data to produce lay explainers on the importance of large datasets for research; we have presented at various online public engagement events to key communities (e.g., Healthcare Excellence Through Technology; Faculty of Clinical Informatics annual conference; NHS Assembly; HDRUK symposium); and more. To ensure the patient voice is represented, we are working closely to decide on language choices with appropriate medical research charities (e.g., Association of Medical Research Charities). We will share information and interpretation of our findings through press releases, social media channels, and plain language summaries.

Information governanceđź”—

If required, suggested content on information governance for the OpenSAFELY Service:

NHS England is the data controller of the NHS England OpenSAFELY Analytics Service; [TPP is the data processor] [Optum is the data processor] [Optum and TPP are the data processors]; all study authors using OpenSAFELY have the approval of NHS England.1, 2 This implementation of OpenSAFELY is hosted within the [Optum environment which is] [TPP environment which is] [Optum and TPP environments which are] accredited to the ISO 27001 information security standard and [is][are] NHS IG Data Security and Information Governance Toolkit compliant.3

Patient data has been pseudonymised for analysis and linkage using industry standard cryptographic hashing techniques; all pseudonymised datasets transmitted for linkage onto OpenSAFELY are encrypted; access to the NHS England OpenSAFELY Analytics Service is via a virtual private network (VPN) connection; the researchers hold contracts with NHS England and only access the platform to initiate database queries and statistical models; all database activity is logged; only aggregate statistical outputs leave the platform environment following best practice for anonymisation of results such as statistical disclosure control for low cell counts.4

The service adheres to the obligations of the UK General Data Protection Regulation (UK GDPR) and the Data Protection Act 2018. The Secretary of State has requested that NHS England operates the Service under the COVID-19 Directions 20205 and the NHS OpenSAFELY Data Analytics Service Pilot Directions 2025 which came into force on 17 June 2025.6

The Pilot Directions enable the Service to be operated for purposes wider than COVID-19. Taken together, these provide the legal bases to link patient datasets using the service.

  1. The NHS England OpenSAFELY COVID-19 service - privacy notice. NHS Digital (Now NHS England). https://digital.nhs.uk/coronavirus/coronavirus-covid-19-response-information-governance-hub/the-nhs-england-opensafely-covid-19-service-privacy-notice (accessed 23 February 2026).
  2. The NHS England OpenSAFELY Data Analytics Service pilot - privacy notice. NHS Digital (Now NHS England). https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/secretary-of-state-directions/nhs-opensafely-data-analytics-service-pilot-directions-2025/nhs-opensafely-privacy-notice (accessed 23 February 2026).
  3. Data Security and Protection Toolkit. NHS Digital (Now NHS England). https://digital.nhs.uk/services/data-security-and-protection-toolkit (accessed 23 February 2026). Archived at: https://web.archive.org/web/20250405100349/https://digital.nhs.uk/services/data-security-and-protection-toolkit/data-security-and-protection-toolkit.
  4. ISB1523: Anonymisation Standard for Publishing Health and Social Care Data. NHS Digital (Now NHS England). https://digital.nhs.uk/data-and-information/information-standards/information-standards-and-data-collections-including-extractions/publications-and-notifications/standards-and-collections/isb1523-anonymisation-standard-for-publishing-health-and-social-care-data (accessed 23 February 2026).
  5. Secretary of State for Health and Social Care. UK Government. COVID-19 Public Health Directions 2020: notification to NHS Digital. https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/secretary-of-state-directions/covid-19-public-health-directions-2020 (accessed 23 February 2026). Archived at: https://web.archive.org/web/20250414105355/https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/data-provision-notices-dpns/opensafely-covid-19-service-data-provision-notice.
  6. NHS OpenSAFELY Data Analytics Service Pilot Directions 2025. NHS England. https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/secretary-of-state-directions/nhs-opensafely-data-analytics-service-pilot-directions-2025 (accessed 23 February 2026).

Data provenance, access and verificationđź”—

If required, suggested content on provenance, access and verification of data in the OpenSAFELY Service:

The data in the NHS England OpenSAFELY Analytics Service is drawn from General Practice data across England where Optum and TPP are the data processors. This study uses data from [Optum only / TPP only / both Optum and TPP]. Linked data from hospitals and death registrations are also available.

Access to the underlying identifiable and potentially re-identifiable pseudonymised electronic health record data is tightly governed by various legislative and regulatory frameworks, and restricted by best practice. [Optum][TPP][Optum and TPP] developers initiate an automated process to create pseudonymised records in the core OpenSAFELY database, which are copies of key structured data tables in the identifiable records. These pseudonymised records are linked onto key external data resources that have also been pseudonymised via SHA-512 one-way hashing of NHS numbers using a shared salt. University of Oxford, Bennett Institute for Applied Data Science developers and PIs, who hold contracts with NHS England, have access to the OpenSAFELY pseudonymised data tables to develop the OpenSAFELY tools.

These tools in turn enable researchers with OpenSAFELY data access agreements to write and execute code for data management and data analysis without direct access to the underlying raw pseudonymised patient data, and to review the outputs of this code. All OpenSAFELY platform software is shared for inspection on GitHub: https://github.com/OpensAFELY-Core.

All code for the full data management pipeline, from raw data to completed results, for this analysis are available for review and reuse for all OpenSAFELY analyses [LINK TO GITHUB REPO]. The data management and analysis code for this paper was led by (XX) and contributed to by (XX, XX).

Open accessđź”—

We strongly encourage all papers using OpenSAFELY be pre-printed prior to peer-review. The vast majority of OpenSAFELY papers have been preprinted on medRxiv, though other preprint servers may be suitable. If you are unsure whether a particular preprint server is suitable for an OpenSAFELY paper, your co-pilot can advise.

Outputs published in academic journals should comply with Wellcome’s Plan S requirements for journal publication.