Open Research Primer
This primer introduces the principles and practices of Open Research, offering a reference to help improve the transparency, integrity, and effectiveness of scientific work. By adopting Open Research approaches, researchers can maximise the quality, credibility, reach, and societal impact of our work.
What is Open Research?
Open Research is a set of practices and principles that prioritise transparency and inclusivity throughout the research process — from study design and methodology to outputs and data sharing.
Open Research can be thought of as a ‘continuum within a buffet‘ (or a ‘multi-axis openness framework’), with each dimension of research practice — design, data, methods, outputs, participation — ranging from closed to open. Open Research principles can be applied in our own research practice, modelled for trainees and collaborators, and championed when reviewing manuscripts and funding proposals. The adoption of Open Research practice varies widely between scientific disciplines.
A useful maxim is striving to be “as open as possibly and closed as necessary“. While the benefits of Open Research practice are considerable, good research can take many forms and “rigour” is context-dependent. Open Research should not be conflated with reproducible research, as reproducibility is not epistemologically universal. Good science should use the right tools to answer the right questions, and some valuable forms of knowledge production cannot be reproduced (e.g., phenomena altered by the act of observation and qualitative approaches to knowledge creation). While Open Research cannot guarantee good research, it provides a vital safegurd against bad science ([1]).
Benefits of Open Research
- Publication: Increasingly, journals and funders require researchers to meet core Open Research standards (e.g., sharing reproducible code and data) when submitting manuscripts.
- Integrity: Errors are more easily detected and corrected when research is open. Science is a self-correcting endeavour, but openness is essential to enable these corrections ([2]), and openness helps safeguard against (rare) deliberate malpractice.
- Uptake: Open access outputs enjoy wider distribution, greater visibility, and often attract more citations ([3]).
- Collaboration: haring datasets and methodologies fosters new and sometimes unexpected collaborations.
- Inclusivity: Reducing barriers to participation in research ecosystems enables underrepresented groups to contribute to, and benefit from, the production of formal knowledge ([4]).
Open Research and the Reproducibility Crisis
Open Research is often championed as a remedy for the reproducibility crisis ([5])—the growing recognition that many published findings cannot be replicated when studies are repeated. The crisis arises from interacting factors such as:
- Selective reporting: Favouring statistically significant or “positive” results while neglecting null findings (the file drawer problem).
- Poor research practices: (Usually) unintentionally compromising validity through flawed design, specification searching, inappropriate tools, or p-hacking.
- Lack of transparency and detail: Omitting critical methodological information needed for replication.
- Pressure to publish: Incentives to publish quickly in prestigious journals can compromise quality.
- Insufficient appreciation of replication studies: Prestige journals often prioritise novelty over verification, discouraging replication work.
Consequences of poor reproducibility include:
- Producing erroneous insights with real-world consequences.
- Damaging the credibility of science; if findings cannot be replicated, they cannot be fully trusted.
- Wasting resources on studies built on unreliable results.
- Eroding public trust in science and scientific institutions.
Checklist of Open Research Practice
This non-exhaustive checklist is designed to support the advancement of Open Research within the TESS Lab, with a focus on geospatial ecology but broad applicability across Geography and Ecology. It covers practices that are foundational and usually essential, those that are often really helpful, and some that can be great to apply depending on the project or context. Think of it as a flexible guide to help us all make our research more transparent, collaborative, and impactful.
Collaboration and study design
Protect Academic Freedoms and Integrity
We should avoid situations where any party gains the power to suppress or distort research findings they find inconvenient. In applied research, it is common for data controllers or funders to request their approval is obtained after an analysis is completed but before research findings are shared. This can be extremely problematic, as it can be challenging or even impossible to detect that the evidence has been distorted after the fact.
Therefore, it is important to avoid commitments that allow stakeholders to restrict research findings. However, these stakeholders often possess valuable insights that can enhance interpretation and communication of results. Inviting them to provide additional reviews can be constructive—allowing us to consider and incorporate their suggestions where appropriate.
Engaging in open dialogue with such stakeholders can also help raise awareness of the unintended consequences of outdated policies and encourage more progressive practices. For example, we have successfully encouraged some data controllers to shift their approval process earlier—from approving final findings to agreeing on the research design upfront. This means data is shared for investigation based on predefined questions and methods rather than subjecting findings to post hoc approval. While not fully open, this approach represents a positive progression along the Open Research continuum supporting better outcomes.
Preregister Experimental Designs
Preregistration involves specifying and committing in advance to research questions, hypotheses, and analytical methods, and registering this plan with a trusted authority such as the Open Science Framework – Registrations. There can be substantial value for scientific rigour in carefully planning research in advance, which is often underestimated.
Registrations help protect and demonstrate integrity by guarding against practices like ‘secretly hypothesising after the results are known (SHARKing)‘, the file drawer problem (withholding negative results), and specification searching. They enable final reports to clearly distinguish between pre-planned hypotheses and exploratory findings, the latter of which should be interpreted with appropriate caution despite potential novelty.

Preregistration is effectively mandatory for hypothesis-testing studies in fields like medical research but remains optional—and surprisingly rare—in disciplines such as geography and ecology. While preregistration is less suited to ‘learning by doing’ approaches common in doctoral training, it is a useful tool for more pre-planned projects, such as studies with comprehensive experimental designs set out in funded grant proposals.
For more guidance on whether preregistration is right for a particular project, see the ReproducibiliTea ‘introduction to preregistration‘ reading list.
Managing data, code and analytical pipelines
Use Free and Open Source Software
To maximise the reusability and reproducibility of our workflows—both for ourselves and others—we should preferentially use Free and Open Source Software (FOSS) whenever suitable options exist. This is especially important considering global inequities: many dominant software companies are based in the Global North, and licence fees can reinforce economic imbalances by transferring wealth away from under-resourced communities.
Our choice of tools has long-term implications. Familiarity bias means that software learned early in one’s career tends to be used throughout, and these habits spread as skills are passed on. Too often, methodological decisions are driven by familiarity rather than suitability—a dynamic well illustrated in this review. To support better choices, we’ve compiled a list of FOSS options useful for TESS Lab research.
For example, Zotero is our preferred bibliographic manager: it’s fully featured, open source, and offers powerful collaboration and customization options. In contrast, proprietary tools like Mendeley, owned by Elsevier, further concentrate power within large academic publishers.
Similarly, for statistical analysis, R is a free, powerful, and widely accessible option. Proprietary alternatives like SPSS or Minitab require costly subscriptions, limiting the ability for others to replicate or build on work (or even you to re-use your own code when you move organisations!).

Develop Reproducible Workflows
Using reproducible workflows—such as scripted data processing and analysis—is fundamental to Open Research in most, though not all, scientific fields. These workflows transparently document every analytical step, making it easier to replicate, update, upscale, and detect or correct errors.

In TESS Lab, we extensively use R for its versatility and power in building reproducible workflows. The University of Exeter offers support through its Coding for Reproducible Research (CfRR) Training Programme, helping researchers improve skills for efficient and transparent research. The Research Software and Analytics Group at Exeter is also happy to support conversations on best practices for data openness and code reproducibility.

Share Reproducible Code
Sharing reproducible code is crucial for building trust, improving research quality, and accelerating scientific progress. It enables others to verify findings, detect errors, and build upon previous work, fostering a more robust and efficient research ecosystem. Sharing code also documents the research process beyond what’s possible in textual paper descriptions, helping others fully understand our work and its implications ([6]).
While many journals require analytical code sharing, enforcement is uneven. Science is a self-correcting endeavour but the literature is replete with errors of varying magnitudes, some of which have had serious consequences in the real world. Adopting a practice of always sharing code transparently alongside publications is a vital safeguard.
For instance, a PhD student discovered that an extremely influential paper used to justify controversial austerity cuts in the 2010s contained major errors due to Excel misuse, invalidating its findings (media report). In another example, shared code enabled researchers replicating a high-profile study to efficiently demonstrate that 4/5 key results were sensitive to a contentious statistical modeling choice, tempering media claims of ‘insect Armageddon.’
In TESS Lab, we leverage R, Git, and GitHub to develop and share reproducible code. Our GitHub organisation hosts both private and public repositories for collaboration and sharing. We link our public repositories (currently ≥ 28) to Zenodo — a general-purpose, open-access repository developed by CERN — which issues persistent DOIs to make code citable and trackable. Zenodo integrates seamlessly with GitHub, supports versioning, and preserves older code versions alongside updates. More details are on our “Using Computers for Research” page.

When sharing code, we use licences to clarify usage rights. For most TESS Lab projects, the GNU GPL v3 licence is the best option. For an overview of common licences, including options limiting commercial reuse, visit choosealicense.com.

Share Research Data
Sharing research data properly is vital for (i) reproducing or verifying research, (ii) sharing publicly-funded work with the wider community, (iii) enabling new questions to be asked of existing data, and (iv) advancing scientific innovation. This accelerates progress by allowing researchers to build on prior work, creating a more reliable and efficient research ecosystem. Research shows that sharing data increases citations by about 9% ([8]) and can lead to unexpected collaboration opportunities.
The default assumption throughout the research life cycle should be that data will be shared in a FAIR-compliant way—Findable, Accessible, Interoperable, and Reusable ([7]). nteroperability requires standardised data formats (e.g., those codified by the Open Geospatial Consortium). It is much easier to share data if sharing is considered throughout collection and processing pipelines.
We cannot rely on outdated commitments such as “data shared upon reasonable request”, as 93% of such requests are ignored ([9], [10], [11]). While most journals require data sharing, enforcement varies.
Responsible research must also respect ethical and commercial constraints that can limit data sharing—such as protecting the locations of vulnerable wildlife, ensuring respondent anonymity, or respecting commercial sensitivities of partners. In many cases, sharing suitably processed data or metadata is appropriate ([12]). Examples include reducing location precision, anonymizing surveys, or sharing data subsets to enable code reproducibility—as we did when sharing a 5% subset of a sensitive dataset to aid reproducibility of our machine learning pipeline.
For TESS Lab, the Zenodo repository, operated by CERN, is generally our prefered option for archiving and sharing research datasets. Zenodo supports individual dataset uploads of up to 50 GB; larger datasets can be divided into multiple submissions. Exeter’s institutional repository (ORE) is available for smaller datasets.
Embargoing data temporarily is sometimes appropriate to protect early career researchers or data collectors, allowing them to publish first. Many repositories (e.g., Zenodo) offer embargo options for set periods or automatic unlocking of data upon publication of an associated paper.

Researchers handling sensitive Indigenous datasets should consult Operationalizing the CARE and FAIR Principles for Indigenous data futures for accessible introductory guidance. While valuable in some contexts, the CARE principles are still evolving and are not yet universally applicable, partly due to unresolved tensions between accessibility and data sovereignty.
Writing and publication
Create and Use an ORCID
All researchers producing scholarly outputs should obtain and use an ORCID (Open Researcher and Contributor ID). ORCID is a free, unique, and persistent identifier for individuals engaged in research, scholarship, and innovation. It enhances transparency in authorship by ensuring unambiguous attribution of contributions.
ORCID also connects many parts of the research ecosystem — including funding, peer review, and data repositories — allowing researchers to spend more time on research and less on administrative tasks. We should encourage trainees and co-authors to register and use ORCIDs for all outputs, supporting research efficiency and integrity.

Write Accessibly
Accessible writing is a key but often overlooked aspect of Open Research. Concide and clear communication helps ensure that scientific ideas can be understood, replicated, and built upon. Overly complex writing laden with acronyms can obscure meaning and limit reach.
One can think of this as writing defensively — minimising opportunities for misinterpretation. You can find more guidance on clear writing and accessible figures on other TESS Lab pages.
It is valuable to seek feedback on your writing from the perspective of a non-specialist reader. Useful options for this can include the judicious use of Large Language Models to propose refinements to existing prose that you can consider on a word-by-word basis to evaluate whether they aid clarity while retaining full accuracy and nuance (although be aware of the limitations of these tools), and also the University of Exeter offers support from Royal Literary Fund Fellows who can provide high quality feedback on your writing.

Include Author Contribution Statements
For multi-authored works, include an Author Contribution Statement using the Contributor Role Taxonomy (CRediT) framework. Authorship confers credit and accountability with academic, social and financial implications; however, authorship order alone poorly reflects individual contributions — especially as conventions vary across disciplines.
Recording contributions explicitly (typically at the end of a manuscript) matters most in large collaborations, where a core group typically contributes most of the intellectual and practical load. Fair recognition of contributions helps safeguard early career researchers’ development and provides evidence for research evaluations such as REF and recruitment and promotion panels.

Open Access Research Outputs
Publishing research outputs Open Access (OA) ensures that the knowledge we create is freely available to anyone with internet access. This expands readership to include policymakers, practitioners, and other audiences without subscription access.

Exeter’s agreements with many publishers allow outputs to be published OA, aligning with requirements from most funders and research assessment exercises. However, OA must be seen as part of a broader Open Research approach — not its end point. We should also remain mindful that article processing charges can be prohibitive for many researchers, perpetuating inequities in who can contribute to the formal scholarly record.
Preprints
A preprint is a manuscript shared publically before journal peer review. Preprints are most valuable for rapidly disseminating time-sensitive work in a credible and citable way. Useful contexts can include:
- Demonstrating nearly completed work — e.g., for grant applications or CVs — to give assessors confidence in recent outputs that may otherwise be delayed in the publication process, which can be particularly valuable for early career researchers, provided that preprints are clearly differentiated from accepted articles.
- Sharing data, ideas, or code with multiple external stakeholders.
- When there is a necessity to establish priority in competitive research areas.
Relevant preprint servers for TESS Lab include EarthArXiv (geospatial ecology) and EcoEvoRxiv (ecology), both hosted on the OSF. While preprints can slightly accelerate the exposure of work, they should be used with care([13]). Prematurely sharing provisional findings can be misunderstood by journalists and other stakeholders. While some Open Access journals automatically make all submitted manuscripts available as preprints and the vast majority of journals accept manuscriptsthat have been shared as preprints, a few journals will not accept submissions of preprinted manuscripts.
The University of Exeter’s transformative publishing agreements with most major publishers already enable Open Access publication, removing the need to use preprints to bypass paywalls, weakening contentions that preprints are an essential part of Open Research.
Further Resources
Links to training programmes, handbooks, tools etc. for Open Research practice.
The Turing Way offers a handbook for practices supporting reproducible, ethical and collaborative data science, including great resources on reproducible code, analysis workflows and data management.
Open Science Framework is a free, open-source platform for managing and sharing all parts of the research lifecycle — data, code, protocols, preprints, and manuscripts. It’s widely used in ecology and earth sciences.
Framework for Open and Reproducible Research Training (FORRT) – Community-generated resources supporting Open Science teaching in both quantitative and qualitative research, with summaries of open science literature, self-assessment tools, seminars, and teaching resources.
ReproducibiliTea is an international grassroots initiative supporting Open Science journal clubs to discuss diverse issues, papers and ideas about improving science, reproducibility and the Open Science movement. There are monthly webinars, an Exeter-based ReproducibiliTea journal club, and curated reading lists providing great introductions to the underlying issues.
The Coding for Reproducible Research (CfRR) training programme at the University of Exeter and the linked Exeter Data Analytics Hub for Promoting Open and Reproducible Science based on Exeter’s Penryn Campus offers a useful community of practice with introductory resources and peer support (although beware that some of their resources may be outdated).
Open Research via Open Access Outputs at the University of Exeter (useful for checking the status of publishing agreements facilitating Open Access publications).
Qualitative Open Research
Qualitative epistemological approaches have unique requirements when it comes to Open Research practices. This section highlights a selection of valuable resources designed to help qualitative and mixed-methods researchers navigate these challenges more effectively. An aspiration here is reimagining Openness to help ensure that Open Research is properly open to all (with thanks to Madeleine Pownall, whose work is making important contributions to advancing qualitative open research!).
- Framework for Open and Reproducible Research Training (FORRT) – Community-generated resources supporting Open Science teaching in both quantitative and qualitative research, with summaries of open science literature, self-assessment tools, seminars, and teaching resources.
- Rethinking Transparency and Rigor from a Qualitative Open Science Perspective by Steltenpohl et al. (2023)
- Integrating Qualitative Methods and Open Science: Five Principles for More Trustworthy Research by Humphreys et al. (2021)
- Preregistering qualitative research by Haven and Van Grootel (2019)
- Three steps to open science for qualitative research in psychology by Branney et al. (2023)
Further Reading
Academic research into Open Research practice.
Easing Into Open Science: A Guide for Graduate Students and Their Advisors by Kathawalla et al. (2021)
Promoting Open Science: A Holistic Approach to Changing Behaviour by Robson et al. (2021)
Open code for open science? by Easterbrook (2014)
Data sharing: An open mind on open data by Gewin (2016)
The Buffet Approach to Open Science by Bergmann (2016)
Promoting trust in research and researchers: How open science and research integrity are intertwined by Haven et al. (2022)
Harking, Sharking, and Tharking: Making the Case for Post Hoc Analysis of Scientific Data by Hollenbeck and Wright (2017)
Exploring the reproducibility of research publications: A Reproducibility Evaluation Pilot – a stimulating video unpacking some aspects of research reproducibility
Violation of research integrity principles occurs more often than we think by Li et al. (2022)
The effect of Open Data on Citations by Traag (2024)
Data Sharing Practices and Data Availability upon Request Differ across Scientific Disciplines by Tedersoo et al. (2021)
The Conundrum of Sharing Research Data by Borgman (2012)
Why Most Published Research Findings Are False by Ioannidis (2005)
Trust in scientists and their role in society across 68 countries by Cologna et al. (2025)
Philosophy of Open Science Open Access book by Leonelli (2023)
