The open science and research handbook
- The text on this page is from The open science and research handbook, December 2014, English version
This handbook aims to help researchers, research organisations, decision-makers, financiers, and the general public promote the adoption and use of open science and research (OSR).
OSR opens new opportunities for participating in scientific research and applying research results, increasing the global contribution and social impact of scientific activities.
Finland seeks to become a leader in OSR, applying its principles to accelerate Finnish scientific research and boost its impact.
Achieving this goal will require the extensive participation of the research community and the internalisation of new working methods, but with the aid of OSR, we aim to boost the competitiveness and quality of Finland's research system, promoting reliability and transparency in scientific research.
This handbook aims to facilitate this process, and will be edited and enhanced by key stakeholders as we progress. Handbook versions approved by the strategy team will be published on the openscience.fi website.
Currently, the handbook comprises the following sections:
The key premises for OSR | Give to the strategy team, 2 June 2014 |
Promoting OSR | Give to the strategy team, 2 June 2014 |
General principles | |
Promoting openness in the research process | |
Measures for science policymakers | |
Measures for research financiers | |
Measures for research administration | |
Measures for research organisations and research teams | |
Measures for researchers | |
Putting OSR into practice | Additions from working groups and open comments during 2014; to be published in the forum on 25 November |
Open research publications | |
Open research data | |
Open research methodologies | |
Open research infrastructures | |
Open research support services | |
International situation | Additions from working groups during 2014 |
Glossary | Additions from working groups and cooperation with the Ministry of Finance during 2014 |
The first version of this handbook includes Chapter 1, short versions of Chapters 2 and 4, and Chapter 5. Over the course of 2014, additions will be made by working groups from the OSR Initiative (ATT), from open comments, and from the road map.
Detailed guidelines and model processes will be inserted into Chapters 2 and 3 in particular.
Openness: The key premise of open science and research
Openness is a key principle of science and research, creating new opportunities for participation by researchers, decision-makers and the general public.
OSR is now a globally accepted way of promoting science and its social impact. In adopting OSR, we generate increased opportunity for transparency, repeatability and confirmation of research results.
For OSR to thrive, the research community needs wide-ranging access to the publications, data, methods, expertise, and support services generated by, and required for, scientific research.
Scientific knowledge is by nature open and responsible, and this openness should be reflected not only in data collection but also in research and evaluation methods. Others' desire to access and use research results indicates a study's significance and, because its results are widely used, openness also increases the impact of that research.
Science can be democratised through the use of new operating models for OSR. OSR is not just a collection of methods and recommendations — it's a way of thinking, of promoting open-mindedness and networking. Digitalisation has opened up the research process, creating new opportunities for cooperation and communication.
Openness provides potential for change. A key aspect of this is ensuring that emerging researchers working outside of established research infrastructures have equal opportunities for data access, something especially relevant in emerging countries.
Openness gives everyone the principled opportunity to research, criticise or affirm research results in accordance with their abilities. This increases trust in science, and boosts corporate activity. Intangible commodities such as data, publications, methods, and services also benefit from increased openness in science. Making materials more widely — or freely — available, can also increase the potential for innovation.
The aim of science has always been to promote high-quality research and best practice, and to prevent poor research and falsification. OSR by its nature supports this, requiring efficient, open methods for managing research data. OSR can be successfully adopted only if those responsible for research systems are motivated and trained to apply OSR principles. Every research system should routinely employ open methods for managing research data. The first practical requirement is that each actor must have clearly structured, up-to-date descriptions of how to promote open science. With the aid of OSR, we can promote sustainability, usability, access, and trust in research.
OSR depends on openness in the research process and working culture. Models for openness can be used to create opportunities for rich dialogue, and to preserve and increase diversity.
Research data, along with its associated expertise and understanding, are distributed across a variety of research actors, networks and communities. In these circumstances, openness must be increased:
- inwardly: to bring new ideas to the research process.
- outwardly: to enable others to harness ideas in new ways.
The recipe to achieving OSR is simple: we make our research offerings openly available, and use this openness to increase our impact
- We will choose to make our research offerings — including publications, data and methods - openly available in accordance with the principles of research ethics and the judicial environment, thus benefiting from the opportunities afforded by open access, open peer reviews, and parallel archiving. We will publish research results with an open licence (recommendation CC4.0 BY) and use support services that facilitate openness.
- We will make full use of research offerings made openly available by others, ensuring we have the required expertise, open‐source software, and information about open standards and interfaces, as well as the solutions to implement them.
Benefits of OSR
OSR brings with it benefits including:
- Increased efficacy: We can use existing materials and methods to update our processes, resulting in faster development thanks to shared resources.
- Increased awareness of the scientific model: We can promote awareness of scientific methods and ways of working.
- Improved focus and better quality research results: We can confirm and validate data more quickly, improving quality and repeatability of results through greater transparency in research practices.
- Faster generation of new research ideas: We can more easily apply research results in real time.
- Increased commitment to science and improved scientific literacy: The general public can more easily access scientific results and methods.
- Increased economic and social impact: Businesses and decision-makers can more easily access and harness research results and methods.
OSR creates opportunities for a variety of stakeholders to participate in brainstorming research topics, conducting research, evaluating results, and developing software.
The basic logic of science and research dictates that science, along with its methods and results, should be as open as possible. Using OSR, research results and new information can be confirmed and validated independently and without bias, and support structures are required for this validation.
In making data and research results open, we enable and further new businesses and innovations, such as the creation of new services and software.
Openness is also economical and effective (According to a study conducted by the European Commission, open data projects are expected to generate 140 billion per annum. The economic impact of open research data is not so straightforward. A recent study estimated that investments in availability services would generate income in a ratio of 2–10 to investments): previously collected data and information generated from it become globally, efficiently and equally accessible to all.
Openness also improves the quality of research: results and data enable scientific observations to be verified or challenged, so the global body of scientific knowledge can develop and correct itself faster and without redundancy.
Openness also promotes the faster transfer of information for use by all. It guarantees equal access to research data, regardless of geography.
Providing open access to research results and data in an information network increases the visibility of researchers, research results, and research institutions, and improves impact.
Different stakeholdersbenefit from OSR in different ways (Figure 1), including increased visibility (citations, mentions in social and other media), increased credits (references to publications, data and methods; awards for openness), increased funding (rewards for openness, awards for clear definitions of copyright/proprietary rights), and improved networking (new opportunities, better workload distribution, better results analyses). To harness these benefits, we must implement the measures that are covered in more detail in the Chapter Promoting OSR.
Enabling OSR
Barriers to OSR include:
- The narrow reward culture in current academia: there is no real incentive to promote and reward openness
- Lack of infrastructure to support openness: there is widespread uncertainty about how the costs of openness will be covered. For example, business models may prevent increased openness in some research institutions
- Fear that raw data will be misinterpreted, methods misused, or data published too early
- Uncertainty over the ownership of data and methods
- Lack of expertise in promoting openness
These barriers are not insurmountable, and change can be supported by, for example:
- Developing incentives to promote cultural change: we need to clearly define the rewards and requirements for openness using indicators, metrics, and career impact, for example
- Promoting cooperation and interoperability: we need to build platforms for enabling and rewarding cooperation
- Planning for the sustainable development of research services and infrastructures: we need to develop with interoperability in mind, using open-source software, and open interfaces and standards whenever possible
- Identifying OSR expertise and supporting its growth
- Developing clear OSR policies and guidelines for every party involved
Openness is not black and white. In practice, we operate between the extremes of the open-closed scale (Figure 2), seeking a functional combination of availability, machine readability, costs, and legality. For example, research that reuses materials subject to the Personal Data Act requires a data‐protection guarantee. In such instances, researchers are urged to be open with regard to their publications.
Promoting OSR
- a. Basic principles and best practices
- b. Promoting openness in the research process
- Measures for science policymakers
- Measures for research financiers
- Measures for research administration
- Measures for research organisations and research teams
- Measures for researchers
Basic principles and good practise in OSR
Best practice in promoting OSR includes the development of
- common, binding ethical guidelines for research
- good methods for managing research data
- basic principles for openness, including recommendations for their expansion.
In 2012, the Finnish Advisory Board on Research Integrity (TENK) — appointed by the Ministry of Education and Culture — worked with the Finnish research community to update its ethical guidelines on good scientific practice and how to handle suspected violations (see the HTK guidelines).
Openness and well-managed research results lead to high-quality operations and results: honesty, diligence, and accuracy. The characteristics of high quality include usability and availability, integrity and accuracy, openness and confidentiality.
High-quality data is a central requirement for good data management. When research data is well managed, data is not altered without reason, is not corrupted or lost during processing, and can be relied on for its accuracy and quality.
Quality assurance and accuracy' in research data management are absolutes that cannot be compromised without justification. However, measures to ensure data quality and accuracy should always be scaled to the data's significance and importance. The greater the data’s importance, the more effort should go into ensuring its quality and accuracy.
'Availability and usability are essential and closely related concepts. Usability is generally used when referring to information systems, while availability is used in connection with data, publications and methods. Availability has two dimensions: the concrete availability of data and methods, such as the physical location of files, and the technical and other requirements for making data available.
Data confidentiality and protection is a key aspect of data management. Data security covers the usability, integrity and confidentiality of other parties' data. Research system providers, the general public, decision-makers and companies must be safe in the knowledge that sensitive or confidential material surrendered for research purposes does not end up in the possession of third parties.
Figure 3 shows the process of promoting OSR as a whole.
In practice, OSR is promoted at four levels:
- general policy
- working culture
- working methods
- services and infrastructure The following shows how OSR is promoted at each
The basic policy for OSR in the Finnish research system
The Finnish research system operates under a few basic policies:
- Research results — including publications, data, methods and the tools required to publish — should be openly available in data networks via an open interface in accordance with ethical principles and respecting legal operating environments.
- Openness within research infrastructures should always be pursued when it is legally and contractually possible.
- The further use of research results should not be unnecessarily restricted (See, for example, The Act on the Openness of Government Activities), and the terms and conditions of their use should be clearly defined.
- The scientific and research community should work with financiers to define quality criteria and indicators for OSR that can be used alongside other indicators used in research.
Promoting openness in the research process
Achieving openness requires an open approach at all stages of the research process, with different challenges at each stage (Figure 4)
To develop a working culture that supports OSR, Finland requires best practice and clear policies from financiers and organisations within the research system. This means supporting and considering openness in, for example, contracts, funding and credits.
The research process alternates between planning, action and evaluation. We need to adapt our working methods not only with regard to research structures and processes, but also values, attitudes and behaviour.
Finland's data infrastructure service offering should be built on a sustainable base, and should provide services for the storage, retrieval and preservation of data, methods and publications. Infrastructure interoperability must be ensured when materials or methods cannot be made freely available. Services should by designed comprehensively and cooperatively.
Promoting openness requires the cooperation of all parties, each with their own measures and responsibilities (Figure 5)
Measures for science policymakers
OSR often require samendments to existing structures. When drawing up scientific policies to promote openness, policymakers need to consider:
- mixed policies
- measures for both providers and users
- methods for obtaining the required expertise and making use of networked expertise
- ways to promote an open approach throughout the innovation system
- both pre-emptive measures and those dictated by change
- ecosystem architecture and services
Measures for research financiers
When making the decision to fund research, financiers should require:
- swift publication of research results
- clear contracts on copyrights and proprietary rights concerning research results
- the open licensing of results (the OSR Initiative ATT recommends a CC4.0 BY licence)
- data management planning
- an open use policy for funded research infrastructures
Research financiers should support:
- open access publishing (financially)
- open cooperation
- an operating model for the long
- term preservation of research results
- the building of service infrastructures
Research financiers should state:
- what their recommendations are concerning alternatives for open access publication
- how openness will be rewarded in career development
- how the financier would like copyrights and proprietary rights to be managed
- what quality criteria they stipulate for research
- the metrics and indicators to be used in evaluations
Measures for research organisations
As part of their operation, research organisations should:
- develop clear data policies
- develop clear publication policies
- develop clear licensing policies
- clearly describe researchers’ rights regarding openness
- increase and maintain expertise
- encourage the use of a common service infrastructure, providing any required local infrastructure services
- collect the required indicator data
- provide quality systems
Measures for research teams
When collaborating as part of a research team, individual teams should:
- agree on copyrights, proprietary rights and licensing of results
- recommend and use open-source software, and open standards and interfaces
- produce descriptive metadata
- use of quality systems
- ensure the repeatability of research results
- promote interoperability
Measures for researchers
As part of their research process, researchers should:
- agree on copyrights, proprietary rights and licensing of results in accordance with guidelines
- use citations
- choose a publishing channel
- use open evaluation whenever possible
- embrace open access publishing (articles, data, methods)
- ensure the repeatability of research results
- produce descriptive metadata
Figure 6 summarises the measures different parties should take when supporting OSR.
Putting OSR into practise
(Additions from working groups during 2014)
- Open research publications
- Open research data
- Open research methods
- Open research infrastructures
- Open research support services
International trends
(Additions from working groups during 2014)
A large number of Finnish and international research organisations and financiers have set targets for increasing the openness of research results. For example, the European Commission Recommendation on Access to and Preservation of Scientific Information sets out policies and targets for openness in European research.
The openness of research data has improved thanks to a report published by the OECD in 2007, entitled 'Principles and Guidelines for Access to Research Data from Public Funding’.
The OECD Council's 2008 recommendation for Enhanced Access and More Effective Use of Public Sector Information had a major impact on the PSI Directive. Open science stakeholders - such as financiers, institutions of higher education and their libraries, datacentres, scientific societies and backers — are looking at the major challenges and opportunities raised by the reuse of data items.
An increasing proportion of scientific publications are appearing in open access journals. One such journal, PLoS One has unquestionably become the world's biggest scientific journal. Other significant scientific periodicals, such as Nature and Science, are publishing articles on open science with increasing frequency. Publishers are developing strategies to distribute research data in tandem with publications.
The variety of public and private sector stakeholders taking part in these activities, and the range of projects involved indicates that OSR is becoming more widespread and increasingly important across the globe. Example projects include DataCite; the pre-print service arXiV; OpenAIRE, which collates publications on EU projects; Zenodo, which collates data; and field-specific repositories such as PANGAEA and Dryad. Many major research infrastructures (such as the ESFRI projects) are also sharing their results via open services.
However, sharing of research data has not yet become the norm. Although the publication of research data is of global interest, the key to success fully embracing OSR lies in understanding the associated challenges facing a variety of different stakeholders.
Glossary
(Additions from working groups and cooperation with the Ministry of Finance during 2014)
Authority services
- An authority file supports the identification of individuals or associations by differentiating similarly named individuals with the aid of dates, professions, etc. However, the authority file also collates the individual/association's name variants to ensure 'access' to the same person/organisation's work regardless of the form in which their name appears. The authority file also organises the database by linking a person's various 'public identities', such as real name and pen name. In the same way, new and old names resulting from personal or organisational name changes are also linked. National libraries create name authority files for works published in their country. Search terms in accordance with national regulations are stored in a target's authority file along with any relevant IDs (ISNI, ORCID) and name variants. The information is collated in international databases (such as the ISNI and ORCID databases and the Virtual International Authority File VIAF), which are freely available online. The current public administration recommendation is to also publish this information as open linked data.
Availability
- Availability determines whether information is available in accordance with its purpose, in principle both technically and in accordance with other operational requirements.
Citation
- A citation is used to refer to a source. It can be placed as an in-line citation within the text, in the footer as a footnote, as an endnote at the end of a publication or section thereof, or as a reference in a bibliography, for example. Sources have typically been research publications, but research data stored in a repository can also be a source. Repositories usually contain both the actual item (for example, a data file(s), code-book or questionnaire) as well as the metadata associated with the item. All these can be cited. When citing research data, the important differentiating elements are the author, the item's title, the version number, the repository ID (usually the item number), the collection date, the collector, producer and distributor, and (if available) a PID. The Finnish national standard SFS 5989 (Lähde- ja tekstiviitteitä koskevat ohjeet) provides guidelines on using citations and references. Many institutions of higher education use citation software based on the aforementioned standard. The Finnish Social Science Data Archive also provides guidance on citing research data (see here).
- EXAMPLE 1 SOSIAALIBAROMETRI 1994 [digital item]. FSD1129, version 1.0 (2002-03-11). Helsinki: Sosiaaliturvan keskusliitto [producer], 1994. Tampere: Yhteiskuntatieteellinen tietoarkisto [distributor], 2002.
- EXAMPLE 2 INTERNATIONAL SOCIAL SURVEY PROGRAMME 2007: Leisure Time and Sports [digital item]. ZA4850, version 2.0.0 (2009-10-29). Cologne: GESIS [producer, distributor], 2009. doi:10.4232/1.10079. Available here
- EXAMPLE 3 EKHOLM, Peter, Karina JUTILA and Pentti KILJUNEN: Käsitykset ilmastonmuutoksesta 2006 [digital item]. FSD2262, version 1.0 (2007-05-10). Lempäälä: Yhdyskuntatutkimus [data collection], 2006. Helsinki: Ajatuspaja e2 [producer], 2006. Tampere: Yhteiskuntatieteellinen tietoarkisto [distributor], 2007.
- According to the HTK guidelines: Researchers should acknowledge the work and achievements of other researchers in an appropriate manner, showing respect for others' work by citing other researchers' publications in an appropriate manner and giving others' achievements due value and significance both in their research and the publication of their results.
Data integrity
- Data integrity means 1. (Data or a data system that is) genuine, authentic, free from internal conflicts, comprehensive, up-to-date, legal, and usable. 2. A characteristic by which information or a message has not been altered without authority, and any potential changes can be traced with an audit trail. (Glossary of Government Information Security, 1/2000).
Open access publishing
- In its simplest form, open access publishing (articles, reports, monographs) means uploading a research publication to a data network and granting rights to read, copy, print and link to entire scientific publications. Open access publishing means free dissemination of scientific information. A scientific publication is openly available when both the scientific community and the general public have unrestricted access via the Internet without charge.
- In simple terms, Golden OA (the Gold Road) meansopen journals, while Green OA (the Green Road) means self-archiving. More detailed information about the alternatives for open access publishing is available from, for example, Ilva & Lilja: Kotimaiset tieteelliset lehdet ja avoin julkaiseminen (Finnish scientific journals and open access publishing). 2014. URN:NBN:fi-fe2014050725729.
Open data
- Open data refers to unprocessed information accumulated by research organisations, researchers, public administration, companies or private persons that is made freely accessible to third parties for use without charge.
Open interfaces
- An open interface refers to well-documented, free-to-use means of transferring data between software (PERA definitions). For example, a database will provide software developers with an interface for queries.
Open knowledge
- Open knowledge refers to unrestricted access to digital content and data that users may use, amend and distribute without charge. To meet the criteria for open knowledge, items must available in full, in a usable and amendable format, via the Internet. Items must also be licensed for unrestricted use, amendment and distribution.
Open licences
- Research data and publications are usually protected by copyright. However, agreements can be signed to enable open use of these materials. Creative Commons licences can be used to grant selected rights and freedoms to users, readers or experiencers. By combining different terms and conditions, you handle your rights in a way that suits both you and the situation. You can try combining different terms and conditions with your choice of licence. The OSR Initiative (ATT) recommends CC0, CC 4.0 BY or, if necessary, other generally recognised licences.
Openness and confidentiality of research results
- Confidentiality is 1. Maintaining the confidentiality of data, and protecting the rights associated with data, data processing and communications from violation. 2. The extent to which confidentiality is considered important. (Glossary of Government Information Security, 1/2000).
Open research environments or infrastructures
- A research environment or infrastructure comprises the tools, equipment, materials and services that enable research to be carried out. Research infrastructures can be used to strengthen research communities and increase capacity. Research infrastructures can be located in one place, be decentralised, or be virtual. An open research infrastructure provides access to a comprehensive package — the research process via which the results will be produced. For a research infrastructure to be considered open, results, publications and background materials must be freely available to the research community.
Open science
- Open science means the promotion of an open operating model in scientific research. The key objective is to publish research results, along with the data and methods used, so they can be examined and used by any interested party.
- Open science includes practices such as promoting open access publishing, open access publishing itself, harnessing open-source software and open standards, and the public documentation of research processes with 'memoing'.
Open source
- Open source is a way to develop and share software. The software's source code is freely available to be used, copied, altered, and shared. In the open source development world, both ideas and finished products are available for all to see and use. Any single company does not manage development — it is a global community consisting of private persons and companies. Everyone can participate in development work, and bugs can be quickly found and fixed. This often leads to high software quality, good data security, and software interoperability.
Open standards
- Open standard means adherence to established, commonly agreed standards, so developers can create replica or compatible software. Open standards are available to all, so anyone can find out about and adhere to them.
Peer review
- Peer review, or 'refereeing', comes from the custom in which scientific articles sent to journals or other publications are evaluated by the editorial team and selected external experts. Peer reviewers examine the content and scientific significance of the submitted article, along with its linguistic form and textual structures, ensuring that each article adheres to principles of scientific writing (such as succinctness, clarity in diagrams and tables, source citations).
Persistent identifiers
- Identifiers for publications and research data are used to, for example, search for, identify and link materials. Identifiers are also mandatory for long-term preservation. Depending on the type of publication involved, the identifier may be an ISBN (monographs) or a variety of persistent identifiers (PIDs). A Handle ID is used in repositories, a DOI in commercial publishers' systems, and a URN in national libraries' digital collections. A PID is almost exclusively used for research data in national and international projects, while the National Research Data Project (TTA) and OSR Initiative (ATT) use a URN. IDs are also required for researchers and other juridical bodies participating in the research process (universities and other institutions of higher education, research institutes, scientific communities and their institutions, research teams). These identifiers are always separately allocated in Finland.
Quality system
- A quality system consists of procedures and processes for assuring the quality of training, research, social dialogue and impact, human resources, services, and management.
Usability
- Usability is defined as follows: According to the Glossary of Government Information Security (1/2000), usability means 1) a characteristic of data, an information system or service, by which it is available to those with the right to use it, and can be used at the desired time and in the required manner, and 2) ease of use.