Researchers' attitudes and perceptions towards data sharing and data reuse in the field of food science and technology

This work analyses the perception and practice of sharing, reusing, and facilitating access to research data in the field of food science and technology. The study involved the coordination of a focus group discussion and an online survey, to understand and evince the behaviour of researchers regarding data management in that field. Both the discussion group and the survey were performed with researchers from several institutes of the Spanish National Research Council. The lack of a data sharing culture, the fear of being scooped, and confusion between the concepts of the working plan and the data management plan were some of the issues that emerged in the focus group. Respondents' previous experience with sharing their research data has been mainly in the form of appendices to peer‐reviewed publications. From the survey (101 responses), the most important motivations for publishing research data were found to be facilitating the reproducibility of the research, increasing the likelihood of citations of the article, and compliance with funding body mandates. Legal constraints, intellectual property, data ownership, data rights, potential commercial exploitation, and misuse of data were the main barriers to publishing data as open data. Citation in publications, certification, compliance with standards, and the reputation of the data providers were the most relevant factors affecting the use of other researchers' data. Being recent or recently updated, well documented, with quality metadata and ease of access were the most valued attributes of open research data.


INTRODUCTION
The term 'research data' refers to quantitative or qualitative information collected by researchers in the course of their work, Open research data policies are important drivers, but implementation and compliance differ depending on their provenance and disciplinary approach. Institutional and funder policies are considered very important for social sciences and chemical sciences; however, in the field of economics, policies issued by a scientific society are considered more important (Schmidt, Gemeinholzer, & Treloar, 2016).
In Europe, the European Commission and the European Research Council (2019) describes the management of the life-cycle of data gathered, processed, and/or created in a project. It is a living document that identifies and describes issues such as the data-gathering process, the metadata standards used in their description, and data preservation, while also reflecting changes or modifications made during the research project. In summary, DMP provides comprehensive information on the data and the context in which they were created. There are numerous resources available to support the creation of a DMP, such as those created by the Digital Curation Center ( Data reuse practices are facilitated by aspects such as efficiency, perceived effectiveness, and the importance of developing the research (Curty, Crowston, Specht, Grant, & Dalton, 2017).
However, problems arising from the interoperability of standards and formats, or a lack of understanding of intellectual property issues, may constrain their implementation (Wallis, Rolando, & Borgman, 2013).
Despite the benefits that sharing and reusing data can have for researchers (Costello, 2009;Michener, 2015;Piwowar, 2011;Wiley, 2018), both practices face technical barriers related to the management of data (Michener, 2015) and to social aspects associated with the attitudes and habits of researchers. Genomics, for example, is a field in which technology has overcome the barriers of distance and facilitated data sharing, but there are still other barriers arising from the human factor (resistance, competition, habits) that are slowing down such sharing (Fusi, Manzella, Louafi, & Welch, 2018). In the field of public health, in addition to technical, economic, and motivational difficulties that need structural solutions, there are also political, legal, and ethical considerations that require international consensus to promote fluid data exchange ( Van-Panhuis et al., 2014). Fecher, Friesike, Hebing, Linek, and Sauermann (2015) surveyed individual academic researchers across all disciplines on their handling of data, their publication practices, and motivators for sharing or withholding research data. The survey found that the main impediments to making data available were the need to publish a paper before sharing the data that supported it, and the efforts required to share them. Eighty percent of respondents cited the fear that others could publish their data before them as a reason for these reservations, evidencing the need for training and information on how to share and reuse data. Fear of losing control over their data was the main reason for the reluctance; however, respondents said they would openly share their data if they had the power to decide how and when their data was reused, and by whom. A big driver for making data available to others was recognition, that is, data citation.
Researchers' behaviour related to the exchange and use of data also varies between disciplines. According to one of the first studies carried out in this regard (Tenopir et al., 2011), researchers in atmospheric and environmental sciences and ecology expressed a willingness to share (87%) their data. Most authors in other disciplines who participated in the study were also willing to share, but with significant differences among them; in particular, respondents in the fields of computer science and medicine were more reluctant. In biology, data discovery is fundamental and it is also the area where the highest percentage of shared data was reported (Stuart et al., 2018).

Key points
• Researchers are often confused or unaware of data lifecycles and the exact definition of data, and also unsure of intellectual property rights.
• Funder policies and European Union norms are the most important motivators for sharing data within food and science researchers.
• Researchers want citation of their data sets and report being far more willing to share if citation is assured.
• Researchers believe that open data must be well formatted, updated, and easy to use and interpret.
Researchers in the areas of engineering and computer science considered it very important to adhere to the FAIR data principles (findable, accessible, interoperable, and reusable; Wilkinson et al., 2016) for the advancement of their discipline, while researchers in earth sciences and economics supported the idea of adhering to the FAIR data principles, but with some reservations (Schmidt et al., 2016). Researchers in social sciences and political science tend to be influenced by their colleagues, so that data sharing is likely to become common practice if their community advocates and practices it (Zenk-Möltgen, Akdeniz, Katsanidou, Naßhoven, & Balaban, 2018).
The Spanish National Research Council (2019) (CSIC, for its initials in Spanish) is the largest public institution in Spain dedicated to research and one of the most prominent in the European Commision (2019b). CSIC operates a large number of major research centres where its scientific and technical research activity is generally conducted. Each of these centres is thematically integrated into one of the eight scientific areas of the CSIC, one of which is food science and technology (www.csic.es/en/ investigation/institutes-centres-units).
The aim of this study was to examine and analyse the perception and practice of sharing, reusing, and facilitating access to research data in the field of food science and technology among researchers working at CSIC research centres. The study data was collected by means of a focus group discussion and an online survey. This is the first subject-oriented study performed at CSIC, and it could lead to further analysis in other research areas.

Focus group
A focus group (FG) was organized to discuss topics related to the research data life-cycle and the practice of creating DMPs. The purpose of the discussion was to gather information about perceptions and awareness of the meaning of data management and data sharing, and attitudes towards research data reuse and sharing. The FG was formed in June 2018, with 7 researchers from IATA-CSIC (Institute of Agrochemistry and Food Technology, 2019) with previous experience in applying for and conducting European research projects.

Online survey
Based on information collected from previous studies (Berghmans et al., 2017;Schmidt et al., 2016;Tenopir et al., 2015;Teperek et al., 2018), a questionnaire was developed with 43 questions organized into seven content blocks (Supporting information, Appendix S1): demographic data; format, volume, and storage of data; funding organization; sharing and reuse of research data; reasons to share open data; obstacles to publishing research data in open access; and citation, licensing, and training.
The initial questionnaire was shared and tested with three researchers who gave us their opinion about the clarity of the questions. After their comments were received, the final version was developed using the Google Forms tool. Email addresses of senior scientists, scientific researchers, research professors, and post-doctoral fellows were retrieved from the web sites of eight CSIC research institutes in food science, nutrition, and agrochemistry. We collected 465 email addresses, but 62 were found to have issues, resulting in a total of 403 researchers to be contacted. Emails were sent individually explaining the aim of the study with a link to the online survey. The questionnaire was available from July to October 2018, during which time three reminders were sent. It was not mandatory to answer all the questions. The survey data were exported to an Excel sheet for descriptive analysis and visual representation. The questions with Likert-type scales (extent of agreement/disagreement) and rating scales (very important/not at all important) with 5 or 4 points, respectively, were coded by assigning a value from the lowest to the highest. They were then grouped into four categories: barriers, motivators, conditions, and quality. A factor analysis was applied to identify which underlying factors were measured by the observed variables within these four categories. Then, a cluster analysis was applied to find, if any, groups of cases that could represent respondents as a whole. Both statistical techniques were performed using the statistics package SPSS v24 (Armonk, NY: IBM). The Kaiser-Meyer-Olkin (KMO) statistic was used to determine how suitable the data were for factor analysis (sampling adequacy), which indicates the proportion of variance in the variables that might be caused by underlying factors (values >0.7 generally indicate that a factor analysis may be useful with the obtained data). Cronbach's α was used as a measure of reliability, or internal consistency. This metric outlines whether the test designed is accurately measuring the variable of interest.

Focus group outputs
Seven researchers from the IATA-CSIC were part of the focus group. The session lasted approximately 1.5 h and participants discussed two issues raised by the facilitator: • What do you understand a data life-cycle to be?
• What do you understand a DMP to be? Have you had any experience creating DMPs?
Out of these two topics, the following issues emerged. Each issue is followed by a brief summary based on the group's comments, and some perceptions from the moderator: project and ends when the results are published; or the time during which data are gathered and become ready to use.
Moderator's perception: Lack of understanding of the concepts.
• It is also necessary to clarify what exactly is meant by 'data' in order to define more precisely what their life-cycle is.
Moderator's perception: Definition of data not clear.

Issue 2: Publication of research data
• The option of publishing raw research data was unknown to participants. Moderator's perception: New venues for data publication unknown.
• Doubts were raised about the usefulness of publishing data, considering that all the information has already been used in the published articles.
Moderator's perception: Lack of culture of acceptance of open raw data before publication.
Issue 3: Data lifespan • Participants considered it necessary to determine data lifespan to be able to preserve data correctly during their life-cycle.
Data are considered to be still useful as long as current technology has not surpassed the method originally used to collect them. Another concern, beyond their obsolescence, was with the validity of data in a different context. There were also participants who felt that data lose their validity once a research project is finished.
Moderator's perception: The obsolescence of research data is not always clear.
Issue 4: Dissemination, sharing, and publication of data • The idea of data sharing was not widely accepted because data are created/obtained within the context of a project, which renders them understandable; however, outside that context they could be misinterpreted or misunderstood. In order to avoid this, sharing data requires a previous debugging and description of how the data were collected, treated, formatted, analysed, and the rest, entailing an increased workload.
Moderator's perception: Fear of misinterpretation hinders data sharing. • Data are collected/generated for the purpose of publication, but not for sharing and reuse.
Moderator's perception: Lack of culture of data sharing.
• Participants thought that information is already being shared once the outputs are published, whether through publications, conferences, and the rest.
Moderator's perception: Publishing is misunderstood as being the same as sharing.
• Data sharing could be more easily assumed by the future generation of researchers, while the current generation continues with the traditional model in which 'sharing' is done after publication and not before. There is also a resistance to data sharing because of the hard work involved in obtaining the data, from which others would be able to benefit without any effort.

Issue 5: Data management plans
• Participants said they have never created a DMP associated with their projects. Usually, they design a work plan to achieve their project research aims.
Moderator's perception: Confusion about the difference between a work plan and a DMP.
• Participants understood a DMP to be a document drawn up at the end of the research, when the outputs are already available, as a final report rather than a living document drafted during the project.
• Participants agreed that DMPs should contain a description of the conditions for obtaining the data, data formats, and data licensing. They also pointed to the importance of following international standards and formats, although there was some concern about the additional workload that this represents.
Moderator's perception: Standardization/formatting of data means more work for researchers.
Survey-block 2: Format, volume, and storage of data Formats of research data used or created by researchers in their most recent project were textual, numerical, and images, while audio-visual formats were almost non-existent (Fig. 1). As was expected given the working areas concerned, the most common types were experimental data (53%), followed by observational or empirical (21%), reference or canonical (12%), derived or compiled (10%), and simulation (4%).
In relation to the volume of data collected as part of the most recent research project, files were mostly small to medium size, although the 'I don't know' response represents nearly 23% of total answers to this question, suggesting that volume does not represent a limitation or they are not aware of it ( Fig. 2A,B).
The preferred location used to store data files (Fig. 3) is their personal computer despite the risk this option represents. Worthy of note is the low use of cloud services to store data files. Backups were performed on personal computers (63% performing one backup and 37% performing two), although some respondents said they lost some files during the past year (13%).
On the question of how long data should be stored, the largest proportion of respondents considered that most data files should be stored indefinitely (35%), followed by 5-10 years (26%), 1-5 years (17%), and more than 10 years (16%).

Survey-block 3: Funding source and data ownership
According to the survey data, the biggest source of funding for research projects was the Spanish government (57%), followed by corporate private/commercial organizations based on agreements or R&D contracts with businesses in the sector (21%), regional research funding from projects funded mainly by the European Commission (17%), funding by their own institution (3%), and funded by a charity or other third party (2%).
It should be noted that when respondents were asked about their awareness of funding requirements for FAIR data, the most common response was 'I don't know' (>50%), and when they were asked whether the primary funding agency requires a DMP, 60% said they did not while 38% said they did. This reflects a lack of awareness about how to make data findable, accessible, interoperable, and reusable, and/or about the funders' requirements for DMPs, despite the fact some respondents were working under the H2020 Open Research Data Pilot. This reveals a lack of awareness of their funder's requirements for research data.
Intellectual property and data ownership are issues researchers often misunderstand and misinterpret, and many do not even   know how to respond to questions about who owns their data. Indeed, when asked who they thought the research data belonged to before and after publication (Fig. 4), respondents offered very different answers, with 40% more respondents assigning ownership to the publisher after publication. This situation of ignorance about who owns research data is a very widespread problem if we consider the previous literature (Stuart et al., 2018).
Misunderstandings could also arise in relation to what is meant by data when a paper is submitted. Authors might think of data as referring to tables, figures, or images, rather than to the raw data they represent. Authors might not be aware that when a publisher is assigned copyright over a paper, that copyright does not extend to the underlying raw data that support the results or the paper. On the other hand, it was surprising to find that some respondents attributed data ownership to the funder, because, in general, grant agreements do not have clauses regarding ownership but only regarding dissemination and data sharing.

Survey-block 4: Sharing and reuse of research data
There is no doubt that open data is important in terms of research progress, science communication (public education and information, supporting applications of science to societal problems), and technology and knowledge transfer, and respondents scored all of these areas highly. This response might be obvious, but what the respondents say is not always what they practice.
This behaviour, whereby data sharing is understood but is not translated into action, has also been described previously (Fecher et al., 2015).
More than 50% of respondents said they shared any or all of the research data used or created as part of their last research project directly with others (i.e. person-to-person); however, only 24% reused data shared directly by others (i.e. person-to-person).
Respondents rated the importance of data sharing before and after publication with other colleagues from different institutions or with the public at large (Fig. 5) considerably higher after publication. The idea that making data publicly available before publication could jeopardize their future outputs could be the one of the main reasons for this. Nearly 50% of respondents were aware of data repositories, but only 29% had used them to deposit datasets. Respondents were willing to directly share data collected during a research project; however, only 24% said they had used data shared by others. This might be related to a lack of trust in the data of others, or a lack of familiarity with the practice. When asked what they had done with research data used or collected as part of their last research project, 62% of respondents said they shared them directly with researchers working on the same research project in a research collaboration, 22% shared them directly with project partners, and very few (3%) shared them directly with researchers not working on the same research project whom they did not know personally (see Table 1). This lack of trust and the fear that data may be misinterpreted or misused are attitudes also described in other fields, such as geophysics (Tenopir, Christian, Allard, & Borycz, 2018) and other disciplines (Fecher, Friesike, Hebing, & Linek, 2017).  Most respondents (69%) believe that a lot of effort is required to make their research data reusable by others (Table 1).
This is a significant barrier to the reuse of research data. Familiarity with metadata, archiving venues, rights, and licensing, or with how to create a DMP, is necessary for the production of FAIR data and deserves attention. Recognition of this effort (learning about metadata, archiving venues, IP issues, licensing, etc) could transform this barrier into a motivation (Schmidt et al., 2016).
Respondents were asked to state their agreement or disagreement with different statements regarding the conditions for allowing other researchers to reuse their data (Fig. 6). Acknowledgement, recognition, citation, and providing reprints of articles using the data to the data provider, were the most common statements selected by respondents, which are directly related to collaboration and feedback from data users. Recovery of part of the cost of data acquisition does not seem as important when exchanging data is driven by prestige and reputation rather than money (Fecher et al., 2015(Fecher et al., , 2017.
Citation in formal venues, certification, compliance with guidelines and standards, and the reputation of the data providers Yes, I am aware of research data repositories, but I have not used them 47 Yes, I am already using them to find existing datasets or to share my own data 29 Have you shared directly (i.e. person-to-person sharing) with others any or all of the research data that you used or created as part of your last research project?
No 37 I do not know 5

Yes 57
Was any or all of the research data that you used or created as part of your last research project shared with you by others, either directly (i.e. person-to-person)?
No 71 I do not know 4 Yes 24 How much of your data do you make available to others?
Some 55 Most 29 None 13 All 3 Irrespective of public sharing (e.g. archiving your research data in a repository accessible to others), have you done any of the following with any or all of the research data that you used or created as part of your last research project?
Shared directly with researchers working on the same research project in a research collaboration 62 Shared directly with researchers NOT working on the same research project who I know personally 14 Shared directly with researchers NOT working on the same research project who I DON'T know personally 2 Shared directly with research project partners (e.g. funders) 22 Prior to sharing, research data typically needs to be formatted and often has documentation and/or metadata added to make it re-useable by others. How would describe the effort typically required to make your research data re-useable by others?
A lot of effort 69

Some effort 22
No effort 1 I would not share 8 are the most important factors related to the use of other researchers' data. Regarding factors for determining whether to use others' data ( Fig. 7), more than 90% of respondents identified aspects related to the documentation and metadata provided to explain how the data were developed and formatted as important or very important. The provenance and reputation of data providers were also important factors in deciding on their reuse.

Survey-block 5: Reasons for using and sharing open data
The most common situation in which respondents had previously shared their research data (66%) has been as an appendix to a peer-reviewed research publication (journal or book), followed by a standalone peer-reviewed data publication (23%) and archiving in a data repository (17%).
All the reasons for sharing data presented in the survey were identified by most respondents as important or very important, particularly those related to funder policy compliance (although curiously scientific and professional society policies have not scored highly) (see Fig. 8). Conversely, when respondents were asked whether they were familiar with funders' expectations for FAIR data and funder policy requirements, the response rate were low, 12 and 4%, respectively. Figure 9 presents the survey data regarding respondents' agreement or disagreement with certain conditions for reusing and sharing research data. Proper citation when datasets are used, ease of access, and willingness of data sharing across a broad group or researchers were the top three conditions selected by respondents. Researchers would not share all their data in data repositories without restrictions of use; however, paradoxically, they would be willing to reuse others' data if they were easy to access. Citation of datasets when they are reused was an important issue, as seen previously. In summary, data providers do not want to lose control of who is able to access and reuse their data, but are open to new collaborations as a result of sharing research data. This same question was asked in a survey distributed to members of the American Geophysical Union (Tenopir et al., 2018); our results are qualitatively in agreement with this previous study, although quantitatively there are some differences in the percentages of respondents who disagreed with most of the statements except for the importance of citation.
The most important attributes of open research data identified by respondents were the following: data should be recent or updated, well documented with quality metadata to facilitate their interpretation, and easily accessible but potentially protected or restricted in relation to their use (Fig. 10).
The most important benefits of publishing research data together with the article were rated as follows: it facilitates reproducibility of research (62.4%); the article is more likely to be cited (54.5%); it complies with funding body mandates (52.5%); there are more possibilities for collaboration (49.5%); it complies with journal or publisher requirements (38.6%); data can be reused (37.6%); the article is more likely to be accepted for publication (32.7%); it encourages other researchers to make their data publicly available (32.7%); and it results in research aggregation (e.g. for meta-analysis, 29.7%). These benefits are directly related to the researchers' own activities and rewards, rather than to changing the way research data is shared.

Survey-block 6: Barriers
Legal constraints, intellectual property, data rights, potential commercial exploitation, and misuse of data are the main barriers to publishing data as open data (Fig. 11). A lack of trust and misunderstandings about data ownership are the main factors behind these barriers. However, these barriers could be overcome by the establishment of effective data management and the promotion of best practices by policymakers and academic institutions.
Poor quality of data or metadata that could lead to a misinterpretation or misuse of data is also an obstacle to the use of research data. Lack of access to data generated by other researchers was also mentioned as a major impediment to the progress of science (Fig. 12).
Data quality, formats, the time needed to understand how to interpret data, and how to access data are also major obstacles identified by respondents (Fig. 13).

Survey-block 7: Miscellaneous (citation, licence types, and training)
There is a growing need to cite research data that are not included in traditional publishing venues, but that have been shared directly (i.e., person-to-person sharing) or indirectly (downloaded from a repository). However, the culture of sharing, using, and citing research data has not grown as quickly as desired. Standards for citation exist, but awareness of those standards appears to be poor: • There are no agreed standards for citing research data in my field (62%); • There are agreed standards for citing research data in my field, and most researchers follow them (21%); • There are agreed standards for citing research data in my field, but most researchers do not follow them (6%); • There is no need for standards for the citation of research data (1%).
When asked about the licences that the researchers would use to distribute and share their data, only 8% indicated CC BY; 27% indicated CC BY-NC; 19% identified CC BY-NC-ND, and 39% responded 'I don't know'. Lack of knowledge about the meaning of licences might be the reason behind these results, especially because one of the motives for data sharing is the creation of potentially new results/outputs, which means that the use of restrictive licences makes no sense.
More than 70% of respondents recognized the need for training in several areas: copyright and intellectual property (88%); the use of data repositories and open access (84%); FIGURE 9 Agreement/disagreement with statements related to sharing scientific data (number of responses displayed in the bars).
In relation to who should provide this training, the most popular response was a professional in data management (38.6%), followed by the research support unit(s) (26%), and the library (16.5%).

Factor and cluster analyses
As mentioned in the Methodology section, questions with rating or Likert-type scales in which respondents expressed their preference/importance or agreement/disagreement were subjected to a factor analysis to assess the correlations between variables, reduce their number, and explain the biggest variance with the smallest number of factors. Factor variables were used to group individuals in a cluster analysis to identify similarities among members of the same group.
The variables that have been taken into account responded to four categories according to the information gathered: (1) motivators for using, sharing, and publishing research data; (2) barriers to sharing research data; (3) conditions for accessing and reusing research data; and (4) aspects related to the quality of research data and metadata. The results obtained under these four categories are discussed in the following sections.

Questions related to motivations for sharing and publishing research data
The category grouping survey questions associated with the importance of open data and the motivators for sharing research are shown in Table 2. In the factor analysis, a rotated matrix with the factor loadings was produced. Only factors higher than 0.6 were considered, resulting in a reduction to three factors that accounted for 62.8% of the variance, and a value of 0.748 in the KMO and Bartlett's test. The importance of data sharing and the motivation for it are represented by these three factors. Cronbach's α, which measures the degree of reliability, ranged between 0.7 and 0.84, confirming the scale adopted.
Factor 2 grouped items V65, V68, V66, and V67 (Cronbach's α = 0.854), related to aspects to support open data motivated by institutional, funder, or publisher mandates. Factor 3 grouped items V17, V18, V16, and V15 (Cronbach's α = 0.743), related to more pragmatic aspects: the importance of data for technology transfer and to support applications of science to societal problems.
The cluster analysis placed the respondents into four groups. Cluster 1 grouped 38 individuals, comprising researchers with high scores for the philanthropic motivators (Factor1). Cluster 2 grouped 45 individuals who assigned a high mean value to all three factors. Clusters 3 and 4 grouped 7 and 11 individuals with low mean values for Factor 2 and Factor 1, respectively. In summary, more than threequarters of respondents agreed on the importance of open data to their commitment with society and for compliance with mandates.

Questions related to barriers to sharing research data
The factor analysis of questions classified under the barriers category (Table 3) produced seven factors. The first three factors accounted for 50% of the variance with a KMO and Bartlett's test score of 0.785. The other factors did not result in a reduction of variables. These factors represent barriers to publishing, accessing, and reusing research data.
Factor 1 grouped items related to legal issues (IP, legal responsibility, misuse).
Factor 2 included aspects associated with understanding the meaning of the data (content and metadata).
Factor 3 encompassed aspects related to access to data and understanding their terms of use, licensing, and citation.
Cronbach's α, which measures the degree of reliability, ranged between 0.88 and 0.86, confirming the scale.  The cluster analysis performed with the previous three factors produced three groups that accounted for 50% of the variance. Cluster 1 comprised 43 participants who scored all three factors highly, especially Factors 1 and 2, with mean values greater than 4 out of 5 points. Cluster 2 grouped individuals who stressed the importance of metadata quality and the specification of what and how research data were gathered to facilitate their understanding. Cluster 3 comprised 15 individuals who considered legal issues related to data to be very important.

Questions related to how to access and reuse of research data: Conditions
The category that groups items of the questions associated with the conditions under which they would share their data or facilitate their access is displayed in Table 4.
The factor analysis produced two factors that accounted for 58% of the variance (KMO and Bartlett's test = 0.642).
The first factor comprised items associated with permissions and control by the creator for the reuse of datasets. As has been noted in the previous sections, fear of misuse is another reason why data creators may be reluctant to share their data. Factor 2 refers to the ease/willingness with which they could access or deposit datasets in centralized repositories.
The cluster analysis resulted in three clusters of 25, 63, and 11 participants, respectively; Individuals included in cluster 1 seem not to rely on permissions for the use of datasets and they have a moderate opinion regarding facilitating access to their data through centralized repositories. Clusters 2 and 3 exhibit a moderate and similar attitude with respect to Factor 1 and Factor 2.

Questions related to the quality of datasets
In this section, issues affecting the quality of the metadata and attributes of the research data (Table 5) are taken into account based on the degree of importance assigned to them by the respondents.
The factor analysis of questions classified in this category ( Factor 2 grouped V55, V56, and V54 (Cronbach's α = 0.827), associated with the use of recognized standards for data and metadata. Factor 3 grouped items V53, V51, and V52 (Cronbach's α = 0.716), associated with the reliability of the data providers (reputation and trust). The potential use and commercial exploitation V104 Major barrier (5) Loss of control over intellectual property V107 High barrier (4) Misinterpretation or misuse of data V106 Half barrier (3) Loss of credit or recognition of original work V105 Low barrier (2) Difficulty of clarifying data rights for work involving multiple inputs or authors

V108
No barrier (1) Concerns about legal liability for data or release of data V103 Concerns about impact of data release (e.g., on endangered species, cultural artefacts, or vulnerable populations, unwanted disclosure, etc)

V109
Varying data formats V124 Varying standards in how has been gathered V123 Varying degrees of data quality in different datasets V122 Understanding how to interpret and reuse data V121 Access information on how to cite the datasets V119 Accessing and understanding terms of use/licences V118 Needing to register (e.g. in a website) to access data V117 Understanding how to access the data V120 Factor 4 grouped items V87, V82, and V81 (Cronbach's α = 0.680), related to unrestricted access to research data.
If we consider Factors 1 and 4 formed by outliers, we are left with two groups with approximately the same number of people whose attitudes follow the same trend with respect the five  (1) Citation of the data providers in all disseminated work making use of the data

V41
No (0) Share facilitates the opportunity to collaborate on a project using the data

V42
The results based (at least in part) on the data could not be disseminated in any format without the data provider's approval V43 At least part of the costs of data acquisition, retrieval or provision must be recovered

V44
The results based (at least in part) on the data could not be disseminated without the data provider having the opportunity to review the results and make suggestions or comments, but approval not required

V45
Reprints of articles that make use of the data must be provided to the data provider

V46
The data provider is given a complete list of all products that make use of the data, including articles, presentations, educational materials, and the rest

V47
Legal permission for data use is obtained V48 There is a mutual agreement on reciprocal sharing of data

V49
The data provider provides and accepts an agreement for its use

V50
Motivators to using and sharing data The following statements relate to sharing scientific data. dimensions except for some individuals in cluster 3 who rated Factor 2 items (quality of content and metadata) more highly. Almost all respondents agreed on the importance of access, reputation, and certification of data and metadata.

CONCLUSIONS
In the existing literature there are works related to the habits and attitudes of researchers regarding sharing, use, access and reuse of research data, in areas such as psychology, sociology, physics, astronomy, and also from a multidisciplinary perspective. This study focuses on an important field not only from the research point of view, but also in terms of the importance that data could have in other fields like health, nutrition, and consumer attitudes. Reliable data sharing could therefore represent an important contribution to the scientific community and to society in general.
Reasons given by researchers for data sharing reveal that many view it as a means of promoting and disseminating research results, although mandates play an important role in their participation. A lack of awareness of legal aspects related to data and a reluctance to share data for fear of losing exclusive control over it are problems that have emerged in various studies, while a lack of knowledge of data management constitutes a barrier to data sharing. All this points to the need for ongoing training and education in the emerging field of data management to make authors aware of the potential of sharing data safely and legally. There is also a need to create institutional policies on research data rather than just on publications, and to ensure their terms are understood, and the work of data managers and librarians should not be based voluntary but supported by academic authorities. Another very important issue to explore is how the work involved in sharing and managing research data is encouraged and rewarded.
Some of these aspects have been stated before in other disciplines, which might lead to conclude they are not disciplinespecific but something related with intrinsic research habits that need a cultural change to create a real data sharing ecosystem.
This study can serve not only as a snapshot of the attitudes of a particular scientific community but also as a guide for the types of measures needed to accompany the provision of services, training, incentives and recognition for researchers, that could be extrapolated to any other communities. Items affecting the quality of the metadata and attributes of the research data.

Question Option Item Scale
How important were the following factors when deciding to use others' research data that the data Are from someone I know personally V51 Very important (4) Are from someone I Do not know personally but who has a good reputation V52 Important (3) Are from someone at an institute with a good reputation V53 Somewhat important (2) Are from a repository with a good reputation V54 Not all important (1) Comply with guidelines around collection and/or formatting

V55
Have good documentation and/or metadata V56 Are certified or accredited by a third party V57 Are commonly used by others V58 Have been cited in a publication in the formal literature (journals, books, etc)

V59
Which attributes do you think are most important to open data?
Access and use of data without restrictions V81 Possibility of reusing data for academic/research purposes V82 Very Important (4) Well-defined terms of use licence V83 Important (3) The data are clearly defined in a useful way to interpret them V84 Somewhat important (2) The metadata is complete, structured according to a standard, and is useful for interpreting the data V85 Not all important (1) The potential value of the data for easy access and low-cost reuse

V86
That access to the data has no cost V87 Data are updated V88 Data are recent V89