5) Relación costo-efectividad y calidad de desempeño
2.3.38. La etapa de acondicionamiento de la señal
The data examined in this study found that similarities and distinguishable differences exist between the information organization habits and approaches of
information professionals and scientists. These similarities and differences can be seen in the way descriptive metadata is created and subject terms are applied. Below is a review of the five focused questions which highlight findings presented earlier in this
dissertation.
Question 1: What types of formal/standard metadata are currently being applied by both groups?
Findings from the PIM-influenced portion of this study show that information professionals are more likely to use formal/standard metadata in their everyday work than scientists. Information professionals use standards like Dublin Core and the Ecological Metadata Language.
Surrogate creation was not a typical part of the scientist workflow. When prompted during the Dryad scenario, scientist participants did create metadata, but only within the Dublin Core-based application profile metadata form they were given. Software was used a guideline or standard for scientist participants when creating metadata for both shared and personal use. For example, a total of 14 scientist
participants reported using a type of software program to organize the data sets. Those 14 participants listed six specific types of software used and two un-named programs. Based on the data collected in this study, software use appeared to be a central part of the scientists’ organizing processes.
Question 2: What types of personal metadata are currently be applied by both groups? Personal metadata schemes were used by both groups. Personal metadata
schemes are defined above in Section 8: Descriptive Metadata. Information professionals used personal schemes that reflected their training in formalized metadata standards used in libraries. When comparing scientists’ systems with information professionals’, the scientists’ systems were more specific and focused less on “aboutness”. The PIM sense of “for my use” in relation to “to share with others” was also present during this study.
Question 3: Which controlled vocabularies map best to subject terms created by both groups?
Based on the four sub-types of subject terms examined by this study (spatial, temporal, topical, and scientific), conclusions can be made that Library of Congress Subject Headings (LCSH) had the best subject term coverage for topical terms. For scientific terms, the Integrated Taxanomic Information System represented scientific names very well and had the highest average score of all the vocabularies. Determining a strong vocabulary choice for spatial and temporal terms was not possible during this study because those types of terms were used less frequently (if at all) by participants.
Question 4: What is the extent of overlap in subject term application between the two groups?
Information professionals and scientists did show an overlap in subject term application. There were 12 terms that both scientists and information professionals applied for data set 1. In terms of percentages that would be 27% of the terms used
for data set 1 were applied by both groups. Another 12 terms were applied by both scientists and information professionals. This means that 24% of the terms used for data set 2 were applied by both groups. Compared to foundational inter-‐indexer consistency studies (Hooper, 1965; Leonard, 1977), these numbers are within range for multiple indexers describing the same information objects.
Question 5: What is the extent of divergence in subject term application between the two groups?
Information professionals and scientists also had some differences in the subject terms they applied. As a group, information professionals applied fewer terms than scientists. Information professionals had a total of 15 terms that were different from what scientists applied for data set 1. For data set 2, information professionals applied 11 unique terms that were different from what scientists applied. Scientists applied 18 terms that were different from what information professionals applied for data set 1. For data set 2, scientists applied 26 terms that were different from what information professionals applied.
Summary
The results of this study provide insight into how two different communities create metadata and apply subject terms. Based on the findings from this study, it is recommended that repository designers examine information organization processes of their chosen user group before creating underlying information structures. These findings suggest that considering who will be creating metadata could have an impact on the type
of metadata field choices as well as controlled vocabularies used. Deciding which metadata approach to take during repository development could also impact the types of metadata guidelines that can be created.
This dissertation research was heavily influenced by the perspectives found in the Personal Information Management (PIM) community. Typically, PIM-related studies use naturalistic approaches to examine how people work within their own environments. This study relied more on control by introducing artificial elements into daily workflows. Comments from individual participants showed awareness of the artificial nature of the two preselected datasets. The same participants explained their rationales for how they successfully worked with the dataset for this simulation and how they would have worked in a more “naturalistic” situation. The unique PIM-influenced portion of the study lends support to recommend that personal information management (PIM) practices of scientists be considered during the repository planning phase.
Results from this study, outlined in more detail earlier, indicate that the software packages being used by scientists to create and analyze data sets have an impact on the process of “science” itself. Considering the diversity of software packages represented by even the small sample studied in this dissertation, it is recommended that a repository designed to represent an interdisciplinary domain should take this into account before making metadata decisions.
While researchers, such as Salo, have pointed to problems in information organization in managing scientific data sets very little research has been done on this topic specifically. This study was meant as a step the right direction. Metadata creation and subject term application is only one portion of the data life cycle. More studies will
need to be done in order to fully understand the impact that these two types of information organization have on the larger process.