• No se han encontrado resultados

2 LAS POLÍTICAS PÚBLICAS ANTE LA PLURALIDAD DE ESTRUCTURAS FAMILIRES

LOS ESCENARIOS DE LA FAMILIA, EL TRABAJO Y LA EMPRESA

2 LAS POLÍTICAS PÚBLICAS ANTE LA PLURALIDAD DE ESTRUCTURAS FAMILIRES

Eugene Garfield (see Garfield,2002, home page) is considered the

‘grandfather’ of bibliometrics. Garfield set up the Science Citation Index (see subsection 3.2.1, page 32) from which the Journal Impact Factor (JIF) is calculated – thede facto performance indicator for the publishing and research community. Journals compete for library subscriptions largely on their JIF (Garfield,1972) and researchers may be evaluated by the JIF of the journals in which they have published (see next section for a more in depth look at the JIF).

Garfield(1955) put forward the idea of a citation index for the sciences as a way to improve the scholarly process (there had been such indexes for law reports for some time). A citation index is a list of papers along with the papers that cite them. Garfield also suggested the possibility of using citations as a measure of the impact of an article within its research field. Counting citations would provide a better indication of performance than the existing method of simply counting the number of publications an author had written.

Garfield and Sher(1963) presented results of research into the citation behaviour (“bibliometrics”) of research literature published in 1961. This found that when plotting citation frequency (the number of times something is cited, be that a paper, author or journal) that a small subset receive the majority of citations. For example, 60 of the 5000 journals studied

accounted for 60% of all citations. This leaves the majority of papers receiving little or no citations after they are published.

of an estimated 24,000 (Harnad et al., 2004). However, Garfield (1990) pointed out that “no matter how many journals are in the market, only a small fraction account for most of the articles that are published and cited in a given year.” Thomson ISI (2004) estimated “that a core of

approximately 2,000 journals now accounts for about 85% of published articles and 95% of cited articles.”

4.2.1

The Impact Factor

The journal impact factor is the number of citations to a journal normalised by the number of papers in that journal (Equation 4.2 gives the

mathematical definition). Garfield first put foward the idea of an impact factor inGarfield(1955) – “when one is trying to evaluate the significance of a particular work and its impact on the literature and thinking of the period . . . such an ‘impact factor’ may be much more indicative than an absolute count of the number of a scientist’s publications.” While Garfield put

forward the idea of an impact factor as a means of evaluating research it has been most widely used as a method of comparing the importance of journals. Garfield wasn’t the first person to use citations as a quantitative measure of the importance of journals, Gross and Gross (1927) proposed counting citations as a means of collection management for journals (unlike the journal impact factor, Gross and Gross didn’t normalise for the number of papers in a journal). However, the increasing use of research evaluation has led to the use of the JIF to being the most widespread research metric used in evaluation. Ij = Ct−t 2 Pt t−2 (4.2) where Ij is the impact factor of a journal j, C is the total

citations to papers in that journal over two preceding years and P is the total research papers published in that journal over two preceding years.

Garfield(2005) likens his creation to that of nuclear energy – “the impact factor is a mixed blessing. I expected it to be used constructively while recognizing that in the wrong hands it might be abused.” As Garfield acknowledges the JIF has been transposed from a measure of journal impact to a proxy of the impact of authors publishing in that journal. This has had real economic consequences for some researcherse.g. in Spain the use of the ISI JIF to award researchers’ bonuses is enshrined in law

(Jim´enez-Contreras et al.,2002).

The use of the journal impact factor to evaluate authors is, however, deeply flawed. Seglen (1994,1997) argued against the use of the journal JIF as an evaluation tool for authors, as there is a huge range in the number of citations to papers within a journal (Seglen found for three biochemical journals that “15% of the [papers] account for 50% of the citations, and the most cited 50% of the [papers] account for 90% of the citations.”) In effect, an author of a low-impact paper gets the same rating as an author of a high-impact paper, as long as both are published in the same journal. While journals vary in the quality and type of papers they accept, within a journal individual papers will have a wide range of quality and hence impact.

The use of quantitative evaluation inevitably effects the subject of

evaluation. If the evaluation system did not result in the subjects changing (hopefully improving) the evaluation isn’t having any effect. This evaluation pressure on journals has led to accusations of cheating the system to

maximise a journal’s JIF. One mechanism a journal can use to increase their JIF is to encourage authors to cite papers previously published in that journal. Fassoulaki et al. (2000) found a very strong correlation ofr = 0.899 between the amount of journal ‘self-citation’ and its JIF. While there may be entirely reasonable motivations to do thise.g. to tie papers into the historical record of the journal, the net result is to increase that journal’s JIF.

4.3

Conclusion

Bibliometrics is the general umbrella field for the quantitative analysis of research publications. Reviewing the general laws of bibliometrics are useful as a background to the analysis of open access literature, because (to date) open access shares the same charactistics (the distribution of citations, authors and journals).

The Journal Impact Factor (JIF) was developed to compare journals’ importance within a field (the more highly cited the journal, the more important it is). Because citation counts are recognised as a useful quantitative measure of importance, using citation data to evaluate the effect of open access is an obvious step (by comparing the citation impact of open access vs. non-open access papers). But the limitations of the JIF must also kept in mind when drawing conclusions from any citation

counting based comparison: that citations are highly skewed, ‘self-citation’ can distort results and that citations are essentially a measure of popularity and not necessarily of quality.

In order to perform bibliometrics at all requires access to bibliometric data. The Open Archives Initiative (section 3.4) provides me with metadata (authors, title etc.) for open access papers but in order to get citation data for these papers requires building a citation database, which is discussed in the next chapter and in Citebase Search (chapter 7).

An Analysis of OAI

Repositories and Harvesting

Support

5.1

Introduction

Distributed systems are advantageous because they share the costs across many providers, improve reliability through removing single points of failure and, for many applications, improve performance by distributing load across multiple providers. The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a distributed system – many data providers manage the acquisition and cataloguing of resources, providing a common interface for many services to harvest and aggregate those resources. But some problems in harvesting data providers have been encountered by OAI service providers.

Given the global nature of OAI-PMH – and its very low implementation cost – many OAI data providers are on low-bandwidth networks or exhibit minor errors in their ‘home-brew’ implementation. Some very large

collections of material exist that present a very large amount of data to re-harvest should a error occur during harvesting.

Two problems encountered while harvesting from OAI data providers are errors in text data (character encoding issues), which causes the XML-based responses to not be parseable by XML parsing tools, and not correctly implementing the required parts of the OAI-PMH protocol (e.g. only accepting the optional seconds-based datestamps, when OAI-PMH requires at least support for day-based resolution).

To help with these challenges I wrote a tool that harvests records from data providers, handles OAI-implementation errors and stores the record

metadata in a database for use by other services. This tool is called ‘Celestial’.

Celestial does not help to resolve the semantic ambiguity associated with Dublin Core metadata (seesubsection 3.4.2) – or other problems with metadata interpretation. The purpose of Celestial is to at least provide a consistent (correct) mechanism to obtain metadata, by abstracting over the wide range of OAI-PMH implementations.