Deutsche Version Home Contact

Towards quality measures for bibliometric indicators

Bibliometric data are a representation of formal communication in science. They represent results of scientific practices. They facilitate analyses of structure and dynamics of scientific knowledge. They provide insights into the social conditions under which science is produced. This potential of bibliometric data and indicators is tempting. But one must not ignore the fact that both have numerous shortcomings and, even though they are frequently presented as such, are far from being objective representations. As with all social science data bibliometric data and indicators are strongly dependent on the methods and mechanisms adopted for their production. Consequently, they contain incomplete or imprecise information and in some cases they are simply erroneous (e.g. Glänzel/Debackere 2003, Hornbostel et al. 2008).

In contrast to other fields of social science research, the debate on this problem within the bibliometric community is far from exhaustive. It often remains limited to an eclectic description of issues. In many cases, neither any indication on the scope of the problems and uncertainties nor any conclusions as to the reliability of the resulting indicators are available. Hence, statements on the quality and limitations of bibliometric knowledge are difficult to sustain. It is left to the user of the data and indicators to decide in which contexts the indicators are reliable enough to be applied.

This is the departing point of the project “Towards quality measures for bibliometric indicators”. It aims at improving the quality of bibliometric analyses by providing readily available information on potential biases and errors in bibliometric indicators.

The project is divided in a number of modules, each dedicated to a particular aspect of the quality of bibliometric indicators. In the first stage of the project, research will focus on the following topics:

Reference Matching
Information on citations is generated by linking references to target documents. There is evidence that the methods for doing this in some cases strongly influence citation counts (Moed 2005). However, very little is known about how the specific methods applied at this stage of producing bibliometric data actually influence bibliometric indicators and how potential changes are distributed among different objects of citation analysis. The aim of this module is to compare the results of different methods for matching reference lists with target document in order to provide a reliability measure for citation rates.

Citations Windows
Citations are used as a proxy for the attention a publication attracts in the scientific community. However, every field has its own pace of processing knowledge. The information on how citations evolve over time is crucial for assuring the reliability of comparisons between fields. When comparing citation rates across fields these differences have to be taken into account. Using this as a departing point we will analyse how field specific citation patterns can best be described and integrated into bibliometric analyses.

Another issue which is important with regard to citation analyses is the role of self-citations across fields. Aside from the question of whether an analysis should include or exclude self-citations, the basic question of what is a self-citation is far from being resolved. In this project we will analyse different methods for identifying self-citations and provide guidelines on which methods are best suited for which kind of analyses.

Author disambiguation
An important task of most bibliometric research endeavours is identifying authors by their name and their publication set. Particularly in cases where authors have common last names (such as in the case of “Smith” or “Miller” in English-speaking countries) more information than just the name is required to find out which publications can actually be attributed to one specific author. In this sub-project we will compare different methods for grouping publications according to authors. The overall aim is to provide estimates for the correct identification of authors.

Coverage of bibliometric databases
The emergence of bibliometric databases such as Elsevier’s Scopus or Google Scholar has lead to competition in the market for bibliometric data which for a long time is dominated by Thomson Reuter’s products such as the Web of Science. It is widely known that each data source has different features, particularly with regard to coverage and different policies for including data. However very little is known about what this actually means for the potential use of the data. Which kinds of literature are covered by the respective databases? How do the producers select publications? How does the coverage differ according to field? Those are questions to be addressed in another subproject which aims at providing information of the field specific coverage of bibliometric databases.

Counting methods
Over the last several years a debate on different methods for counting citations and publications in the case of publications which can be attributed to more than one unit of analysis has emerged (e.g. Gauffriau et al. 2008). Nevertheless, only a relatively small number of studies have so far addressed the question of to what extent the use of different counting methods actually makes a difference. The aim of the project is to conduct a number of case studies with the aim of comparing the results of different methods for counting publications and citations.

The project is integrated into bibliometric research conducted at iFQ. As part of the Competence Centre for Bibliometrics the project will be implemented in cooperation with the IWT Bielefeld, ISI Fraunhofer and FIZ Karlsruhe. It explicitly contributes to improving the quality of the bibliometric databases currently developed within the Competence Centre for Bibliometrics and used in all analyses conducted within this framework. All resulting methods will be implemented into the in-house databases and made available to potential users of the database.

Selected References

Gauffriau, Marianne / Larsen, Peder Olesen / Maye, Isabelle / Roulin-Perriard, Anne / von Ins, Markus, 2008: Comparisons of results of publication counting using different methods. Scientometrics 77 (1), 47-176.
Glänzel, Wolfgang / Debackere, Koenraad, 2003: On the opportunities and limitations using bibliometric indicators in a policy relevant context, in: Forschungszentrum Jülich: Bibliometric analysis in science and research. Applications, benefits and limitations; 2nd conference of the central library. Jülich: Schriften des Forschungszentrums Jülich, 225-236.
Glänzel, Wolfgang / Schöpflin, Urs, 1994: Little Scientometrics, Big Scientometrics … and beyond? Scientometrics, 30 (2-3), 375-384.
Hornbostel, Stefan / Klingsporn, Bernd / von Ins, Markus, 2008: Messung von Forschungsleistungen – eine Vermessenheit? in: Alexander von Humboldt-Stiftung (Hg.): Publikationsverhalten in unterschiedlichen Disziplinen. Beiträge zur Beurteilung von Forschungsleistungen, Discussion Paper 12, 2008. Bonn: Alexander von Humboldt-Stiftung, 11-32.
Moed, Henk F. / Glänzel, Wolfgang / Schmoch, Ulrich, 2004: Handbook of quantitative science and technology research. The use of publication and patent statistics in studies of S&T systems. Dordrecht: Kluwer Acad. Publ.
Moed, Henk F., 2005: Citation Analysis in Research Evaluation. Dordrecht: Springer.
Weingart, Peter, 2005: Impact of bibliometrics upon the science system: inadvertent consequences? Scientometrics Volume 62, Number 1, 117-131.

Contact person: Jeffrey Demaine