Paul Donner
Abteilung Forschungssystem und Wissenschaftsdynamik
wissenschaftlicher Mitarbeiter
- 030 2064177-21
- 030 2064177-99
- Google Scholar
- Orcid
Wissenschaftliche Forschungsgebiete
Bibliometrie, Informationsvisualisierung
Liste der Projekte
Liste der Publikationen
Data inaccuracy quantification and uncertainty propagation for bibliometric indicators.Donner, P. (2024).Data inaccuracy quantification and uncertainty propagation for bibliometric indicators. Research Evaluation, 33. https://doi.org/10.1093/reseval/rvae047 Abstract
This study introduces an approach to estimate the uncertainty in bibliometric indicator values that is caused by data errors. This approach utilizes Bayesian regression models, estimated from empirical data samples, which are used to predict error-free data. Through direct Monte Carlo simulation—drawing many replicates of predicted data from the estimated regression models for the same input data—probability distributions for indicator values can be obtained which provide the information on their uncertainty due to data errors. [...] |
How accurate are Scopus publication counts of researchers? A survey-bibliometric comparison for Germany.Fenton, A., Donner, P., Ambrasat, J., Fabian, G., & Heger, C. (2024).How accurate are Scopus publication counts of researchers? A survey-bibliometric comparison for Germany. In STI2024 (Hrsg.), Proceedings of the 28th International Conference on Science, Technology and Innovation Indicators (STI2024). Berlin: STI2024. Abstract
The number of researchers' publications is a widely used proxy measure for scientific output, individual achievement, and performance. Despite well-known criticism from the bibliometric community, the use of bibliometric databases as a basis for measuring publication output is widespread. At the same time, there are established survey instruments that also measure the publication output per researcher. We use survey-bibliometric matching with Scopus publication records to compare the alternative publication counts. A Scopus author ID match could be found for 70 % of the respondent researchers. The number of publications per researcher varies greatly between these data sources. [...] |
Are peer review duration and publication delay research quality signals?Donner, P. (2024).Are peer review duration and publication delay research quality signals? In STI2024 (Hrsg.), Proceedings of the 28th International Conference on Science, Technology and Innovation Indicators (STI2024). Berlin: STI2024. Abstract
Here we study how the lengths of the periods from submission to acceptance (review duration) and from acceptance to publication (publication delay) relate to research quality, as operationalized by F1000Prime recommendations, for a large dataset of publications from the life and health sciences. We find a statistically detectable relationship between shorter peer review duration and recommendations, but its effect size is negligibly small. |
Open and overlooked: the penalties for preprint open access papers and translated journals in citation analysis.Donner, P. (2024).Open and overlooked: the penalties for preprint open access papers and translated journals in citation analysis. In STI2024 (Hrsg.), Proceedings of the 28th International Conference on Science, Technology and Innovation Indicators (STI2024). Berlin: STI2024. Abstract
The citations to open access preprint versions of papers and citations to papers in journals translated to English are not regularly counted in major proprietary citation index databases or in ordinary bibliometric research assessment even though they arguably reflect a true part of a work’s scientific impact. Here we explore the extent of these phenomena using Web of Science data. |
Investigation of the external validity of the 2004 German Science Foundation author contribution calculation recommendation for medical schools' performance-based funding systems.Donner, P. (2024).Investigation of the external validity of the 2004 German Science Foundation author contribution calculation recommendation for medical schools' performance-based funding systems. In STI2024 (Hrsg.), Proceedings of the 28th International Conference on Science, Technology and Innovation Indicators (STI2024). Berlin: STI2024. Abstract
This study examines how well an idiosyncratic authorship counting rule for co-authored publications recommended by the German Science Foundation (DFG) for medical schools and widely used in performance-based funding systems aligns with the empirical evidence. The DFG rule and two other co-author credit rules are compared with empirical data of percentage contribution statements of authors of co-authored papers in medicine. |
Researcher mobility, co-authorship and individual research agendas.Donner, P., & Blümel, C. (2024).Researcher mobility, co-authorship and individual research agendas. In STI2024 (Hrsg.), Proceedings of the 28th International Conference on Science, Technology and Innovation Indicators (STI2024). Berlin: STI2024. Abstract
This study investigates whether researchers whose published research record is more thematically broad —covering more, and more semantically distant, topics— are also characterized by specific patterns in their mobility between research organizations and countries and their co-authorship patterns. We study a large sample of productive authors in STEM fields who have been active in Germany. Our results show that specific types of international mobility go together with slightly elevated epistemic breadth. But scientists with larger co-author networks have relatively greater epistemic breadth. [...] |
Remarks on modified fractional counting.Donner, P. (2024).Remarks on modified fractional counting. Journal of Informetrics, 18, 101585. https://doi.org/10.1016/j.joi.2024.101585 |
Drawbacks of Normalization by Percentile Ranks in Citation Impact Studies.Donner, P. (2022).Drawbacks of Normalization by Percentile Ranks in Citation Impact Studies. Journal of Library and Information Studies, 20(2), 75-93. Abstract
This paper discusses drawbacks of the percentile rank method for citation impact normalization which have hitherto been neglected in the bibliometrics literature. The transformation of citation counts to percentile ranks changes ratio scale data into ordinal scale data, for which the notions of the ratio between two values and of the magnitude of a difference between two values are not defined – a substantial loss of information. This distorts citation data particularly severely because the differences between citation counts adjacent in order in publication sets are greater for more highly cited publications and because highly cited publications are more scarce than non-highly cited ones. [...] |
Algorithmic identification of Ph.D. thesis-related publications: a proof-of-concept study.Donner, P. (2022).Algorithmic identification of Ph.D. thesis-related publications: a proof-of-concept study. Scientometrics (online first). Abstract
In this study we propose and evaluate a method to automatically identify the journal publications that are related to a Ph.D. thesis using bibliographical data of both items. We build a manually curated ground truth dataset from German cumulative doctoral theses that explicitly list the included publications, which we match with records in the Scopus database. We then test supervised classification methods on the task of identifying the correct associated publications among high numbers of potential candidates using features of the thesis and publication records. The results indicate that this approach results in good match quality in general and with the best results attained by the “random forest” classification algorithm. |
Citation analysis of Ph.D. theses with data from Scopus and Google Books.Donner, P. (2021).Citation analysis of Ph.D. theses with data from Scopus and Google Books. Scientometrics, 126, 9431-9456. https://doi.org/10.1007/s11192-021-04173-w Abstract
This study investigates the potential of citation analysis of Ph.D. theses to obtain valid and useful early career performance indicators at the level of university departments. For German theses from 1996 to 2018 the suitability of citation data from Scopus and Google Books is studied and found to be sufficient to obtain quantitative estimates of early career researchers’ performance at departmental level in terms of scientific recognition and use of their dissertations as reflected in citations. Scopus and Google Books citations complement each other and have little overlap. Individual theses’ citation counts are much higher for those awarded a dissertation award than others. Departmental level estimates of citation impact agree ... |
Identifying constitutive articles of cumulative dissertation theses by bilingual text similarity. Evaluation of similarity methods on a new short text task.Donner, P. (2021).Identifying constitutive articles of cumulative dissertation theses by bilingual text similarity. Evaluation of similarity methods on a new short text task. Quantitative Science Studies, 2(3). https://doi.org/10.1162/qss_a_00152 Abstract
Cumulative dissertations are doctoral theses comprised of multiple published articles. For studies of publication activity and citation impact of early career researchers it is important to identify these articles and link them to their associated theses. Using a new benchmark data set, this paper reports on experiments of measuring the bilingual textual similarity between, on the one hand, titles and keywords of doctoral theses, and, on the other hand, articles’ titles and abstracts. The tested methods are cosine similarity and L1 distance in the Vector Space Model (VSM) as baselines, the language-indifferent methods Latent Semantic Analysis (LSA) and trigram similarity, and the language-aware methods fastText and Random Indexing (RI)... |
Validation of the Astro dataset clustering solutions with external data.Donner, P. (2020).Validation of the Astro dataset clustering solutions with external data. Scientometrics, 126, 1619–1645. https://doi.org/10.1007/s11192-020-03780-3 Abstract
We conduct an independent cluster validation study on published clustering solutions of a research testbed corpus, the Astro dataset of publication records from astronomy and astrophysics. We extend the dataset by collecting external validation data serving as proxies for the latent structure of the corpus. Specifically, we collect (1) grant funding information related to the publications, (2) data on topical special issues, (3) on specific journals’ internal topic classifications and (4) usage data from the main online bibliographic database of the discipline. The latter three types of data are newly introduced for the purpose of clustering validation and the rationale for using them for this task is set out. |
The implicit preference of bibliometrics for basic research.Donner, P., & Schmoch, U. (2020).The implicit preference of bibliometrics for basic research. Scientometrics, 124, 1411-1419. https://doi.org/10.1007/s11192-020-03516-3 Abstract
By individually associating articles to basic or applied research, it is shown that basic articles are cited more frequently than applied ones. Dividing the subject categories of the Web of Science into a basic and an applied part, the mean field-normalization rate is referred to the applied or basic part depending on the research orientation of the paper analysed. By this approach, a distinct difference of the citations for the applied and basic parts of most subject categories is found. However, differences of the citation scores of applied and basic research organisations are found as well, but are less clear. The explanation is that applied and basic research organisations generally publish a mix of basic and applied articles. [...] |
A validation of coauthorship credit models with empirical data from the contributions of PhD candidates.Donner, P. (2020).A validation of coauthorship credit models with empirical data from the contributions of PhD candidates. Quantitative Science Studies, 1 (2), 551-564. Abstract
A perennial problem in bibliometrics is the appropriate distribution of authorship credit for coauthored publications. Several credit allocation methods and formulas have been introduced, but there has been little empirical validation as to which method best reflects the typical contributions of coauthors. This paper presents a validation of credit allocation methods using a new data set of author-provided percentage contribution figures obtained from the coauthored publications in cumulative PhD theses by authors from three countries that contain contribution statements. [...] |
Liste der Vorträge & Tagungen
Seit 2016
Wissenschaftlicher Mitarbeiter am Deutschen Zentrum für Hochschul- und Wissenschaftsforschung, Abteilung 2 Forschungssystem und Wissenschaftsdynamik
08/2013 - 12/2015
Wissenschaftlicher Mitarbeiter am iFQ Berlin im Arbeitsbereich Bibliometrie
2012
Abschluss Master of Arts Bibliotheks- und Informationswissenschaft, Humboldt-Universität zu Berlin