Empirical Study on Citation Count Prediction of Research Articles

Journal of Scientometric Research,2022,11,2,155-163.
Published:September 2022
Type:Research Article
Author(s) affiliations:

Murali Krishna Enduri1,*, V Udaya Sankar2, Koduru Hajarathaiah1

¹Department of Computer Science and Engineering, SRM University, Amaravati, Andhra Pradesh, INDIA.

²Department of Electronics and Communications Engineering, SRM University, Amaravati, Andhra Pradesh, INDIA.


Citation is a measure that quantifies the impact of the researcher, research article and journal’s quality. Investigating the citation of articles and/or researchers is one of the important tasks in the research community. So, understanding and predicting citation patterns of research articles has become popular in scientific research fields. In this work, we give a machine learning approach to predict the citations of research articles using the keywords. We study the citation impact based on keywords motioned in the articles using the data set of publications which are published in the various physical review journals from 1985-2012. In this dataset, for each publication is allocated some PACS codes (keywords) by their authors which represent a sub-field of Physics. In this work, we are investigating the impact of PACS codes of article on article’s citation. We are performing our analysis on the first (sub-field of physics), second (sub area of sub-field of physics) and third level of PACS codes. We observed that compared to the first level, every pair of citation patterns of the second level is highly correlated. We also obtained a universal approximation curve for the third level that matches with the average value of the first level. This curve looks like a shifted and scaled version of the Gaussian function and is right skewed. We can also predict the citations based on the keywords by using this universal curve.

Citations of papers by considering the first level of PACS codes.