Abstract: Measuring the impact of scientific articles is important for evaluating the research output of individual scientists, academic institutions, and journals. While citations are raw data for constructing impact measures, there exist biases and potential issues if factors affecting citation patterns are not properly accounted for. In this work, we address the problem of field variation and introduce an article-level metric useful for evaluating individual articles’ visibility. This measure derives from joint probabilistic modeling of the content in the articles and the citations among them using latent Dirichlet allocation (LDA) and the mixed membership stochastic blockmodel (MMSB). Our proposed model provides a visibility metric for individual articles adjusted for field variation in citation rates, a structural understanding of citation behavior in different fields, and article recommendations that take into account article visibility and citation patterns.
Bio: Tian Zheng is Professor and Department Chair of Statistics at Columbia University. She develops novel methods for exploring and understanding patterns in complex data from different application domains. Her current projects are in the fields of statistical machine learning, spatiotemporal modeling and social network analysis. Professor Zheng’s research has been recognized by the 2008 Outstanding Statistical Application Award from the American Statistical Association (ASA), the Mitchell Prize from ISBA and a Google research award. She became a Fellow of American Statistical Association in 2014. Professor Zheng is the recipient of the 2017 Columbia’s Presidential Award for Outstanding Teaching. From 2018-2020, she has been the chair-elect, chair and past-chair for ASA’s section on Statistical Learning and Data Science.