Research Article
Bestseller Prediction and Influencing Factor Analysis Using Explainable Machine Learning
1 Sapyeong Publishing, 2 Sungkyunkwan University
Published: January 2025 · Vol. 54, No. 1 · pp. 81-108
DOI: https://doi.org/10.17287/kmr.2025.54.1.81
Full Text
Abstract
Research on predicting sales volumes or identifying bestsellers in online bookstores often relies on post-publication data, such as sales records and customer reviews, which poses a cold start problem for new books. This study addresses this issue by developing a machine learning model based solely on metadata from newly released literary books. Among the tested models, LightGBM exhibits the best predictive performance. Using feature importance analysis and the SHAP method, we identify key factors influencing bestseller prediction, including author frequency, publisher frequency, category frequency, price, and publication month. Our findings provide a solution to the cold start problem and offer actionable insights for online bookstores to anticipate a book’s success potential and refine marketing strategies.
