뉴스/소셜 미디어 텍스트와 투자자 기대
Copyright 2011 THE KOREAN ACADEMIC SOCIETY OF BUSINESS ADMINISTRATION
This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
This paper analyzes whether the investor expectation implied by the text in news articles or social media forums affects stock returns in the Korean market. Our model, trained on 640,457 input articles and forum posts, classifies each post as positive or negative, employing word embedding based on the Word2Vec and bi-directional long short-term memory network to construct the investor expectation indices. We find that the expectation index constructed from news articles and the index from social media forums can explain stock return movements. Interestingly, the investor expectation extracted from social media forums outperforms the expectation from news articles.
Keywords:
Interdisciplinarity, Investor Expectation, Machine Learning, News Media Text, Social Media Text, Stock MarketAcknowledgments
This paper is an extended version of Lee’s dissertation. The authors are grateful for the helpful comments and suggestions from Keun-Yeong Lee, Young-Han Kim, Shu-Chin Lin, Jinyoung Yu, Karam Kim, and Sohee Shin.
References
- Bahdanau, D., K. Cho, and Y. Bengio (2016), “Neural machine translation by jointly learning to align and translate,” arxiv. https://arxiv.org/abs/1409.0473v7
- Baker, M., J. Wang, and J. Wurgler (2008), “How does investor sentiment affect the crosssection of stock returns?,” Journal of Investment Management, 6(2), pp.57-72.
- Baker, M. and J. Wurgler (2006), “Investor sentiment and the cross‐section of stock returns,” Journal of Finance, 61(4), pp.1645-1680. [https://doi.org/10.1111/j.1540-6261.2006.00885.x]
- Baker, M. and J. Wurgler (2007), “Investor sentiment in the stock market,” Journal of Economic Perspectives, 21(2), pp.129-152. [https://doi.org/10.1257/jep.21.2.129]
- Behrendt, S. and A. Schmidt (2018), “The Twitter myth revisited: Intraday investor sentiment, Twitter activity and individual-level stock return volatility,” Journal of Banking and Finance, 96, pp. 355-367. [https://doi.org/10.1016/j.jbankfin.2018.09.016]
- Bengio, Y., R. Ducharme, P. Vincent, and C. Jauvin (2003), “A neural probabilistic language model,” Journal of Machine Learning Research, 3, pp.1137-1155.
- Bodie, Z., A. Kane, and A. Marcus (2021), Investments, 12th Edition, New York: McGraw Hill.
- Bojanowski, P., E. Grave, A. Joulin, and T. Mikolov (2017), “Enriching word vectors with subword information,” Transactions of the Association for Computational Linguistics, 5, pp.135-146. [https://doi.org/10.1162/tacl_a_00051]
- Bollen, J., H. Mao, and X. Zeng (2011), “Twitter mood predicts the stock market,” Journal of Computational Science, 2(1), pp.1-8. [https://doi.org/10.1016/j.jocs.2010.12.007]
- Chun, S. (2020), “Predicting Korean stock market return with financial and macro variables: Focusing on in-sample and out-of-sample tests,” Journal of Insurance and Finance, 31(1), pp.87-113. [https://doi.org/10.23842/jif.2020.31.1.003]
- De Long, J. B., A. Shleifer, L. H. Summers, and R. J. Waldmann (1989), “The size and incidence of the losses from noise trading,” Journal of Finance, 44(3), pp.681-696. [https://doi.org/10.1111/j.1540-6261.1989.tb04385.x]
- Elman, J. L. (1990), “Finding structure in time,” Cognitive Science, 14(2), pp.179-211. [https://doi.org/10.1207/s15516709cog1402_1]
- Fama, E. F. and K. R. French (1992), “The cross-section of expected stock returns,” Journal of Finance, 47(2), pp.427-465. [https://doi.org/10.1111/j.1540-6261.1992.tb04398.x]
- Fama, E. F. and K. R. French (1993), “Common risk factors in the returns on stocks and bonds,” Journal of Financial Economics, 33 (1), pp.3-56. [https://doi.org/10.1016/0304-405X(93)90023-5]
- Goodfellow, I., Y. Bengio, and A. Courville (2016), Deep learning, Cambridge: MIT Press.
- Graves, A., and J. Schmidhuber (2005), “Framewise phoneme classification with bidirectional LSTM and other neural network architectures,” Neural Networks, 18(5-6), pp.602-610. [https://doi.org/10.1016/j.neunet.2005.06.042]
- Hinton, G. E., N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov (2012), “Improving neural networks by preventing co-adaptation of feature detectors,” arxiv. https://arxiv.org/abs/1207.0580
- Hirschberg, J. and C. D. Manning (2015), “Advances in natural language processing,” Science, 349(6245), pp.261-266. [https://doi.org/10.1126/science.aaa8685]
- Hjalmarsson, E. (2010), “Predicting global stock returns,” Journal of Financial and Quantitative Analysis, 45(1), pp.49-80. [https://doi.org/10.1017/S0022109009990469]
- Hochreiter, S., Y. Bengio, P. Frasconi, and J. Schmidhuber (2001), “Gradient flow in recurrent nets: The difficulty of learning longterm dependencies,” in J. Kolen and S. Kremer (Eds.), A Field Guide to Dynamical Recurrent Networks, IEEE Press. pp.237-244.
- Hochreiter, S. and J. Schmidhuber (1997), “Long short-term memory,” Neural Computation, 9(8), pp.1735-1780. [https://doi.org/10.1162/neco.1997.9.8.1735]
- Hu, G. X., C. Chen, Y. Shao, and J. Wang (2019), “Fama–French in China: Size and value factors in Chinese stock returns,” International Review of Finance, 19(1), pp.3-44. [https://doi.org/10.1111/irfi.12177]
- Huang, C., S. Simpson, D. Ulybina, and A. Roitman (2019), “News-based sentiment indicators,” IMF Working Papers No. 19/273. [https://doi.org/10.5089/9781513518374.001]
- Huang, T. L. (2019), “Is the Fama and French fivefactor model robust in the Chinese stock market?” Asia Pacific Management Review, 24(3), pp.278-289. [https://doi.org/10.1016/j.apmrv.2018.10.002]
- Kalyani, J., H. N. Bharathi, and R. Jyothi (2016), “Stock trend prediction using news sentiment analysis,” arxiv. https://arxiv.org/abs/1607.01958
- Kam, H. and Y. Shin (2017), “The impact of macro-economic variables on stock returns in Korea,” Korean Journal of Business Administration, 30(1), pp.33-52. [https://doi.org/10.18032/kaaba.2017.30.1.33]
- Kang, H. and J. Yang (2019), “Optimization of Word2Vec models for Korean word embeddings,” Journal of Digital Contents Society, 20(4), pp.825-833. [https://doi.org/10.9728/dcs.2019.20.4.825]
- Kim, E, J. and H. S. Lee (2013), “A study on alternative design research model using unstructured online data: Through design ethnography methodology,” Design Convergence Study, 12(5), pp.205-223.
- Kim, H., H. Cho, and D. Ryu (2020), “Corporate default predictions using machine learning: Literature review,” Sustainability, 12(16), 6325. [https://doi.org/10.3390/su12166325]
- Kim, H., D. Ryu, and H. Cho (2019), “Corporate default predictions and machine learning,” Korean Journal of Financial Engineering, 18(3), pp.131-152. [https://doi.org/10.35527/kfedoi.2019.18.3.006]
- Kim, J. S., D. Ryu, and S. W. Seo (2014), “Investor sentiment and return predictability of disagreement,” Journal of Banking and Finance, 42, pp.166-178. [https://doi.org/10.1016/j.jbankfin.2014.01.017]
- Kim, K. and S. Lee (2018), “News audiences perceptual biases and assessment of news fairness: An analysis of the influences of trust for media, message bias, self-categorization, and selfenhancement,” Communication Theories, 14 (3), pp. 145-198. [https://doi.org/10.20879/ct.2018.14.3.145]
- Kim, K. and D. Ryu (2021), “Does sentiment determine investor trading behaviour?” Applied Economics Letters, Forthcoming. [https://doi.org/10.1080/13504851.2020.1782331]
- Kim, K., D. Ryu, and H. Yang (2018), “Investor sentiment indices and the cross-section of stock returns of individual firms,” Korean Management Review, 47(5), pp.1231-1260. [https://doi.org/10.17287/kmr.2018.47.5.1231]
- Kim, K., D. Ryu, and J. Yu (2021), “Do sentiment trades explain overconfidence around analyst recommendation revisions?” Research in International Business and Finance, 56, 101376. [https://doi.org/10.1016/j.ribaf.2020.101376]
- Kim, N. and Y.-W. Lee (2016), “Machine learning approaches to corn yield estimation using satellite images and climate data: A case of Iowa state,” Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, 34(4), pp.383-390. [https://doi.org/10.7848/ksgpc.2016.34.4.383]
- Kim, S., Y. Lee, J. Shin, and K. Y. Park (2019), “Text mining for economic analysis,” BOK Working Paper 2019-18.
- Kim, Y., S. Jeong, and S. Lee (2014), “A study on the stock market prediction based on sentiment analysis of social media,” Entrue Journal of Information Technology, 13(3), pp.59-70.
- Kingma, D. P. and J. Ba (2017), “Adam: A method for stochastic optimization,” arxiv. https://arxiv.org/abs/1412.6980v9
- Ko, S.-J., H.-Y. Yun, and D.-M. Shin (2018), “Electronic demand data prediction using bidirectional long short term memory networks,” Journal of Software Assessment and Valuation, 14(1), pp.33-40.
- Lee, D. H., H. G. Kang, S. H. Kim, and C. M. Lee (2013), “Autocorrelation analysis of the sentiment with stock information appearing on big-data,” Korean Journal of Financial Engineering, 12(2), pp.79-96. [https://doi.org/10.35527/kfedoi.2013.12.2.004]
- Lee, Y. (2018), “Introduction to eKoNLPy: Korean NLP python packagage for economic analysis,” Available at https://github.com/entelecheia/eKoNLPy, .
- Levy, O. and Y. Goldberg (2014), “Linguistic regularities in sparse and explicit word representations,” Proceedings of the 18th Conference on Computational Natural Language Learning, pp.171-180. [https://doi.org/10.3115/v1/W14-1618]
- Liu, H. and Y.-C. Gao (2019), “The impact of corporate lifecycle on Fama-French three-factor model,” Physica A, 513, pp. 390-398. [https://doi.org/10.1016/j.physa.2018.09.037]
- Luong, M.-T., H. Pham, and C. D. Manning, (2015), “Effective approaches to attention-based neural machine translation,” arxiv. https://arxiv.org/abs/1508.04025v5 [https://doi.org/10.18653/v1/D15-1166]
- Mikolov, T., I. Sutskever, K. Chen, G. Corrado, and J. Dean (2013), “Distributed representations of words and phrases and their compositionality,” Advances in Neural Information Processing Systems 26.
- Nair, V. and G. E. Hinton (2010), “Rectified linear units improve restricted Boltzmann machines,” Proceedings of the 27th International Conference on Machine Learning, pp.807-814.
- O’Leary, D. E. (2013), “Artificial intelligence and big data,” IEEE Intelligent Systems, 28(2), pp.96-99. [https://doi.org/10.1109/MIS.2013.39]
- Park, E. L. and S. Cho (2014). “KoNLPy: Korean natural language processing in python,” Proceedings of the 26th Annual Conference on Human and Cognitive Language Technology, pp.133-136.
- Park, K. S., E. J. Lee, and I. M. Lee (2003), “Determinants of dividend policy of Korean firms,” Asian Review of Financial Research, 16(2), pp.195-229.
- Renault, T. (2017), “Intraday online investor sentiment and return patterns in the U.S. stock market,” Journal of Banking and Finance, 84, pp. 25-40. [https://doi.org/10.1016/j.jbankfin.2017.07.002]
- Russell, S. J. and P. Norvig (2020), Artificial intelligence: A modern approach, 4th Edition, Pearson.
- Ryu, D., H. Kim, and H. Yang (2017), “Investor sentiment, trading behavior and stock returns,” Applied Economics Letters, 24(12), pp.826-830. [https://doi.org/10.1080/13504851.2016.1231890]
- Schrimpf, A. (2010), “International stock return predictability under model uncertainty,” Journal of International Money and Finance, 29(7), pp.1256-1282. [https://doi.org/10.1016/j.jimonfin.2010.03.005]
- Seok, S. I., H. Cho, and D. Ryu (2019a), “Firm-specific investor sentiment and the stock market response to earnings news,” North American Journal of Economics and Finance, 48, pp.221-240. [https://doi.org/10.1016/j.najef.2019.01.014]
- Seok, S. I., H. Cho, and D. Ryu (2019b), “Firmspecific investor sentiment and daily stock returns,” North American Journal of Economics and Finance, 50, 100857. [https://doi.org/10.1016/j.najef.2018.10.005]
- Shapiro, A. H., M. Sudhof, and D. J. Wilson (2021), “Measuring news sentiment,” Journal of Econometrics, Forthcoming. [https://doi.org/10.1016/j.jeconom.2020.07.053]
- Smales, L. A. (2016), “Trading behavior in S&P 500 index futures,” Review of Financial Economics, 28, pp.46-55. [https://doi.org/10.1016/j.rfe.2015.11.001]
- Smales, L. A. (2020), “News sentiment as an explanation for changes in the VIX futures basis,” Journal of Investing, 29(4), pp.92-102. [https://doi.org/10.3905/joi.2020.1.125]
- Stambaugh, R. F., J. Yu, and Y. Yuan (2012), “The short of it: Investor sentiment and anomalies,” Journal of Financial Economics, 104(2), pp. 288-302. [https://doi.org/10.1016/j.jfineco.2011.12.001]
- Stein, J. C. (1987), “Informational externalities and welfare reducing speculation,” Journal of Political Economy, 95(6), pp.1123-1145. [https://doi.org/10.1086/261508]
- Sumathy, K. L. and M. Chidambaram (2013), “Text mining: Concepts, applications, tools and issues - An overview,” International Journal of Computer Applications, 80(4), pp.29-32. [https://doi.org/10.5120/13851-1685]
- Yildirim, O. (2018), “A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification,” Computers in Biology and Medicine, 96, pp.189-202. [https://doi.org/10.1016/j.compbiomed.2018.03.016]
∙The author Juhwa Lee graduated from the School of Business Administration, College of Business & Economics, Chung-Ang University. He has got a Master’s degree in Economics at Sungkyunkwan University. His current research interests are machine learning, big data analysis, financial management, and behavioral finance.
∙The author Doojin Ryu is a full/tenured professor of economics at Sungkyunkwan University. He graduated from Seoul National University (School of Electrical Engineering), and has got a Ph.D. degree at KAIST. He was a research fellow at the National Pension Service, an assistant professor at Hankuk University of Foreign Studies, and a full/tenured professor at Chung-Ang University. Prof. Ryu is currently an editor of Investment Analysts Journal (SSCI) and a subject editor of Emerging Markets Review (SSCI), Journal of Multinational Financial Management (SSCI), and Emerging Markets Finance & Trade (SSCI). He is an editorial board member of Journal of Futures Markets (SSCI) and Asian Business & Management (SSCI).