The purpose of this study is to investigate the effectiveness of qualitative information extracted from firm’s annual report in predicting corporate credit rating. Qualitative information represented by published reports or management interview has been known as an important source in addition to quantitative information represented by financial values in assigning corporate credit rating in practice. Nevertheless, prior studies have room for further research in that they rarely employed qualitative information in developing prediction model of corporate credit rating.
This study adopted three document vectorization methods, Bag-Of-Words (BOW), Word to Vector (Word2Vec) and Document to Vector (Doc2Vec), to transform an unstructured textual data into a numeric vector, so that Machine Learning (ML) algorithms accept it as an input. For the experiments, we used the corpus of Management’s Discussion and Analysis (MD&A) section in 10-K financial reports as well as financial variables and corporate credit rating data.
Experimental results from a series of multi-class classification experiments show the predictive models trained by both financial variables and vectors extracted from MD&A data outperform the benchmark models trained only by traditional financial variables.
This study proposed a new approach for corporate credit rating prediction by using qualitative information extracted from MD&A documents as an input to ML-based prediction models. Also, this research adopted and compared three textual vectorization methods in the domain of corporate credit rating prediction and showed that BOW mostly outperformed Word2Vec and Doc2Vec.
This research was supported by the Korea Univeristy Business School Research Grant.
Choi, J., Suh, Y. and Jung, N. (2020), "Predicting corporate credit rating based on qualitative information of MD&A transformed using document vectorization techniques", Data Technologies and Applications, Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/DTA-08-2019-0127Download as .RIS
Emerald Publishing Limited
Copyright © 2020, Emerald Publishing Limited