Interdisciplinary integration of economics and linguistics: Corpus-based vocabulary modelling for Financial English
Abstract
This study addresses the growing need for linguistically grounded and domain-sensitive vocabulary modelling in Financial English by integrating methods from corpus linguistics with insights from financial discourse analysis. While financial communication is central to global economic systems, the vocabulary that structures this domain remains under-examined, particularly in terms of how lexical items function across different genres such as journalism, corporate reporting, and regulatory texts. This gap is especially evident in English for Specific Purposes (ESP) instruction, where existing materials often lack empirical grounding and fail to reflect the complexity of real-world financial language use. The purpose of the study is to develop a corpus-based vocabulary model that captures the semantic, collocational, and discourse functions of Financial English vocabulary. A 878,000-word Financial English Corpus (FEC) was constructed, incorporating 220 texts from financial journalism, corporate financial reports, and institutional regulatory documents. The methodological framework includes frequency analysis, mutual information (MI)-based collocational profiling, semantic tagging using the UCREL USAS system, and qualitative concordance analysis. The results demonstrate that core financial terms such as inflation, capital, risk, and yield exhibit genre-specific frequencies, collocational partners, and discourse functions. For instance, inflation is highly evaluative in regulatory discourse, while capital is linked to strategy in corporate texts. Semantic tagging revealed a significant presence of evaluative and affective language, challenging assumptions of objectivity in financial writing. The study concludes that a corpus-based model offers a replicable and pedagogically valuable approach to specialised vocabulary. The output has practical implications for ESP instruction, terminology management, and financial NLP, contributing to a more nuanced understanding of how financial meaning is constructed in institutional contexts.
Downloads
References
Anthony, L. (2023). AntConc (Version 4.2.0) [Computer software]. Waseda University. Available at: ttps://www.laurenceanthony.net/software/ antconc/ [Accessed 13 July 2025].
Baker, P. (2023). Using Corpora in Discourse Analysis (2nd ed.). Bloomsbury Academic. Available at: https://www.bloomsbury.com/uk/ using-corpora-in-discourse-analysis-9781350083752/ [Accessed 13 July 2025].
Brezina, V. (2018). Statistics in corpus linguistics: A practical guide. Cambridge University Press. DOI: https://doi.org/10.1017/9781316410899
Choe, J., Noh, K., Kim, N., Ahn, S., Jung, W. (2023). Exploring the Impact of Corpus Diversity on Financial Pretrained Language Models. arXiv.Org. abs/2310.13312. DOI: https://doi.org/10.48550/arxiv.2310.13312
Gablasova, D., Brezina, V., McEnery, A.M. (2017). Collocations in corpus‑based language learning research: Identifying, comparing and interpreting the evidence. Language Learning. 67 (Suppl. 1), pp. 155‑179. DOI: https://doi.org/10.1111/lang.12225
Goźdź-Roszkowski, S. (2019). Phraseology in Legal and Institutional Settings. Routledge.
Hyland, K. (Ed.). (2016). The Routledge Handbook of English for Academic Purposes (1st ed.). Routledge. DOI: https://doi.org/10.4324/9781315657455
Jaworska, S. (2020). Corporate discourse. In: De Fina, A. and Georgakopoulou-Nunes, A. (Eds.). Handbook of Discourse Studies. Cambridge University Press, pp.1-28. DOI: https://doi.org/10.1017/9781108348195
Kenny, N., Işık-Taş, E.E., Jian, H. (2020). English for Specific Purposes Instruction and Research: Current Practices, Challenges and Innovations. Cham: Palgrave Macmillan. DOI: https://doi.org/10.1007/978-3-030-32914-3
Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., Suchomel, V. (2014). The Sketch Engine: Ten years on. Lexicography. 1(1), pp. 7-36. DOI: https://doi.org/10.1007/s40607-014-0009-9
Lu, L. (2023). Chapter 19: Fintech: technology-enabled financial innovation for digital trade. Research Handbook on Digital Trade. Cheltenham, UK: Edward Elgar Publishing, pp. 306-328. DOI: https://doi.org/10.4337/9781800884953.00029
Màrquez, L., Rodríguez, H. (1998). Part-of-speech tagging using decision trees. In: Nédellec, C., Rouveirol, C. (Eds.). Machine Learning: ECML-98. ECML 1998. Lecture Notes in Computer Science. Springer, Vol. 1398. DOI: https://doi.org/10.1007/BFb0026668
Mooney, A. (2020). The Discourses of Money and the Economy. In A. De Fina, A. Georgakopoulou (Eds.). The Cambridge Handbook of Discourse Studies. Cambridge: Cambridge University Press, pp. 644-665.
Partington, A., Duguid, A., Taylor, C. (2013). Patterns and Meanings in Discourse: Theory and Practice in Corpus-assisted Discourse Studies (CADS). In G. Brookes (Ed.). International Journal of Corpus Linguistics. 19(2), pp. 292-300. DOI: https://doi.org/10.1075/ijcl.19.2.07bro
Tanweer, A.D., Avsar, R.B. (2015). Ideographic use of economic terms. On the Horizon. 23(3), pp. 169-173. DOI: https://doi.org/10.1108/OTH-05-2015-0018
Telibașa, G. (2015). The pervasiveness of metaphor in the language of economics. Studies and Scientific Researches. Economics Edition. 21. DOI: https://doi.org/10.29358/sceco.v0i21.305
Wu, X., Yue, X. (2025). Corpus-assisted discourse studies. In M. Gillings, G. Mautner, P. Baker (Eds.). Digital Scholarship in the Humanities. 40(1), pp. 457-459. DOI: https://doi.org/10.1093/llc/fqae066