Application of the latent dirichlet allocation method to determine news text topics

Authors

  • Sarif Surorejo STMIK YMI Tegal, Indonesia
  • M Taufik Fajar Maulana STMIK YMI Tegal, Indonesia
  • Wresti Andriani STMIK YMI Tegal, Indonesia
  • Gunawan Gunawan STMIK YMI Tegal, Indonesia

DOI:

https://doi.org/10.35335/mandiri.v13i1.306

Keywords:

Indonesian News, Latent Dirichlet Allocation, Media Analysis, Text Analysis, Text Mining

Abstract

This research discusses the application of the Latent Dirichlet Allocation (LDA) method to determine news text topics, providing new insights into media content analysis. This research aims to develop a model that can increase the accuracy and efficiency of topic identification in Indonesian news texts. The research uses a quantitative approach with experimental methods, quantitative analysis, and model validation, where news text data is processed and analyzed using LDA. The results show that the developed model can accurately identify news topics, showing significant improvements compared to existing methods. The implications are substantial for practitioners and researchers in journalism and media analysis, offering more efficient and effective strategies for managing and understanding large flows of information and opening new directions for advanced research in news text analysis.

References

Alomari, D., & Ahmad, I. (2024). Exploring Character Trigrams for Robust Arabic Text Classification: A Comparative Analysis in the Face of Vocabulary Expansion and Misspelled Words. IEEE Access.

Bakar, M. F. R. A., Idris, N., Shuib, L., & Khamis, N. (2020). Sentiment analysis of noisy Malay text: state of art, challenges and future work. IEEE Access, 8, 24687–24696.

Bastani, K., Namavari, H., & Shaffer, J. (2019). Latent Dirichlet allocation (LDA) for topic modeling of the CFPB consumer complaints. Expert Systems with Applications, 127, 256–271.

Cohen, N. S. (2019). At work in the digital newsroom. Digital Journalism, 7(5), 571–591.

Fleuren, L. M., Klausch, T. L. T., Zwager, C. L., Schoonmade, L. J., Guo, T., Roggeveen, L. F., Swart, E. L., Girbes, A. R. J., Thoral, P., & Ercole, A. (2020). Machine learning for the prediction of sepsis: a systematic review and meta-analysis of diagnostic test accuracy. Intensive Care Medicine, 46, 383–400.

Gadekar, H., & Bugalia, N. (2023). Automatic classification of construction safety reports using semi-supervised YAKE-Guided LDA approach. Advanced Engineering Informatics, 56, 101929.

Hamilton, L. M., & Lahne, J. (2020). Fast and automated sensory analysis: Using natural language processing for descriptive lexicon development. Food Quality and Preference, 83, 103926.

Hu, R., Ma, W., Lin, W., Chen, X., Zhong, Z., & Zeng, C. (2022). Technology topic identification and trend prediction of new energy vehicle using LDA modeling. Complexity, 2022, 1–20.

Husnayain, A., Fuad, A., & Lazuardi, L. (2019). Correlation between Google Trends on dengue fever and national surveillance report in Indonesia. Global Health Action, 12(1), 1552652.

Jelodar, H., Wang, Y., Yuan, C., Feng, X., Jiang, X., Li, Y., & Zhao, L. (2019a). Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey. Multimedia Tools and Applications, 78, 15169–15211.

Jelodar, H., Wang, Y., Yuan, C., Feng, X., Jiang, X., Li, Y., & Zhao, L. (2019b). Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey. Multimedia Tools and Applications, 78, 15169–15211.

Li, J., Li, G., Liu, M., Zhu, X., & Wei, L. (2022). A novel text-based framework for forecasting agricultural futures using massive online news headlines. International Journal of Forecasting, 38(1), 35–50.

Liu, Y., Du, F., Sun, J., & Jiang, Y. (2020). iLDA: An interactive latent Dirichlet allocation model to improve topic quality. Journal of Information Science, 46(1), 23–40.

Lossio-Ventura, J. A., Gonzales, S., Morzan, J., Alatrista-Salas, H., Hernandez-Boussard, T., & Bian, J. (2021). Evaluation of clustering and topic modeling methods over health-related tweets and emails. Artificial Intelligence in Medicine, 117, 102096.

Maier, D., Waldherr, A., Miltner, P., Wiedemann, G., Niekler, A., Keinert, A., Pfetsch, B., Heyer, G., Reber, U., & Häussler, T. (2021). Applying LDA topic modeling in communication research: Toward a valid and reliable methodology. In Computational methods for communication science (pp. 13–38). Routledge.

Melton, C. A., Olusanya, O. A., Ammar, N., & Shaban-Nejad, A. (2021). Public sentiment analysis and topic modeling regarding COVID-19 vaccines on the Reddit social media platform: A call to action for strengthening vaccine confidence. Journal of Infection and Public Health, 14(10), 1505–1512.

Naseem, U., Razzak, I., & Eklund, P. W. (2021a). A survey of pre-processing techniques to improve short-text quality: a case study on hate speech detection on twitter. Multimedia Tools and Applications, 80, 35239–35266.

Naseem, U., Razzak, I., & Eklund, P. W. (2021b). A survey of pre-processing techniques to improve short-text quality: a case study on hate speech detection on twitter. Multimedia Tools and Applications, 80, 35239–35266.

Pan, X., & Xue, Y. (2023). Advancements of Artificial Intelligence Techniques in the Realm About Library and Information Subject—A Case Survey of Latent Dirichlet Allocation Method. IEEE Access, 11, 132627–132640.

Rüdiger, M., Antons, D., Joshi, A. M., & Salge, T.-O. (2022). Topic modeling revisited: New evidence on algorithm performance and quality metrics. Plos One, 17(4), e0266325.

Wu, L., Perin, G., & Picek, S. (2022). I choose you: Automated hyperparameter tuning for deep learning-based side-channel analysis. IEEE Transactions on Emerging Topics in Computing.

Xing, W., Lee, H.-S., & Shibani, A. (2020). Identifying patterns in students’ scientific argumentation: content analysis through text mining using Latent Dirichlet Allocation. Educational Technology Research and Development, 68(5), 2185–2214.

Yi, F., Jiang, B., & Wu, J. (2020). Topic modeling for short texts via word embedding and document correlation. IEEE Access, 8, 30692–30705.

Ying, L., Montgomery, J. M., & Stewart, B. M. (2022). Topics, concepts, and measurement: A crowdsourced procedure for validating topics as measures. Political Analysis, 30(4), 570–589.

Zheng, M., Jiang, K., Xu, R., & Qi, L. (2023). An adaptive LDA optimal topic number selection method in news topic identification. IEEE Access.

Downloads

Published

2024-06-19

How to Cite

Surorejo, S., Maulana, M. T. F., Andriani, W., & Gunawan, G. (2024). Application of the latent dirichlet allocation method to determine news text topics. Jurnal Mandiri IT, 13(1), 106–115. https://doi.org/10.35335/mandiri.v13i1.306

Most read articles by the same author(s)

1 2 > >>