[1]

S. Chitta, S. Thota, S. Manoj Yellepeddi, A. Kumar Reddy, and A. K. P. Venkata, “Multimodal Deep Learning: Integrating Vision and Language for Real-World Applications”, Asian J. Multi. Res. Rev., vol. 1, no. 2, pp. 262–282, Nov. 2020, Accessed: Jun. 07, 2025. [Online]. Available: https://ajmrr.org/journal/article/view/211