AHCI RESEARCH GROUP
Publications
Papers published in international journals,
proceedings of conferences, workshops and books.
OUR RESEARCH
Scientific Publications
How to
Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.
2024
de Oliveira, E. A. Masasi; Silva, D. F. C.; Filho, A. R. G.
Improving VR Accessibility Through Automatic 360 Scene Description Using Multimodal Large Language Models Proceedings Article
In: ACM Int. Conf. Proc. Ser., pp. 289–293, Association for Computing Machinery, 2024, ISBN: 979-840070979-1 (ISBN).
Abstract | Links | BibTeX | Tags: 3D Scene, 3D scenes, Accessibility, Computer simulation languages, Descriptive information, Digital elevation model, Immersive, Language Model, Multi-modal, Multimodal large language model, Multimodal Large Language Models (MLLMs), Scene description, Virtual environments, Virtual Reality, Virtual Reality (VR), Virtual reality technology
@inproceedings{masasi_de_oliveira_improving_2024,
title = {Improving VR Accessibility Through Automatic 360 Scene Description Using Multimodal Large Language Models},
author = {E. A. Masasi de Oliveira and D. F. C. Silva and A. R. G. Filho},
url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85206580797&doi=10.1145%2f3691573.3691619&partnerID=40&md5=6e80800fce0e6b56679fbcbe982bcfa7},
doi = {10.1145/3691573.3691619},
isbn = {979-840070979-1 (ISBN)},
year = {2024},
date = {2024-01-01},
booktitle = {ACM Int. Conf. Proc. Ser.},
pages = {289–293},
publisher = {Association for Computing Machinery},
abstract = {Advancements in Virtual Reality (VR) technology hold immense promise for enriching immersive experiences. Despite the advancements in VR technology, there remains a significant gap in addressing accessibility concerns, particularly in automatically providing descriptive information for VR scenes. This paper combines the potential of leveraging Multimodal Large Language Models (MLLMs) to automatically generate text descriptions for 360 VR scenes according to Speech-to-Text (STT) prompts. As a case study, we conduct experiments on educational settings in VR museums, improving dynamic experiences across various contexts. Despite minor challenges in adapting MLLMs to VR Scenes, the experiments demonstrate that they can generate descriptions with high quality. Our findings provide insights for enhancing VR experiences and ensuring accessibility to individuals with disabilities or diverse needs. © 2024 Copyright held by the owner/author(s).},
keywords = {3D Scene, 3D scenes, Accessibility, Computer simulation languages, Descriptive information, Digital elevation model, Immersive, Language Model, Multi-modal, Multimodal large language model, Multimodal Large Language Models (MLLMs), Scene description, Virtual environments, Virtual Reality, Virtual Reality (VR), Virtual reality technology},
pubstate = {published},
tppubtype = {inproceedings}
}
Advancements in Virtual Reality (VR) technology hold immense promise for enriching immersive experiences. Despite the advancements in VR technology, there remains a significant gap in addressing accessibility concerns, particularly in automatically providing descriptive information for VR scenes. This paper combines the potential of leveraging Multimodal Large Language Models (MLLMs) to automatically generate text descriptions for 360 VR scenes according to Speech-to-Text (STT) prompts. As a case study, we conduct experiments on educational settings in VR museums, improving dynamic experiences across various contexts. Despite minor challenges in adapting MLLMs to VR Scenes, the experiments demonstrate that they can generate descriptions with high quality. Our findings provide insights for enhancing VR experiences and ensuring accessibility to individuals with disabilities or diverse needs. © 2024 Copyright held by the owner/author(s).