AHCI RESEARCH GROUP
Publications
Papers published in international journals,
proceedings of conferences, workshops and books.
OUR RESEARCH
Scientific Publications
How to
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.
2024
Villalobos, W.; Kumar, Y.; Li, J. J.
The Multilingual Eyes Multimodal Traveler’s App Proceedings Article
In: X.-S., Yang; S., Sherratt; N., Dey; A., Joshi (Ed.): Lect. Notes Networks Syst., pp. 565–575, Springer Science and Business Media Deutschland GmbH, 2024, ISBN: 23673370 (ISSN); 978-981973304-0 (ISBN).
Abstract | Links | BibTeX | Tags: AI in travel, Artificial intelligence in travel, Assistive navigation technologies, Assistive navigation technology, Assistive navigations, Human-AI interaction in tourism, Human-artificial intelligence interaction in tourism, Language Model, Military applications, Military operations, Multi-modal, Multilingual translations, Multimodal large language model, Multimodal LLMs, Navigation technology, Real- time, Real-time multilingual translation, Robots, Virtual Reality
@inproceedings{villalobos_multilingual_2024,
title = {The Multilingual Eyes Multimodal Traveler’s App},
author = {W. Villalobos and Y. Kumar and J. J. Li},
editor = {Yang X.-S. and Sherratt S. and Dey N. and Joshi A.},
url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85201104509&doi=10.1007%2f978-981-97-3305-7_45&partnerID=40&md5=91f94aa091c97ec3ad251e07b47fa06e},
doi = {10.1007/978-981-97-3305-7_45},
isbn = {23673370 (ISSN); 978-981973304-0 (ISBN)},
year = {2024},
date = {2024-01-01},
booktitle = {Lect. Notes Networks Syst.},
volume = {1004 LNNS},
pages = {565–575},
publisher = {Springer Science and Business Media Deutschland GmbH},
abstract = {This paper presents an in-depth analysis of “The Multilingual Eyes Multimodal Traveler’s App” (MEMTA), a novel application in the realm of travel technology, leveraging advanced Artificial Intelligence (AI) capabilities. The core of MEMTA’s innovation lies in its integration of multimodal Large Language Models (LLMs), notably ChatGPT-4-Vision, to enhance navigational assistance and situational awareness for tourists and visually impaired individuals in diverse environments. The study rigorously evaluates how the incorporation of OpenAI’s Whisper and DALL-E 3 technologies augments the app’s proficiency in real-time, multilingual translation, pronunciation, and visual content generation, thereby significantly improving the user experience in various geographical settings. A key focus is placed on the development and impact of a custom GPT model, Susanin, designed specifically for the app, highlighting its advancements in Human-AI interaction and accessibility over standard LLMs. The paper thoroughly explores the practical applications of MEMTA, extending its utility beyond mere travel assistance to sectors such as robotics, virtual reality, and military operations, thus underscoring its multifaceted significance. Through this exploration, the study contributes novel insights into the fields of AI-enhanced travel, assistive technologies, and the broader scope of human-AI interaction. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.},
keywords = {AI in travel, Artificial intelligence in travel, Assistive navigation technologies, Assistive navigation technology, Assistive navigations, Human-AI interaction in tourism, Human-artificial intelligence interaction in tourism, Language Model, Military applications, Military operations, Multi-modal, Multilingual translations, Multimodal large language model, Multimodal LLMs, Navigation technology, Real- time, Real-time multilingual translation, Robots, Virtual Reality},
pubstate = {published},
tppubtype = {inproceedings}
}