AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Guo, H.; Liu, Z.; Tang, C.; Zhang, X.

An Interactive Framework for Personalized Navigation Based on Metacosmic Cultural Tourism and Large Model Fine-Tuning Journal Article

In: IEEE Access, vol. 13, pp. 81450–81461, 2025, ISSN: 21693536 (ISSN).

Abstract | Links | BibTeX | Tags: Cultural informations, Digital Cultural Heritage, Digital cultural heritages, Digital guide, Fine tuning, fine-tuning, Historical monuments, Language Model, Large language model, Leisure, Metacosmic cultural tourism, Multimodal Interaction, Tourism, Virtual tour

@article{guo_interactive_2025,

title = {An Interactive Framework for Personalized Navigation Based on Metacosmic Cultural Tourism and Large Model Fine-Tuning},

author = {H. Guo and Z. Liu and C. Tang and X. Zhang},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105004059236&doi=10.1109%2fACCESS.2025.3565359&partnerID=40&md5=45d328831c5795fa31e7e033299912b5},

doi = {10.1109/ACCESS.2025.3565359},

issn = {21693536 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {IEEE Access},

volume = {13},

pages = {81450–81461},

abstract = {With the wide application of large language models (LLMs) and the rapid growth of metaverse tourism demand, the digital tour and personalized interaction of historical sites have become the key to improving users’ digital travel experience. Creating an environment where users can access rich cultural information and enjoy personalized, immersive experiences is a crucial issue in the field of digital cultural travel. To this end, we propose a tourism information multimodal generation personalized question-answering interactive framework TIGMI (Tourism Information Generation and Multimodal Interaction) based on LLM fine-tuning, which aims to provide a richer and more in-depth experience for virtual tours of historical monuments. Taking Qutan Temple as an example, the framework combines LLM, retrieval augmented generation (RAG), and auto-prompting engineering techniques to retrieve accurate information related to the historical monument from external knowledge bases and seamlessly integrates it into the generated content. This integration mechanism ensures the accuracy and relevance of the generated answers. Through TIGMI’s LLM-driven command interaction mechanism in the 3D digital scene of Qutan Temple, users are able to interact with the building and scene environment in a personalized and real-time manner, successfully integrating historical and cultural information with modern digital technology. This integration significantly enhances the naturalness of interaction and personalizes the user experience, thereby improving user immersion and information acquisition efficiency. Evaluation results show that TIGMI excels in question-answering and multimodal interactions, significantly enhancing the depth and breadth of services provided by the personalized virtual tour. We conclude by addressing the limitations of TIGMI and briefly discuss how future research will focus on further improving the accuracy and user satisfaction of the generated content to adapt to the dynamically changing tourism environment. © 2013 IEEE.},

keywords = {Cultural informations, Digital Cultural Heritage, Digital cultural heritages, Digital guide, Fine tuning, fine-tuning, Historical monuments, Language Model, Large language model, Leisure, Metacosmic cultural tourism, Multimodal Interaction, Tourism, Virtual tour},

pubstate = {published},

tppubtype = {article}

}

With the wide application of large language models (LLMs) and the rapid growth of metaverse tourism demand, the digital tour and personalized interaction of historical sites have become the key to improving users’ digital travel experience. Creating an environment where users can access rich cultural information and enjoy personalized, immersive experiences is a crucial issue in the field of digital cultural travel. To this end, we propose a tourism information multimodal generation personalized question-answering interactive framework TIGMI (Tourism Information Generation and Multimodal Interaction) based on LLM fine-tuning, which aims to provide a richer and more in-depth experience for virtual tours of historical monuments. Taking Qutan Temple as an example, the framework combines LLM, retrieval augmented generation (RAG), and auto-prompting engineering techniques to retrieve accurate information related to the historical monument from external knowledge bases and seamlessly integrates it into the generated content. This integration mechanism ensures the accuracy and relevance of the generated answers. Through TIGMI’s LLM-driven command interaction mechanism in the 3D digital scene of Qutan Temple, users are able to interact with the building and scene environment in a personalized and real-time manner, successfully integrating historical and cultural information with modern digital technology. This integration significantly enhances the naturalness of interaction and personalizes the user experience, thereby improving user immersion and information acquisition efficiency. Evaluation results show that TIGMI excels in question-answering and multimodal interactions, significantly enhancing the depth and breadth of services provided by the personalized virtual tour. We conclude by addressing the limitations of TIGMI and briefly discuss how future research will focus on further improving the accuracy and user satisfaction of the generated content to adapt to the dynamically changing tourism environment. © 2013 IEEE.

Dongye, X.; Weng, D.; Jiang, H.; Tian, Z.; Bao, Y.; Chen, P.

Personalized decision-making for agents in face-to-face interaction in virtual reality Journal Article

In: Multimedia Systems, vol. 31, no. 1, 2025, ISSN: 09424962 (ISSN).

Abstract | Links | BibTeX | Tags: Decision making, Decision-making process, Decisions makings, Design frameworks, Face-to-face interaction, Feed-back based, Fine tuning, Human-agent interaction, Human–agent interaction, Integrated circuit design, Intelligent virtual agents, Language Model, Large language model, Multi agent systems, Multimodal Interaction, Virtual environments, Virtual Reality

@article{dongye_personalized_2025,

title = {Personalized decision-making for agents in face-to-face interaction in virtual reality},

author = {X. Dongye and D. Weng and H. Jiang and Z. Tian and Y. Bao and P. Chen},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85212947825&doi=10.1007%2fs00530-024-01591-7&partnerID=40&md5=d969cd926fdfd241399f2f96dbf42907},

doi = {10.1007/s00530-024-01591-7},

issn = {09424962 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {Multimedia Systems},

volume = {31},

number = {1},

abstract = {Intelligent agents for face-to-face interaction in virtual reality are expected to make decisions and provide appropriate feedback based on the user’s multimodal interaction inputs. Designing the agent’s decision-making process poses a significant challenge owing to the limited availability of multimodal interaction decision-making datasets and the complexities associated with providing personalized interaction feedback to diverse users. To overcome these challenges, we propose a novel design framework that involves generating and labeling symbolic interaction data, pre-training a small-scale real-time decision-making network, collecting personalized interaction data within interactions, and fine-tuning the network using personalized data. We develop a prototype system to demonstrate our design framework, which utilizes interaction distances, head orientations, and hand postures as inputs in virtual reality. The agent is capable of delivering personalized feedback from different users. We evaluate the proposed design framework by demonstrating the utilization of large language models for data labeling, emphasizing reliability and robustness. Furthermore, we evaluate the incorporation of personalized data fine-tuning for decision-making networks within our design framework, highlighting its importance in improving the user interaction experience. The design principles of this framework can be further explored and applied to various domains involving virtual agents. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.},

keywords = {Decision making, Decision-making process, Decisions makings, Design frameworks, Face-to-face interaction, Feed-back based, Fine tuning, Human-agent interaction, Human–agent interaction, Integrated circuit design, Intelligent virtual agents, Language Model, Large language model, Multi agent systems, Multimodal Interaction, Virtual environments, Virtual Reality},

pubstate = {published},

tppubtype = {article}

}