AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Chen, J.; Wu, X.; Lan, T.; Li, B.

LLMER: Crafting Interactive Extended Reality Worlds with JSON Data Generated by Large Language Models Journal Article

In: IEEE Transactions on Visualization and Computer Graphics, vol. 31, no. 5, pp. 2715–2724, 2025, ISSN: 10772626 (ISSN), (Publisher: IEEE Computer Society).

Abstract | Links | BibTeX | Tags: % reductions, 3D modeling, algorithm, Algorithms, Augmented Reality, Coding errors, Computer graphics, Computer interaction, computer interface, Computer simulation languages, Extended reality, generative artificial intelligence, human, Human users, human-computer interaction, Humans, Imaging, Immersive, Language, Language Model, Large language model, large language models, Metadata, Natural Language Processing, Natural language processing systems, Natural languages, procedures, Script generation, Spatio-temporal data, Three dimensional computer graphics, Three-Dimensional, three-dimensional imaging, User-Computer Interface, Virtual Reality

@article{chen_llmer_2025,

title = {LLMER: Crafting Interactive Extended Reality Worlds with JSON Data Generated by Large Language Models},

author = {J. Chen and X. Wu and T. Lan and B. Li},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105003825793&doi=10.1109%2FTVCG.2025.3549549&partnerID=40&md5=50597473616678390f143a33082a13d3},

doi = {10.1109/TVCG.2025.3549549},

issn = {10772626 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {IEEE Transactions on Visualization and Computer Graphics},

volume = {31},

number = {5},

pages = {2715–2724},

abstract = {The integration of Large Language Models (LLMs) like GPT-4 with Extended Reality (XR) technologies offers the potential to build truly immersive XR environments that interact with human users through natural language, e.g., generating and animating 3D scenes from audio inputs. However, the complexity of XR environments makes it difficult to accurately extract relevant contextual data and scene/object parameters from an overwhelming volume of XR artifacts. It leads to not only increased costs with pay-per-use models, but also elevated levels of generation errors. Moreover, existing approaches focusing on coding script generation are often prone to generation errors, resulting in flawed or invalid scripts, application crashes, and ultimately a degraded user experience. To overcome these challenges, we introduce LLMER, a novel framework that creates interactive XR worlds using JSON data generated by LLMs. Unlike prior approaches focusing on coding script generation, LLMER translates natural language inputs into JSON data, significantly reducing the likelihood of application crashes and processing latency. It employs a multi-stage strategy to supply only the essential contextual information adapted to the user's request and features multiple modules designed for various XR tasks. Our preliminary user study reveals the effectiveness of the proposed system, with over 80% reduction in consumed tokens and around 60% reduction in task completion time compared to state-of-the-art approaches. The analysis of users' feedback also illuminates a series of directions for further optimization. © 2025 Elsevier B.V., All rights reserved.},

note = {Publisher: IEEE Computer Society},

keywords = {% reductions, 3D modeling, algorithm, Algorithms, Augmented Reality, Coding errors, Computer graphics, Computer interaction, computer interface, Computer simulation languages, Extended reality, generative artificial intelligence, human, Human users, human-computer interaction, Humans, Imaging, Immersive, Language, Language Model, Large language model, large language models, Metadata, Natural Language Processing, Natural language processing systems, Natural languages, procedures, Script generation, Spatio-temporal data, Three dimensional computer graphics, Three-Dimensional, three-dimensional imaging, User-Computer Interface, Virtual Reality},

pubstate = {published},

tppubtype = {article}

}

2024

Kang, Z.; Liu, Y.; Zheng, J.; Sun, Z.

Revealing the Difficulty in Jailbreak Defense on Language Models for Metaverse Proceedings Article

In: Gong, Q.; He, X. (Ed.): SocialMeta - Proc. Int. Workshop Soc. Metaverse Comput., Sens. Netw., Part: ACM SenSys, pp. 31–37, Association for Computing Machinery, Inc, 2024, ISBN: 9798400712999 (ISBN).

Abstract | Links | BibTeX | Tags: % reductions, Attack strategies, Computer simulation languages, Defense, Digital elevation model, Guard rails, Jailbreak, Language Model, Large language model, Metaverse Security, Metaverses, Natural languages, Performance, Virtual Reality

@inproceedings{kang_revealing_2024,

title = {Revealing the Difficulty in Jailbreak Defense on Language Models for Metaverse},

author = {Z. Kang and Y. Liu and J. Zheng and Z. Sun},

editor = {Q. Gong and X. He},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85212189363&doi=10.1145%2F3698387.3699998&partnerID=40&md5=7a7b1260748719041c58ac9e22e79633},

doi = {10.1145/3698387.3699998},

isbn = {9798400712999 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {SocialMeta - Proc. Int. Workshop Soc. Metaverse Comput., Sens. Netw., Part: ACM SenSys},

pages = {31–37},

publisher = {Association for Computing Machinery, Inc},

abstract = {Large language models (LLMs) have demonstrated exceptional capabilities in natural language processing tasks, fueling innovations in emerging areas such as the metaverse. These models enable dynamic virtual communities, enhancing user interactions and revolutionizing industries. However, their increasing deployment exposes vulnerabilities to jailbreak attacks, where adversaries can manipulate LLM-driven systems to generate harmful content. While various defense mechanisms have been proposed, their efficacy against diverse jailbreak techniques remains unclear. This paper addresses this gap by evaluating the performance of three popular defense methods (Backtranslation, Self-reminder, and Paraphrase) against different jailbreak attack strategies (GCG, BEAST, and Deepinception), while also utilizing three distinct models. Our findings reveal that while defenses are highly effective against optimization-based jailbreak attacks and reduce the attack success rate by 79% on average, they struggle in defending against attacks that alter attack motivations. Additionally, methods relying on self-reminding perform better when integrated with models featuring robust safety guardrails. For instance, Llama2-7b shows a 100% reduction in Attack Success Rate, while Vicuna-7b and Mistral-7b, lacking safety alignment, exhibit a lower average reduction of 65.8%. This study highlights the challenges in developing universal defense solutions for securing LLMs in dynamic environments like the metaverse. Furthermore, our study highlights that the three distinct models utilized demonstrate varying initial defense performance against different jailbreak attack strategies, underscoring the complexity of effectively securing LLMs. © 2024 Elsevier B.V., All rights reserved.},

keywords = {% reductions, Attack strategies, Computer simulation languages, Defense, Digital elevation model, Guard rails, Jailbreak, Language Model, Large language model, Metaverse Security, Metaverses, Natural languages, Performance, Virtual Reality},

pubstate = {published},

tppubtype = {inproceedings}

}