AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Liu, G.; Du, H.; Wang, J.; Niyato, D.; Kim, D. I.

Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse Journal Article

In: IEEE Transactions on Mobile Computing, 2025, ISSN: 15361233 (ISSN).

Abstract | Links | BibTeX | Tags: Contest Theory, Deep learning, Deep reinforcement learning, Diffusion Model, Generative adversarial networks, Generative AI, High quality, Image generation, Image generations, Immersive technologies, Metaverses, Mobile edge computing, Reinforcement Learning, Reinforcement learnings, Resource allocation, Resources allocation, Semantic data, Virtual addresses, Virtual environments, Virtual Reality

@article{liu_contract-inspired_2025,

title = {Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse},

author = {G. Liu and H. Du and J. Wang and D. Niyato and D. I. Kim},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105000066834&doi=10.1109%2fTMC.2025.3550815&partnerID=40&md5=3cb5a2143b9ce4ca7f931a60f1bf239c},

doi = {10.1109/TMC.2025.3550815},

issn = {15361233 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {IEEE Transactions on Mobile Computing},

abstract = {The rapid advancement of immersive technologies has propelled the development of the Metaverse, where the convergence of virtual and physical realities necessitates the generation of high-quality, photorealistic images to enhance user experience. However, generating these images, especially through Generative Diffusion Models (GDMs), in mobile edge computing environments presents significant challenges due to the limited computing resources of edge devices and the dynamic nature of wireless networks. This paper proposes a novel framework that integrates contract-inspired contest theory, Deep Reinforcement Learning (DRL), and GDMs to optimize image generation in these resource-constrained environments. The framework addresses the critical challenges of resource allocation and semantic data transmission quality by incentivizing edge devices to efficiently transmit high-quality semantic data, which is essential for creating realistic and immersive images. The use of contest and contract theory ensures that edge devices are motivated to allocate resources effectively, while DRL dynamically adjusts to network conditions, optimizing the overall image generation process. Experimental results demonstrate that the proposed approach not only improves the quality of generated images but also achieves superior convergence speed and stability compared to traditional methods. This makes the framework particularly effective for optimizing complex resource allocation tasks in mobile edge Metaverse applications, offering enhanced performance and efficiency in creating immersive virtual environments. © 2002-2012 IEEE.},

keywords = {Contest Theory, Deep learning, Deep reinforcement learning, Diffusion Model, Generative adversarial networks, Generative AI, High quality, Image generation, Image generations, Immersive technologies, Metaverses, Mobile edge computing, Reinforcement Learning, Reinforcement learnings, Resource allocation, Resources allocation, Semantic data, Virtual addresses, Virtual environments, Virtual Reality},

pubstate = {published},

tppubtype = {article}

}

Kim, Y.; Aamir, Z.; Singh, M.; Boorboor, S.; Mueller, K.; Kaufman, A. E.

Explainable XR: Understanding User Behaviors of XR Environments Using LLM-Assisted Analytics Framework Journal Article

In: IEEE Transactions on Visualization and Computer Graphics, vol. 31, no. 5, pp. 2756–2766, 2025, ISSN: 10772626 (ISSN).

Abstract | Links | BibTeX | Tags: adult, Agnostic, Article, Assistive, Cross Reality, Data Analytics, Data collection, data interpretation, Data recording, Data visualization, Extended reality, human, Language Model, Large language model, large language models, Multi-modal, Multimodal Data Collection, normal human, Personalized assistive technique, Personalized Assistive Techniques, recorder, Spatio-temporal data, therapy, user behavior, User behaviors, Virtual addresses, Virtual environments, Virtual Reality, Visual analytics, Visual languages

@article{kim_explainable_2025,

title = {Explainable XR: Understanding User Behaviors of XR Environments Using LLM-Assisted Analytics Framework},

author = {Y. Kim and Z. Aamir and M. Singh and S. Boorboor and K. Mueller and A. E. Kaufman},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105003815583&doi=10.1109%2fTVCG.2025.3549537&partnerID=40&md5=1085b698db06656985f80418cb37b773},

doi = {10.1109/TVCG.2025.3549537},

issn = {10772626 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {IEEE Transactions on Visualization and Computer Graphics},

volume = {31},

number = {5},

pages = {2756–2766},

abstract = {We present Explainable XR, an end-to-end framework for analyzing user behavior in diverse eXtended Reality (XR) environments by leveraging Large Language Models (LLMs) for data interpretation assistance. Existing XR user analytics frameworks face challenges in handling cross-virtuality - AR, VR, MR - transitions, multi-user collaborative application scenarios, and the complexity of multimodal data. Explainable XR addresses these challenges by providing a virtuality-agnostic solution for the collection, analysis, and visualization of immersive sessions. We propose three main components in our framework: (1) A novel user data recording schema, called User Action Descriptor (UAD), that can capture the users' multimodal actions, along with their intents and the contexts; (2) a platform-agnostic XR session recorder, and (3) a visual analytics interface that offers LLM-assisted insights tailored to the analysts' perspectives, facilitating the exploration and analysis of the recorded XR session data. We demonstrate the versatility of Explainable XR by demonstrating five use-case scenarios, in both individual and collaborative XR applications across virtualities. Our technical evaluation and user studies show that Explainable XR provides a highly usable analytics solution for understanding user actions and delivering multifaceted, actionable insights into user behaviors in immersive environments. © 1995-2012 IEEE.},

keywords = {adult, Agnostic, Article, Assistive, Cross Reality, Data Analytics, Data collection, data interpretation, Data recording, Data visualization, Extended reality, human, Language Model, Large language model, large language models, Multi-modal, Multimodal Data Collection, normal human, Personalized assistive technique, Personalized Assistive Techniques, recorder, Spatio-temporal data, therapy, user behavior, User behaviors, Virtual addresses, Virtual environments, Virtual Reality, Visual analytics, Visual languages},

pubstate = {published},

tppubtype = {article}

}

Kurai, R.; Hiraki, T.; Hiroi, Y.; Hirao, Y.; Perusquia-Hernandez, M.; Uchiyama, H.; Kiyokawa, K.

MagicItem: Dynamic Behavior Design of Virtual Objects With Large Language Models in a Commercial Metaverse Platform Journal Article

In: IEEE Access, vol. 13, pp. 19132–19143, 2025, ISSN: 21693536 (ISSN).

Abstract | Links | BibTeX | Tags: Behavior design, Code programming, Computer simulation languages, Dynamic behaviors, Language Model, Large-language model, Low-code programming, Metaverse platform, Metaverses, Virtual addresses, Virtual environments, Virtual objects, Virtual Reality, Virtual-reality environment

@article{kurai_magicitem_2025,

title = {MagicItem: Dynamic Behavior Design of Virtual Objects With Large Language Models in a Commercial Metaverse Platform},

author = {R. Kurai and T. Hiraki and Y. Hiroi and Y. Hirao and M. Perusquia-Hernandez and H. Uchiyama and K. Kiyokawa},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85216011970&doi=10.1109%2fACCESS.2025.3530439&partnerID=40&md5=7a33b9618af8b4ab79b43fb3bd4317cf},

doi = {10.1109/ACCESS.2025.3530439},

issn = {21693536 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {IEEE Access},

volume = {13},

pages = {19132–19143},

abstract = {To create rich experiences in virtual reality (VR) environments, it is essential to define the behavior of virtual objects through programming. However, programming in 3D spaces requires a wide range of background knowledge and programming skills. Although Large Language Models (LLMs) have provided programming support, they are still primarily aimed at programmers. In metaverse platforms, where many users inhabit VR spaces, most users are unfamiliar with programming, making it difficult for them to modify the behavior of objects in the VR environment easily. Existing LLM-based script generation methods for VR spaces require multiple lengthy iterations to implement the desired behaviors and are difficult to integrate into the operation of metaverse platforms. To address this issue, we propose a tool that generates behaviors for objects in VR spaces from natural language within Cluster, a metaverse platform with a large user base. By integrating LLMs with the Cluster Script provided by this platform, we enable users with limited programming experience to define object behaviors within the platform freely. We have also integrated our tool into a commercial metaverse platform and are conducting online experiments with 63 general users of the platform. The experiments show that even users with no programming background can successfully generate behaviors for objects in VR spaces, resulting in a highly satisfying system. Our research contributes to democratizing VR content creation by enabling non-programmers to design dynamic behaviors for virtual objects in metaverse platforms. © 2013 IEEE.},

keywords = {Behavior design, Code programming, Computer simulation languages, Dynamic behaviors, Language Model, Large-language model, Low-code programming, Metaverse platform, Metaverses, Virtual addresses, Virtual environments, Virtual objects, Virtual Reality, Virtual-reality environment},

pubstate = {published},

tppubtype = {article}

}

2024

Ma, H.; Yao, X.; Wang, X.

Metaverses for Parallel Transportation: From General 3D Traffic Environment Construction to Virtual-Real I2TS Management and Control Proceedings Article

In: Proc. - IEEE Int. Conf. Digit. Twins Parallel Intell., DTPI, pp. 598–603, Institute of Electrical and Electronics Engineers Inc., 2024, ISBN: 979-835034925-2 (ISBN).

Abstract | Links | BibTeX | Tags: Advanced traffic management systems, Data fusion, generative artificial intelligence, Highway administration, Information Management, Intelligent transportation systems, Interactive Intelligent Transportation System, Metaverses, Mixed Traffic, Parallel Traffic System, Social Diversity and Uncertainty, Traffic control, Traffic Metaverse, Traffic systems, Uncertainty, Virtual addresses, Virtual environments

@inproceedings{ma_metaverses_2024,

title = {Metaverses for Parallel Transportation: From General 3D Traffic Environment Construction to Virtual-Real I2TS Management and Control},

author = {H. Ma and X. Yao and X. Wang},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85214916181&doi=10.1109%2fDTPI61353.2024.10778876&partnerID=40&md5=94a6bf4b06a2a45f7c483936beee840f},

doi = {10.1109/DTPI61353.2024.10778876},

isbn = {979-835034925-2 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {Proc. - IEEE Int. Conf. Digit. Twins Parallel Intell., DTPI},

pages = {598–603},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {Metaverse technologies have enabled the creation of highly realistic artificial traffic system via real-time multi-source data fusion, while generative artificial intelligence (GAI) has facilitated the construction of large-scale traffic scenarios and the evaluation of strategies. This integration allows for the modeling of traffic environments that blend virtual and real-world interactions, providing digital proving grounds for the management and control (M&C) of intelligent transportation systems (ITS). This paper comprehensively reviews the evolution of traffic modeling tools, from traditional 2D and 3D traffic simulations to the construction of generative 3D traffic environments based on digital twin (DT) technologies and the metaverse. Furthermore, to address the challenges posed by social diversity and uncertainty in mixed traffic, as well as the limitations of traditional methods, we propose a virtual-real interaction M&C strategy based on GAI. This strategy integrates the metaverse into parallel traffic systems (PTS), enabling bidirectional interaction and collaboration between virtual and physical environments. Through specific case studies, this research demonstrates the potential of combining the metaverse with PTS to enhance the efficiency of mixed traffic systems. © 2024 IEEE.},

keywords = {Advanced traffic management systems, Data fusion, generative artificial intelligence, Highway administration, Information Management, Intelligent transportation systems, Interactive Intelligent Transportation System, Metaverses, Mixed Traffic, Parallel Traffic System, Social Diversity and Uncertainty, Traffic control, Traffic Metaverse, Traffic systems, Uncertainty, Virtual addresses, Virtual environments},

pubstate = {published},

tppubtype = {inproceedings}

}

Sehgal, V.; Sekaran, N.

Virtual Recording Generation Using Generative AI and Carla Simulator Proceedings Article

In: SAE Techni. Paper., SAE International, 2024, ISBN: 01487191 (ISSN).

Abstract | Links | BibTeX | Tags: Access control, Air cushion vehicles, Associative storage, Augmented Reality, Automobile driver simulators, Automobile drivers, Automobile simulators, Automobile testing, Autonomous Vehicles, benchmarking, Computer testing, Condition, Continuous functions, Dynamic random access storage, Formal concept analysis, HDCP, Language Model, Luminescent devices, Network Security, Operational test, Operational use, Problem oriented languages, Randomisation, Real-world drivings, Sailing vessels, Ships, Test condition, UNIX, Vehicle modelling, Virtual addresses

@inproceedings{sehgal_virtual_2024,

title = {Virtual Recording Generation Using Generative AI and Carla Simulator},

author = {V. Sehgal and N. Sekaran},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85213320680&doi=10.4271%2f2024-28-0261&partnerID=40&md5=37a924cf9beda31f2c23b3a2cdf575d2},

doi = {10.4271/2024-28-0261},

isbn = {01487191 (ISSN)},

year  = {2024},

date = {2024-01-01},

booktitle = {SAE Techni. Paper.},

publisher = {SAE International},

abstract = {To establish and validate new systems incorporated into next generation vehicles, it is important to understand actual scenarios which the autonomous vehicles will likely encounter. Consequently, to do this, it is important to run Field Operational Tests (FOT). FOT is undertaken with many vehicles and large acquisition areas ensuing the capability and suitability of a continuous function, thus guaranteeing the randomization of test conditions. FOT and Use case(a software testing technique designed to ensure that the system under test meets and exceeds the stakeholders' expectations) scenario recordings capture is very expensive, due to the amount of necessary material (vehicles, measurement equipment/objectives, headcount, data storage capacity/complexity, trained drivers/professionals) and all-time robust working vehicle setup is not always available, moreover mileage is directly proportional to time, along with that it cannot be scaled up due to physical limitations. During the early development phase, ground truth data is not available, and data that can be reused from other projects may not match 100% with current project requirements. All event scenarios/weather conditions cannot be ensured during recording capture, in such cases synthetic/virtual recording comes very handy which can accurately mimic real conditions on test bench and can very well address the before mentioned constraints. Car Learning to Act (CARLA) [1] is an autonomous open-source driving simulator, used for the development, training, and validation of autonomous driving systems is extended for generation of synthetic/virtual data/recordings, by integrating Generative Artificial Intelligence (Gen AI), particularly Generative Adversarial Networks (GANs) [2] and Retrieval Augmented Generation (RAG) [3] which are deep learning models. The process of creating synthetic data using vehicle models becomes more efficient and reliable as Gen AI can hold and reproduce much more data in scenario development than a developer or tester. A Large Language Model (LLM) [4] takes user input in the form of user prompts and generate scenarios that are used to produce a vast amount of high-quality, distinct, and realistic driving scenarios that closely resemble real-world driving data. Gen AI [5] empowers the user to generate not only dynamic environment conditions (such as different weather conditions and lighting conditions) but also dynamic elements like the behavior of other vehicles and pedestrians. Synthetic/Virtual recording [6] generated using Gen AI can be used to train and validate virtual vehicle models, FOT/Use case data which is used to indirectly prove real-world performance of functionality of tasks such as object detection, object recognition, image segmentation, and decision-making algorithms in autonomous vehicles. Augmenting LLM with CARLA involves training generative models on real-world driving data using RAG which allows the model to generate new, synthetic instances that resemble real-world conditions/scenarios. © 2024 SAE International. All Rights Reserved.},

keywords = {Access control, Air cushion vehicles, Associative storage, Augmented Reality, Automobile driver simulators, Automobile drivers, Automobile simulators, Automobile testing, Autonomous Vehicles, benchmarking, Computer testing, Condition, Continuous functions, Dynamic random access storage, Formal concept analysis, HDCP, Language Model, Luminescent devices, Network Security, Operational test, Operational use, Problem oriented languages, Randomisation, Real-world drivings, Sailing vessels, Ships, Test condition, UNIX, Vehicle modelling, Virtual addresses},

pubstate = {published},

tppubtype = {inproceedings}

}

To establish and validate new systems incorporated into next generation vehicles, it is important to understand actual scenarios which the autonomous vehicles will likely encounter. Consequently, to do this, it is important to run Field Operational Tests (FOT). FOT is undertaken with many vehicles and large acquisition areas ensuing the capability and suitability of a continuous function, thus guaranteeing the randomization of test conditions. FOT and Use case(a software testing technique designed to ensure that the system under test meets and exceeds the stakeholders' expectations) scenario recordings capture is very expensive, due to the amount of necessary material (vehicles, measurement equipment/objectives, headcount, data storage capacity/complexity, trained drivers/professionals) and all-time robust working vehicle setup is not always available, moreover mileage is directly proportional to time, along with that it cannot be scaled up due to physical limitations. During the early development phase, ground truth data is not available, and data that can be reused from other projects may not match 100% with current project requirements. All event scenarios/weather conditions cannot be ensured during recording capture, in such cases synthetic/virtual recording comes very handy which can accurately mimic real conditions on test bench and can very well address the before mentioned constraints. Car Learning to Act (CARLA) [1] is an autonomous open-source driving simulator, used for the development, training, and validation of autonomous driving systems is extended for generation of synthetic/virtual data/recordings, by integrating Generative Artificial Intelligence (Gen AI), particularly Generative Adversarial Networks (GANs) [2] and Retrieval Augmented Generation (RAG) [3] which are deep learning models. The process of creating synthetic data using vehicle models becomes more efficient and reliable as Gen AI can hold and reproduce much more data in scenario development than a developer or tester. A Large Language Model (LLM) [4] takes user input in the form of user prompts and generate scenarios that are used to produce a vast amount of high-quality, distinct, and realistic driving scenarios that closely resemble real-world driving data. Gen AI [5] empowers the user to generate not only dynamic environment conditions (such as different weather conditions and lighting conditions) but also dynamic elements like the behavior of other vehicles and pedestrians. Synthetic/Virtual recording [6] generated using Gen AI can be used to train and validate virtual vehicle models, FOT/Use case data which is used to indirectly prove real-world performance of functionality of tasks such as object detection, object recognition, image segmentation, and decision-making algorithms in autonomous vehicles. Augmenting LLM with CARLA involves training generative models on real-world driving data using RAG which allows the model to generate new, synthetic instances that resemble real-world conditions/scenarios. © 2024 SAE International. All Rights Reserved.

Chen, X.; Gao, W.; Chu, Y.; Song, Y.

Enhancing interaction in virtual-real architectural environments: A comparative analysis of generative AI-driven reality approaches Journal Article

In: Building and Environment, vol. 266, 2024, ISSN: 03601323 (ISSN).

Abstract | Links | BibTeX | Tags: Architectural design, Architectural environment, Architectural environments, Artificial intelligence, cluster analysis, Comparative analyzes, comparative study, Computational design, Generative adversarial networks, Generative AI, generative artificial intelligence, Mixed reality, Real time interactions, Real-space, Unity3d, Virtual addresses, Virtual environments, Virtual Reality, Virtual spaces, Work-flows

@article{chen_enhancing_2024,

title = {Enhancing interaction in virtual-real architectural environments: A comparative analysis of generative AI-driven reality approaches},

author = {X. Chen and W. Gao and Y. Chu and Y. Song},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85205298350&doi=10.1016%2fj.buildenv.2024.112113&partnerID=40&md5=8c7d4f5477e25b021dfc5e013a851620},

doi = {10.1016/j.buildenv.2024.112113},

issn = {03601323 (ISSN)},

year  = {2024},

date = {2024-01-01},

journal = {Building and Environment},

volume = {266},

abstract = {The architectural environment is expanding into digital, virtual, and informational dimensions, introducing challenges in virtual-real space interaction. Traditional design methods struggle with real-time interaction, integration with existing workflows, and rapid space modification. To address these issues, we present a generative design method that enables symbiotic interaction between virtual and real spaces using Mixed Reality (MR) and Generative Artificial Intelligence (AI) technologies. We developed two approaches: one using the Rhino modeling platform and the other based on the Unity3D game engine, tailored to different application needs. User experience testing in exhibition, leisure, and residential spaces evaluated our method's effectiveness. Results showed significant improvements in design flexibility, interactive efficiency, and user satisfaction. In the exhibition scenario, the Unity3D-based method excelled in rapid design modifications and immersive experiences. Questionnaire data indicated that MR offers good visual comfort and higher immersion than VR, effectively supporting architects in interface and scale design. Clustering analysis of participants' position and gaze data revealed diverse behavioral patterns in the virtual-physical exhibition space, providing insights for optimizing spatial layouts and interaction methods. Our findings suggest that the generative AI-driven MR method simplifies traditional design processes by enabling real-time modification and interaction with spatial interfaces through simple verbal and motion interactions. This approach streamlines workflows by reducing steps like measuring, modeling, and rendering, while enhancing user engagement and creativity. Overall, this method offers new possibilities for experiential exhibition and architectural design, contributing to future environments where virtual and real spaces coexist seamlessly. © 2024},

keywords = {Architectural design, Architectural environment, Architectural environments, Artificial intelligence, cluster analysis, Comparative analyzes, comparative study, Computational design, Generative adversarial networks, Generative AI, generative artificial intelligence, Mixed reality, Real time interactions, Real-space, Unity3d, Virtual addresses, Virtual environments, Virtual Reality, Virtual spaces, Work-flows},

pubstate = {published},

tppubtype = {article}

}