AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Liu, G.; Du, H.; Wang, J.; Niyato, D.; Kim, D. I.

Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse Journal Article

In: IEEE Transactions on Mobile Computing, 2025, ISSN: 15361233 (ISSN).

Abstract | Links | BibTeX | Tags: Contest Theory, Deep learning, Deep reinforcement learning, Diffusion Model, Generative adversarial networks, Generative AI, High quality, Image generation, Image generations, Immersive technologies, Metaverses, Mobile edge computing, Reinforcement Learning, Reinforcement learnings, Resource allocation, Resources allocation, Semantic data, Virtual addresses, Virtual environments, Virtual Reality

@article{liu_contract-inspired_2025,

title = {Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse},

author = {G. Liu and H. Du and J. Wang and D. Niyato and D. I. Kim},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105000066834&doi=10.1109%2fTMC.2025.3550815&partnerID=40&md5=3cb5a2143b9ce4ca7f931a60f1bf239c},

doi = {10.1109/TMC.2025.3550815},

issn = {15361233 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {IEEE Transactions on Mobile Computing},

abstract = {The rapid advancement of immersive technologies has propelled the development of the Metaverse, where the convergence of virtual and physical realities necessitates the generation of high-quality, photorealistic images to enhance user experience. However, generating these images, especially through Generative Diffusion Models (GDMs), in mobile edge computing environments presents significant challenges due to the limited computing resources of edge devices and the dynamic nature of wireless networks. This paper proposes a novel framework that integrates contract-inspired contest theory, Deep Reinforcement Learning (DRL), and GDMs to optimize image generation in these resource-constrained environments. The framework addresses the critical challenges of resource allocation and semantic data transmission quality by incentivizing edge devices to efficiently transmit high-quality semantic data, which is essential for creating realistic and immersive images. The use of contest and contract theory ensures that edge devices are motivated to allocate resources effectively, while DRL dynamically adjusts to network conditions, optimizing the overall image generation process. Experimental results demonstrate that the proposed approach not only improves the quality of generated images but also achieves superior convergence speed and stability compared to traditional methods. This makes the framework particularly effective for optimizing complex resource allocation tasks in mobile edge Metaverse applications, offering enhanced performance and efficiency in creating immersive virtual environments. © 2002-2012 IEEE.},

keywords = {Contest Theory, Deep learning, Deep reinforcement learning, Diffusion Model, Generative adversarial networks, Generative AI, High quality, Image generation, Image generations, Immersive technologies, Metaverses, Mobile edge computing, Reinforcement Learning, Reinforcement learnings, Resource allocation, Resources allocation, Semantic data, Virtual addresses, Virtual environments, Virtual Reality},

pubstate = {published},

tppubtype = {article}

}

Shen, Y.; Li, B.; Huang, J.; Wang, Z.

GaussianShopVR: Facilitating Immersive 3D Authoring Using Gaussian Splatting in VR Proceedings Article

In: Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW, pp. 1292–1293, Institute of Electrical and Electronics Engineers Inc., 2025, ISBN: 979-833151484-6 (ISBN).

Abstract | Links | BibTeX | Tags: 3D authoring, 3D modeling, Digital replicas, Gaussian distribution, Gaussian Splatting editing, Gaussians, Graphical user interfaces, High quality, Immersive, Immersive environment, Interactive computer graphics, Rendering (computer graphics), Rendering pipelines, Splatting, Three dimensional computer graphics, User profile, Virtual Reality, Virtual reality user interface, Virtualization, VR user interface

@inproceedings{shen_gaussianshopvr_2025,

title = {GaussianShopVR: Facilitating Immersive 3D Authoring Using Gaussian Splatting in VR},

author = {Y. Shen and B. Li and J. Huang and Z. Wang},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105005138672&doi=10.1109%2fVRW66409.2025.00292&partnerID=40&md5=9b644bd19394a289d3027ab9a2dfed6a},

doi = {10.1109/VRW66409.2025.00292},

isbn = {979-833151484-6 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW},

pages = {1292–1293},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {Virtual reality (VR) applications require massive high-quality 3D assets to create immersive environments. Generating mesh-based 3D assets typically involves a significant amount of manpower and effort, which makes VR applications less accessible. 3D Gaussian Splatting (3DGS) has attracted much attention for its ability to quickly create digital replicas of real-life scenes and its compatibility with traditional rendering pipelines. However, it remains a challenge to edit 3DGS in a flexible and controllable manner. We propose GaussianShopVR, a system that leverages VR user interfaces to specify target areas to achieve flexible and controllable editing of reconstructed 3DGS. In addition, selected areas can provide 3D information to generative AI models to facilitate the editing. GaussianShopVR integrates object hierarchy management while keeping the backpropagated gradient flow to allow local editing with context information. © 2025 IEEE.},

keywords = {3D authoring, 3D modeling, Digital replicas, Gaussian distribution, Gaussian Splatting editing, Gaussians, Graphical user interfaces, High quality, Immersive, Immersive environment, Interactive computer graphics, Rendering (computer graphics), Rendering pipelines, Splatting, Three dimensional computer graphics, User profile, Virtual Reality, Virtual reality user interface, Virtualization, VR user interface},

pubstate = {published},

tppubtype = {inproceedings}

}

Tong, Y.; Qiu, Y.; Li, R.; Qiu, S.; Heng, P. -A.

MS2Mesh-XR: Multi-Modal Sketch-to-Mesh Generation in XR Environments Proceedings Article

In: Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR, pp. 272–276, Institute of Electrical and Electronics Engineers Inc., 2025, ISBN: 979-833152157-8 (ISBN).

Abstract | Links | BibTeX | Tags: 3D meshes, 3D object, ControlNet, Hand-drawn sketches, Hands movement, High quality, Image-based, immersive visualization, Mesh generation, Multi-modal, Pipeline codes, Realistic images, Three dimensional computer graphics, Virtual environments, Virtual Reality

@inproceedings{tong_ms2mesh-xr_2025,

title = {MS2Mesh-XR: Multi-Modal Sketch-to-Mesh Generation in XR Environments},

author = {Y. Tong and Y. Qiu and R. Li and S. Qiu and P. -A. Heng},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105000423684&doi=10.1109%2fAIxVR63409.2025.00052&partnerID=40&md5=caeace6850dcbdf8c1fa0441b98fa8d9},

doi = {10.1109/AIxVR63409.2025.00052},

isbn = {979-833152157-8 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR},

pages = {272–276},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {We present MS2Mesh-XR, a novel multimodal sketch-to-mesh generation pipeline that enables users to create realistic 3D objects in extended reality (XR) environments using hand-drawn sketches assisted by voice inputs. In specific, users can intuitively sketch objects using natural hand movements in mid-air within a virtual environment. By integrating voice inputs, we devise ControlNet to infer realistic images based on the drawn sketches and interpreted text prompts. Users can then review and select their preferred image, which is subsequently reconstructed into a detailed 3D mesh using the Convolutional Reconstruction Model. In particular, our proposed pipeline can generate a high-quality 3D mesh in less than 20 seconds, allowing for immersive visualization and manipulation in runtime XR scenes. We demonstrate the practicability of our pipeline through two use cases in XR settings. By leveraging natural user inputs and cutting-edge generative AI capabilities, our approach can significantly facilitate XR-based creative production and enhance user experiences. Our code and demo will be available at: https://yueqiu0911.github.io/MS2Mesh-XR/. © 2025 IEEE.},

keywords = {3D meshes, 3D object, ControlNet, Hand-drawn sketches, Hands movement, High quality, Image-based, immersive visualization, Mesh generation, Multi-modal, Pipeline codes, Realistic images, Three dimensional computer graphics, Virtual environments, Virtual Reality},

pubstate = {published},

tppubtype = {inproceedings}

}

Mao, H.; Xu, Z.; Wei, S.; Quan, Y.; Deng, N.; Yang, X.

LLM-powered Gaussian Splatting in VR interactions Proceedings Article

In: Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW, pp. 1654–1655, Institute of Electrical and Electronics Engineers Inc., 2025, ISBN: 979-833151484-6 (ISBN).

Abstract | Links | BibTeX | Tags: 3D Gaussian Splatting, 3D reconstruction, Content creation, Digital elevation model, Gaussians, High quality, Language Model, material analysis, Materials analysis, Physical simulation, Quality rendering, Rendering (computer graphics), Splatting, Virtual Reality, Volume Rendering, VR systems

@inproceedings{mao_llm-powered_2025,

title = {LLM-powered Gaussian Splatting in VR interactions},

author = {H. Mao and Z. Xu and S. Wei and Y. Quan and N. Deng and X. Yang},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105005148017&doi=10.1109%2fVRW66409.2025.00472&partnerID=40&md5=ee725f655a37251ff335ad2098d15f22},

doi = {10.1109/VRW66409.2025.00472},

isbn = {979-833151484-6 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW},

pages = {1654–1655},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {Recent advances in radiance field rendering, particularly 3D Gaussian Splatting (3DGS), have demonstrated significant potential for VR content creation, offering both high-quality rendering and an efficient production pipeline. However, current physics-based interaction systems for 3DGS are limited to either simplistic, unrealistic simulations or require substantial user input for complex scenes, largely due to the lack of scene comprehension. In this demonstration, we present a highly realistic interactive VR system powered by large language models (LLMs). After object-aware GS reconstruction, we prompt GPT-4o to analyze the physical properties of objects in the scene, which then guide physical simulations that adhere to real-world phenomena. Additionally, We design a GPT-assisted GS inpainting module to complete the areas occluded by manipulated objects. To facilitate rich interaction, we introduce a computationally efficient physical simulation framework through a PBD-based unified interpolation method, which supports various forms of physical interactions. In our research demonstrations, we reconstruct varieties of scenes enhanced by LLM's understanding, showcasing how our VR system can support complex, realistic interactions without additional manual design or annotation. © 2025 IEEE.},

keywords = {3D Gaussian Splatting, 3D reconstruction, Content creation, Digital elevation model, Gaussians, High quality, Language Model, material analysis, Materials analysis, Physical simulation, Quality rendering, Rendering (computer graphics), Splatting, Virtual Reality, Volume Rendering, VR systems},

pubstate = {published},

tppubtype = {inproceedings}

}

2024

Ling, T.; Yanan, L.; Lei, Z.; Shuzhi, J.; Lixin, H.; Xiaoqun, Y.

Modeling the Competitive Content Service Market from the Perspective of Consumer Preferences: A Game Theory Approach Proceedings Article

In: Int. Conf. Comput. Artif. Intell. Technol., CAIT, pp. 326–332, Institute of Electrical and Electronics Engineers Inc., 2024, ISBN: 979-833153089-1 (ISBN).

Abstract | Links | BibTeX | Tags: Artificial intelligence-generated content, consumer preference model, Consumer preference modeling, Content service market, Content services, Duopoly game, High quality, Service industry, Service markets, Service provider, Service Quality, Social welfare

@inproceedings{ling_modeling_2024,

title = {Modeling the Competitive Content Service Market from the Perspective of Consumer Preferences: A Game Theory Approach},

author = {T. Ling and L. Yanan and Z. Lei and J. Shuzhi and H. Lixin and Y. Xiaoqun},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105004661601&doi=10.1109%2fCAIT64506.2024.10963074&partnerID=40&md5=005a1cd46af2b7e6613fc521a4ed1c18},

doi = {10.1109/CAIT64506.2024.10963074},

isbn = {979-833153089-1 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {Int. Conf. Comput. Artif. Intell. Technol., CAIT},

pages = {326–332},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {Deploying interaction-intensive AI-generated content (AIGC) services on mobile edge networks enables mobile AIGC to deliver personalized, high-quality content efficiently and cost-effectively. These services are designed to automatically generate content based on user inputs or requirements, positioning mobile AIGC as a promising solution for content creation in immersive and dynamic Metaverse environments. However, the current implementation of edge devices as AIGC Service Providers (ASPs) suffers from a lack of incentives, which impedes the sustainable delivery of high-quality edge AIGC services. This paper tries to design an incentive strategy by investigating the competition relationship among AIGC service providers (ASPs) as a Duopoly Game, which mathematically describes the competition among ASPs based on their service quality, cost, and prices. In this content service market, social planners issue policies to improve social welfare while providers maximize their profits according to the consumption preference of users. This consumption preference of consumers is described as consumer preference model to capture consumers' demands for different QoS services when different content service providers serve them simultaneously. The simulation shows that 1) competition among providers will produce more differences on content service quality than social regulation will bring; 2) more social welfare may be engendered by the competition among different ASPs. © 2024 IEEE.},

keywords = {Artificial intelligence-generated content, consumer preference model, Consumer preference modeling, Content service market, Content services, Duopoly game, High quality, Service industry, Service markets, Service provider, Service Quality, Social welfare},

pubstate = {published},

tppubtype = {inproceedings}

}

Paweroi, R. M.; Koppen, M.

Framework for Integration of Generative AI into Metaverse Asset Creation Proceedings Article

In: Int. Conf. Intell. Metaverse Technol. Appl., iMETA, pp. 27–33, Institute of Electrical and Electronics Engineers Inc., 2024, ISBN: 979-835035151-4 (ISBN).

Abstract | Links | BibTeX | Tags: 3D Asset Creation, 3D Asset Diversity, 3D models, 3d-modeling, Digital assets, Digital Objects, Generative adversarial networks, Generative AI, High quality, Metaverse, Metaverses, Virtual worlds

Si, J.; Yang, S.; Song, J.; Son, S.; Lee, S.; Kim, D.; Kim, S.

Generating and Integrating Diffusion Model-Based Panoramic Views for Virtual Interview Platform Proceedings Article

In: IEEE Int. Conf. Artif. Intell. Eng. Technol., IICAIET, pp. 343–348, Institute of Electrical and Electronics Engineers Inc., 2024, ISBN: 979-835038969-2 (ISBN).

Abstract | Links | BibTeX | Tags: AI, Deep learning, Diffusion, Diffusion Model, Diffusion technology, Digital elevation model, High quality, Manual process, Model-based OPC, New approaches, Panorama, Panoramic views, Virtual environments, Virtual Interview, Virtual Reality

@inproceedings{si_generating_2024,

title = {Generating and Integrating Diffusion Model-Based Panoramic Views for Virtual Interview Platform},

author = {J. Si and S. Yang and J. Song and S. Son and S. Lee and D. Kim and S. Kim},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209663031&doi=10.1109%2fIICAIET62352.2024.10730450&partnerID=40&md5=a52689715ec912c54696948c34fc0263},

doi = {10.1109/IICAIET62352.2024.10730450},

isbn = {979-835038969-2 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {IEEE Int. Conf. Artif. Intell. Eng. Technol., IICAIET},

pages = {343–348},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {This paper presents a new approach to improve virtual interview platforms in education, which are gaining significant attention. This study aims to simplify the complex manual process of equipment setup to enhance the realism and reliability of virtual interviews. To this end, this study proposes a method for automatically constructing 3D virtual interview environments using diffusion technology in generative AI. In this research, we exploit a diffusion model capable of generating high-quality panoramic images. We generate images of interview rooms capable of delivering immersive interview experiences via refined text prompts. The resulting imagery is then reconstituted 3D VR content utilizing the Unity engine, facilitating enhanced interaction and engagement within virtual environments. This research compares and analyzes various methods presented in related research and proposes a new process for efficiently constructing 360-degree virtual environments. When wearing Oculus Quest 2 and experiencing the virtual environment created using the proposed method, a high sense of immersion was experienced, similar to the actual interview environment. © 2024 IEEE.},

keywords = {AI, Deep learning, Diffusion, Diffusion Model, Diffusion technology, Digital elevation model, High quality, Manual process, Model-based OPC, New approaches, Panorama, Panoramic views, Virtual environments, Virtual Interview, Virtual Reality},

pubstate = {published},

tppubtype = {inproceedings}

}

Diaz, T. G.; Lee, X. Y.; Zhuge, H.; Vidyaratne, L.; Sin, G.; Watanabe, T.; Farahat, A.; Gupta, C.

AI+AR based Framework for Guided Visual Equipment Diagnosis Proceedings Article

In: C.S., Kulkarni; M.E., Orchard (Ed.): Proc. Annu. Conf. Progn. Health Manag. Soc., PHM, Prognostics and Health Management Society, 2024, ISBN: 23250178 (ISSN); 978-193626305-9 (ISBN).

Abstract | Links | BibTeX | Tags: Augmented Reality, Automated solutions, Customer loyalty, Customer satisfaction, Customers' satisfaction, Diagnosis, Equipment diagnosis, Failure Diagnosis, Failure repairs, High quality, Knowledge graphs, Language Model, Quality of Service, Query languages, Sales, Support services

@inproceedings{diaz_aiar_2024,

title = {AI+AR based Framework for Guided Visual Equipment Diagnosis},

author = {T. G. Diaz and X. Y. Lee and H. Zhuge and L. Vidyaratne and G. Sin and T. Watanabe and A. Farahat and C. Gupta},

editor = {Kulkarni C.S. and Orchard M.E.},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85210227167&doi=10.36001%2fphmconf.2024.v16i1.3909&partnerID=40&md5=897ac8045a48e2e80aa7522870c2004f},

doi = {10.36001/phmconf.2024.v16i1.3909},

isbn = {23250178 (ISSN); 978-193626305-9 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {Proc. Annu. Conf. Progn. Health Manag. Soc., PHM},

volume = {16},

publisher = {Prognostics and Health Management Society},

abstract = {Automated solutions for effective support services, such as failure diagnosis and repair, are crucial to keep customer satisfaction and loyalty. However, providing consistent, high quality, and timely support is a difficult task. In practice, customer support usually requires technicians to perform onsite diagnosis, but service quality is often adversely affected by limited expert technicians, high turnover, and minimal automated tools. To address these challenges, we present a novel solution framework for aiding technicians in performing visual equipment diagnosis. We envision a workflow where the technician reports a failure and prompts the system to automatically generate a diagnostic plan that includes parts, areas of interest, and necessary tasks. The plan is used to guide the technician with augmented reality (AR), while a perception module analyzes and tracks the technician’s actions to recommend next steps. Our framework consists of three components: planning, tracking, and guiding. The planning component automates the creation of a diagnostic plan by querying a knowledge graph (KG). We propose to leverage Large Language Models (LLMs) for the construction of the KG to accelerate the extraction process of parts, tasks, and relations from manuals. The tracking component enhances 3D detections by using perception sensors with a 2D nested object detection model. Finally, the guiding component reduces process complexity for technicians by combining 2D models and AR interactions. To validate the framework, we performed multiple studies to:1) determine an effective prompt method for the LLM to construct the KG; 2) demonstrate benefits of our 2D nested object model combined with AR model. © 2024 Prognostics and Health Management Society. All rights reserved.},

keywords = {Augmented Reality, Automated solutions, Customer loyalty, Customer satisfaction, Customers' satisfaction, Diagnosis, Equipment diagnosis, Failure Diagnosis, Failure repairs, High quality, Knowledge graphs, Language Model, Quality of Service, Query languages, Sales, Support services},

pubstate = {published},

tppubtype = {inproceedings}

}