AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Tong, Y.; Qiu, Y.; Li, R.; Qiu, S.; Heng, P. -A.

MS2Mesh-XR: Multi-Modal Sketch-to-Mesh Generation in XR Environments Proceedings Article

In: Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR, pp. 272–276, Institute of Electrical and Electronics Engineers Inc., 2025, ISBN: 979-833152157-8 (ISBN).

Abstract | Links | BibTeX | Tags: 3D meshes, 3D object, ControlNet, Hand-drawn sketches, Hands movement, High quality, Image-based, immersive visualization, Mesh generation, Multi-modal, Pipeline codes, Realistic images, Three dimensional computer graphics, Virtual environments, Virtual Reality

@inproceedings{tong_ms2mesh-xr_2025,

title = {MS2Mesh-XR: Multi-Modal Sketch-to-Mesh Generation in XR Environments},

author = {Y. Tong and Y. Qiu and R. Li and S. Qiu and P. -A. Heng},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105000423684&doi=10.1109%2fAIxVR63409.2025.00052&partnerID=40&md5=caeace6850dcbdf8c1fa0441b98fa8d9},

doi = {10.1109/AIxVR63409.2025.00052},

isbn = {979-833152157-8 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR},

pages = {272–276},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {We present MS2Mesh-XR, a novel multimodal sketch-to-mesh generation pipeline that enables users to create realistic 3D objects in extended reality (XR) environments using hand-drawn sketches assisted by voice inputs. In specific, users can intuitively sketch objects using natural hand movements in mid-air within a virtual environment. By integrating voice inputs, we devise ControlNet to infer realistic images based on the drawn sketches and interpreted text prompts. Users can then review and select their preferred image, which is subsequently reconstructed into a detailed 3D mesh using the Convolutional Reconstruction Model. In particular, our proposed pipeline can generate a high-quality 3D mesh in less than 20 seconds, allowing for immersive visualization and manipulation in runtime XR scenes. We demonstrate the practicability of our pipeline through two use cases in XR settings. By leveraging natural user inputs and cutting-edge generative AI capabilities, our approach can significantly facilitate XR-based creative production and enhance user experiences. Our code and demo will be available at: https://yueqiu0911.github.io/MS2Mesh-XR/. © 2025 IEEE.},

keywords = {3D meshes, 3D object, ControlNet, Hand-drawn sketches, Hands movement, High quality, Image-based, immersive visualization, Mesh generation, Multi-modal, Pipeline codes, Realistic images, Three dimensional computer graphics, Virtual environments, Virtual Reality},

pubstate = {published},

tppubtype = {inproceedings}

}

Leininger, P.; Weber, C. J.; Rothe, S.

Understanding Creative Potential and Use Cases of AI-Generated Environments for Virtual Film Productions: Insights from Industry Professionals Proceedings Article

In: IMX - Proc. ACM Int. Conf. Interact. Media Experiences, pp. 60–78, Association for Computing Machinery, Inc, 2025, ISBN: 979-840071391-0 (ISBN).

Abstract | Links | BibTeX | Tags: 3-D environments, 3D reconstruction, 3D Scene Reconstruction, 3d scenes reconstruction, AI-generated 3d environment, AI-Generated 3D Environments, Computer interaction, Creative Collaboration, Creatives, Digital content creation, Digital Content Creation., Filmmaking workflow, Filmmaking Workflows, Gaussian distribution, Gaussian Splatting, Gaussians, Generative AI, Graphical user interface, Graphical User Interface (GUI), Graphical user interfaces, Human computer interaction, human-computer interaction, Human-Computer Interaction (HCI), Immersive, Immersive Storytelling, Interactive computer graphics, Interactive computer systems, Interactive media, Mesh generation, Previsualization, Real-Time Rendering, Splatting, Three dimensional computer graphics, Virtual production, Virtual Production (VP), Virtual Reality, Work-flows

@inproceedings{leininger_understanding_2025,

title = {Understanding Creative Potential and Use Cases of AI-Generated Environments for Virtual Film Productions: Insights from Industry Professionals},

author = {P. Leininger and C. J. Weber and S. Rothe},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105007976841&doi=10.1145%2f3706370.3727853&partnerID=40&md5=0d4cf7a2398d12d04e4f0ab182474a10},

doi = {10.1145/3706370.3727853},

isbn = {979-840071391-0 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {IMX - Proc. ACM Int. Conf. Interact. Media Experiences},

pages = {60–78},

publisher = {Association for Computing Machinery, Inc},

abstract = {Virtual production (VP) is transforming filmmaking by integrating real-time digital elements with live-action footage, offering new creative possibilities and streamlined workflows. While industry experts recognize AI's potential to revolutionize VP, its practical applications and value across different production phases and user groups remain underexplored. Building on initial research into generative and data-driven approaches, this paper presents the first systematic pilot study evaluating three types of AI-generated 3D environments - Depth Mesh, 360° Panoramic Meshes, and Gaussian Splatting - through the participation of 15 filmmaking professionals from diverse roles. Unlike commonly used 2D AI-generated visuals, our approach introduces navigable 3D environments that offer greater control and flexibility, aligning more closely with established VP workflows. Through expert interviews and literature research, we developed evaluation criteria to assess their usefulness beyond concept development, extending to previsualization, scene exploration, and interdisciplinary collaboration. Our findings indicate that different environments cater to distinct production needs, from early ideation to detailed visualization. Gaussian Splatting proved effective for high-fidelity previsualization, while 360° Panoramic Meshes excelled in rapid concept ideation. Despite their promise, challenges such as limited interactivity and customization highlight areas for improvement. Our prototype, EnVisualAIzer, built in Unreal Engine 5, provides an accessible platform for diverse filmmakers to engage with AI-generated environments, fostering a more inclusive production process. By lowering technical barriers, these environments have the potential to make advanced VP tools more widely available. This study offers valuable insights into the evolving role of AI in VP and sets the stage for future research and development. © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.},

keywords = {3-D environments, 3D reconstruction, 3D Scene Reconstruction, 3d scenes reconstruction, AI-generated 3d environment, AI-Generated 3D Environments, Computer interaction, Creative Collaboration, Creatives, Digital content creation, Digital Content Creation., Filmmaking workflow, Filmmaking Workflows, Gaussian distribution, Gaussian Splatting, Gaussians, Generative AI, Graphical user interface, Graphical User Interface (GUI), Graphical user interfaces, Human computer interaction, human-computer interaction, Human-Computer Interaction (HCI), Immersive, Immersive Storytelling, Interactive computer graphics, Interactive computer systems, Interactive media, Mesh generation, Previsualization, Real-Time Rendering, Splatting, Three dimensional computer graphics, Virtual production, Virtual Production (VP), Virtual Reality, Work-flows},

pubstate = {published},

tppubtype = {inproceedings}

}

Virtual production (VP) is transforming filmmaking by integrating real-time digital elements with live-action footage, offering new creative possibilities and streamlined workflows. While industry experts recognize AI's potential to revolutionize VP, its practical applications and value across different production phases and user groups remain underexplored. Building on initial research into generative and data-driven approaches, this paper presents the first systematic pilot study evaluating three types of AI-generated 3D environments - Depth Mesh, 360° Panoramic Meshes, and Gaussian Splatting - through the participation of 15 filmmaking professionals from diverse roles. Unlike commonly used 2D AI-generated visuals, our approach introduces navigable 3D environments that offer greater control and flexibility, aligning more closely with established VP workflows. Through expert interviews and literature research, we developed evaluation criteria to assess their usefulness beyond concept development, extending to previsualization, scene exploration, and interdisciplinary collaboration. Our findings indicate that different environments cater to distinct production needs, from early ideation to detailed visualization. Gaussian Splatting proved effective for high-fidelity previsualization, while 360° Panoramic Meshes excelled in rapid concept ideation. Despite their promise, challenges such as limited interactivity and customization highlight areas for improvement. Our prototype, EnVisualAIzer, built in Unreal Engine 5, provides an accessible platform for diverse filmmakers to engage with AI-generated environments, fostering a more inclusive production process. By lowering technical barriers, these environments have the potential to make advanced VP tools more widely available. This study offers valuable insights into the evolving role of AI in VP and sets the stage for future research and development. © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.

2024

Weng, S. C. -C.; Chiou, Y. -M.; Do, E. Y. -L.

Dream Mesh: A Speech-to-3D Model Generative Pipeline in Mixed Reality Proceedings Article

In: Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR, pp. 345–349, Institute of Electrical and Electronics Engineers Inc., 2024, ISBN: 979-835037202-1 (ISBN).

Abstract | Links | BibTeX | Tags: 3D content, 3D modeling, 3D models, 3d-modeling, Augmented Reality, Digital assets, Generative AI, generative artificial intelligence, Intelligence models, Mesh generation, Mixed reality, Modeling, Speech-to-3D, Text modeling, Three dimensional computer graphics, User interfaces

@inproceedings{weng_dream_2024,

title = {Dream Mesh: A Speech-to-3D Model Generative Pipeline in Mixed Reality},

author = {S. C. -C. Weng and Y. -M. Chiou and E. Y. -L. Do},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85187218106&doi=10.1109%2fAIxVR59861.2024.00059&partnerID=40&md5=5bfe206e841f23de6458f88a0824bd4d},

doi = {10.1109/AIxVR59861.2024.00059},

isbn = {979-835037202-1 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR},

pages = {345–349},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {Generative Artificial Intelligence (AI) models have risen to prominence due to their unparalleled ability to craft and generate digital assets, encompassing text, images, audio, video, and 3D models. Leveraging the capabilities of diffusion models, such as Stable Diffusion and Instruct pix2pix, users can guide AI with specific prompts, streamlining the creative journey for graphic designers. However, the primary application of these models has been to graphic content within desktop interfaces, prompting professionals in interior and architectural design to seek more tailored solutions for their daily operations. To bridge this gap, Augmented Reality (AR) and Mixed Reality (MR) technologies offer a promising solution, transforming traditional 2D artworks into engaging 3D interactive realms. In this paper, we present "Dream Mesh,"a MR application MR tool that combines a Speech-to-3D generative workflow besed on DreamFusion model without relying on pre-existing 3D content libraries. This innovative system empowers users to express 3D content needs through natural language input, promising transformative potential in real-time 3D content creation and an enhanced MR user experience. © 2024 IEEE.},

keywords = {3D content, 3D modeling, 3D models, 3d-modeling, Augmented Reality, Digital assets, Generative AI, generative artificial intelligence, Intelligence models, Mesh generation, Mixed reality, Modeling, Speech-to-3D, Text modeling, Three dimensional computer graphics, User interfaces},

pubstate = {published},

tppubtype = {inproceedings}

}

Scott, A. J. S.; McCuaig, F.; Lim, V.; Watkins, W.; Wang, J.; Strachan, G.

Revolutionizing Nurse Practitioner Training: Integrating Virtual Reality and Large Language Models for Enhanced Clinical Education Proceedings Article

In: G., Strudwick; N.R., Hardiker; G., Rees; R., Cook; R., Cook; Y.J., Lee (Ed.): Stud. Health Technol. Informatics, pp. 671–672, IOS Press BV, 2024, ISBN: 09269630 (ISSN); 978-164368527-4 (ISBN).

Abstract | Links | BibTeX | Tags: 3D modeling, 3D models, 3d-modeling, adult, anamnesis, clinical decision making, clinical education, Clinical Simulation, Computational Linguistics, computer interface, Computer-Assisted Instruction, conference paper, Curriculum, Decision making, E-Learning, Education, Health care education, Healthcare Education, human, Humans, Language Model, Large language model, large language models, Mesh generation, Model animations, Modeling languages, nurse practitioner, Nurse Practitioners, Nursing, nursing education, nursing student, OSCE preparation, procedures, simulation, Teaching, therapy, Training, Training program, User-Computer Interface, Virtual Reality, Virtual reality training

@inproceedings{scott_revolutionizing_2024,

title = {Revolutionizing Nurse Practitioner Training: Integrating Virtual Reality and Large Language Models for Enhanced Clinical Education},

author = {A. J. S. Scott and F. McCuaig and V. Lim and W. Watkins and J. Wang and G. Strachan},

editor = {Strudwick G. and Hardiker N.R. and Rees G. and Cook R. and Cook R. and Lee Y.J.},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85199593781&doi=10.3233%2fSHTI240272&partnerID=40&md5=90c7bd43ba978f942723e6cf1983ffb3},

doi = {10.3233/SHTI240272},

isbn = {09269630 (ISSN); 978-164368527-4 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {Stud. Health Technol. Informatics},

volume = {315},

pages = {671–672},

publisher = {IOS Press BV},

abstract = {This project introduces an innovative virtual reality (VR) training program for student Nurse Practitioners, incorporating advanced 3D modeling, animation, and Large Language Models (LLMs). Designed to simulate realistic patient interactions, the program aims to improve communication, history taking, and clinical decision-making skills in a controlled, authentic setting. This abstract outlines the methods, results, and potential impact of this cutting-edge educational tool on nursing education. © 2024 The Authors.},

keywords = {3D modeling, 3D models, 3d-modeling, adult, anamnesis, clinical decision making, clinical education, Clinical Simulation, Computational Linguistics, computer interface, Computer-Assisted Instruction, conference paper, Curriculum, Decision making, E-Learning, Education, Health care education, Healthcare Education, human, Humans, Language Model, Large language model, large language models, Mesh generation, Model animations, Modeling languages, nurse practitioner, Nurse Practitioners, Nursing, nursing education, nursing student, OSCE preparation, procedures, simulation, Teaching, therapy, Training, Training program, User-Computer Interface, Virtual Reality, Virtual reality training},

pubstate = {published},

tppubtype = {inproceedings}

}