AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Dong, Y.

Enhancing Painting Exhibition Experiences with the Application of Augmented Reality-Based AI Video Generation Technology Proceedings Article

In: P., Zaphiris; A., Ioannou; A., Ioannou; R.A., Sottilare; J., Schwarz; M., Rauterberg (Ed.): Lect. Notes Comput. Sci., pp. 256–262, Springer Science and Business Media Deutschland GmbH, 2025, ISBN: 03029743 (ISSN); 978-303176814-9 (ISBN).

Abstract | Links | BibTeX | Tags: 3D modeling, AI-generated art, Art and Technology, Arts computing, Augmented Reality, Augmented reality technology, Digital Exhibition Design, Dynamic content, E-Learning, Education computing, Generation technologies, Interactive computer graphics, Knowledge Management, Multi dimensional, Planning designs, Three dimensional computer graphics, Video contents, Video generation

@inproceedings{dong_enhancing_2025,

title = {Enhancing Painting Exhibition Experiences with the Application of Augmented Reality-Based AI Video Generation Technology},

author = {Y. Dong},

editor = {Zaphiris P. and Ioannou A. and Ioannou A. and Sottilare R.A. and Schwarz J. and Rauterberg M.},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85213302959&doi=10.1007%2f978-3-031-76815-6_18&partnerID=40&md5=35484f5ed199a831f1a30f265a0d32d5},

doi = {10.1007/978-3-031-76815-6_18},

isbn = {03029743 (ISSN); 978-303176814-9 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Lect. Notes Comput. Sci.},

volume = {15378 LNCS},

pages = {256–262},

publisher = {Springer Science and Business Media Deutschland GmbH},

abstract = {Traditional painting exhibitions often rely on flat presentation methods, such as walls and stands, limiting their impact. Augmented Reality (AR) technology presents an opportunity to transform these experiences by turning static, flat artwork into dynamic, multi-dimensional presentations. However, creating and integrating video or dynamic content can be time-consuming and challenging, requiring meticulous planning, design, and production. In the context of urban renewal and community revitalization, particularly in China’s first-tier cities where real estate development has saturated the market, there is a growing trend to repurpose traditional commercial and office spaces with cultural and artistic exhibitions. These exhibitions not only enhance the spatial quality but also elevate the user experience, making the spaces more competitive. However, these non-traditional exhibition venues often lack the amenities of professional galleries, relying on walls, windows, and corners for displays, and requiring quick setup times. For visitors, who are often office workers or shoppers with limited time, the use of personal mobile devices for interaction is common. WeChat, China’s most widely used mobile application, provides a platform for convenient digital interactive experiences through mini-programs, which can support lightweight AR applications. AI video generation technologies, such as Conditional Generative Adversarial Networks (ControlNet) and Latent Consistency Models (LCM), have seen significant advancements. These technologies now allow for the creation of 3D models and video content from text and images. Tools like Meshy and Pika provide the ability to generate various video styles and offer precise control over video content. New AI video applications like Stable Video further expand the possibilities by rapidly converting static images into dynamic videos, facilitating easy adjustments and edits. This paper explores the application of AR-based AI video generation technology in enhancing the experience of painting exhibitions. By integrating these technologies, traditional paintings can be transformed into interactive, engaging displays that enrich the viewer’s experience. The study demonstrates the potential of these innovations to make art exhibitions more appealing and competitive in various public spaces, thereby improving both artistic expression and audience engagement. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.},

keywords = {3D modeling, AI-generated art, Art and Technology, Arts computing, Augmented Reality, Augmented reality technology, Digital Exhibition Design, Dynamic content, E-Learning, Education computing, Generation technologies, Interactive computer graphics, Knowledge Management, Multi dimensional, Planning designs, Three dimensional computer graphics, Video contents, Video generation},

pubstate = {published},

tppubtype = {inproceedings}

}

Traditional painting exhibitions often rely on flat presentation methods, such as walls and stands, limiting their impact. Augmented Reality (AR) technology presents an opportunity to transform these experiences by turning static, flat artwork into dynamic, multi-dimensional presentations. However, creating and integrating video or dynamic content can be time-consuming and challenging, requiring meticulous planning, design, and production. In the context of urban renewal and community revitalization, particularly in China’s first-tier cities where real estate development has saturated the market, there is a growing trend to repurpose traditional commercial and office spaces with cultural and artistic exhibitions. These exhibitions not only enhance the spatial quality but also elevate the user experience, making the spaces more competitive. However, these non-traditional exhibition venues often lack the amenities of professional galleries, relying on walls, windows, and corners for displays, and requiring quick setup times. For visitors, who are often office workers or shoppers with limited time, the use of personal mobile devices for interaction is common. WeChat, China’s most widely used mobile application, provides a platform for convenient digital interactive experiences through mini-programs, which can support lightweight AR applications. AI video generation technologies, such as Conditional Generative Adversarial Networks (ControlNet) and Latent Consistency Models (LCM), have seen significant advancements. These technologies now allow for the creation of 3D models and video content from text and images. Tools like Meshy and Pika provide the ability to generate various video styles and offer precise control over video content. New AI video applications like Stable Video further expand the possibilities by rapidly converting static images into dynamic videos, facilitating easy adjustments and edits. This paper explores the application of AR-based AI video generation technology in enhancing the experience of painting exhibitions. By integrating these technologies, traditional paintings can be transformed into interactive, engaging displays that enrich the viewer’s experience. The study demonstrates the potential of these innovations to make art exhibitions more appealing and competitive in various public spaces, thereby improving both artistic expression and audience engagement. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

Logothetis, I.; Diakogiannis, K.; Vidakis, N.

Interactive Learning Through Conversational Avatars and Immersive VR: Enhancing Diabetes Education and Self-Management Proceedings Article

In: X., Fang (Ed.): Lect. Notes Comput. Sci., pp. 415–429, Springer Science and Business Media Deutschland GmbH, 2025, ISBN: 03029743 (ISSN); 978-303192577-1 (ISBN).

Abstract | Links | BibTeX | Tags: Artificial intelligence, Chronic disease, Computer aided instruction, Diabetes Education, Diagnosis, E-Learning, Education management, Engineering education, Gamification, Immersive virtual reality, Interactive computer graphics, Interactive learning, Large population, Learning systems, NUI, Self management, Serious game, Serious games, simulation, Virtual Reality

@inproceedings{logothetis_interactive_2025,

title = {Interactive Learning Through Conversational Avatars and Immersive VR: Enhancing Diabetes Education and Self-Management},

author = {I. Logothetis and K. Diakogiannis and N. Vidakis},

editor = {Fang X.},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105008266480&doi=10.1007%2f978-3-031-92578-8_27&partnerID=40&md5=451274dfa3ef0b3f1b39c7d5a665ee3b},

doi = {10.1007/978-3-031-92578-8_27},

isbn = {03029743 (ISSN); 978-303192577-1 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Lect. Notes Comput. Sci.},

volume = {15816 LNCS},

pages = {415–429},

publisher = {Springer Science and Business Media Deutschland GmbH},

abstract = {Diabetes is a chronic disease affecting a large population of the world. Education and self-management of diabetes are crucial. Technologies such as Virtual Reality (VR) have presented promising results in healthcare education, while studies suggest that Artificial Intelligence (AI) can help in learning by further engaging the learner. This study aims to educate users on the entire routine of managing diabetes. The serious game utilizes VR for realistic interaction with diabetes tools and generative AI through a conversational avatar that acts as an assistant instructor. In this way, it allows users to practice diagnostic and therapeutic interventions in a controlled virtual environment, helping to build their understanding and confidence in diabetes management. To measure the effects of the proposed serious game, presence, and perceived agency were measured. Preliminary results indicate that this setup aids in the engagement and immersion of learners, while the avatar can provide helpful information during gameplay. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.},

keywords = {Artificial intelligence, Chronic disease, Computer aided instruction, Diabetes Education, Diagnosis, E-Learning, Education management, Engineering education, Gamification, Immersive virtual reality, Interactive computer graphics, Interactive learning, Large population, Learning systems, NUI, Self management, Serious game, Serious games, simulation, Virtual Reality},

pubstate = {published},

tppubtype = {inproceedings}

}

Shen, Y.; Li, B.; Huang, J.; Wang, Z.

GaussianShopVR: Facilitating Immersive 3D Authoring Using Gaussian Splatting in VR Proceedings Article

In: Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW, pp. 1292–1293, Institute of Electrical and Electronics Engineers Inc., 2025, ISBN: 979-833151484-6 (ISBN).

Abstract | Links | BibTeX | Tags: 3D authoring, 3D modeling, Digital replicas, Gaussian distribution, Gaussian Splatting editing, Gaussians, Graphical user interfaces, High quality, Immersive, Immersive environment, Interactive computer graphics, Rendering (computer graphics), Rendering pipelines, Splatting, Three dimensional computer graphics, User profile, Virtual Reality, Virtual reality user interface, Virtualization, VR user interface

@inproceedings{shen_gaussianshopvr_2025,

title = {GaussianShopVR: Facilitating Immersive 3D Authoring Using Gaussian Splatting in VR},

author = {Y. Shen and B. Li and J. Huang and Z. Wang},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105005138672&doi=10.1109%2fVRW66409.2025.00292&partnerID=40&md5=9b644bd19394a289d3027ab9a2dfed6a},

doi = {10.1109/VRW66409.2025.00292},

isbn = {979-833151484-6 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW},

pages = {1292–1293},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {Virtual reality (VR) applications require massive high-quality 3D assets to create immersive environments. Generating mesh-based 3D assets typically involves a significant amount of manpower and effort, which makes VR applications less accessible. 3D Gaussian Splatting (3DGS) has attracted much attention for its ability to quickly create digital replicas of real-life scenes and its compatibility with traditional rendering pipelines. However, it remains a challenge to edit 3DGS in a flexible and controllable manner. We propose GaussianShopVR, a system that leverages VR user interfaces to specify target areas to achieve flexible and controllable editing of reconstructed 3DGS. In addition, selected areas can provide 3D information to generative AI models to facilitate the editing. GaussianShopVR integrates object hierarchy management while keeping the backpropagated gradient flow to allow local editing with context information. © 2025 IEEE.},

keywords = {3D authoring, 3D modeling, Digital replicas, Gaussian distribution, Gaussian Splatting editing, Gaussians, Graphical user interfaces, High quality, Immersive, Immersive environment, Interactive computer graphics, Rendering (computer graphics), Rendering pipelines, Splatting, Three dimensional computer graphics, User profile, Virtual Reality, Virtual reality user interface, Virtualization, VR user interface},

pubstate = {published},

tppubtype = {inproceedings}

}

Cao, X.; Ju, K. P.; Li, C.; Jain, D.

SceneGenA11y: How can Runtime Generative tools improve the Accessibility of a Virtual 3D Scene? Proceedings Article

In: Conf Hum Fact Comput Syst Proc, Association for Computing Machinery, 2025, ISBN: 979-840071395-8 (ISBN).

Abstract | Links | BibTeX | Tags: 3D application, 3D modeling, 3D scenes, Accessibility, BLV, DHH, Discrete event simulation, Generative AI, Generative tools, Interactive computer graphics, One dimensional, Runtimes, Three dimensional computer graphics, Video-games, Virtual 3d scene, virtual 3D scenes, Virtual environments, Virtual Reality

@inproceedings{cao_scenegena11y_2025,

title = {SceneGenA11y: How can Runtime Generative tools improve the Accessibility of a Virtual 3D Scene?},

author = {X. Cao and K. P. Ju and C. Li and D. Jain},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105005772656&doi=10.1145%2f3706599.3720265&partnerID=40&md5=9b0bf29c3e89b70efa2d6a3e740829fb},

doi = {10.1145/3706599.3720265},

isbn = {979-840071395-8 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Conf Hum Fact Comput Syst Proc},

publisher = {Association for Computing Machinery},

abstract = {With the popularity of virtual 3D applications, from video games to educational content and virtual reality scenarios, the accessibility of 3D scene information is vital to ensure inclusive and equitable experiences for all. Previous work include information substitutions like audio description and captions, as well as personalized modifications, but they could only provide predefined accommodations. In this work, we propose SceneGenA11y, a system that responds to the user’s natural language prompts to improve accessibility of a 3D virtual scene in runtime. The system primes LLM agents with accessibility-related knowledge, allowing users to explore the scene and perform verifiable modifications to improve accessibility. We conducted a preliminary evaluation of our system with three blind and low-vision people and three deaf and hard-of-hearing people. The results show that our system is intuitive to use and can successfully improve accessibility. We discussed usage patterns of the system, potential improvements, and integration into apps. We ended with highlighting plans for future work. © 2025 Copyright held by the owner/author(s).},

keywords = {3D application, 3D modeling, 3D scenes, Accessibility, BLV, DHH, Discrete event simulation, Generative AI, Generative tools, Interactive computer graphics, One dimensional, Runtimes, Three dimensional computer graphics, Video-games, Virtual 3d scene, virtual 3D scenes, Virtual environments, Virtual Reality},

pubstate = {published},

tppubtype = {inproceedings}

}

Peter, K.; Makosa, I.; Auala, S.; Ndjao, L.; Maasz, D.; Mbinge, U.; Winschiers-Theophilus, H.

Co-creating a VR Narrative Experience of Constructing a Food Storage Following OvaHimba Traditional Practices Proceedings Article

In: IMX - Proc. ACM Int. Conf. Interact. Media Experiences, pp. 418–423, Association for Computing Machinery, Inc, 2025, ISBN: 979-840071391-0 (ISBN).

Abstract | Links | BibTeX | Tags: 3D Modelling, 3D models, 3d-modeling, Co-designs, Community-based, Community-Based Co-Design, Computer aided design, Cultural heritage, Cultural heritages, Food storage, Human computer interaction, Human engineering, Indigenous Knowledge, Information Systems, Interactive computer graphics, Interactive computer systems, IVR, Namibia, OvaHimba, Ovahimbum, Photogrammetry, Sustainable development, Virtual environments, Virtual Reality

Coronado, A.; Carvalho, S. T.; Berretta, L.

See Through My Eyes: Using Multimodal Large Language Model for Describing Rendered Environments to Blind People Proceedings Article

In: IMX - Proc. ACM Int. Conf. Interact. Media Experiences, pp. 451–457, Association for Computing Machinery, Inc, 2025, ISBN: 979-840071391-0 (ISBN).

Abstract | Links | BibTeX | Tags: Accessibility, Behavioral Research, Blind, Blind people, Helmet mounted displays, Human engineering, Human rehabilitation equipment, Interactive computer graphics, Interactive computer systems, Language Model, LLM, Multi-modal, Rendered environment, rendered environments, Spatial cognition, Virtual Reality, Vision aids, Visual impairment, Visual languages, Visually impaired people

@inproceedings{coronado_see_2025,

title = {See Through My Eyes: Using Multimodal Large Language Model for Describing Rendered Environments to Blind People},

author = {A. Coronado and S. T. Carvalho and L. Berretta},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105007991842&doi=10.1145%2f3706370.3731641&partnerID=40&md5=2f7cb1535d39d5e59b1f43f773de3272},

doi = {10.1145/3706370.3731641},

isbn = {979-840071391-0 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {IMX - Proc. ACM Int. Conf. Interact. Media Experiences},

pages = {451–457},

publisher = {Association for Computing Machinery, Inc},

abstract = {Extended Reality (XR) is quickly expanding "as the next major technology wave in personal computing". Nevertheless, this expansion and adoption could also exclude certain disabled users, particularly people with visual impairment (VIP). According to the World Health Organization (WHO) in their 2019 publication, there were at least 2.2 billion people with visual impairment, a number that is also estimated to have increased in recent years. Therefore, it is important to include disabled users, especially visually impaired people, in the design of Head-Mounted Displays and Extended Reality environments. Indeed, this objective can be pursued by incorporating Multimodal Large Language Model (MLLM) technology, which can assist visually impaired people. As a case study, this study employs different prompts that result in environment descriptions from an MLLM integrated into a virtual reality (VR) escape room. Therefore, six potential prompts were engineered to generate valuable outputs for visually impaired users inside a VR environment. These outputs were evaluated using the G-Eval, and VIEScore metrics. Even though, the results show that the prompt patterns provided a description that aligns with the user's point of view, it is highly recommended to evaluate these outputs through "expected outputs"from Orientation and Mobility Specialists, and Sighted Guides. Furthermore, the subsequent step in the process is to evaluate these outputs by visually impaired people themselves to identify the most effective prompt pattern. © 2025 Copyright held by the owner/author(s).},

keywords = {Accessibility, Behavioral Research, Blind, Blind people, Helmet mounted displays, Human engineering, Human rehabilitation equipment, Interactive computer graphics, Interactive computer systems, Language Model, LLM, Multi-modal, Rendered environment, rendered environments, Spatial cognition, Virtual Reality, Vision aids, Visual impairment, Visual languages, Visually impaired people},

pubstate = {published},

tppubtype = {inproceedings}

}

Oliveira, E. A. Masasi De; Sousa, R. T.; Bastos, A. A.; Cintra, L. Martins De Freitas; Filho, A. R. G.

Immersive Virtual Museums with Spatially-Aware Retrieval-Augmented Generation Proceedings Article

In: IMX - Proc. ACM Int. Conf. Interact. Media Experiences, pp. 437–440, Association for Computing Machinery, Inc, 2025, ISBN: 979-840071391-0 (ISBN).

Abstract | Links | BibTeX | Tags: Association reactions, Behavioral Research, Generation systems, Geographics, Human computer interaction, Human engineering, Immersive, Information Retrieval, Interactive computer graphics, Language Model, Large language model, large language models, Museums, Retrieval-Augmented Generation, Search engines, Spatially aware, User interfaces, Virtual environments, Virtual museum, Virtual museum., Virtual Reality, Visual Attention, Visual languages

@inproceedings{masasi_de_oliveira_immersive_2025,

title = {Immersive Virtual Museums with Spatially-Aware Retrieval-Augmented Generation},

author = {E. A. Masasi De Oliveira and R. T. Sousa and A. A. Bastos and L. Martins De Freitas Cintra and A. R. G. Filho},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105007979183&doi=10.1145%2f3706370.3731643&partnerID=40&md5=db10b41217dd8a0b0705c3fb4a615666},

doi = {10.1145/3706370.3731643},

isbn = {979-840071391-0 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {IMX - Proc. ACM Int. Conf. Interact. Media Experiences},

pages = {437–440},

publisher = {Association for Computing Machinery, Inc},

abstract = {Virtual Reality has significantly expanded possibilities for immersive museum experiences, overcoming traditional constraints such as space, preservation, and geographic limitations. However, existing virtual museum platforms typically lack dynamic, personalized, and contextually accurate interactions. To address this, we propose Spatially-Aware Retrieval-Augmented Generation (SA-RAG), an innovative framework integrating visual attention tracking with Retrieval-Augmented Generation systems and advanced Large Language Models. By capturing users' visual attention in real time, SA-RAG dynamically retrieves contextually relevant data, enhancing the accuracy, personalization, and depth of user interactions within immersive virtual environments. The system's effectiveness is initially demonstrated through our preliminary tests within a realistic VR museum implemented using Unreal Engine. Although promising, comprehensive human evaluations involving broader user groups are planned for future studies to rigorously validate SA-RAG's effectiveness, educational enrichment potential, and accessibility improvements in virtual museums. The framework also presents opportunities for broader applications in immersive educational and storytelling domains. © 2025 Copyright held by the owner/author(s).},

keywords = {Association reactions, Behavioral Research, Generation systems, Geographics, Human computer interaction, Human engineering, Immersive, Information Retrieval, Interactive computer graphics, Language Model, Large language model, large language models, Museums, Retrieval-Augmented Generation, Search engines, Spatially aware, User interfaces, Virtual environments, Virtual museum, Virtual museum., Virtual Reality, Visual Attention, Visual languages},

pubstate = {published},

tppubtype = {inproceedings}

}

Vachha, C.; Kang, Y.; Dive, Z.; Chidambaram, A.; Gupta, A.; Jun, E.; Hartmann, B.

Dreamcrafter: Immersive Editing of 3D Radiance Fields Through Flexible, Generative Inputs and Outputs Proceedings Article

In: Conf Hum Fact Comput Syst Proc, Association for Computing Machinery, 2025, ISBN: 979-840071394-1 (ISBN).

Abstract | Links | BibTeX | Tags: 3D modeling, 3D scenes, AI assisted creativity tool, Animation, Computer vision, Direct manipulation, Drawing (graphics), Gaussian Splatting, Gaussians, Generative AI, Graphic, Graphics, High level languages, Immersive, Interactive computer graphics, Splatting, Three dimensional computer graphics, Virtual Reality, Worldbuilding interface

@inproceedings{vachha_dreamcrafter_2025,

title = {Dreamcrafter: Immersive Editing of 3D Radiance Fields Through Flexible, Generative Inputs and Outputs},

author = {C. Vachha and Y. Kang and Z. Dive and A. Chidambaram and A. Gupta and E. Jun and B. Hartmann},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105005725679&doi=10.1145%2f3706598.3714312&partnerID=40&md5=68cf2a08d3057fd9756e25d53959872b},

doi = {10.1145/3706598.3714312},

isbn = {979-840071394-1 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Conf Hum Fact Comput Syst Proc},

publisher = {Association for Computing Machinery},

abstract = {Authoring 3D scenes is a central task for spatial computing applications. Competing visions for lowering existing barriers are (1) focus on immersive, direct manipulation of 3D content or (2) leverage AI techniques that capture real scenes (3D Radiance Fields such as, NeRFs, 3D Gaussian Splatting) and modify them at a higher level of abstraction, at the cost of high latency. We unify the complementary strengths of these approaches and investigate how to integrate generative AI advances into real-time, immersive 3D Radiance Field editing. We introduce Dreamcrafter, a VR-based 3D scene editing system that: (1) provides a modular architecture to integrate generative AI algorithms; (2) combines different levels of control for creating objects, including natural language and direct manipulation; and (3) introduces proxy representations that support interaction during high-latency operations. We contribute empirical findings on control preferences and discuss how generative AI interfaces beyond text input enhance creativity in scene editing and world building. © 2025 Copyright held by the owner/author(s).},

keywords = {3D modeling, 3D scenes, AI assisted creativity tool, Animation, Computer vision, Direct manipulation, Drawing (graphics), Gaussian Splatting, Gaussians, Generative AI, Graphic, Graphics, High level languages, Immersive, Interactive computer graphics, Splatting, Three dimensional computer graphics, Virtual Reality, Worldbuilding interface},

pubstate = {published},

tppubtype = {inproceedings}

}

Leininger, P.; Weber, C. J.; Rothe, S.

Understanding Creative Potential and Use Cases of AI-Generated Environments for Virtual Film Productions: Insights from Industry Professionals Proceedings Article

In: IMX - Proc. ACM Int. Conf. Interact. Media Experiences, pp. 60–78, Association for Computing Machinery, Inc, 2025, ISBN: 979-840071391-0 (ISBN).

Abstract | Links | BibTeX | Tags: 3-D environments, 3D reconstruction, 3D Scene Reconstruction, 3d scenes reconstruction, AI-generated 3d environment, AI-Generated 3D Environments, Computer interaction, Creative Collaboration, Creatives, Digital content creation, Digital Content Creation., Filmmaking workflow, Filmmaking Workflows, Gaussian distribution, Gaussian Splatting, Gaussians, Generative AI, Graphical user interface, Graphical User Interface (GUI), Graphical user interfaces, Human computer interaction, human-computer interaction, Human-Computer Interaction (HCI), Immersive, Immersive Storytelling, Interactive computer graphics, Interactive computer systems, Interactive media, Mesh generation, Previsualization, Real-Time Rendering, Splatting, Three dimensional computer graphics, Virtual production, Virtual Production (VP), Virtual Reality, Work-flows

@inproceedings{leininger_understanding_2025,

title = {Understanding Creative Potential and Use Cases of AI-Generated Environments for Virtual Film Productions: Insights from Industry Professionals},

author = {P. Leininger and C. J. Weber and S. Rothe},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105007976841&doi=10.1145%2f3706370.3727853&partnerID=40&md5=0d4cf7a2398d12d04e4f0ab182474a10},

doi = {10.1145/3706370.3727853},

isbn = {979-840071391-0 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {IMX - Proc. ACM Int. Conf. Interact. Media Experiences},

pages = {60–78},

publisher = {Association for Computing Machinery, Inc},

abstract = {Virtual production (VP) is transforming filmmaking by integrating real-time digital elements with live-action footage, offering new creative possibilities and streamlined workflows. While industry experts recognize AI's potential to revolutionize VP, its practical applications and value across different production phases and user groups remain underexplored. Building on initial research into generative and data-driven approaches, this paper presents the first systematic pilot study evaluating three types of AI-generated 3D environments - Depth Mesh, 360° Panoramic Meshes, and Gaussian Splatting - through the participation of 15 filmmaking professionals from diverse roles. Unlike commonly used 2D AI-generated visuals, our approach introduces navigable 3D environments that offer greater control and flexibility, aligning more closely with established VP workflows. Through expert interviews and literature research, we developed evaluation criteria to assess their usefulness beyond concept development, extending to previsualization, scene exploration, and interdisciplinary collaboration. Our findings indicate that different environments cater to distinct production needs, from early ideation to detailed visualization. Gaussian Splatting proved effective for high-fidelity previsualization, while 360° Panoramic Meshes excelled in rapid concept ideation. Despite their promise, challenges such as limited interactivity and customization highlight areas for improvement. Our prototype, EnVisualAIzer, built in Unreal Engine 5, provides an accessible platform for diverse filmmakers to engage with AI-generated environments, fostering a more inclusive production process. By lowering technical barriers, these environments have the potential to make advanced VP tools more widely available. This study offers valuable insights into the evolving role of AI in VP and sets the stage for future research and development. © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.},

keywords = {3-D environments, 3D reconstruction, 3D Scene Reconstruction, 3d scenes reconstruction, AI-generated 3d environment, AI-Generated 3D Environments, Computer interaction, Creative Collaboration, Creatives, Digital content creation, Digital Content Creation., Filmmaking workflow, Filmmaking Workflows, Gaussian distribution, Gaussian Splatting, Gaussians, Generative AI, Graphical user interface, Graphical User Interface (GUI), Graphical user interfaces, Human computer interaction, human-computer interaction, Human-Computer Interaction (HCI), Immersive, Immersive Storytelling, Interactive computer graphics, Interactive computer systems, Interactive media, Mesh generation, Previsualization, Real-Time Rendering, Splatting, Three dimensional computer graphics, Virtual production, Virtual Production (VP), Virtual Reality, Work-flows},

pubstate = {published},

tppubtype = {inproceedings}

}

Virtual production (VP) is transforming filmmaking by integrating real-time digital elements with live-action footage, offering new creative possibilities and streamlined workflows. While industry experts recognize AI's potential to revolutionize VP, its practical applications and value across different production phases and user groups remain underexplored. Building on initial research into generative and data-driven approaches, this paper presents the first systematic pilot study evaluating three types of AI-generated 3D environments - Depth Mesh, 360° Panoramic Meshes, and Gaussian Splatting - through the participation of 15 filmmaking professionals from diverse roles. Unlike commonly used 2D AI-generated visuals, our approach introduces navigable 3D environments that offer greater control and flexibility, aligning more closely with established VP workflows. Through expert interviews and literature research, we developed evaluation criteria to assess their usefulness beyond concept development, extending to previsualization, scene exploration, and interdisciplinary collaboration. Our findings indicate that different environments cater to distinct production needs, from early ideation to detailed visualization. Gaussian Splatting proved effective for high-fidelity previsualization, while 360° Panoramic Meshes excelled in rapid concept ideation. Despite their promise, challenges such as limited interactivity and customization highlight areas for improvement. Our prototype, EnVisualAIzer, built in Unreal Engine 5, provides an accessible platform for diverse filmmakers to engage with AI-generated environments, fostering a more inclusive production process. By lowering technical barriers, these environments have the potential to make advanced VP tools more widely available. This study offers valuable insights into the evolving role of AI in VP and sets the stage for future research and development. © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.

Shawash, J.; Thibault, M.; Hamari, J.

Who Killed Helene Pumpulivaara?: AI-Assisted Content Creation and XR Implementation for Interactive Built Heritage Storytelling Proceedings Article

In: IMX - Proc. ACM Int. Conf. Interact. Media Experiences, pp. 377–379, Association for Computing Machinery, Inc, 2025, ISBN: 979-840071391-0 (ISBN).

Abstract | Links | BibTeX | Tags: Artificial intelligence, Augmented Reality, Built heritage, Content creation, Digital heritage, Digital Interpretation, Extended reality, Human computer interaction, Human engineering, Industrial Heritage, Interactive computer graphics, Interactive computer systems, Mobile photographies, Narrative Design, Narrative designs, Production pipelines, Uncanny valley, Virtual Reality

@inproceedings{shawash_who_2025,

title = {Who Killed Helene Pumpulivaara?: AI-Assisted Content Creation and XR Implementation for Interactive Built Heritage Storytelling},

author = {J. Shawash and M. Thibault and J. Hamari},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105008003446&doi=10.1145%2f3706370.3731703&partnerID=40&md5=bc8a8d221abcf6c560446979fbd06cbc},

doi = {10.1145/3706370.3731703},

isbn = {979-840071391-0 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {IMX - Proc. ACM Int. Conf. Interact. Media Experiences},

pages = {377–379},

publisher = {Association for Computing Machinery, Inc},

abstract = {This demo presents "Who Killed Helene Pumpulivaara?", an innovative interactive heritage experience that combines crime mystery narrative with XR technology to address key challenges in digital heritage interpretation. Our work makes six significant contributions: (1) the discovery of a "Historical Uncanny Valley"effect where varying fidelity levels between AI-generated and authentic content serve as implicit markers distinguishing fact from interpretation; (2) an accessible production pipeline combining mobile photography with AI tools that democratizes XR heritage creation for resource-limited institutions; (3) a spatial storytelling approach that effectively counters decontextualization in digital heritage; (4) a multi-platform implementation strategy across web and VR environments; (5) a practical model for AI-assisted heritage content creation balancing authenticity with engagement; and (6) a pathway toward spatial augmented reality for future heritage interpretation. Using the historic Finlayson Factory in Tampere, Finland as a case study, our implementation demonstrates how emerging technologies can enrich the authenticity of heritage experiences, fostering deeper emotional connections between visitors and the histories embedded in place. © 2025 Copyright held by the owner/author(s).},

keywords = {Artificial intelligence, Augmented Reality, Built heritage, Content creation, Digital heritage, Digital Interpretation, Extended reality, Human computer interaction, Human engineering, Industrial Heritage, Interactive computer graphics, Interactive computer systems, Mobile photographies, Narrative Design, Narrative designs, Production pipelines, Uncanny valley, Virtual Reality},

pubstate = {published},

tppubtype = {inproceedings}

}

2024

Gottsacker, M.; Bruder, G.; Welch, G. F.

rlty2rlty: Transitioning Between Realities with Generative AI Proceedings Article

In: Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW, pp. 1160–1161, Institute of Electrical and Electronics Engineers Inc., 2024, ISBN: 979-835037449-0 (ISBN).

Abstract | Links | BibTeX | Tags: Human computer interaction, Human computer interaction (HCI), Human-centered computing, Interaction paradigm, Interaction paradigms, Interactive computer graphics, Liminal spaces, Mixed / augmented reality, Mixed reality, Real environments, System use, User interfaces, Virtual worlds

Nebeling, M.; Oki, M.; Gelsomini, M.; Hayes, G. R.; Billinghurst, M.; Suzuki, K.; Graf, R.

Designing Inclusive Future Augmented Realities Proceedings Article

In: Conf Hum Fact Comput Syst Proc, Association for Computing Machinery, 2024, ISBN: 979-840070331-7 (ISBN).

Abstract | Links | BibTeX | Tags: Accessible and inclusive design, Augmented Reality, Augmented reality technology, Display technologies, Generative AI, Inclusive design, Interactive computer graphics, Mixed reality, Mixed reality technologies, Rapid prototyping, Rapid-prototyping, Sensing technology, Spatial computing

@inproceedings{nebeling_designing_2024,

title = {Designing Inclusive Future Augmented Realities},

author = {M. Nebeling and M. Oki and M. Gelsomini and G. R. Hayes and M. Billinghurst and K. Suzuki and R. Graf},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85194176929&doi=10.1145%2f3613905.3636313&partnerID=40&md5=411b65058a4c96149182237aa586fa75},

doi = {10.1145/3613905.3636313},

isbn = {979-840070331-7 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {Conf Hum Fact Comput Syst Proc},

publisher = {Association for Computing Machinery},

abstract = {Augmented and mixed reality technology is rapidly advancing, driven by innovations in display, sensing, and AI technologies. This evolution, particularly in the era of generative AI with large language and text-to-image models such as GPT and Stable Diffusion, has the potential, not only to make it easier to create, but also to adapt and personalize, new content. Our workshop explores the pivotal role of augmented and mixed reality to shape a user's interactions with their physical surroundings. We aim to explore how inclusive future augmented realities can be designed, with increasing support for automation, such that environments can welcome users with different needs, emphasizing accessibility and inclusion through layers of augmentations. Our aim is not only to remove barriers by providing accommodations, but also to create a sense of belonging by directly engaging users. Our workshop consists of three main activities: (1) Through brainstorming and discussion of examples provided by the workshop organizers and participants, we critically review the landscape of accessible and inclusive design and their vital role in augmented and mixed reality experiences. (2) Through rapid prototyping activities including bodystorming and low-fidelity, mixed-media prototypes, participants explore how augmented and mixed reality can transform physical space into a more personal place, enhancing accessibility and inclusion based on novel interface and interaction techniques that are desirable, but not necessarily technically feasible just yet. In the workshop, we plan to focus on physical space to facilitate rapid prototyping without technical constraints, but techniques developed in the workshop are likely applicable to immersive virtual environments as well. (3) Finally, we collaborate to outline a research agenda for designing future augmented realities that promote equal opportunities, benefiting diverse user populations. Our workshop inspires innovation in augmented and mixed reality, reshaping physical environments to be more accessible and inclusive through immersive design. © 2024 Owner/Author.},

keywords = {Accessible and inclusive design, Augmented Reality, Augmented reality technology, Display technologies, Generative AI, Inclusive design, Interactive computer graphics, Mixed reality, Mixed reality technologies, Rapid prototyping, Rapid-prototyping, Sensing technology, Spatial computing},

pubstate = {published},

tppubtype = {inproceedings}

}

Augmented and mixed reality technology is rapidly advancing, driven by innovations in display, sensing, and AI technologies. This evolution, particularly in the era of generative AI with large language and text-to-image models such as GPT and Stable Diffusion, has the potential, not only to make it easier to create, but also to adapt and personalize, new content. Our workshop explores the pivotal role of augmented and mixed reality to shape a user's interactions with their physical surroundings. We aim to explore how inclusive future augmented realities can be designed, with increasing support for automation, such that environments can welcome users with different needs, emphasizing accessibility and inclusion through layers of augmentations. Our aim is not only to remove barriers by providing accommodations, but also to create a sense of belonging by directly engaging users. Our workshop consists of three main activities: (1) Through brainstorming and discussion of examples provided by the workshop organizers and participants, we critically review the landscape of accessible and inclusive design and their vital role in augmented and mixed reality experiences. (2) Through rapid prototyping activities including bodystorming and low-fidelity, mixed-media prototypes, participants explore how augmented and mixed reality can transform physical space into a more personal place, enhancing accessibility and inclusion based on novel interface and interaction techniques that are desirable, but not necessarily technically feasible just yet. In the workshop, we plan to focus on physical space to facilitate rapid prototyping without technical constraints, but techniques developed in the workshop are likely applicable to immersive virtual environments as well. (3) Finally, we collaborate to outline a research agenda for designing future augmented realities that promote equal opportunities, benefiting diverse user populations. Our workshop inspires innovation in augmented and mixed reality, reshaping physical environments to be more accessible and inclusive through immersive design. © 2024 Owner/Author.

Torre, F. De La; Fang, C. M.; Huang, H.; Banburski-Fahey, A.; Fernandez, J. A.; Lanier, J.

LLMR: Real-time Prompting of Interactive Worlds using Large Language Models Proceedings Article

In: Conf Hum Fact Comput Syst Proc, Association for Computing Machinery, 2024, ISBN: 979-840070330-0 (ISBN).

Abstract | Links | BibTeX | Tags: Artificial intelligence, Computational Linguistics, Design goal, Interactive computer graphics, Interactive worlds, Internal dynamics, Language Model, Large language model, Mixed reality, Novel strategies, Real- time, Spatial Reasoning, Training data

He, K.; Lapham, A.; Li, Z.

Enhancing Narratives with SayMotion's text-to-3D animation and LLMs Proceedings Article

In: S.N., Spencer (Ed.): Proc. - SIGGRAPH Real-Time Live!, Association for Computing Machinery, Inc, 2024, ISBN: 979-840070526-7 (ISBN).

Abstract | Links | BibTeX | Tags: 3D animation, AI-based animation, Animation, Animation editing, Deep learning, Film production, Human motions, Interactive computer graphics, Interactive media, Language Model, Motion models, Physics simulation, Production medium, Simulation platform, Three dimensional computer graphics

He, K.; Yao, K.; Zhang, Q.; Yu, J.; Liu, L.; Xu, L.

DressCode: Autoregressively Sewing and Generating Garments from Text Guidance Journal Article

In: ACM Transactions on Graphics, vol. 43, no. 4, 2024, ISSN: 07300301 (ISSN).

Abstract | Links | BibTeX | Tags: 3D content, 3d garments, autoregressive model, Autoregressive modelling, Content creation, Digital humans, Embeddings, Fashion design, Garment generation, Interactive computer graphics, Sewing pattern, sewing patterns, Textures, Virtual Reality, Virtual Try-On

@article{he_dresscode_2024,

title = {DressCode: Autoregressively Sewing and Generating Garments from Text Guidance},

author = {K. He and K. Yao and Q. Zhang and J. Yu and L. Liu and L. Xu},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85199257820&doi=10.1145%2f3658147&partnerID=40&md5=8996e62e4d9dabb5a7034f8bf4df5a43},

doi = {10.1145/3658147},

issn = {07300301 (ISSN)},

year  = {2024},

date = {2024-01-01},

journal = {ACM Transactions on Graphics},

volume = {43},

number = {4},

abstract = {Apparel's significant role in human appearance underscores the importance of garment digitalization for digital human creation. Recent advances in 3D content creation are pivotal for digital human creation. Nonetheless, garment generation from text guidance is still nascent. We introduce a text-driven 3D garment generation framework, DressCode, which aims to democratize design for novices and offer immense potential in fashion design, virtual try-on, and digital human creation. We first introduce SewingGPT, a GPT-based architecture integrating cross-attention with text-conditioned embedding to generate sewing patterns with text guidance. We then tailor a pre-trained Stable Diffusion to generate tile-based Physically-based Rendering (PBR) textures for the garments. By leveraging a large language model, our framework generates CG-friendly garments through natural language interaction. It also facilitates pattern completion and texture editing, streamlining the design process through user-friendly interaction. This framework fosters innovation by allowing creators to freely experiment with designs and incorporate unique elements into their work. With comprehensive evaluations and comparisons with other state-of-the-art methods, our method showcases superior quality and alignment with input prompts. User studies further validate our high-quality rendering results, highlighting its practical utility and potential in production settings. Copyright © 2024 held by the owner/author(s).},

keywords = {3D content, 3d garments, autoregressive model, Autoregressive modelling, Content creation, Digital humans, Embeddings, Fashion design, Garment generation, Interactive computer graphics, Sewing pattern, sewing patterns, Textures, Virtual Reality, Virtual Try-On},

pubstate = {published},

tppubtype = {article}

}

Leong, C. W.; Jawahar, N.; Basheerabad, V.; Wörtwein, T.; Emerson, A.; Sivan, G.

Combining Generative and Discriminative AI for High-Stakes Interview Practice Proceedings Article

In: ACM Int. Conf. Proc. Ser., pp. 94–96, Association for Computing Machinery, 2024, ISBN: 979-840070463-5 (ISBN).

Abstract | Links | BibTeX | Tags: AI systems, College admissions, Continuous improvements, End to end, Interactive computer graphics, Interactive dialog system, interactive dialogue systems, Language Model, Modeling languages, Multi-modal, Multimodal computing, Video interview, video interviews, Virtual avatar, Virtual environments, Virtual Reality

2023

Friess, P.

THE_OTHERVERSE - A Contemporary Cabinet Of Curiosities Proceedings Article

In: D., Byrne; N., Martelaro (Ed.): DIS Companion: Companion Publ. ACM Des Interact. Syst. Conf., pp. 50–54, Association for Computing Machinery, Inc, 2023, ISBN: 978-145039898-5 (ISBN).

Abstract | Links | BibTeX | Tags: 'current, AI-generated content, Algorithmic energy, Algorithmics, Arts computing, Energy, Form of existences, Image enhancement, Interactive computer graphics, Metaverse, Metaverses, Other verse, Other verses, Resilience, Virtual Reality, Virtual worlds

@inproceedings{friess_the_otherverse_2023,

title = {THE_OTHERVERSE - A Contemporary Cabinet Of Curiosities},

author = {P. Friess},

editor = {Byrne D. and Martelaro N.},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85167658209&doi=10.1145%2f3563703.3596803&partnerID=40&md5=d68cab2e32a1581efd5b734b67b0f88b},

doi = {10.1145/3563703.3596803},

isbn = {978-145039898-5 (ISBN)},

year  = {2023},

date = {2023-01-01},

booktitle = {DIS Companion: Companion Publ. ACM Des Interact. Syst. Conf.},

pages = {50–54},

publisher = {Association for Computing Machinery, Inc},

abstract = {THE_OTHERVERSE is an artwork exploring resilience as a proactive artistic attitude. Inspired by the revisited idea of a "Wunderkammer"for multidisciplinary exploration and as a representation of microcosms of the world, the artwork features a contemporary cabinet of curiosities with AI-generated content of other forms of existence and expands the current Metaverse paradigm including emphasizing algorithm energy sound and rhythm. The research-creation process includes storytelling, object creation, virtual environment design, sound creation, and image enhancement, blending the aesthetics of the obtained results with the tools used for the creation. Understanding and interacting with AI as a creative partner opens up new possibilities for future research-creation, both for the research part in providing collective knowledge as for the creation part to propose a machine-thinking inspired recombination of ideas. Resilience is not only achieved by how we respond to bad things, but also how we broaden our possibilities (https://vimeo.com/petermfriess/the-otherverse). © 2023 ACM.},

keywords = {'current, AI-generated content, Algorithmic energy, Algorithmics, Arts computing, Energy, Form of existences, Image enhancement, Interactive computer graphics, Metaverse, Metaverses, Other verse, Other verses, Resilience, Virtual Reality, Virtual worlds},

pubstate = {published},

tppubtype = {inproceedings}

}

Vincent, B.; Ayyar, K.

Roblox Generative AI in action Proceedings Article

In: S.N., Spencer (Ed.): Proc. - SIGGRAPH Real-Time Live!, Association for Computing Machinery, Inc, 2023, ISBN: 979-840070158-0 (ISBN).

Abstract | Links | BibTeX | Tags: AI techniques, Complex model, Creation process, Education, Game, Games, Interactive computer graphics, Interactive objects, Lighting, Metaverse, Metaverses, Modeling, Modeling languages, Natural languages, Object and scenes, Pipeline, Real-Time Rendering, Rendering (computer graphics)