AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Tian, Y.; Li, X.; Cheng, Z.; Huang, Y.; Yu, T.

Design of Realistic and Artistically Expressive 3D Facial Models for Film AIGC: A Cross-Modal Framework Integrating Audience Perception Evaluation Journal Article

In: Sensors, vol. 25, no. 15, 2025, ISSN: 14248220 (ISSN), (Publisher: Multidisciplinary Digital Publishing Institute (MDPI)).

Abstract | Links | BibTeX | Tags: 3D faces, 3d facial model, 3D facial models, 3D modeling, adaptation, adult, Article, Audience perception evaluation, benchmarking, controlled study, Cross-modal, Face generation, Facial modeling, facies, Feature extraction, feedback, feedback system, female, Geometry, High-fidelity, human, illumination, Immersive media, Lighting, male, movie, Neural radiance field, Neural Radiance Fields, perception, Quality control, Rendering (computer graphics), Semantics, sensor, Three dimensional computer graphics, Virtual production, Virtual Reality

@article{tian_design_2025,

title = {Design of Realistic and Artistically Expressive 3D Facial Models for Film AIGC: A Cross-Modal Framework Integrating Audience Perception Evaluation},

author = {Y. Tian and X. Li and Z. Cheng and Y. Huang and T. Yu},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105013137724&doi=10.3390%2Fs25154646&partnerID=40&md5=8508a27b693f0857ce7cb58e97a2705c},

doi = {10.3390/s25154646},

issn = {14248220 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {Sensors},

volume = {25},

number = {15},

abstract = {The rise of virtual production has created an urgent need for both efficient and high-fidelity 3D face generation schemes for cinema and immersive media, but existing methods are often limited by lighting–geometry coupling, multi-view dependency, and insufficient artistic quality. To address this, this study proposes a cross-modal 3D face generation framework based on single-view semantic masks. It utilizes Swin Transformer for multi-level feature extraction and combines with NeRF for illumination decoupled rendering. We utilize physical rendering equations to explicitly separate surface reflectance from ambient lighting to achieve robust adaptation to complex lighting variations. In addition, to address geometric errors across illumination scenes, we construct geometric a priori constraint networks by mapping 2D facial features to 3D parameter space as regular terms with the help of semantic masks. On the CelebAMask-HQ dataset, this method achieves a leading score of SSIM = 0.892 (37.6% improvement from baseline) with FID = 40.6. The generated faces excel in symmetry and detail fidelity with realism and aesthetic scores of 8/10 and 7/10, respectively, in a perceptual evaluation with 1000 viewers. By combining physical-level illumination decoupling with semantic geometry a priori, this paper establishes a quantifiable feedback mechanism between objective metrics and human aesthetic evaluation, providing a new paradigm for aesthetic quality assessment of AI-generated content. © 2025 Elsevier B.V., All rights reserved.},

note = {Publisher: Multidisciplinary Digital Publishing Institute (MDPI)},

keywords = {3D faces, 3d facial model, 3D facial models, 3D modeling, adaptation, adult, Article, Audience perception evaluation, benchmarking, controlled study, Cross-modal, Face generation, Facial modeling, facies, Feature extraction, feedback, feedback system, female, Geometry, High-fidelity, human, illumination, Immersive media, Lighting, male, movie, Neural radiance field, Neural Radiance Fields, perception, Quality control, Rendering (computer graphics), Semantics, sensor, Three dimensional computer graphics, Virtual production, Virtual Reality},

pubstate = {published},

tppubtype = {article}

}

Zhao, Y.; Dasari, M.; Guo, T.

CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality Journal Article

In: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 9, no. 3, 2025, ISSN: 24749567 (ISSN), (Publisher: Association for Computing Machinery).

Abstract | Links | BibTeX | Tags: Augmented Reality, Color computer graphics, Environment lighting, Estimation results, Generative model, High quality, Human engineering, Immersive, Lighting, Lighting conditions, Lighting estimation, Mobile augmented reality, Real-time refinement, Rendering (computer graphics), Statistical tests, Virtual objects, Virtual Reality

@article{zhao_clear_2025,

title = {CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality},

author = {Y. Zhao and M. Dasari and T. Guo},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105015452988&doi=10.1145%2F3749535&partnerID=40&md5=ed970d47cbf7f547555eca43b32cd7e7},

doi = {10.1145/3749535},

issn = {24749567 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies},

volume = {9},

number = {3},

abstract = {High-quality environment lighting is essential for creating immersive mobile augmented reality (AR) experiences. However, achieving visually coherent estimation for mobile AR is challenging due to several key limitations in AR device sensing capabilities, including low camera FoV and limited pixel dynamic ranges. Recent advancements in generative AI, which can generate high-quality images from different types of prompts, including texts and images, present a potential solution for high-quality lighting estimation. Still, to effectively use generative image diffusion models, we must address two key limitations of content quality and slow inference. In this work, we design and implement a generative lighting estimation system called CleAR that can produce high-quality, diverse environment maps in the format of 360◦ HDR images. Specifically, we design a two-step generation pipeline guided by AR environment context data to ensure the output aligns with the physical environment’s visual context and color appearance. To improve the estimation robustness under different lighting conditions, we design a real-time refinement component to adjust lighting estimation results on AR devices. To train and test our generative models, we curate a large-scale environment lighting estimation dataset with diverse lighting conditions. Through a combination of quantitative and qualitative evaluations, we show that CleAR outperforms state-of-the-art lighting estimation methods on both estimation accuracy, latency, and robustness, and is rated by 31 participants as producing better renderings for most virtual objects. For example, CleAR achieves 51% to 56% accuracy improvement on virtual object renderings across objects of three distinctive types of materials and reflective properties. CleAR produces lighting estimates of comparable or better quality in just 3.2 seconds—over 110X faster than state-of-the-art methods. Moreover, CleAR supports real-time refinement of lighting estimation results, ensuring robust and timely updates for AR applications. © 2025 Elsevier B.V., All rights reserved.},

note = {Publisher: Association for Computing Machinery},

keywords = {Augmented Reality, Color computer graphics, Environment lighting, Estimation results, Generative model, High quality, Human engineering, Immersive, Lighting, Lighting conditions, Lighting estimation, Mobile augmented reality, Real-time refinement, Rendering (computer graphics), Statistical tests, Virtual objects, Virtual Reality},

pubstate = {published},

tppubtype = {article}

}

High-quality environment lighting is essential for creating immersive mobile augmented reality (AR) experiences. However, achieving visually coherent estimation for mobile AR is challenging due to several key limitations in AR device sensing capabilities, including low camera FoV and limited pixel dynamic ranges. Recent advancements in generative AI, which can generate high-quality images from different types of prompts, including texts and images, present a potential solution for high-quality lighting estimation. Still, to effectively use generative image diffusion models, we must address two key limitations of content quality and slow inference. In this work, we design and implement a generative lighting estimation system called CleAR that can produce high-quality, diverse environment maps in the format of 360◦ HDR images. Specifically, we design a two-step generation pipeline guided by AR environment context data to ensure the output aligns with the physical environment’s visual context and color appearance. To improve the estimation robustness under different lighting conditions, we design a real-time refinement component to adjust lighting estimation results on AR devices. To train and test our generative models, we curate a large-scale environment lighting estimation dataset with diverse lighting conditions. Through a combination of quantitative and qualitative evaluations, we show that CleAR outperforms state-of-the-art lighting estimation methods on both estimation accuracy, latency, and robustness, and is rated by 31 participants as producing better renderings for most virtual objects. For example, CleAR achieves 51% to 56% accuracy improvement on virtual object renderings across objects of three distinctive types of materials and reflective properties. CleAR produces lighting estimates of comparable or better quality in just 3.2 seconds—over 110X faster than state-of-the-art methods. Moreover, CleAR supports real-time refinement of lighting estimation results, ensuring robust and timely updates for AR applications. © 2025 Elsevier B.V., All rights reserved.

2023

Vincent, B.; Ayyar, K.

Roblox Generative AI in action Proceedings Article

In: Spencer, S. N. (Ed.): Proc. - SIGGRAPH Real-Time Live!, Association for Computing Machinery, Inc, 2023, ISBN: 9798400701580 (ISBN).

Abstract | Links | BibTeX | Tags: AI techniques, Complex model, Creation process, Education, Game, Games, Interactive computer graphics, Interactive objects, Lighting, Metaverse, Metaverses, Modeling, Modeling languages, Natural languages, Object and scenes, Pipeline, Real-Time Rendering, Rendering (computer graphics)