AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Tian, Y.; Li, X.; Cheng, Z.; Huang, Y.; Yu, T.

Design of Realistic and Artistically Expressive 3D Facial Models for Film AIGC: A Cross-Modal Framework Integrating Audience Perception Evaluation Journal Article

In: Sensors, vol. 25, no. 15, 2025, ISSN: 14248220 (ISSN), (Publisher: Multidisciplinary Digital Publishing Institute (MDPI)).

Abstract | Links | BibTeX | Tags: 3D faces, 3d facial model, 3D facial models, 3D modeling, adaptation, adult, Article, Audience perception evaluation, benchmarking, controlled study, Cross-modal, Face generation, Facial modeling, facies, Feature extraction, feedback, feedback system, female, Geometry, High-fidelity, human, illumination, Immersive media, Lighting, male, movie, Neural radiance field, Neural Radiance Fields, perception, Quality control, Rendering (computer graphics), Semantics, sensor, Three dimensional computer graphics, Virtual production, Virtual Reality

@article{tian_design_2025,

title = {Design of Realistic and Artistically Expressive 3D Facial Models for Film AIGC: A Cross-Modal Framework Integrating Audience Perception Evaluation},

author = {Y. Tian and X. Li and Z. Cheng and Y. Huang and T. Yu},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105013137724&doi=10.3390%2Fs25154646&partnerID=40&md5=8508a27b693f0857ce7cb58e97a2705c},

doi = {10.3390/s25154646},

issn = {14248220 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {Sensors},

volume = {25},

number = {15},

abstract = {The rise of virtual production has created an urgent need for both efficient and high-fidelity 3D face generation schemes for cinema and immersive media, but existing methods are often limited by lighting–geometry coupling, multi-view dependency, and insufficient artistic quality. To address this, this study proposes a cross-modal 3D face generation framework based on single-view semantic masks. It utilizes Swin Transformer for multi-level feature extraction and combines with NeRF for illumination decoupled rendering. We utilize physical rendering equations to explicitly separate surface reflectance from ambient lighting to achieve robust adaptation to complex lighting variations. In addition, to address geometric errors across illumination scenes, we construct geometric a priori constraint networks by mapping 2D facial features to 3D parameter space as regular terms with the help of semantic masks. On the CelebAMask-HQ dataset, this method achieves a leading score of SSIM = 0.892 (37.6% improvement from baseline) with FID = 40.6. The generated faces excel in symmetry and detail fidelity with realism and aesthetic scores of 8/10 and 7/10, respectively, in a perceptual evaluation with 1000 viewers. By combining physical-level illumination decoupling with semantic geometry a priori, this paper establishes a quantifiable feedback mechanism between objective metrics and human aesthetic evaluation, providing a new paradigm for aesthetic quality assessment of AI-generated content. © 2025 Elsevier B.V., All rights reserved.},

note = {Publisher: Multidisciplinary Digital Publishing Institute (MDPI)},

keywords = {3D faces, 3d facial model, 3D facial models, 3D modeling, adaptation, adult, Article, Audience perception evaluation, benchmarking, controlled study, Cross-modal, Face generation, Facial modeling, facies, Feature extraction, feedback, feedback system, female, Geometry, High-fidelity, human, illumination, Immersive media, Lighting, male, movie, Neural radiance field, Neural Radiance Fields, perception, Quality control, Rendering (computer graphics), Semantics, sensor, Three dimensional computer graphics, Virtual production, Virtual Reality},

pubstate = {published},

tppubtype = {article}

}

2023

Vaidhyanathan, V.; Radhakrishnan, T. R.; López, J. L. G.

Spacify A Generative Framework for Spatial Comprehension, Articulation and Visualization using Large Language Models (LLMs) and eXtended Reality (XR) Proceedings Article

In: Crawford, A.; Diniz, N. M.; Beckett, R.; Vanucchi, J.; Swackhamer, M. (Ed.): Habits Anthropocene: Scarcity Abundance Post-Mater. Econ. - Proc. Annu. Conf. Assoc. Comput. Aided Des. Archit., ACADIA, pp. 430–443, Association for Computer Aided Design in Architecture, 2023, ISBN: 9798986080598 (ISBN); 9798986080581 (ISBN).

Abstract | Links | BibTeX | Tags: 3D data processing, 3D spaces, Architectural design, Built environment, C (programming language), Computational Linguistics, Computer aided design, Computer architecture, Data handling, Data users, Data visualization, Immersive media, Interior designers, Language Model, Natural languages, Spatial design, Three dimensional computer graphics, Urban designers, User interfaces, Visualization

@inproceedings{vaidhyanathan_spacify_2023,

title = {Spacify A Generative Framework for Spatial Comprehension, Articulation and Visualization using Large Language Models (LLMs) and eXtended Reality (XR)},

author = {V. Vaidhyanathan and T. R. Radhakrishnan and J. L. G. López},

editor = {A. Crawford and N. M. Diniz and R. Beckett and J. Vanucchi and M. Swackhamer},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85192831586&partnerID=40&md5=996906de0f5ef1e6c88b10bb65caabc0},

isbn = {9798986080598 (ISBN); 9798986080581 (ISBN)},

year  = {2023},

date = {2023-01-01},

booktitle = {Habits Anthropocene: Scarcity Abundance Post-Mater. Econ. - Proc. Annu. Conf. Assoc. Comput. Aided Des. Archit., ACADIA},

volume = {2},

pages = {430–443},

publisher = {Association for Computer Aided Design in Architecture},

abstract = {Spatial design, the thoughtful planning and creation of built environments, typically requires advanced technical knowledge and visuospatial skills, making it largely exclusive to professionals like architects, interior designers, and urban designers. This exclusivity limits non-experts' access to spatial design, despite their ability to describe requirements and suggestions in natural language. Recent advancements in generative artificial intelligence (AI), particularly large language models (LLMs), and extended reality, (XR) offer the potential to address this limitation. This paper introduces Spacify (Figure 1), a framework that utilizes the generalizing capabilities of LLMs, 3D data-processing, and XR interfaces to create an immersive medium for language-driven spatial understanding, design, and visualization for non-experts. This paper describes the five components of Spacify: External Data, User Input, Spatial Interface, Large Language Model, and Current Spatial Design; which enable the use of generative AI models in a) question/ answering about 3D spaces with reasoning, b) (re)generating 3D spatial designs with natural language prompts, and c) visualizing designed 3D spaces with natural language descriptions. An implementation of Spacify is demonstrated via an XR smartphone application, allowing for an end-to-end, language-driven interior design process. User survey results from non-experts redesigning their spaces in 3D using this application suggest that Spacify can make spatial design accessible using natural language prompts, thereby pioneering a new realm of spatial design that is naturally language-driven. © 2024 Elsevier B.V., All rights reserved.},

keywords = {3D data processing, 3D spaces, Architectural design, Built environment, C (programming language), Computational Linguistics, Computer aided design, Computer architecture, Data handling, Data users, Data visualization, Immersive media, Interior designers, Language Model, Natural languages, Spatial design, Three dimensional computer graphics, Urban designers, User interfaces, Visualization},

pubstate = {published},

tppubtype = {inproceedings}

}