AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Lv, J.; Słowik, A.; Rani, S.; Kim, B. -G.; Chen, C. -M.; Kumari, S.; Li, K.; Lyu, X.; Jiang, H.

Multimodal Metaverse Healthcare: A Collaborative Representation and Adaptive Fusion Approach for Generative Artificial-Intelligence-Driven Diagnosis Journal Article

In: Research, vol. 8, 2025, ISSN: 20965168 (ISSN); 26395274 (ISSN), (Publisher: American Association for the Advancement of Science).

Abstract | Links | BibTeX | Tags: Adaptive fusion, Collaborative representations, Diagnosis, Electronic health record, Generative adversarial networks, Health care application, Healthcare environments, Immersive, Learning frameworks, Metaverses, Multi-modal, Multi-modal learning, Performance

@article{lv_multimodal_2025,

title = {Multimodal Metaverse Healthcare: A Collaborative Representation and Adaptive Fusion Approach for Generative Artificial-Intelligence-Driven Diagnosis},

author = {J. Lv and A. Słowik and S. Rani and B. -G. Kim and C. -M. Chen and S. Kumari and K. Li and X. Lyu and H. Jiang},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-86000613924&doi=10.34133%2Fresearch.0616&partnerID=40&md5=ce118b548f94bde494051760a217c33c},

doi = {10.34133/research.0616},

issn = {20965168 (ISSN); 26395274 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {Research},

volume = {8},

abstract = {The metaverse enables immersive virtual healthcare environments, presenting opportunities for enhanced care delivery. A key challenge lies in effectively combining multimodal healthcare data and generative artificial intelligence abilities within metaverse-based healthcare applications, which is a problem that needs to be addressed. This paper proposes a novel multimodal learning framework for metaverse healthcare, MMLMH, based on collaborative intra- and intersample representation and adaptive fusion. Our framework introduces a collaborative representation learning approach that captures shared and modality-specific features across text, audio, and visual health data. By combining modality-specific and shared encoders with carefully formulated intrasample and intersample collaboration mechanisms, MMLMH achieves superior feature representation for complex health assessments. The framework’s adaptive fusion approach, utilizing attention mechanisms and gated neural networks, demonstrates robust performance across varying noise levels and data quality conditions. Experiments on metaverse healthcare datasets demonstrate MMLMH’s superior performance over baseline methods across multiple evaluation metrics. Longitudinal studies and visualization further illustrate MMLMH’s adaptability to evolving virtual environments and balanced performance across diagnostic accuracy, patient–system interaction efficacy, and data integration complexity. The proposed framework has a unique advantage in that a similar level of performance is maintained across various patient populations and virtual avatars, which could lead to greater personalization of healthcare experiences in the metaverse. MMLMH’s successful functioning in such complicated circumstances suggests that it can combine and process information streams from several sources. They can be successfully utilized in next-generation healthcare delivery through virtual reality. © 2025 Elsevier B.V., All rights reserved.},

note = {Publisher: American Association for the Advancement of Science},

keywords = {Adaptive fusion, Collaborative representations, Diagnosis, Electronic health record, Generative adversarial networks, Health care application, Healthcare environments, Immersive, Learning frameworks, Metaverses, Multi-modal, Multi-modal learning, Performance},

pubstate = {published},

tppubtype = {article}

}

The metaverse enables immersive virtual healthcare environments, presenting opportunities for enhanced care delivery. A key challenge lies in effectively combining multimodal healthcare data and generative artificial intelligence abilities within metaverse-based healthcare applications, which is a problem that needs to be addressed. This paper proposes a novel multimodal learning framework for metaverse healthcare, MMLMH, based on collaborative intra- and intersample representation and adaptive fusion. Our framework introduces a collaborative representation learning approach that captures shared and modality-specific features across text, audio, and visual health data. By combining modality-specific and shared encoders with carefully formulated intrasample and intersample collaboration mechanisms, MMLMH achieves superior feature representation for complex health assessments. The framework’s adaptive fusion approach, utilizing attention mechanisms and gated neural networks, demonstrates robust performance across varying noise levels and data quality conditions. Experiments on metaverse healthcare datasets demonstrate MMLMH’s superior performance over baseline methods across multiple evaluation metrics. Longitudinal studies and visualization further illustrate MMLMH’s adaptability to evolving virtual environments and balanced performance across diagnostic accuracy, patient–system interaction efficacy, and data integration complexity. The proposed framework has a unique advantage in that a similar level of performance is maintained across various patient populations and virtual avatars, which could lead to greater personalization of healthcare experiences in the metaverse. MMLMH’s successful functioning in such complicated circumstances suggests that it can combine and process information streams from several sources. They can be successfully utilized in next-generation healthcare delivery through virtual reality. © 2025 Elsevier B.V., All rights reserved.

2023

Feng, Y.; Zhu, H.; Peng, D.; Peng, X.; Hu, P.

RONO: Robust Discriminative Learning with Noisy Labels for 2D-3D Cross-Modal Retrieval Proceedings Article

In: Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit, pp. 11610–11619, IEEE Computer Society, 2023, ISBN: 10636919 (ISSN).

Abstract | Links | BibTeX | Tags: 3D content, 3D data, 3D modeling, Adversarial machine learning, Contrastive Learning, Cross-modal, Discriminative learning, Federated learning, Heterogeneous structures, Learning mechanism, Learning performance, Metaverses, Multi-modal learning, Noisy labels, Spatio-temporal data

@inproceedings{feng_rono_2023,

title = {RONO: Robust Discriminative Learning with Noisy Labels for 2D-3D Cross-Modal Retrieval},

author = {Y. Feng and H. Zhu and D. Peng and X. Peng and P. Hu},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85170845124&doi=10.1109%2fCVPR52729.2023.01117&partnerID=40&md5=2eee285207ff3ea8e774480e29d96ec1},

doi = {10.1109/CVPR52729.2023.01117},

isbn = {10636919 (ISSN)},

year  = {2023},

date = {2023-01-01},

booktitle = {Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit},

volume = {2023-June},

pages = {11610–11619},

publisher = {IEEE Computer Society},

abstract = {Recently, with the advent of Metaverse and AI Generated Content, cross-modal retrieval becomes popular with a burst of 2D and 3D data. However, this problem is challenging given the heterogeneous structure and semantic discrepancies. Moreover, imperfect annotations are ubiquitous given the ambiguous 2D and 3D content, thus inevitably producing noisy labels to degrade the learning performance. To tackle the problem, this paper proposes a robust 2D-3D retrieval framework (RONO) to robustly learn from noisy multimodal data. Specifically, one novel Robust Discriminative Center Learning mechanism (RDCL) is proposed in RONO to adaptively distinguish clean and noisy samples for respectively providing them with positive and negative optimization directions, thus mitigating the negative impact of noisy labels. Besides, we present a Shared Space Consistency Learning mechanism (SSCL) to capture the intrinsic information inside the noisy data by minimizing the cross-modal and semantic discrepancy between common space and label space simultaneously. Comprehensive mathematical analyses are given to theoretically prove the noise tolerance of the proposed method. Furthermore, we conduct extensive experiments on four 3D-model multimodal datasets to verify the effectiveness of our method by comparing it with 15 state-of-the-art methods. © 2023 IEEE.},

keywords = {3D content, 3D data, 3D modeling, Adversarial machine learning, Contrastive Learning, Cross-modal, Discriminative learning, Federated learning, Heterogeneous structures, Learning mechanism, Learning performance, Metaverses, Multi-modal learning, Noisy labels, Spatio-temporal data},

pubstate = {published},

tppubtype = {inproceedings}

}