AHCI RESEARCH GROUP
Publications
Papers published in international journals,
proceedings of conferences, workshops and books.
OUR RESEARCH
Scientific Publications
How to
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.
2025
Hu, Y. -H.; Matsumoto, A.; Ito, K.; Narumi, T.; Kuzuoka, H.; Amemiya, T.
Avatar Motion Generation Pipeline for the Metaverse via Synthesis of Generative Models of Text and Video Proceedings Article
In: Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW, pp. 767–771, Institute of Electrical and Electronics Engineers Inc., 2025, ISBN: 979-833151484-6 (ISBN).
Abstract | Links | BibTeX | Tags: Ambient intelligence, Design and evaluation methods, Distributed computer systems, Human-centered computing, Language Model, Metaverses, Processing capability, Text-processing, Treemap, Treemaps, Visualization, Visualization design and evaluation method, Visualization design and evaluation methods, Visualization designs, Visualization technique, Visualization techniques
@inproceedings{hu_avatar_2025,
title = {Avatar Motion Generation Pipeline for the Metaverse via Synthesis of Generative Models of Text and Video},
author = {Y. -H. Hu and A. Matsumoto and K. Ito and T. Narumi and H. Kuzuoka and T. Amemiya},
url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105005158851&doi=10.1109%2fVRW66409.2025.00155&partnerID=40&md5=2bc9a6390e1cf710206835722ca8dbbf},
doi = {10.1109/VRW66409.2025.00155},
isbn = {979-833151484-6 (ISBN)},
year = {2025},
date = {2025-01-01},
booktitle = {Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW},
pages = {767–771},
publisher = {Institute of Electrical and Electronics Engineers Inc.},
abstract = {Efforts to integrate AI avatars into the metaverse to enhance interactivity have progressed in both research and commercial domains. AI avatars in the metaverse are expected to exhibit not only verbal responses but also avatar motions, such as non-verbal gestures, to enable seamless communication with users. Large Language Models (LLMs) are known for their advanced text processing capabilities, such as user input, avatar actions, and even entire virtual environments as text, making them a promising approach for planning avatar motions. However, generating the avatar motions solely from the textual information often requires extensive training data whereas the configuration is very challenging, with results that often lack diversity and fail to match user expectations. On the other hand, AI technologies for generating videos have progressed to the point where they can depict diverse and natural human movements based on prompts. Therefore, this paper introduces a novel pipeline, TVMP, that synthesizes LLMs with advanced text processing capabilities and video generation models with the ability to generate videos containing a variety of motions. The pipeline first generates videos from text input, then estimates the motions from the generated videos, and lastly exports the estimated motion data into the avatars in the metaverse. Feedback on the TVMP prototype suggests further refinement is needed, such as speed control, display of the progress, and direct edition for contextual relevance and usability enhancements. The proposed method enables AI avatars to perform highly adaptive and diverse movements to fulfill user expectations and contributes to developing a more immersive metaverse. © 2025 IEEE.},
keywords = {Ambient intelligence, Design and evaluation methods, Distributed computer systems, Human-centered computing, Language Model, Metaverses, Processing capability, Text-processing, Treemap, Treemaps, Visualization, Visualization design and evaluation method, Visualization design and evaluation methods, Visualization designs, Visualization technique, Visualization techniques},
pubstate = {published},
tppubtype = {inproceedings}
}