AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Behravan, M.; Matković, K.; Grǎcanin, D.

Generative AI for Context-Aware 3D Object Creation Using Vision-Language Models in Augmented Reality Proceedings Article

In: Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR, pp. 73–81, Institute of Electrical and Electronics Engineers Inc., 2025, ISBN: 9798331521578 (ISBN).

Abstract | Links | BibTeX | Tags: 3D object, 3D Object Generation, Artificial intelligence systems, Augmented Reality, Capture images, Context-Aware, Generative adversarial networks, Generative AI, generative artificial intelligence, Generative model, Language Model, Object creation, Vision language model, vision language models, Visual languages

@inproceedings{behravan_generative_2025,

title = {Generative AI for Context-Aware 3D Object Creation Using Vision-Language Models in Augmented Reality},

author = {M. Behravan and K. Matković and D. Grǎcanin},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105000292700&doi=10.1109%2FAIxVR63409.2025.00018&partnerID=40&md5=0a11897a4f37fd8ebaa257498cb92eb7},

doi = {10.1109/AIxVR63409.2025.00018},

isbn = {9798331521578 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR},

pages = {73–81},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {We present a novel Artificial Intelligence (AI) system that functions as a designer assistant in augmented reality (AR) environments. Leveraging Vision Language Models (VLMs) like LLaVA and advanced text-to-3D generative models, users can capture images of their surroundings with an Augmented Reality (AR) headset. The system analyzes these images to recommend contextually relevant objects that enhance both functionality and visual appeal. The recommended objects are generated as 3D models and seamlessly integrated into the AR environment for interactive use. Our system utilizes open-source AI models running on local systems to enhance data security and reduce operational costs. Key features include context-aware object suggestions, optimal placement guidance, aesthetic matching, and an intuitive user interface for real-time interaction. Evaluations using the COCO 2017 dataset and real-world AR testing demonstrated high accuracy in object detection and contextual fit rating of 4.1 out of 5. By addressing the challenge of providing context-aware object recommendations in AR, our system expands the capabilities of AI applications in this domain. It enables users to create personalized digital spaces efficiently, leveraging AI for contextually relevant suggestions. © 2025 Elsevier B.V., All rights reserved.},

keywords = {3D object, 3D Object Generation, Artificial intelligence systems, Augmented Reality, Capture images, Context-Aware, Generative adversarial networks, Generative AI, generative artificial intelligence, Generative model, Language Model, Object creation, Vision language model, vision language models, Visual languages},

pubstate = {published},

tppubtype = {inproceedings}

}

Behravan, M.; Grǎcanin, D.

From Voices to Worlds: Developing an AI-Powered Framework for 3D Object Generation in Augmented Reality Proceedings Article

In: Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW, pp. 150–155, Institute of Electrical and Electronics Engineers Inc., 2025, ISBN: 9798331514846 (ISBN).

Abstract | Links | BibTeX | Tags: 3D modeling, 3D object, 3D Object Generation, 3D reconstruction, Augmented Reality, Cutting edges, Generative AI, Interactive computer systems, Language Model, Large language model, large language models, matrix, Multilingual speech interaction, Real- time, Speech enhancement, Speech interaction, Volume Rendering

@inproceedings{behravan_voices_2025,

title = {From Voices to Worlds: Developing an AI-Powered Framework for 3D Object Generation in Augmented Reality},

author = {M. Behravan and D. Grǎcanin},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105005153589&doi=10.1109%2FVRW66409.2025.00038&partnerID=40&md5=34311e63349697801caf849bc231e879},

doi = {10.1109/VRW66409.2025.00038},

isbn = {9798331514846 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW},

pages = {150–155},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {This paper presents Matrix, an advanced AI-powered framework designed for real-time 3D object generation in Augmented Reality (AR) environments. By integrating a cutting-edge text-to-3D generative AI model, multilingual speech-to-text translation, and large language models (LLMs), the system enables seamless user interactions through spoken commands. The framework processes speech inputs, generates 3D objects, and provides object recommendations based on contextual understanding, enhancing AR experiences. A key feature of this framework is its ability to optimize 3D models by reducing mesh complexity, resulting in significantly smaller file sizes and faster processing on resource-constrained AR devices. Our approach addresses the challenges of high GPU usage, large model output sizes, and real-time system responsiveness, ensuring a smoother user experience. Moreover, the system is equipped with a pre-generated object repository, further reducing GPU load and improving efficiency. We demonstrate the practical applications of this framework in various fields such as education, design, and accessibility, and discuss future enhancements including image-to-3D conversion, environmental object detection, and multimodal support. The open-source nature of the framework promotes ongoing innovation and its utility across diverse industries. © 2025 Elsevier B.V., All rights reserved.},

keywords = {3D modeling, 3D object, 3D Object Generation, 3D reconstruction, Augmented Reality, Cutting edges, Generative AI, Interactive computer systems, Language Model, Large language model, large language models, matrix, Multilingual speech interaction, Real- time, Speech enhancement, Speech interaction, Volume Rendering},

pubstate = {published},

tppubtype = {inproceedings}

}

Kurai, R.; Hiraki, T.; Hiroi, Y.; Hirao, Y.; Perusquía-Hernández, M.; Uchiyama, H.; Kiyokawa, K.

An implementation of MagicCraft: Generating Interactive 3D Objects and Their Behaviors from Text for Commercial Metaverse Platforms Proceedings Article

In: Proc. - IEEE Conf. Virtual Real. 3D User Interfaces Abstr. Workshops, VRW, pp. 1284–1285, Institute of Electrical and Electronics Engineers Inc., 2025, ISBN: 9798331514846 (ISBN).

Abstract | Links | BibTeX | Tags: 3D modeling, 3D models, 3D object, 3D Object Generation, 3d-modeling, AI-Assisted Design, Generative AI, Immersive, Metaverse, Metaverses, Model skill, Object oriented programming, Programming skills

Kurai, R.; Hiraki, T.; Hiroi, Y.; Hirao, Y.; Perusquía-Hernández, M.; Uchiyama, H.; Kiyokawa, K.

MagicCraft: Natural Language-Driven Generation of Dynamic and Interactive 3D Objects for Commercial Metaverse Platforms Journal Article

In: IEEE Access, vol. 13, pp. 132459–132474, 2025, ISSN: 21693536 (ISSN), (Publisher: Institute of Electrical and Electronics Engineers Inc.).

Abstract | Links | BibTeX | Tags: 3D models, 3D object, 3D Object Generation, 3d-modeling, AI-Assisted Design, Artificial intelligence, Behavioral Research, Content creation, Generative AI, Immersive, Metaverse, Metaverses, Natural language processing systems, Natural languages, Object oriented programming, Three dimensional computer graphics, user experience, User interfaces

@article{kurai_magiccraft_2025,

title = {MagicCraft: Natural Language-Driven Generation of Dynamic and Interactive 3D Objects for Commercial Metaverse Platforms},

author = {R. Kurai and T. Hiraki and Y. Hiroi and Y. Hirao and M. Perusquía-Hernández and H. Uchiyama and K. Kiyokawa},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105010187256&doi=10.1109%2FACCESS.2025.3587232&partnerID=40&md5=9b7a8115c62a8f9da4956dbbbb53dc4e},

doi = {10.1109/ACCESS.2025.3587232},

issn = {21693536 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {IEEE Access},

volume = {13},

pages = {132459–132474},

abstract = {Metaverse platforms are rapidly evolving to provide immersive spaces for user interaction and content creation. However, the generation of dynamic and interactive 3D objects remains challenging due to the need for advanced 3D modeling and programming skills. To address this challenge, we present MagicCraft, a system that generates functional 3D objects from natural language prompts for metaverse platforms. MagicCraft uses generative AI models to manage the entire content creation pipeline: converting user text descriptions into images, transforming images into 3D models, predicting object behavior, and assigning necessary attributes and scripts. It also provides an interactive interface for users to refine generated objects by adjusting features such as orientation, scale, seating positions, and grip points. Implemented on Cluster, a commercial metaverse platform, MagicCraft was evaluated by 7 expert CG designers and 51 general users. Results show that MagicCraft significantly reduces the time and skill required to create 3D objects. Users with no prior experience in 3D modeling or programming successfully created complex, interactive objects and deployed them in the metaverse. Expert feedback highlighted the system's potential to improve content creation workflows and support rapid prototyping. By integrating AI-generated content into metaverse platforms, MagicCraft makes 3D content creation more accessible. © 2025 Elsevier B.V., All rights reserved.},

note = {Publisher: Institute of Electrical and Electronics Engineers Inc.},

keywords = {3D models, 3D object, 3D Object Generation, 3d-modeling, AI-Assisted Design, Artificial intelligence, Behavioral Research, Content creation, Generative AI, Immersive, Metaverse, Metaverses, Natural language processing systems, Natural languages, Object oriented programming, Three dimensional computer graphics, user experience, User interfaces},

pubstate = {published},

tppubtype = {article}

}

2024

Behravan, M.; Grǎcanin, D.

Generative Multi-Modal Artificial Intelligence for Dynamic Real-Time Context-Aware Content Creation in Augmented Reality Proceedings Article

In: Spencer, S. N. (Ed.): Proc. ACM Symp. Virtual Reality Softw. Technol. VRST, Association for Computing Machinery, 2024, ISBN: 9798400705359 (ISBN).

Abstract | Links | BibTeX | Tags: 3D object, 3D Object Generation, Augmented Reality, Content creation, Context-Aware, Generative adversarial networks, Generative AI, generative artificial intelligence, Language Model, Multi-modal, Real- time, Time contexts, Vision language model, vision language models, Visual languages