AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Behravan, M.; Matković, K.; Grǎcanin, D.

Generative AI for Context-Aware 3D Object Creation Using Vision-Language Models in Augmented Reality Proceedings Article

In: Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR, pp. 73–81, Institute of Electrical and Electronics Engineers Inc., 2025, ISBN: 9798331521578 (ISBN).

Abstract | Links | BibTeX | Tags: 3D object, 3D Object Generation, Artificial intelligence systems, Augmented Reality, Capture images, Context-Aware, Generative adversarial networks, Generative AI, generative artificial intelligence, Generative model, Language Model, Object creation, Vision language model, vision language models, Visual languages

@inproceedings{behravan_generative_2025,

title = {Generative AI for Context-Aware 3D Object Creation Using Vision-Language Models in Augmented Reality},

author = {M. Behravan and K. Matković and D. Grǎcanin},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105000292700&doi=10.1109%2FAIxVR63409.2025.00018&partnerID=40&md5=0a11897a4f37fd8ebaa257498cb92eb7},

doi = {10.1109/AIxVR63409.2025.00018},

isbn = {9798331521578 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR},

pages = {73–81},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {We present a novel Artificial Intelligence (AI) system that functions as a designer assistant in augmented reality (AR) environments. Leveraging Vision Language Models (VLMs) like LLaVA and advanced text-to-3D generative models, users can capture images of their surroundings with an Augmented Reality (AR) headset. The system analyzes these images to recommend contextually relevant objects that enhance both functionality and visual appeal. The recommended objects are generated as 3D models and seamlessly integrated into the AR environment for interactive use. Our system utilizes open-source AI models running on local systems to enhance data security and reduce operational costs. Key features include context-aware object suggestions, optimal placement guidance, aesthetic matching, and an intuitive user interface for real-time interaction. Evaluations using the COCO 2017 dataset and real-world AR testing demonstrated high accuracy in object detection and contextual fit rating of 4.1 out of 5. By addressing the challenge of providing context-aware object recommendations in AR, our system expands the capabilities of AI applications in this domain. It enables users to create personalized digital spaces efficiently, leveraging AI for contextually relevant suggestions. © 2025 Elsevier B.V., All rights reserved.},

keywords = {3D object, 3D Object Generation, Artificial intelligence systems, Augmented Reality, Capture images, Context-Aware, Generative adversarial networks, Generative AI, generative artificial intelligence, Generative model, Language Model, Object creation, Vision language model, vision language models, Visual languages},

pubstate = {published},

tppubtype = {inproceedings}

}

2024

Behravan, M.; Grǎcanin, D.

Generative Multi-Modal Artificial Intelligence for Dynamic Real-Time Context-Aware Content Creation in Augmented Reality Proceedings Article

In: Spencer, S. N. (Ed.): Proc. ACM Symp. Virtual Reality Softw. Technol. VRST, Association for Computing Machinery, 2024, ISBN: 9798400705359 (ISBN).

Abstract | Links | BibTeX | Tags: 3D object, 3D Object Generation, Augmented Reality, Content creation, Context-Aware, Generative adversarial networks, Generative AI, generative artificial intelligence, Language Model, Multi-modal, Real- time, Time contexts, Vision language model, vision language models, Visual languages