AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Monjoree, U.; Yan, W.

Assessing AI Models' Spatial Visualization in PSVT:R and Augmented Reality: Towards Enhancing AI's Spatial Intelligence Proceedings Article

In: pp. 727–734, Institute of Electrical and Electronics Engineers Inc., 2025, ISBN: 9798331524005 (ISBN).

Abstract | Links | BibTeX | Tags: 3D modeling, Architecture engineering, Artificial intelligence, Augmented Reality, Construction science, Engineering education, Engineering science, Generative AI, generative artificial intelligence, Image processing, Intelligence models, Linear transformations, Medicine, Rotation, Rotation process, Spatial Intelligence, Spatial rotation, Spatial visualization, Three dimensional computer graphics, Three dimensional space, Visualization

@inproceedings{monjoree_assessing_2025,

title = {Assessing AI Models' Spatial Visualization in PSVT:R and Augmented Reality: Towards Enhancing AI's Spatial Intelligence},

author = {U. Monjoree and W. Yan},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105011255775&doi=10.1109%2FCAI64502.2025.00131&partnerID=40&md5=0bd551863839b3025898e55265403969},

doi = {10.1109/CAI64502.2025.00131},

isbn = {9798331524005 (ISBN)},

year  = {2025},

date = {2025-01-01},

pages = {727–734},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {Spatial intelligence is important in many fields, such as Architecture, Engineering, and Construction (AEC), Science, Technology, Engineering, and Mathematics (STEM), and Medicine. Understanding three-dimensional (3D) spatial rotations can involve verbal descriptions and visual or interactive examples, illustrating how objects move and change orientation in 3D space. Recent studies show that artificial intelligence (AI) with language and vision capabilities still faces limitations in spatial reasoning. In this paper, we have studied the spatial capabilities of advanced generative AI to understand the rotations of objects in 3D space utilizing its image processing and language processing features. We examined the spatial intelligence of three generative AI models (GPT-4, Gemini 1.5 Pro, and Llama 3.2) to understand the spatial rotation process with spatial rotation diagrams based on the revised Purdue Spatial Visualization Test: Visualization of Rotations (Revised PSVT:R). Furthermore, we incorporated an added layer of a coordinate system axes on Revised PSVT:R to study the variations in generative AI models' performance. We additionally examined generative AI models' understanding of 3D rotations in Augmented Reality (AR) scene images that visualize spatial rotations of a physical object in 3D space and observed an increased accuracy of generative AI models' understanding of rotations by adding additional textual information depicting the rotation process or mathematical representations of the rotation (e.g., matrices) superimposed on the object. The results indicate that while GPT-4, Gemini 1.5 Pro, and Llama 3.2 as the main current generative AI model lack the understanding of a spatial rotation process, it has the potential to understand the rotation process with additional information that can be provided by methods such as AR. AR can superimpose textual information or mathematical representations of rotations on spatial transformation diagrams and create a more intelligible input for AI to comprehend or for training AI's spatial intelligence. Furthermore, by combining the potentials in spatial intelligence of AI with AR's interactive visualization abilities, we expect to offer enhanced guidance for students' spatial learning activities. Such spatial guidance can greatly benefit understanding spatial transformations and additionally support processes like assembly, construction, manufacturing, as well as learning in AEC, STEM, and Medicine that require precise 3D spatial understanding. © 2025 Elsevier B.V., All rights reserved.},

keywords = {3D modeling, Architecture engineering, Artificial intelligence, Augmented Reality, Construction science, Engineering education, Engineering science, Generative AI, generative artificial intelligence, Image processing, Intelligence models, Linear transformations, Medicine, Rotation, Rotation process, Spatial Intelligence, Spatial rotation, Spatial visualization, Three dimensional computer graphics, Three dimensional space, Visualization},

pubstate = {published},

tppubtype = {inproceedings}

}

Spatial intelligence is important in many fields, such as Architecture, Engineering, and Construction (AEC), Science, Technology, Engineering, and Mathematics (STEM), and Medicine. Understanding three-dimensional (3D) spatial rotations can involve verbal descriptions and visual or interactive examples, illustrating how objects move and change orientation in 3D space. Recent studies show that artificial intelligence (AI) with language and vision capabilities still faces limitations in spatial reasoning. In this paper, we have studied the spatial capabilities of advanced generative AI to understand the rotations of objects in 3D space utilizing its image processing and language processing features. We examined the spatial intelligence of three generative AI models (GPT-4, Gemini 1.5 Pro, and Llama 3.2) to understand the spatial rotation process with spatial rotation diagrams based on the revised Purdue Spatial Visualization Test: Visualization of Rotations (Revised PSVT:R). Furthermore, we incorporated an added layer of a coordinate system axes on Revised PSVT:R to study the variations in generative AI models' performance. We additionally examined generative AI models' understanding of 3D rotations in Augmented Reality (AR) scene images that visualize spatial rotations of a physical object in 3D space and observed an increased accuracy of generative AI models' understanding of rotations by adding additional textual information depicting the rotation process or mathematical representations of the rotation (e.g., matrices) superimposed on the object. The results indicate that while GPT-4, Gemini 1.5 Pro, and Llama 3.2 as the main current generative AI model lack the understanding of a spatial rotation process, it has the potential to understand the rotation process with additional information that can be provided by methods such as AR. AR can superimpose textual information or mathematical representations of rotations on spatial transformation diagrams and create a more intelligible input for AI to comprehend or for training AI's spatial intelligence. Furthermore, by combining the potentials in spatial intelligence of AI with AR's interactive visualization abilities, we expect to offer enhanced guidance for students' spatial learning activities. Such spatial guidance can greatly benefit understanding spatial transformations and additionally support processes like assembly, construction, manufacturing, as well as learning in AEC, STEM, and Medicine that require precise 3D spatial understanding. © 2025 Elsevier B.V., All rights reserved.