AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Zhou, J.; Weber, R.; Wen, E.; Lottridge, D.

Real-Time Full-body Interaction with AI Dance Models: Responsiveness to Contemporary Dance Proceedings Article

In: Int Conf Intell User Interfaces Proc IUI, pp. 1177–1187, Association for Computing Machinery, 2025, ISBN: 9798400713064 (ISBN).

Abstract | Links | BibTeX | Tags: 3D modeling, Chatbots, Computer interaction, Deep learning, Deep-Learning Dance Model, Design of Human-Computer Interaction, Digital elevation model, Generative AI, Input output programs, Input sequence, Interactivity, Motion capture, Motion tracking, Movement analysis, Output sequences, Problem oriented languages, Real- time, Text mining, Three dimensional computer graphics, User input, Virtual environments, Virtual Reality

@inproceedings{zhou_real-time_2025,

title = {Real-Time Full-body Interaction with AI Dance Models: Responsiveness to Contemporary Dance},

author = {J. Zhou and R. Weber and E. Wen and D. Lottridge},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105001922427&doi=10.1145%2F3708359.3712077&partnerID=40&md5=3eab32d7c5b4708f4005393b2db25291},

doi = {10.1145/3708359.3712077},

isbn = {9798400713064 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Int Conf Intell User Interfaces Proc IUI},

pages = {1177–1187},

publisher = {Association for Computing Machinery},

abstract = {Interactive AI chatbots put the power of Large-Language Models (LLMs) into people's hands; it is this interactivity that fueled explosive worldwide influence. In the generative dance space, however, there are few deep-learning-based generative dance models built with interactivity in mind. The release of the AIST++ dance dataset in 2021 led to an uptick of capabilities in generative dance models. Whether these models could be adapted to support interactivity and how well this approach will work is not known. In this study, we explore the capabilities of existing generative dance models for motion-to-motion synthesis on real-time, full-body motion-captured contemporary dance data. We identify an existing model that we adapted to support interactivity: the Bailando++ model, which is trained on the AIST++ dataset and was modified to take music and a motion sequence as input parameters in an interactive loop. We worked with two professional contemporary choreographers and dancers to record and curate a diverse set of 203 motion-captured dance sequences as a set of "user inputs"captured through the Optitrack high-precision motion capture 3D tracking system. We extracted 17 quantitative movement features from the motion data using the well-established Laban Movement Analysis theory, which allowed for quantitative comparisons of inter-movement correlations, which we used for clustering input data and comparing input and output sequences. A total of 10 pieces of music were used to generate a variety of outputs using the adapted Bailando++ model. We found that, on average, the generated output motion achieved only moderate correlations to the user input, with some exceptions of movement and music pairs achieving high correlation. The high-correlation generated output sequences were deemed responsive and relevant co-creations in relation to the input sequences. We discuss implications for interactive generative dance agents, where the use of 3D joint coordinate data should be used over SMPL parameters for ease of real-time generation, and how the use of Laban Movement Analysis could be used to extract useful features and fine-tune deep-learning models. © 2025 Elsevier B.V., All rights reserved.},

keywords = {3D modeling, Chatbots, Computer interaction, Deep learning, Deep-Learning Dance Model, Design of Human-Computer Interaction, Digital elevation model, Generative AI, Input output programs, Input sequence, Interactivity, Motion capture, Motion tracking, Movement analysis, Output sequences, Problem oriented languages, Real- time, Text mining, Three dimensional computer graphics, User input, Virtual environments, Virtual Reality},

pubstate = {published},

tppubtype = {inproceedings}

}

Interactive AI chatbots put the power of Large-Language Models (LLMs) into people's hands; it is this interactivity that fueled explosive worldwide influence. In the generative dance space, however, there are few deep-learning-based generative dance models built with interactivity in mind. The release of the AIST++ dance dataset in 2021 led to an uptick of capabilities in generative dance models. Whether these models could be adapted to support interactivity and how well this approach will work is not known. In this study, we explore the capabilities of existing generative dance models for motion-to-motion synthesis on real-time, full-body motion-captured contemporary dance data. We identify an existing model that we adapted to support interactivity: the Bailando++ model, which is trained on the AIST++ dataset and was modified to take music and a motion sequence as input parameters in an interactive loop. We worked with two professional contemporary choreographers and dancers to record and curate a diverse set of 203 motion-captured dance sequences as a set of "user inputs"captured through the Optitrack high-precision motion capture 3D tracking system. We extracted 17 quantitative movement features from the motion data using the well-established Laban Movement Analysis theory, which allowed for quantitative comparisons of inter-movement correlations, which we used for clustering input data and comparing input and output sequences. A total of 10 pieces of music were used to generate a variety of outputs using the adapted Bailando++ model. We found that, on average, the generated output motion achieved only moderate correlations to the user input, with some exceptions of movement and music pairs achieving high correlation. The high-correlation generated output sequences were deemed responsive and relevant co-creations in relation to the input sequences. We discuss implications for interactive generative dance agents, where the use of 3D joint coordinate data should be used over SMPL parameters for ease of real-time generation, and how the use of Laban Movement Analysis could be used to extract useful features and fine-tune deep-learning models. © 2025 Elsevier B.V., All rights reserved.

2024

Jayaraman, S.; Bhavya, R.; Srihari, V.; Rajam, V. Mary Anita

TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions Proceedings Article

In: IEEE Int. Conf. Comput. Vis. Mach. Intell., CVMI, Institute of Electrical and Electronics Engineers Inc., 2024, ISBN: 9798350376876 (ISBN).

Abstract | Links | BibTeX | Tags: Adversarial networks, Computer simulation languages, Deep learning, Depth Estimation, Depth perception, Diffusion Model, diffusion models, Digital elevation model, Generative adversarial networks, Generative model, Generative systems, Language Model, Motion capture, Stereo image processing, Text-to-image, Training data, Video analysis, Video-clips, Virtual environments, Virtual Reality

@inproceedings{jayaraman_texavi_2024,

title = {TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions},

author = {S. Jayaraman and R. Bhavya and V. Srihari and V. Mary Anita Rajam},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85215265234&doi=10.1109%2FCVMI61877.2024.10782691&partnerID=40&md5=21e6ecfcc0710c036ba93e39b5fcd30d},

doi = {10.1109/CVMI61877.2024.10782691},

isbn = {9798350376876 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {IEEE Int. Conf. Comput. Vis. Mach. Intell., CVMI},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {While generative models such as text-to-image, large language models and text-to-video have seen significant progress, the extension to text-to-virtual-reality remains largely unexplored, due to a deficit in training data and the complexity of achieving realistic depth and motion in virtual environments. This paper proposes an approach to coalesce existing generative systems to form a stereoscopic virtual reality video from text. Carried out in three main stages, we start with a base text-to-image model that captures context from an input text. We then employ Stable Diffusion on the rudimentary image produced, to generate frames with enhanced realism and overall quality. These frames are processed with depth estimation algorithms to create left-eye and right-eye views, which are stitched side-by-side to create an immersive viewing experience. Such systems would be highly beneficial in virtual reality production, since filming and scene building often require extensive hours of work and post-production effort. We utilize image evaluation techniques, specifically Fréchet Inception Distance and CLIP Score, to assess the visual quality of frames produced for the video. These quantitative measures establish the proficiency of the proposed method. Our work highlights the exciting possibilities of using natural language-driven graphics in fields like virtual reality simulations. © 2025 Elsevier B.V., All rights reserved.},

keywords = {Adversarial networks, Computer simulation languages, Deep learning, Depth Estimation, Depth perception, Diffusion Model, diffusion models, Digital elevation model, Generative adversarial networks, Generative model, Generative systems, Language Model, Motion capture, Stereo image processing, Text-to-image, Training data, Video analysis, Video-clips, Virtual environments, Virtual Reality},

pubstate = {published},

tppubtype = {inproceedings}

}

2013

Franchini, Silvia; Gentile, Antonio; Sorbello, Filippo; Vassallo, Giorgio; Vitabile, Salvatore

Design and Implementation of an Embedded Coprocessor with Native Support for 5D, Quadruple-Based Clifford Algebra Journal Article

In: IEEE Transactions on Computers, vol. 62, no. 12, pp. 2366–2381, 2013, ISSN: 0018-9340.

Abstract | Links | BibTeX | Tags: Application-specific processors, Clifford algebra, Computational geometry, Computer graphics, Embedded coprocessors, Field Programmable Gate Arrays, FPGA prototyping, Geometric algebra, inverse kinematics, Motion capture, Raytracing, robotic arm, Robotics

@article{franchiniDesignImplementationEmbedded2013,

title = {Design and Implementation of an Embedded Coprocessor with Native Support for 5D, Quadruple-Based Clifford Algebra},

author = { Silvia Franchini and Antonio Gentile and Filippo Sorbello and Giorgio Vassallo and Salvatore Vitabile},

doi = {10.1109/TC.2012.225},

issn = {0018-9340},

year  = {2013},

date = {2013-01-01},

journal = {IEEE Transactions on Computers},

volume = {62},

number = {12},

pages = {2366--2381},

abstract = {Geometric or Clifford algebra (CA) is a powerful mathematical tool that offers a natural and intuitive way to model geometric facts in a number of research fields, such as robotics, machine vision, and computer graphics. Operating in higher dimensional spaces, its practical use is hindered, however, by a significant computational cost, only partially addressed by dedicated software libraries and hardware/software codesigns. For low-dimensional algebras, several dedicated hardware accelerators and coprocessing architectures have been already proposed in the literature. This paper introduces the architecture of CliffordALU5, an embedded coprocessing core conceived for native execution of up to 5D CA operations. CliffordALU5 exploits a novel, hardware-oriented representation of the algebra elements that allows for faster execution of Clifford operations. In this paper, a prototype implementation of a complete system-on-chip (SOC) based on CliffordALU5 is presented. This prototype integrates an embedded processing soft-core based on the PowerPC 405 and a CliffordALU5 coprocessor on a Xilinx XUPV2P Field Programmable Gate Array (FPGA) board. Test results show a 5texttimes average speedup for 4D Clifford products and a 4texttimes average speedup for 5D Clifford products against the same operations in Gaigen 2, a CA software library generator running on the general-purpose PowerPC processor. This paper also presents an execution analysis of three different applications in three diverse domains, namely, inverse kinematics of a robot, optical motion capture, and raytracing, showing an average speedup between 3texttimes and 4texttimes with respect to the baseline Gaigen 2 implementation. Finally, a multicore approach to higher dimensional CA based on CliffordALU5 is discussed. textcopyright 1968-2012 IEEE.},

keywords = {Application-specific processors, Clifford algebra, Computational geometry, Computer graphics, Embedded coprocessors, Field Programmable Gate Arrays, FPGA prototyping, Geometric algebra, inverse kinematics, Motion capture, Raytracing, robotic arm, Robotics},

pubstate = {published},

tppubtype = {article}

}