AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Wei, Q.; Huang, J.; Gao, Y.; Dong, W.

One Model to Fit Them All: Universal IMU-based Human Activity Recognition with LLM-assisted Cross-dataset Representation Journal Article

In: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 9, no. 3, 2025, ISSN: 24749567 (ISSN), (Publisher: Association for Computing Machinery).

Abstract | Links | BibTeX | Tags: Broad application, Contrastive Learning, Cross-dataset, Data collection, Human activity recognition, Human activity recognition systems, Human computer interaction, Intelligent interactions, Language Model, Large datasets, Large language model, large language models, Learning systems, Neural-networks, Pattern recognition, Spatial relationships, Ubiquitous computing, Virtual Reality

@article{wei_one_2025,

title = {One Model to Fit Them All: Universal IMU-based Human Activity Recognition with LLM-assisted Cross-dataset Representation},

author = {Q. Wei and J. Huang and Y. Gao and W. Dong},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105015431117&doi=10.1145%2F3749509&partnerID=40&md5=2a6f26a05856c48ba3aaaf356b375dc0},

doi = {10.1145/3749509},

issn = {24749567 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies},

volume = {9},

number = {3},

abstract = {Human Activity Recognition (HAR) is essential for pervasive computing and intelligent interaction, with broad applications across various fields. However, there is still no one model capable of fitting various HAR datasets, severely limiting its applicability in practical scenarios. To address this, we propose oneHAR, an LLM-assisted universal IMU-based HAR system designed to achieve "one model to fit them all" — just one model that can adapt to diverse HAR datasets without any dataset-specific operation. In particular, we propose Cross-Dataset neural network (CDNet) for the "one model," which models both the temporal context and spatial relationships of IMU data to capture cross-dataset representations, encompassing differences in device, participant, data collection position, and environment, etc. Additionally, we introduce LLM-driven data synthesis, which enhances the training process by generating virtual IMU data through three carefully designed strategies. Furthermore, LLM-assisted adaptive position processing optimizes the inference process by flexibly handling a variable combination of positional inputs. Our model demonstrates strong generalization across five public IMU-based HAR datasets, outperforming the best baselines by up to 46.9% in the unseen-dataset scenario, and 6.5% in the cross-dataset scenario. © 2025 Elsevier B.V., All rights reserved.},

note = {Publisher: Association for Computing Machinery},

keywords = {Broad application, Contrastive Learning, Cross-dataset, Data collection, Human activity recognition, Human activity recognition systems, Human computer interaction, Intelligent interactions, Language Model, Large datasets, Large language model, large language models, Learning systems, Neural-networks, Pattern recognition, Spatial relationships, Ubiquitous computing, Virtual Reality},

pubstate = {published},

tppubtype = {article}

}

2023

Wang, Z.; Joshi, A.; Zhang, G.; Ren, W.; Jia, F.; Sun, X.

Elevating Perception: Unified Recognition Framework and Vision-Language Pre-Training Using Three-Dimensional Image Reconstruction Proceedings Article

In: Proc. - Int. Conf. Artif. Intell., Human-Comput. Interact. Robot., AIHCIR, pp. 592–596, Institute of Electrical and Electronics Engineers Inc., 2023, ISBN: 9798350360363 (ISBN).

Abstract | Links | BibTeX | Tags: 3D Model LLM, 3D modeling, 3D models, 3D Tech, 3d-modeling, Augmented Reality, Character recognition, Component, Computer aided design, Computer vision, Continuous time systems, Data handling, Generative AI, Image enhancement, Image Reconstruction, Image to Text Generation, Medical Imaging, Pattern recognition, Pre-training, Reconstructive Training, Text generations, Three dimensional computer graphics, Virtual Reality

@inproceedings{wang_elevating_2023,

title = {Elevating Perception: Unified Recognition Framework and Vision-Language Pre-Training Using Three-Dimensional Image Reconstruction},

author = {Z. Wang and A. Joshi and G. Zhang and W. Ren and F. Jia and X. Sun},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85192837757&doi=10.1109%2FAIHCIR61661.2023.00105&partnerID=40&md5=c6d3192a2e88ebadfe1c591c6625aefe},

doi = {10.1109/AIHCIR61661.2023.00105},

isbn = {9798350360363 (ISBN)},

year  = {2023},

date = {2023-01-01},

booktitle = {Proc. - Int. Conf. Artif. Intell., Human-Comput. Interact. Robot., AIHCIR},

pages = {592–596},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {This research project explores a paradigm shift in perceptual enhancement by integrating a Unified Recognition Framework and Vision-Language Pre-Training in three-dimensional image reconstruction. Through the synergy of advanced algorithms from computer vision & language processing, the project tries to enhance the precision and depth of perception in reconstructed images. This innovative approach holds the potential to revolutionize fields such as medical imaging, virtual reality, and computer-aided design, providing a comprehensive perspective on the intersection of multimodal data processing and perceptual advancement. The anticipated research outcomes are expected to significantly contribute to the evolution of technologies that rely on accurate and contextually rich three-dimensional reconstructions. Moreover, the research aims to reduce the constant need for new datasets by improving pattern recognition through 3D image patterning on backpropagation. This continuous improvement of vectors is envisioned to enhance the efficiency and accuracy of pattern recognition, contributing to the optimization of perceptual systems over time. © 2024 Elsevier B.V., All rights reserved.},

keywords = {3D Model LLM, 3D modeling, 3D models, 3D Tech, 3d-modeling, Augmented Reality, Character recognition, Component, Computer aided design, Computer vision, Continuous time systems, Data handling, Generative AI, Image enhancement, Image Reconstruction, Image to Text Generation, Medical Imaging, Pattern recognition, Pre-training, Reconstructive Training, Text generations, Three dimensional computer graphics, Virtual Reality},

pubstate = {published},

tppubtype = {inproceedings}

}

Leng, Z.; Kwon, H.; Ploetz, T.

Generating Virtual On-body Accelerometer Data from Virtual Textual Descriptions for Human Activity Recognition Proceedings Article

In: ISWC - Proc. Int. Symp. Wearable Comput., pp. 39–43, Association for Computing Machinery, Inc, 2023, ISBN: 9798400701993 (ISBN).

Abstract | Links | BibTeX | Tags: Activity recognition, Computational Linguistics, E-Learning, Human activity recognition, Language Model, Large language model, large language models, Motion estimation, Motion Synthesis, On-body, Pattern recognition, Recognition models, Textual description, Training data, Virtual IMU Data, Virtual Reality, Wearable Sensors

@inproceedings{leng_generating_2023,

title = {Generating Virtual On-body Accelerometer Data from Virtual Textual Descriptions for Human Activity Recognition},

author = {Z. Leng and H. Kwon and T. Ploetz},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85175788497&doi=10.1145%2F3594738.3611361&partnerID=40&md5=c6c4d291ee8aa88d4e48589b0740ebac},

doi = {10.1145/3594738.3611361},

isbn = {9798400701993 (ISBN)},

year  = {2023},

date = {2023-01-01},

booktitle = {ISWC - Proc. Int. Symp. Wearable Comput.},

pages = {39–43},

publisher = {Association for Computing Machinery, Inc},

abstract = {The development of robust, generalized models for human activity recognition (HAR) has been hindered by the scarcity of large-scale, labeled data sets. Recent work has shown that virtual IMU data extracted from videos using computer vision techniques can lead to substantial performance improvements when training HAR models combined with small portions of real IMU data. Inspired by recent advances in motion synthesis from textual descriptions and connecting Large Language Models (LLMs) to various AI models, we introduce an automated pipeline that first uses ChatGPT to generate diverse textual descriptions of activities. These textual descriptions are then used to generate 3D human motion sequences via a motion synthesis model, T2M-GPT, and later converted to streams of virtual IMU data. We benchmarked our approach on three HAR datasets (RealWorld, PAMAP2, and USC-HAD) and demonstrate that the use of virtual IMU training data generated using our new approach leads to significantly improved HAR model performance compared to only using real IMU data. Our approach contributes to the growing field of cross-modality transfer methods and illustrate how HAR models can be improved through the generation of virtual training data that do not require any manual effort. © 2023 Elsevier B.V., All rights reserved.},

keywords = {Activity recognition, Computational Linguistics, E-Learning, Human activity recognition, Language Model, Large language model, large language models, Motion estimation, Motion Synthesis, On-body, Pattern recognition, Recognition models, Textual description, Training data, Virtual IMU Data, Virtual Reality, Wearable Sensors},

pubstate = {published},

tppubtype = {inproceedings}

}