AHCI RESEARCH GROUP

Publications

Papers published in international journals,
proceedings of conferences, workshops and books.

OUR RESEARCH

Scientific Publications

How to

Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.

Show all

2025

Azzarelli, A.; Anantrasirichai, N.; Bull, D. R.

Intelligent Cinematography: a review of AI research for cinematographic production Journal Article

In: Artificial Intelligence Review, vol. 58, no. 4, 2025, ISSN: 02692821 (ISSN).

Abstract | Links | BibTeX | Tags: Artificial intelligence research, Computer vision, Content acquisition, Creative industries, Holistic view, machine learning, Machine-learning, Mergers and acquisitions, Review papers, Three dimensional computer graphics, Video applications, Video processing, Video processing and applications, Virtual production, Virtual Reality, Vision research

@article{azzarelli_intelligent_2025,

title = {Intelligent Cinematography: a review of AI research for cinematographic production},

author = {A. Azzarelli and N. Anantrasirichai and D. R. Bull},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85217373428&doi=10.1007%2fs10462-024-11089-3&partnerID=40&md5=360923b5ba8f63b6edfa1b7fd135c926},

doi = {10.1007/s10462-024-11089-3},

issn = {02692821 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {Artificial Intelligence Review},

volume = {58},

number = {4},

abstract = {This paper offers the first comprehensive review of artificial intelligence (AI) research in the context of real camera content acquisition for entertainment purposes and is aimed at both researchers and cinematographers. Addressing the lack of review papers in the field of intelligent cinematography (IC) and the breadth of related computer vision research, we present a holistic view of the IC landscape while providing technical insight, important for experts across disciplines. We provide technical background on generative AI, object detection, automated camera calibration and 3-D content acquisition, with references to assist non-technical readers. The application sections categorize work in terms of four production types: General Production, Virtual Production, Live Production and Aerial Production. Within each application section, we (1) sub-classify work according to research topic and (2) describe the trends and challenges relevant to each type of production. In the final chapter, we address the greater scope of IC research and summarize the significant potential of this area to influence the creative industries sector. We suggest that work relating to virtual production has the greatest potential to impact other mediums of production, driven by the growing interest in LED volumes/stages for in-camera virtual effects (ICVFX) and automated 3-D capture for virtual modeling of real world scenes and actors. We also address ethical and legal concerns regarding the use of creative AI that impact on artists, actors, technologists and the general public. © The Author(s) 2025.},

keywords = {Artificial intelligence research, Computer vision, Content acquisition, Creative industries, Holistic view, machine learning, Machine-learning, Mergers and acquisitions, Review papers, Three dimensional computer graphics, Video applications, Video processing, Video processing and applications, Virtual production, Virtual Reality, Vision research},

pubstate = {published},

tppubtype = {article}

}

Suzuki, R.; Gonzalez-Franco, M.; Sra, M.; Lindlbauer, D.

Everyday AR through AI-in-the-Loop Proceedings Article

In: Conf Hum Fact Comput Syst Proc, Association for Computing Machinery, 2025, ISBN: 979-840071395-8 (ISBN).

Abstract | Links | BibTeX | Tags: Augmented Reality, Augmented reality content, Augmented reality hardware, Computer vision, Content creation, Context-Aware, Generative AI, generative artificial intelligence, Human-AI Interaction, Human-artificial intelligence interaction, Language Model, Large language model, large language models, machine learning, Machine-learning, Mixed reality, Virtual Reality, Virtualization

@inproceedings{suzuki_everyday_2025,

title = {Everyday AR through AI-in-the-Loop},

author = {R. Suzuki and M. Gonzalez-Franco and M. Sra and D. Lindlbauer},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105005752990&doi=10.1145%2f3706599.3706741&partnerID=40&md5=56b5e447819dde7aa4a29f8e3899e535},

doi = {10.1145/3706599.3706741},

isbn = {979-840071395-8 (ISBN)},

year  = {2025},

date = {2025-01-01},

booktitle = {Conf Hum Fact Comput Syst Proc},

publisher = {Association for Computing Machinery},

abstract = {This workshop brings together experts and practitioners from augmented reality (AR) and artificial intelligence (AI) to shape the future of AI-in-the-loop everyday AR experiences. With recent advancements in both AR hardware and AI capabilities, we envision that everyday AR—always-available and seamlessly integrated into users’ daily environments—is becoming increasingly feasible. This workshop will explore how AI can drive such everyday AR experiences. We discuss a range of topics, including adaptive and context-aware AR, generative AR content creation, always-on AI assistants, AI-driven accessible design, and real-world-oriented AI agents. Our goal is to identify the opportunities and challenges in AI-enabled AR, focusing on creating novel AR experiences that seamlessly blend the digital and physical worlds. Through the workshop, we aim to foster collaboration, inspire future research, and build a community to advance the research field of AI-enhanced AR. © 2025 Copyright held by the owner/author(s).},

keywords = {Augmented Reality, Augmented reality content, Augmented reality hardware, Computer vision, Content creation, Context-Aware, Generative AI, generative artificial intelligence, Human-AI Interaction, Human-artificial intelligence interaction, Language Model, Large language model, large language models, machine learning, Machine-learning, Mixed reality, Virtual Reality, Virtualization},

pubstate = {published},

tppubtype = {inproceedings}

}

Banafa, A.

Artificial intelligence in action: Real-world applications and innovations Book

River Publishers, 2025, ISBN: 978-877004619-0 (ISBN); 978-877004620-6 (ISBN).

Abstract | Links | BibTeX | Tags: 5G, Affective Computing, AGI, AI, AI alignments, AI Ethics, AI hallucinations, AI hype, AI models, Alexa, ANI, ASI, Augmented Reality, Autoencoders, Autonomic computing, Autonomous Cars, Autoregressive models, Big Data, Big Data Analytics, Bitcoin, Blockchain, C3PO, Casual AI, Causal reasoning, ChatGPT, Cloud computing, Collective AI, Compression engines, Computer vision, Conditional Automation, Convolutional neural networks (CNNs), Cryptocurrency, Cybersecurity, Deceptive AI, Deep learning, Digital transformation, Driver Assistance, Driverless Cars, Drones, Elon Musk, Entanglement, Environment and sustainability, Ethereum, Explainable AI, Facebook, Facial Recognition, Feedforward. Neural Networks, Fog Computing, Full Automation, Future of AI, General AI, Generative Adversarial Networks (GANs), Generative AI, Google, Green AI, High Automation, Hybrid Blockchain, IEEE, Industrial Internet of Things (IIoT), Internet of things (IoT), Jarvis, Java, JavaScript, Long Short-Term Memory Networks, LTE, machine learning, Microsoft, MultiModal AI, Narrow AI, Natural disasters, Natural Language Generation (NLG), Natural Language Processing (NLP), NetFlix, Network Security, Neural Networks, Nuclear, Nuclear AI, NYTimes, Objective-driven AI, Open Source, Partial Automation, PayPal, Perfect AI, Private Blockchain, Private Cloud Computing, Programming languages, Python, Quantum Communications, Quantum Computing, Quantum Cryptography, Quantum internet, Quantum Machine Learning (QML), R2D2, Reactive machines. limited memory, Recurrent Neural Networks, Responsible AI, Robots, Sci-Fi movies, Self-Aware, Semiconductorâ??s, Sensate AI, Siri, Small Data, Smart Contracts. Hybrid Cloud Computing, Smart Devices, Sovereign AI, Super AI, Superposition, TensorFlow, Theory of Mind, Thick Data, Twitter, Variational Autoencoders (VAEs), Virtual Reality, Voice user interface (VUI), Wearable computing devices (WCD), Wearable Technology, Wi-Fi, XAI, Zero-Trust Model

@book{banafa_artificial_2025,

title = {Artificial intelligence in action: Real-world applications and innovations},

author = {A. Banafa},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105000403587&partnerID=40&md5=4b0d94be48194a942b22bef63f36d3bf},

isbn = {978-877004619-0 (ISBN); 978-877004620-6 (ISBN)},

year  = {2025},

date = {2025-01-01},

publisher = {River Publishers},

series = {Artificial Intelligence in Action: Real-World Applications and Innovations},

abstract = {This comprehensive book dives deep into the current landscape of AI, exploring its fundamental principles, development challenges, potential risks, and the cutting-edge breakthroughs that are propelling it forward. Artificial intelligence (AI) is rapidly transforming industries and societies worldwide through groundbreaking innovations and real-world applications. Starting with the core concepts, the book examines the various types of AI systems, generative AI models, and the complexities of machine learning. It delves into the programming languages driving AI development, data pipelines, model creation and deployment processes, while shedding light on issues like AI hallucinations and the intricate path of machine unlearning. The book then showcases the remarkable real-world applications of AI across diverse domains. From preventing job displacement and promoting environmental sustainability, to enhancing disaster response, drone technology, and even nuclear energy innovation, it highlights how AI is tackling complex challenges and driving positive change. The book also explores the double-edged nature of AI, recognizing its tremendous potential while cautioning about the risks of misuse, unintended consequences, and the urgent need for responsible development practices. It examines the intersection of AI and fields like operating system design, warfare, and semiconductor technology, underscoring the wide-ranging implications of this transformative force. As the quest for artificial general intelligence (AGI) and superintelligent AI systems intensifies, the book delves into cutting-edge research, emerging trends, and the pursuit of multimodal, explainable, and causally aware AI systems. It explores the symbiotic relationship between AI and human creativity, the rise of user-friendly "casual AI," and the potential of AI to tackle open-ended tasks. This is an essential guide for understanding the profound impact of AI on our world today and its potential to shape our future. From the frontiers of innovation to the challenges of responsible development, this book offers a comprehensive and insightful exploration of the remarkable real-world applications and innovations driving the AI revolution. © 2025 River Publishers. All rights reserved.},

keywords = {5G, Affective Computing, AGI, AI, AI alignments, AI Ethics, AI hallucinations, AI hype, AI models, Alexa, ANI, ASI, Augmented Reality, Autoencoders, Autonomic computing, Autonomous Cars, Autoregressive models, Big Data, Big Data Analytics, Bitcoin, Blockchain, C3PO, Casual AI, Causal reasoning, ChatGPT, Cloud computing, Collective AI, Compression engines, Computer vision, Conditional Automation, Convolutional neural networks (CNNs), Cryptocurrency, Cybersecurity, Deceptive AI, Deep learning, Digital transformation, Driver Assistance, Driverless Cars, Drones, Elon Musk, Entanglement, Environment and sustainability, Ethereum, Explainable AI, Facebook, Facial Recognition, Feedforward. Neural Networks, Fog Computing, Full Automation, Future of AI, General AI, Generative Adversarial Networks (GANs), Generative AI, Google, Green AI, High Automation, Hybrid Blockchain, IEEE, Industrial Internet of Things (IIoT), Internet of things (IoT), Jarvis, Java, JavaScript, Long Short-Term Memory Networks, LTE, machine learning, Microsoft, MultiModal AI, Narrow AI, Natural disasters, Natural Language Generation (NLG), Natural Language Processing (NLP), NetFlix, Network Security, Neural Networks, Nuclear, Nuclear AI, NYTimes, Objective-driven AI, Open Source, Partial Automation, PayPal, Perfect AI, Private Blockchain, Private Cloud Computing, Programming languages, Python, Quantum Communications, Quantum Computing, Quantum Cryptography, Quantum internet, Quantum Machine Learning (QML), R2D2, Reactive machines. limited memory, Recurrent Neural Networks, Responsible AI, Robots, Sci-Fi movies, Self-Aware, Semiconductorâ??s, Sensate AI, Siri, Small Data, Smart Contracts. Hybrid Cloud Computing, Smart Devices, Sovereign AI, Super AI, Superposition, TensorFlow, Theory of Mind, Thick Data, Twitter, Variational Autoencoders (VAEs), Virtual Reality, Voice user interface (VUI), Wearable computing devices (WCD), Wearable Technology, Wi-Fi, XAI, Zero-Trust Model},

pubstate = {published},

tppubtype = {book}

}

This comprehensive book dives deep into the current landscape of AI, exploring its fundamental principles, development challenges, potential risks, and the cutting-edge breakthroughs that are propelling it forward. Artificial intelligence (AI) is rapidly transforming industries and societies worldwide through groundbreaking innovations and real-world applications. Starting with the core concepts, the book examines the various types of AI systems, generative AI models, and the complexities of machine learning. It delves into the programming languages driving AI development, data pipelines, model creation and deployment processes, while shedding light on issues like AI hallucinations and the intricate path of machine unlearning. The book then showcases the remarkable real-world applications of AI across diverse domains. From preventing job displacement and promoting environmental sustainability, to enhancing disaster response, drone technology, and even nuclear energy innovation, it highlights how AI is tackling complex challenges and driving positive change. The book also explores the double-edged nature of AI, recognizing its tremendous potential while cautioning about the risks of misuse, unintended consequences, and the urgent need for responsible development practices. It examines the intersection of AI and fields like operating system design, warfare, and semiconductor technology, underscoring the wide-ranging implications of this transformative force. As the quest for artificial general intelligence (AGI) and superintelligent AI systems intensifies, the book delves into cutting-edge research, emerging trends, and the pursuit of multimodal, explainable, and causally aware AI systems. It explores the symbiotic relationship between AI and human creativity, the rise of user-friendly "casual AI," and the potential of AI to tackle open-ended tasks. This is an essential guide for understanding the profound impact of AI on our world today and its potential to shape our future. From the frontiers of innovation to the challenges of responsible development, this book offers a comprehensive and insightful exploration of the remarkable real-world applications and innovations driving the AI revolution. © 2025 River Publishers. All rights reserved.

Ademola, A.; Sinclair, D.; Koniaris, B.; Hannah, S.; Mitchell, K.

NeFT-Net: N-window extended frequency transformer for rhythmic motion prediction Journal Article

In: Computers and Graphics, vol. 129, 2025, ISSN: 00978493 (ISSN).

Abstract | Links | BibTeX | Tags: Cosine transforms, Discrete cosine transforms, Human motions, Immersive, machine learning, Machine-learning, Motion analysis, Motion prediction, Motion processing, Motion sequences, Motion tracking, Real-world, Rendering, Rendering (computer graphics), Rhythmic motion, Three dimensional computer graphics, Virtual environments, Virtual Reality

@article{ademola_neft-net_2025,

title = {NeFT-Net: N-window extended frequency transformer for rhythmic motion prediction},

author = {A. Ademola and D. Sinclair and B. Koniaris and S. Hannah and K. Mitchell},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105006724723&doi=10.1016%2fj.cag.2025.104244&partnerID=40&md5=08fd0792837332404ec9acdd16f608bf},

doi = {10.1016/j.cag.2025.104244},

issn = {00978493 (ISSN)},

year  = {2025},

date = {2025-01-01},

journal = {Computers and Graphics},

volume = {129},

abstract = {Advancements in prediction of human motion sequences are critical for enabling online virtual reality (VR) users to dance and move in ways that accurately mirror real-world actions, delivering a more immersive and connected experience. However, latency in networked motion tracking remains a significant challenge, disrupting engagement and necessitating predictive solutions to achieve real-time synchronization of remote motions. To address this issue, we propose a novel approach leveraging a synthetically generated dataset based on supervised foot anchor placement timings for rhythmic motions, ensuring periodicity and reducing prediction errors. Our model integrates a discrete cosine transform (DCT) to encode motion, refine high-frequency components, and smooth motion sequences, mitigating jittery artifacts. Additionally, we introduce a feed-forward attention mechanism designed to learn from N-window pairs of 3D key-point pose histories for precise future motion prediction. Quantitative and qualitative evaluations on the Human3.6M dataset highlight significant improvements in mean per joint position error (MPJPE) metrics, demonstrating the superiority of our technique over state-of-the-art approaches. We further introduce novel result pose visualizations through the use of generative AI methods. © 2025 The Authors},

keywords = {Cosine transforms, Discrete cosine transforms, Human motions, Immersive, machine learning, Machine-learning, Motion analysis, Motion prediction, Motion processing, Motion sequences, Motion tracking, Real-world, Rendering, Rendering (computer graphics), Rhythmic motion, Three dimensional computer graphics, Virtual environments, Virtual Reality},

pubstate = {published},

tppubtype = {article}

}

2024

Chaccour, C.; Saad, W.; Debbah, M.; Poor, H. V.

Joint Sensing, Communication, and AI: A Trifecta for Resilient THz User Experiences Journal Article

In: IEEE Transactions on Wireless Communications, vol. 23, no. 9, pp. 11444–11460, 2024, ISSN: 15361276 (ISSN).

Abstract | Links | BibTeX | Tags: Artificial intelligence, artificial intelligence (AI), Behavioral Research, Channel state information, Computer hardware, Cramer-Rao bounds, Extended reality (XR), Hardware, Joint sensing and communication, Learning systems, machine learning, machine learning (ML), Machine-learning, Multi agent systems, reliability, Resilience, Sensor data fusion, Tera Hertz, Terahertz, terahertz (THz), Terahertz communication, Wireless communications, Wireless sensor networks, X reality

@article{chaccour_joint_2024,

title = {Joint Sensing, Communication, and AI: A Trifecta for Resilient THz User Experiences},

author = {C. Chaccour and W. Saad and M. Debbah and H. V. Poor},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85190170739&doi=10.1109%2fTWC.2024.3382192&partnerID=40&md5=da12c6f31faacaa08118b26e4570843f},

doi = {10.1109/TWC.2024.3382192},

issn = {15361276 (ISSN)},

year  = {2024},

date = {2024-01-01},

journal = {IEEE Transactions on Wireless Communications},

volume = {23},

number = {9},

pages = {11444–11460},

abstract = {In this paper a novel joint sensing, communication, and artificial intelligence (AI) framework is proposed so as to optimize extended reality (XR) experiences over terahertz (THz) wireless systems. Within this framework, active reconfigurable intelligent surfaces (RISs) are incorporated as pivotal elements, serving as enhanced base stations in the THz band to enhance Line-of-Sight (LoS) communication. The proposed framework consists of three main components. First, a tensor decomposition framework is proposed to extract unique sensing parameters for XR users and their environment by exploiting the THz channel sparsity. Essentially, the THz band's quasi-opticality is exploited and the sensing parameters are extracted from the uplink communication signal, thereby allowing for the use of the same waveform, spectrum, and hardware for both communication and sensing functionalities. Then, the Cramér-Rao lower bound is derived to assess the accuracy of the estimated sensing parameters. Second, a non-autoregressive multi-resolution generative AI framework integrated with an adversarial transformer is proposed to predict missing and future sensing information. The proposed framework offers robust and comprehensive historical sensing information and anticipatory forecasts of future environmental changes, which are generalizable to fluctuations in both known and unforeseen user behaviors and environmental conditions. Third, a multi-agent deep recurrent hysteretic Q-neural network is developed to control the handover policy of RIS subarrays, leveraging the informative nature of sensing information to minimize handover cost, maximize the individual quality of personal experiences (QoPEs), and improve the robustness and resilience of THz links. Simulation results show a high generalizability of the proposed unsupervised generative artificial intelligence (AI) framework to fluctuations in user behavior and velocity, leading to a 61% improvement in instantaneous reliability compared to schemes with known channel state information. © 2002-2012 IEEE.},

keywords = {Artificial intelligence, artificial intelligence (AI), Behavioral Research, Channel state information, Computer hardware, Cramer-Rao bounds, Extended reality (XR), Hardware, Joint sensing and communication, Learning systems, machine learning, machine learning (ML), Machine-learning, Multi agent systems, reliability, Resilience, Sensor data fusion, Tera Hertz, Terahertz, terahertz (THz), Terahertz communication, Wireless communications, Wireless sensor networks, X reality},

pubstate = {published},

tppubtype = {article}

}

In this paper a novel joint sensing, communication, and artificial intelligence (AI) framework is proposed so as to optimize extended reality (XR) experiences over terahertz (THz) wireless systems. Within this framework, active reconfigurable intelligent surfaces (RISs) are incorporated as pivotal elements, serving as enhanced base stations in the THz band to enhance Line-of-Sight (LoS) communication. The proposed framework consists of three main components. First, a tensor decomposition framework is proposed to extract unique sensing parameters for XR users and their environment by exploiting the THz channel sparsity. Essentially, the THz band's quasi-opticality is exploited and the sensing parameters are extracted from the uplink communication signal, thereby allowing for the use of the same waveform, spectrum, and hardware for both communication and sensing functionalities. Then, the Cramér-Rao lower bound is derived to assess the accuracy of the estimated sensing parameters. Second, a non-autoregressive multi-resolution generative AI framework integrated with an adversarial transformer is proposed to predict missing and future sensing information. The proposed framework offers robust and comprehensive historical sensing information and anticipatory forecasts of future environmental changes, which are generalizable to fluctuations in both known and unforeseen user behaviors and environmental conditions. Third, a multi-agent deep recurrent hysteretic Q-neural network is developed to control the handover policy of RIS subarrays, leveraging the informative nature of sensing information to minimize handover cost, maximize the individual quality of personal experiences (QoPEs), and improve the robustness and resilience of THz links. Simulation results show a high generalizability of the proposed unsupervised generative artificial intelligence (AI) framework to fluctuations in user behavior and velocity, leading to a 61% improvement in instantaneous reliability compared to schemes with known channel state information. © 2002-2012 IEEE.

Gujar, P.; Paliwal, G.; Panyam, S.

Generative AI and the Future of Interactive and Immersive Advertising Proceedings Article

In: D., Rivas-Lalaleo; S.L.S., Maita (Ed.): ETCM - Ecuador Tech. Chapters Meet., Institute of Electrical and Electronics Engineers Inc., 2024, ISBN: 979-835039158-9 (ISBN).

Abstract | Links | BibTeX | Tags: Ad Creation, Adversarial machine learning, Advertising Technology (AdTech), Advertizing, Advertizing technology, Augmented Reality, Augmented Reality (AR), Generative adversarial networks, Generative AI, Immersive, Immersive Advertising, Immersive advertizing, Interactive Advertising, Interactive advertizing, machine learning, Machine-learning, Marketing, Mixed reality, Mixed Reality (MR), Personalization, Personalizations, User Engagement, Virtual environments, Virtual Reality, Virtual Reality (VR)

@inproceedings{gujar_generative_2024,

title = {Generative AI and the Future of Interactive and Immersive Advertising},

author = {P. Gujar and G. Paliwal and S. Panyam},

editor = {Rivas-Lalaleo D. and Maita S.L.S.},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85211805262&doi=10.1109%2fETCM63562.2024.10746166&partnerID=40&md5=179c5ceeb28ed72e809748322535c7ad},

doi = {10.1109/ETCM63562.2024.10746166},

isbn = {979-835039158-9 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {ETCM - Ecuador Tech. Chapters Meet.},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {Generative AI is revolutionizing interactive and immersive advertising by enabling more personalized, engaging experiences through advanced technologies like VR, AR, and MR. This transformation is reshaping how advertisers create, deliver, and optimize content, allowing for two-way communication and blurring lines between digital and physical worlds. AI enhances user engagement through predictive analytics, real-time adaptation, and natural language processing, while also optimizing ad placement and personalization. Future trends include integration with emerging technologies like 5G and IoT, fully immersive experiences, and hyper-personalization. However, challenges such as privacy concerns, transparency issues, and ethical considerations must be addressed. As AI continues to evolve, it promises to create unprecedented opportunities for brands to connect with audiences in meaningful ways, potentially blurring the line between advertising and interactive entertainment. The industry must proactively address these challenges to ensure AI-driven advertising enhances user experiences while respecting privacy and maintaining trust. © 2024 IEEE.},

keywords = {Ad Creation, Adversarial machine learning, Advertising Technology (AdTech), Advertizing, Advertizing technology, Augmented Reality, Augmented Reality (AR), Generative adversarial networks, Generative AI, Immersive, Immersive Advertising, Immersive advertizing, Interactive Advertising, Interactive advertizing, machine learning, Machine-learning, Marketing, Mixed reality, Mixed Reality (MR), Personalization, Personalizations, User Engagement, Virtual environments, Virtual Reality, Virtual Reality (VR)},

pubstate = {published},

tppubtype = {inproceedings}

}

Weerasinghe, K.; Janapati, S.; Ge, X.; Kim, S.; Iyer, S.; Stankovic, J. A.; Alemzadeh, H.

Real-Time Multimodal Cognitive Assistant for Emergency Medical Services Proceedings Article

In: Proc. - ACM/IEEE Conf. Internet-of-Things Des. Implement., IoTDI, pp. 85–96, Institute of Electrical and Electronics Engineers Inc., 2024, ISBN: 979-835037025-6 (ISBN).

Abstract | Links | BibTeX | Tags: Artificial intelligence, Augmented Reality, Cognitive Assistance, Computational Linguistics, Decision making, Domain knowledge, Edge computing, Emergency medical services, Forecasting, Graphic methods, Language Model, machine learning, Machine-learning, Multi-modal, Real- time, Service protocols, Smart Health, Speech recognition, State of the art

@inproceedings{weerasinghe_real-time_2024,

title = {Real-Time Multimodal Cognitive Assistant for Emergency Medical Services},

author = {K. Weerasinghe and S. Janapati and X. Ge and S. Kim and S. Iyer and J. A. Stankovic and H. Alemzadeh},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85197769304&doi=10.1109%2fIoTDI61053.2024.00012&partnerID=40&md5=a3b7cf14e46ecb2d4e49905fb845f2c9},

doi = {10.1109/IoTDI61053.2024.00012},

isbn = {979-835037025-6 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {Proc. - ACM/IEEE Conf. Internet-of-Things Des. Implement., IoTDI},

pages = {85–96},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {Emergency Medical Services (EMS) responders often operate under time-sensitive conditions, facing cognitive overload and inherent risks, requiring essential skills in critical thinking and rapid decision-making. This paper presents CognitiveEMS, an end-to-end wearable cognitive assistant system that can act as a collaborative virtual partner engaging in the real-time acquisition and analysis of multimodal data from an emergency scene and interacting with EMS responders through Augmented Reality (AR) smart glasses. CognitiveEMS processes the continuous streams of data in real-time and leverages edge computing to provide assistance in EMS protocol selection and intervention recognition. We address key technical challenges in real-time cognitive assistance by introducing three novel components: (i) a Speech Recognition model that is fine-tuned for real-world medical emergency conversations using simulated EMS audio recordings, augmented with synthetic data generated by large language models (LLMs); (ii) an EMS Protocol Prediction model that combines state-of-the-art (SOTA) tiny language models with EMS domain knowledge using graph-based attention mechanisms; (iii) an EMS Action Recognition module which leverages multimodal audio and video data and protocol predictions to infer the intervention/treatment actions taken by the responders at the incident scene. Our results show that for speech recognition we achieve superior performance compared to SOTA (WER of 0.290 vs. 0.618) on conversational data. Our protocol prediction component also significantly outperforms SOTA (top-3 accuracy of 0.800 vs. 0.200) and the action recognition achieves an accuracy of 0.727, while maintaining an end-to-end latency of 3.78s for protocol prediction on the edge and 0.31s on the server. © 2024 IEEE.},

keywords = {Artificial intelligence, Augmented Reality, Cognitive Assistance, Computational Linguistics, Decision making, Domain knowledge, Edge computing, Emergency medical services, Forecasting, Graphic methods, Language Model, machine learning, Machine-learning, Multi-modal, Real- time, Service protocols, Smart Health, Speech recognition, State of the art},

pubstate = {published},

tppubtype = {inproceedings}

}

Liu, M.; M'Hiri, F.

Beyond Traditional Teaching: Large Language Models as Simulated Teaching Assistants in Computer Science Proceedings Article

In: SIGCSE - Proc. ACM Tech. Symp. Comput. Sci. Educ., pp. 743–749, Association for Computing Machinery, Inc, 2024, ISBN: 979-840070423-9 (ISBN).

Abstract | Links | BibTeX | Tags: Adaptive teaching, ChatGPT, Computational Linguistics, CS education, E-Learning, Education computing, Engineering education, GPT, Language Model, LLM, machine learning, Machine-learning, Novice programmer, novice programmers, Openai, Programming, Python, Students, Teaching, Virtual Reality

@inproceedings{liu_beyond_2024,

title = {Beyond Traditional Teaching: Large Language Models as Simulated Teaching Assistants in Computer Science},

author = {M. Liu and F. M'Hiri},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85189289344&doi=10.1145%2f3626252.3630789&partnerID=40&md5=44ec79c8f005f4551c820c61f5b5d435},

doi = {10.1145/3626252.3630789},

isbn = {979-840070423-9 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {SIGCSE - Proc. ACM Tech. Symp. Comput. Sci. Educ.},

volume = {1},

pages = {743–749},

publisher = {Association for Computing Machinery, Inc},

abstract = {As the prominence of Large Language Models (LLMs) grows in various sectors, their potential in education warrants exploration. In this study, we investigate the feasibility of employing GPT-3.5 from OpenAI, as an LLM teaching assistant (TA) or a virtual TA in computer science (CS) courses. The objective is to enhance the accessibility of CS education while maintaining academic integrity by refraining from providing direct solutions to current-semester assignments. Targeting Foundations of Programming (COMP202), an undergraduate course that introduces students to programming with Python, we have developed a virtual TA using the LangChain framework, known for integrating language models with diverse data sources and environments. The virtual TA assists students with their code and clarifies complex concepts. For homework questions, it is designed to guide students with hints rather than giving out direct solutions. We assessed its performance first through a qualitative evaluation, then a survey-based comparative analysis, using a mix of questions commonly asked on the COMP202 discussion board and questions created by the authors. Our preliminary results indicate that the virtual TA outperforms human TAs on clarity and engagement, matching them on accuracy when the question is non-assignment-specific, for which human TAs still proved more reliable. These findings suggest that while virtual TAs, leveraging the capabilities of LLMs, hold great promise towards making CS education experience more accessible and engaging, their optimal use necessitates human supervision. We conclude by identifying several directions that could be explored in future implementations. © 2024 ACM.},

keywords = {Adaptive teaching, ChatGPT, Computational Linguistics, CS education, E-Learning, Education computing, Engineering education, GPT, Language Model, LLM, machine learning, Machine-learning, Novice programmer, novice programmers, Openai, Programming, Python, Students, Teaching, Virtual Reality},

pubstate = {published},

tppubtype = {inproceedings}

}

Dunaeva, Y.

Digital Evolution of Universities: Neural Networks in Education Book Section

In: Springer Geography, vol. Part F3974, pp. 453–463, Springer Science and Business Media Deutschland GmbH, 2024, ISBN: 2194315X (ISSN).

Abstract | Links | BibTeX | Tags: Artificial general intelligence (AGI), Cybersecurity, Cybersecurity threat, Digital exceptionalism, Dipfake, Large language models (LLM), machine learning, Virtual and augmented reality

@incollection{dunaeva_digital_2024,

title = {Digital Evolution of Universities: Neural Networks in Education},

author = {Y. Dunaeva},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85216787264&doi=10.1007%2f978-3-031-70886-2_38&partnerID=40&md5=9d11413a8dc4aa487c161f0746117169},

doi = {10.1007/978-3-031-70886-2_38},

isbn = {2194315X (ISSN)},

year  = {2024},

date = {2024-01-01},

booktitle = {Springer Geography},

volume = {Part F3974},

pages = {453–463},

publisher = {Springer Science and Business Media Deutschland GmbH},

abstract = {The study articulates the main achievements and opportunities, as well as threats and potential risks of using neural networks in education. Artificial intelligence is the top research topic of 2022–2023 and the period of emergence of new terms such as large language models, dipfake, virtual and augmented reality, and so on. The paper evaluates the tremendous potential of artificial intelligence in education, among which methods such as using AI to create personalized educational programs, analyzing big data on learning success, automating the assessment and grading process, developing new teaching methods, and creating a new knowledge assessment system such as text analysis and pattern recognition. However, artificial intelligence has significantly increased risks in scientific and educational environments: the paper analyzes threats such as cyberstalking (online stalking), phishing (malicious URLs to access accounts), and others. Methodologically, the article is based on the concept of digital participation. Considering university professors as passive recipients or consumers of the services of artificial intelligence technologies, the digital inclusion strategy emphasizes only technological progress, linking it to the younger generation, and underestimates the concept of digital participation of scientists, based on systems thinking, scientific outlook, life wisdom, and devotion to moral ideals. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.},

keywords = {Artificial general intelligence (AGI), Cybersecurity, Cybersecurity threat, Digital exceptionalism, Dipfake, Large language models (LLM), machine learning, Virtual and augmented reality},

pubstate = {published},

tppubtype = {incollection}

}

Cronin, I.

Understanding Generative AI Business Applications: A Guide to Technical Principles and Real-World Applications Book

Apress Media LLC, 2024, ISBN: 979-886880282-9 (ISBN); 979-886880281-2 (ISBN).

Abstract | Links | BibTeX | Tags: Artificial intelligence, Augmented Reality, Autonomous system, Autonomous systems, Business applications, Computer vision, Decision making, Gaussian Splatting, Gaussians, Generative AI, Language processing, Learning algorithms, Learning systems, machine learning, Machine-learning, Natural Language Processing, Natural Language Processing (NLP), Natural language processing systems, Natural languages, Splatting

@book{cronin_understanding_2024,

title = {Understanding Generative AI Business Applications: A Guide to Technical Principles and Real-World Applications},

author = {I. Cronin},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-105001777571&doi=10.1007%2f979-8-8688-0282-9&partnerID=40&md5=c0714ff3e1ad755596426ea092b830d6},

doi = {10.1007/979-8-8688-0282-9},

isbn = {979-886880282-9 (ISBN); 979-886880281-2 (ISBN)},

year  = {2024},

date = {2024-01-01},

publisher = {Apress Media LLC},

series = {Understanding Generative AI Business Applications: A Guide to Technical Principles and Real-World Applications},

abstract = {This guide covers the fundamental technical principles and various business applications of Generative AI for planning, developing, and evaluating AI-driven products. It equips you with the knowledge you need to harness the potential of Generative AI for enhancing business creativity and productivity. The book is organized into three sections: text-based, senses-based, and rationale-based. Each section provides an in-depth exploration of the specific methods and applications of Generative AI. In the text-based section, you will find detailed discussions on designing algorithms to automate and enhance written communication, including insights into the technical aspects of transformer-based Natural Language Processing (NLP) and chatbot architecture, such as GPT-4, Claude 2, Google Bard, and others. The senses-based section offers a glimpse into the algorithms and data structures that underpin visual, auditory, and multisensory experiences, including NeRF, 3D Gaussian Splatting, Stable Diffusion, AR and VR technologies, and more. The rationale-based section illuminates the decision-making capabilities of AI, with a focus on machine learning and data analytics techniques that empower applications such as simulation models, agents, and autonomous systems. In summary, this book serves as a guide for those seeking to navigate the dynamic landscape of Generative AI. Whether you’re a seasoned AI professional or a business leader looking to harness the power of creative automation, these pages offer a roadmap to leverage Generative AI for your organization’s success. © 2024 by Irena Cronin.},

keywords = {Artificial intelligence, Augmented Reality, Autonomous system, Autonomous systems, Business applications, Computer vision, Decision making, Gaussian Splatting, Gaussians, Generative AI, Language processing, Learning algorithms, Learning systems, machine learning, Machine-learning, Natural Language Processing, Natural Language Processing (NLP), Natural language processing systems, Natural languages, Splatting},

pubstate = {published},

tppubtype = {book}

}

Federico, G.; Carrara, F.; Amato, G.; Benedetto, M. Di

Spatio-Temporal 3D Reconstruction from Frame Sequences and Feature Points Proceedings Article

In: ACM Int. Conf. Proc. Ser., pp. 52–64, Association for Computing Machinery, 2024, ISBN: 979-840071794-9 (ISBN).

Abstract | Links | BibTeX | Tags: 3D reconstruction, Adversarial machine learning, Artificial intelligence, Color motion pictures, Color photography, Contrastive Learning, De-noising, Deep learning, Denoising Diffusion Probabilistic Model, Frame features, machine learning, Machine-learning, Probabilistic models, Signed Distance Field, Signed distance fields, Spatio-temporal, Video Reconstruction, Video streaming

@inproceedings{federico_spatio-temporal_2024,

title = {Spatio-Temporal 3D Reconstruction from Frame Sequences and Feature Points},

author = {G. Federico and F. Carrara and G. Amato and M. Di Benedetto},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85203128613&doi=10.1145%2f3672406.3672415&partnerID=40&md5=2a0dc51baa15f0dcd7f9d2cca708ec15},

doi = {10.1145/3672406.3672415},

isbn = {979-840071794-9 (ISBN)},

year  = {2024},

date = {2024-01-01},

booktitle = {ACM Int. Conf. Proc. Ser.},

pages = {52–64},

publisher = {Association for Computing Machinery},

abstract = {Reconstructing a large real environment is a fundamental task to promote eXtended Reality adoption in industrial and entertainment fields. However, the short range of depth cameras, the sparsity of LiDAR sensors, and the huge computational cost of Structure-from-Motion pipelines prevent scene replication in near real time. To overcome these limitations, we introduce a spatio-temporal diffusion neural architecture, a generative AI technique that fuses temporal information (i.e., a short temporally-ordered list of color photographs, like sparse frames of a video stream) with an approximate spatial resemblance of the explored environment. Our aim is to modify an existing 3D diffusion neural model to produce a Signed Distance Field volume from which a 3D mesh representation can be extracted. Our results show that the hallucination approach of diffusion models is an effective methodology where a fast reconstruction is a crucial target. © 2024 Owner/Author.},

keywords = {3D reconstruction, Adversarial machine learning, Artificial intelligence, Color motion pictures, Color photography, Contrastive Learning, De-noising, Deep learning, Denoising Diffusion Probabilistic Model, Frame features, machine learning, Machine-learning, Probabilistic models, Signed Distance Field, Signed distance fields, Spatio-temporal, Video Reconstruction, Video streaming},

pubstate = {published},

tppubtype = {inproceedings}

}

Otoum, Y.; Gottimukkala, N.; Kumar, N.; Nayak, A.

Machine Learning in Metaverse Security: Current Solutions and Future Challenges Journal Article

In: ACM Computing Surveys, vol. 56, no. 8, 2024, ISSN: 03600300 (ISSN).

Abstract | Links | BibTeX | Tags: 'current, Block-chain, Blockchain, digital twin, E-Learning, Extended reality, Future challenges, Generative AI, machine learning, Machine-learning, Metaverse Security, Metaverses, Security and privacy, Spatio-temporal dynamics, Sustainable development

@article{otoum_machine_2024,

title = {Machine Learning in Metaverse Security: Current Solutions and Future Challenges},

author = {Y. Otoum and N. Gottimukkala and N. Kumar and A. Nayak},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85193466017&doi=10.1145%2f3654663&partnerID=40&md5=b35485c5f2e943ec105ea11a80712cbe},

doi = {10.1145/3654663},

issn = {03600300 (ISSN)},

year  = {2024},

date = {2024-01-01},

journal = {ACM Computing Surveys},

volume = {56},

number = {8},

abstract = {The Metaverse, positioned as the next frontier of the Internet, has the ambition to forge a virtual shared realm characterized by immersion, hyper-spatiotemporal dynamics, and self-sustainability. Recent technological strides in AI, Extended Reality, 6G, and blockchain propel the Metaverse closer to realization, gradually transforming it from science fiction into an imminent reality. Nevertheless, the extensive deployment of the Metaverse faces substantial obstacles, primarily stemming from its potential to infringe on privacy and be susceptible to security breaches, whether inherent in its underlying technologies or arising from the evolving digital landscape. Metaverse security provisioning is poised to confront various foundational challenges owing to its distinctive attributes, encompassing immersive realism, hyper-spatiotemporally, sustainability, and heterogeneity. This article undertakes a comprehensive study of the security and privacy challenges facing the Metaverse, leveraging machine learning models for this purpose. In particular, our focus centers on an innovative distributed Metaverse architecture characterized by interactions across 3D worlds. Subsequently, we conduct a thorough review of the existing cutting-edge measures designed for Metaverse systems while also delving into the discourse surrounding security and privacy threats. As we contemplate the future of Metaverse systems, we outline directions for open research pursuits in this evolving landscape. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.},

keywords = {'current, Block-chain, Blockchain, digital twin, E-Learning, Extended reality, Future challenges, Generative AI, machine learning, Machine-learning, Metaverse Security, Metaverses, Security and privacy, Spatio-temporal dynamics, Sustainable development},

pubstate = {published},

tppubtype = {article}

}

2023

Fuchs, A.; Appel, S.; Grimm, P.

Immersive Spaces for Creativity: Smart Working Environments Proceedings Article

In: A.A., Yunanto; A.D., Ramadhani; Y.R., Prayogi; P.A.M., Putra; M., Ruswiansari; M., Ridwan; F., Gamar; W.M., Rahmawati; M.R., Rusli; F.M., Humaira; A.F., Adila (Ed.): IES - Int. Electron. Symp.: Unlocking Potential Immersive Technol. Live Better Life, Proceeding, pp. 610–617, Institute of Electrical and Electronics Engineers Inc., 2023, ISBN: 979-835031473-1 (ISBN).

Abstract | Links | BibTeX | Tags: Artificial intelligence, Generative AI, Human computer interaction, Immersive, Innovative approaches, Intelligent systems, Interactive Environments, Language Model, Language processing, Large language model, large language models, Learning algorithms, machine learning, Natural language processing systems, Natural languages, User behaviors, User interfaces, Virtual Reality, Working environment

@inproceedings{fuchs_immersive_2023,

title = {Immersive Spaces for Creativity: Smart Working Environments},

author = {A. Fuchs and S. Appel and P. Grimm},

editor = {Yunanto A.A. and Ramadhani A.D. and Prayogi Y.R. and Putra P.A.M. and Ruswiansari M. and Ridwan M. and Gamar F. and Rahmawati W.M. and Rusli M.R. and Humaira F.M. and Adila A.F.},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85173627291&doi=10.1109%2fIES59143.2023.10242458&partnerID=40&md5=6ab1796f68c29d7747574272314a2e9d},

doi = {10.1109/IES59143.2023.10242458},

isbn = {979-835031473-1 (ISBN)},

year  = {2023},

date = {2023-01-01},

booktitle = {IES - Int. Electron. Symp.: Unlocking Potential Immersive Technol. Live Better Life, Proceeding},

pages = {610–617},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {This paper presents an innovative approach to designing an immersive space that dynamically supports users (inter-)action based on users' behavior, voice, and mood, providing a personalized experience. The objective of this research is to explore how a space can communicate with users in a seamless, engaging, and interactive environment. Therefore, it integrates natural language processing (NLP), generative artificial intelligence applications and human computer interaction that utilizes a combination of sensors, microphones, and cameras to collect real-time data on users' behavior, voice, and mood. This data is then processed and analyzed by an intelligent system that employs machine learning algorithms to identify patterns and adapt the environment accordingly. The adaptive features include changes in lighting, sound, and visual elements to facilitate creativity, focus, relaxation, or socialization, depending on the user's topics and emotional state. The paper discusses the technical aspects of implementing such a system. Additionally, it highlights the potential applications of this technology in various domains such as education, entertainment, and workplace settings. In conclusion, the immersive creative space represents a paradigm shift in human-environment interaction, offering a dynamic and personalized space that caters to the diverse needs of users. The research findings suggest that this innovative approach holds great promise for enhancing user experiences, fostering creativity, and promoting overall well-being. © 2023 IEEE.},

keywords = {Artificial intelligence, Generative AI, Human computer interaction, Immersive, Innovative approaches, Intelligent systems, Interactive Environments, Language Model, Language processing, Large language model, large language models, Learning algorithms, machine learning, Natural language processing systems, Natural languages, User behaviors, User interfaces, Virtual Reality, Working environment},

pubstate = {published},

tppubtype = {inproceedings}

}

Yeo, J. Q.; Wang, Y.; Tanary, S.; Cheng, J.; Lau, M.; Ng, A. B.; Guan, F.

AICRID: AI-Empowered CR For Interior Design Proceedings Article

In: G., Bruder; A.H., Olivier; A., Cunningham; E.Y., Peng; J., Grubert; I., Williams (Ed.): Proc. - IEEE Int. Symp. Mixed Augment. Real. Adjunct, ISMAR-Adjunct, pp. 837–841, Institute of Electrical and Electronics Engineers Inc., 2023, ISBN: 979-835032891-2 (ISBN).

Abstract | Links | BibTeX | Tags: 3D modeling, 3D models, 3d-modeling, Architectural design, Artificial intelligence, Artificial intelligence technologies, Augmented Reality, Augmented reality technology, Interior Design, Interior designs, machine learning, Machine-learning, Model generation, Novel design, Text images, User need, Visualization

@inproceedings{yeo_aicrid_2023,

title = {AICRID: AI-Empowered CR For Interior Design},

author = {J. Q. Yeo and Y. Wang and S. Tanary and J. Cheng and M. Lau and A. B. Ng and F. Guan},

editor = {Bruder G. and Olivier A.H. and Cunningham A. and Peng E.Y. and Grubert J. and Williams I.},

url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85180375829&doi=10.1109%2fISMAR-Adjunct60411.2023.00184&partnerID=40&md5=b14d89dbd38a4dfe3f85b90800d42e78},

doi = {10.1109/ISMAR-Adjunct60411.2023.00184},

isbn = {979-835032891-2 (ISBN)},

year  = {2023},

date = {2023-01-01},

booktitle = {Proc. - IEEE Int. Symp. Mixed Augment. Real. Adjunct, ISMAR-Adjunct},

pages = {837–841},

publisher = {Institute of Electrical and Electronics Engineers Inc.},

abstract = {Augmented Reality (AR) technologies have been utilized for interior design for years. Normally 3D furniture models need to be created manually or by scanning with specialized devices and this is usually a costly process. Additionally, users need controllers or hands for manipulating the virtual furniture which may lead to fatigue for long-time usage. Artificial Intelligence (AI) technologies have made it possible to generate 3D models from texts, images or both and show potential to automate interactions through the user's voice. We propose a novel design, AICRID in short, which aims to automate the 3D model generation and to facilitate the interactions for interior design AR by leveraging on AI technologies. Specifically, our design will allow the users to directly generate 3D furniture models with generative AI, enabling them to directly interact with the virtual objects through their voices. © 2023 IEEE.},

keywords = {3D modeling, 3D models, 3d-modeling, Architectural design, Artificial intelligence, Artificial intelligence technologies, Augmented Reality, Augmented reality technology, Interior Design, Interior designs, machine learning, Machine-learning, Model generation, Novel design, Text images, User need, Visualization},

pubstate = {published},

tppubtype = {inproceedings}

}

2021

Franchini, Silvia; Terranova, Maria Chiara; Re, Giuseppe Lo; Galia, Massimo; Salerno, Sergio; Midiri, Massimo; Vitabile, Salvatore

A Novel System for Multi-level Crohn's Disease Classification and Grading Based on a Multiclass Support Vector Machine Book Section

In: Esposito, Anna; Faundez-Zanuy, Marcos; Morabito, Francesco Carlo; Pasero, Eros (Ed.): Progresses in Artificial Intelligence and Neural Systems, pp. 185–197, Springer, Singapore, 2021, ISBN: 9789811550935.

Abstract | Links | BibTeX | Tags: Bayesian optimization, Crohn's disease multi-level classification and grading, Feature extraction, Feature reduction, K-fold cross-validation, machine learning, Magnetic Resonance Enterography, Medical Imaging, multi-level classifiers, Multiclass support vector machines, Supervised learning

@incollection{franchiniNovelSystemMultilevel2021,

title = {A Novel System for Multi-level Crohn's Disease Classification and Grading Based on a Multiclass Support Vector Machine},

author = { Silvia Franchini and Maria Chiara Terranova and Giuseppe Lo Re and Massimo Galia and Sergio Salerno and Massimo Midiri and Salvatore Vitabile},

editor = { Anna Esposito and Marcos {Faundez-Zanuy} and Francesco Carlo Morabito and Eros Pasero},

doi = {10.1007/978-981-15-5093-5_18},

isbn = {9789811550935},

year  = {2021},

date = {2021-01-01},

urldate = {2023-03-20},

booktitle = {Progresses in Artificial Intelligence and Neural Systems},

pages = {185--197},

publisher = {Springer},

address = {Singapore},

series = {Smart Innovation, Systems and Technologies},

abstract = {Crohn's disease (CD) is a chronic inflammatory condition of the gastrointestinal tract that can highly alter patient's quality of life. Diagnostic imaging, such as Enterography Magnetic Resonance Imaging (E-MRI), provides crucial information for CD activity assessment. Automatic learning methods play a fundamental role in the classification of CD and allow to avoid the long and expensive manual classification process by radiologists. This paper presents a novel classification method that uses a multiclass Support Vector Machine (SVM) based on a Radial Basis Function (RBF) kernel for the grading of CD inflammatory activity. To validate the system, we have used a dataset composed of 800 E-MRI examinations of 800 patients from the University of Palermo Policlinico Hospital. For each E-MRI image, a team of radiologists has extracted 20 features associated with CD, calculated a disease activity index and classified patients into three classes (no activity, mild activity and severe activity). The 20 features have been used as the input variables to the SVM classifier, while the activity index has been adopted as the response variable. Different feature reduction techniques have been applied to improve the classifier performance, while a Bayesian optimization technique has been used to find the optimal hyperparameters of the RBF kernel. K-fold cross-validation has been used to enhance the evaluation reliability. The proposed SVM classifier achieved a better performance when compared with other standard classification methods. Experimental results show an accuracy index of 91.45% with an error of 8.55% that outperform the operator-based reference values reported in literature.},

keywords = {Bayesian optimization, Crohn's disease multi-level classification and grading, Feature extraction, Feature reduction, K-fold cross-validation, machine learning, Magnetic Resonance Enterography, Medical Imaging, multi-level classifiers, Multiclass support vector machines, Supervised learning},

pubstate = {published},

tppubtype = {incollection}

}

Franchini, Silvia; Terranova, Maria Chiara; Re, Giuseppe Lo; Galia, Massimo; Salerno, Sergio; Midiri, Massimo; Vitabile, Salvatore

A Novel System for Multi-level Crohn’s Disease Classification and Grading Based on a Multiclass Support Vector Machine Book Section

Abstract | Links | BibTeX | Tags: Bayesian optimization, Crohn’s disease multi-level classification and grading, Feature extraction, Feature reduction, K-fold cross-validation, machine learning, Magnetic Resonance Enterography, Medical Imaging, multi-level classifiers, Multiclass support vector machines, Supervised learning

@incollection{franchini_novel_2021,

title = {A Novel System for Multi-level Crohn’s Disease Classification and Grading Based on a Multiclass Support Vector Machine},

author = {Silvia Franchini and Maria Chiara Terranova and Giuseppe Lo Re and Massimo Galia and Sergio Salerno and Massimo Midiri and Salvatore Vitabile},

editor = {Anna Esposito and Marcos Faundez-Zanuy and Francesco Carlo Morabito and Eros Pasero},

url = {https://doi.org/10.1007/978-981-15-5093-5_18},

doi = {10.1007/978-981-15-5093-5_18},

isbn = {9789811550935},

year  = {2021},

date = {2021-01-01},

urldate = {2023-03-20},

booktitle = {Progresses in Artificial Intelligence and Neural Systems},

pages = {185–197},

publisher = {Springer},

address = {Singapore},

series = {Smart Innovation, Systems and Technologies},

abstract = {Crohn’s disease (CD) is a chronic inflammatory condition of the gastrointestinal tract that can highly alter patient’s quality of life. Diagnostic imaging, such as Enterography Magnetic Resonance Imaging (E-MRI), provides crucial information for CD activity assessment. Automatic learning methods play a fundamental role in the classification of CD and allow to avoid the long and expensive manual classification process by radiologists. This paper presents a novel classification method that uses a multiclass Support Vector Machine (SVM) based on a Radial Basis Function (RBF) kernel for the grading of CD inflammatory activity. To validate the system, we have used a dataset composed of 800 E-MRI examinations of 800 patients from the University of Palermo Policlinico Hospital. For each E-MRI image, a team of radiologists has extracted 20 features associated with CD, calculated a disease activity index and classified patients into three classes (no activity, mild activity and severe activity). The 20 features have been used as the input variables to the SVM classifier, while the activity index has been adopted as the response variable. Different feature reduction techniques have been applied to improve the classifier performance, while a Bayesian optimization technique has been used to find the optimal hyperparameters of the RBF kernel. K-fold cross-validation has been used to enhance the evaluation reliability. The proposed SVM classifier achieved a better performance when compared with other standard classification methods. Experimental results show an accuracy index of 91.45% with an error of 8.55% that outperform the operator-based reference values reported in literature.},

keywords = {Bayesian optimization, Crohn’s disease multi-level classification and grading, Feature extraction, Feature reduction, K-fold cross-validation, machine learning, Magnetic Resonance Enterography, Medical Imaging, multi-level classifiers, Multiclass support vector machines, Supervised learning},

pubstate = {published},

tppubtype = {incollection}

}

2020

Franchini, Silvia; Terranova, Maria Chiara; Re, Giuseppe Lo; Salerno, Sergio; Midiri, Massimo; Vitabile, Salvatore

Evaluation of a Support Vector Machine Based Method for Crohn's Disease Classification Book Section

In: Esposito, Anna; Faundez-Zanuy, Marcos; Morabito, Francesco Carlo; Pasero, Eros (Ed.): Neural Approaches to Dynamics of Signal Exchanges, pp. 313–327, Springer, Singapore, 2020, ISBN: 9789811389504.

Abstract | Links | BibTeX | Tags: Crohn's disease classification, Feature extraction, Feature reduction, K-fold cross-validation, machine learning, Magnetic Resonance Enterography, Medical Imaging, Supervised learning, Support vector machines

@incollection{franchiniEvaluationSupportVector2020,

title = {Evaluation of a Support Vector Machine Based Method for Crohn's Disease Classification},

author = { Silvia Franchini and Maria Chiara Terranova and Giuseppe Lo Re and Sergio Salerno and Massimo Midiri and Salvatore Vitabile},

editor = { Anna Esposito and Marcos {Faundez-Zanuy} and Francesco Carlo Morabito and Eros Pasero},

doi = {10.1007/978-981-13-8950-4_29},

isbn = {9789811389504},

year  = {2020},

date = {2020-01-01},

urldate = {2023-03-20},

booktitle = {Neural Approaches to Dynamics of Signal Exchanges},

pages = {313--327},

publisher = {Springer},

address = {Singapore},

series = {Smart Innovation, Systems and Technologies},

abstract = {Crohn's disease (CD) is a chronic, disabling inflammatory bowel disease that affects millions of people worldwide. CD diagnosis is a challenging issue that involves a combination of radiological, endoscopic, histological, and laboratory investigations. Medical imaging plays an important role in the clinical evaluation of CD. Enterography magnetic resonance imaging (E-MRI) has been proven to be a useful diagnostic tool for disease activity assessment. However, the manual classification process by expert radiologists is time-consuming and expensive. This paper proposes the evaluation of an automatic Support Vector Machine (SVM) based supervised learning method for CD classification. A real E-MRI dataset composed of 800 patients from the University of Palermo Policlinico Hospital (400 patients with histologically proved CD and 400 healthy patients) has been used to evaluate the proposed classification technique. For each patient, a team of radiology experts has extracted a vector composed of 20 features, usually associated with CD, from the related E-MRI examination, while the histological specimen results have been used as the ground-truth for CD diagnosis. The dataset composed of 800 vectors has been used to train and validate the SVM classifier. Automatic techniques for feature space reduction have been applied and validated by the radiologists to optimize the proposed classification method, while K-fold cross-validation has been used to improve the SVM classifier reliability. The measured indexes (sensitivity: 97.07%, specificity: 96.04%, negative predictive value: 97.24%, precision: 95.80%, accuracy: 96.54%, error: 3.46%) are better than the operator-based reference values reported in the literature. Experimental results also show that the proposed method outperforms the main standard classification techniques.},

keywords = {Crohn's disease classification, Feature extraction, Feature reduction, K-fold cross-validation, machine learning, Magnetic Resonance Enterography, Medical Imaging, Supervised learning, Support vector machines},

pubstate = {published},

tppubtype = {incollection}

}