AHCI RESEARCH GROUP
Publications
Papers published in international journals,
proceedings of conferences, workshops and books.
OUR RESEARCH
Scientific Publications
How to
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.
2024
Clocchiatti, A.; Fumero, N.; Soccini, A. M.
Character Animation Pipeline based on Latent Diffusion and Large Language Models Proceedings Article
In: Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR, pp. 398–405, Institute of Electrical and Electronics Engineers Inc., 2024, ISBN: 979-835037202-1 (ISBN).
Abstract | Links | BibTeX | Tags: Animation, Animation pipeline, Artificial intelligence, Augmented Reality, Character animation, Computational Linguistics, Computer animation, Deep learning, Diffusion, E-Learning, Extended reality, Film production, Generative art, Language Model, Learning systems, Learning techniques, Natural language processing systems, Pipelines, Production pipelines, Virtual Reality
@inproceedings{clocchiatti_character_2024,
title = {Character Animation Pipeline based on Latent Diffusion and Large Language Models},
author = {A. Clocchiatti and N. Fumero and A. M. Soccini},
url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85187217072&doi=10.1109%2fAIxVR59861.2024.00067&partnerID=40&md5=d88b9ba7c80d49b60fd0d7acd5e7c4f0},
doi = {10.1109/AIxVR59861.2024.00067},
isbn = {979-835037202-1 (ISBN)},
year = {2024},
date = {2024-01-01},
booktitle = {Proc. - IEEE Int. Conf. Artif. Intell. Ext. Virtual Real., AIxVR},
pages = {398–405},
publisher = {Institute of Electrical and Electronics Engineers Inc.},
abstract = {Artificial intelligence and deep learning techniques are revolutionizing the film production pipeline. The majority of the current screenplay-to-animation pipelines focus on understanding the screenplay through natural language processing techniques, and on the generation of the animation through custom engines, missing the possibility to customize the characters. To address these issues, we propose a high-level pipeline for generating 2D characters and animations starting from screenplays, through a combination of Latent Diffusion Models and Large Language Models. Our approach uses ChatGPT to generate character descriptions starting from the screenplay. Then, using that data, it generates images of custom characters with Stable Diffusion and animates them according to their actions in different scenes. The proposed approach avoids well-known problems in generative AI tools such as temporal inconsistency and lack of control on the outcome. The results suggest that the pipeline is consistent and reliable, benefiting industries ranging from film production to virtual, augmented and extended reality content creation. © 2024 IEEE.},
keywords = {Animation, Animation pipeline, Artificial intelligence, Augmented Reality, Character animation, Computational Linguistics, Computer animation, Deep learning, Diffusion, E-Learning, Extended reality, Film production, Generative art, Language Model, Learning systems, Learning techniques, Natural language processing systems, Pipelines, Production pipelines, Virtual Reality},
pubstate = {published},
tppubtype = {inproceedings}
}
Si, J.; Yang, S.; Song, J.; Son, S.; Lee, S.; Kim, D.; Kim, S.
Generating and Integrating Diffusion Model-Based Panoramic Views for Virtual Interview Platform Proceedings Article
In: IEEE Int. Conf. Artif. Intell. Eng. Technol., IICAIET, pp. 343–348, Institute of Electrical and Electronics Engineers Inc., 2024, ISBN: 979-835038969-2 (ISBN).
Abstract | Links | BibTeX | Tags: AI, Deep learning, Diffusion, Diffusion Model, Diffusion technology, Digital elevation model, High quality, Manual process, Model-based OPC, New approaches, Panorama, Panoramic views, Virtual environments, Virtual Interview, Virtual Reality
@inproceedings{si_generating_2024,
title = {Generating and Integrating Diffusion Model-Based Panoramic Views for Virtual Interview Platform},
author = {J. Si and S. Yang and J. Song and S. Son and S. Lee and D. Kim and S. Kim},
url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209663031&doi=10.1109%2fIICAIET62352.2024.10730450&partnerID=40&md5=a52689715ec912c54696948c34fc0263},
doi = {10.1109/IICAIET62352.2024.10730450},
isbn = {979-835038969-2 (ISBN)},
year = {2024},
date = {2024-01-01},
booktitle = {IEEE Int. Conf. Artif. Intell. Eng. Technol., IICAIET},
pages = {343–348},
publisher = {Institute of Electrical and Electronics Engineers Inc.},
abstract = {This paper presents a new approach to improve virtual interview platforms in education, which are gaining significant attention. This study aims to simplify the complex manual process of equipment setup to enhance the realism and reliability of virtual interviews. To this end, this study proposes a method for automatically constructing 3D virtual interview environments using diffusion technology in generative AI. In this research, we exploit a diffusion model capable of generating high-quality panoramic images. We generate images of interview rooms capable of delivering immersive interview experiences via refined text prompts. The resulting imagery is then reconstituted 3D VR content utilizing the Unity engine, facilitating enhanced interaction and engagement within virtual environments. This research compares and analyzes various methods presented in related research and proposes a new process for efficiently constructing 360-degree virtual environments. When wearing Oculus Quest 2 and experiencing the virtual environment created using the proposed method, a high sense of immersion was experienced, similar to the actual interview environment. © 2024 IEEE.},
keywords = {AI, Deep learning, Diffusion, Diffusion Model, Diffusion technology, Digital elevation model, High quality, Manual process, Model-based OPC, New approaches, Panorama, Panoramic views, Virtual environments, Virtual Interview, Virtual Reality},
pubstate = {published},
tppubtype = {inproceedings}
}
Chamola, V.; Bansal, G.; Das, T. K.; Hassija, V.; Sai, S.; Wang, J.; Zeadally, S.; Hussain, A.; Yu, F. R.; Guizani, M.; Niyato, D.
Beyond Reality: The Pivotal Role of Generative AI in the Metaverse Journal Article
In: IEEE Internet of Things Magazine, vol. 7, no. 4, pp. 126–135, 2024, ISSN: 25763180 (ISSN).
Abstract | Links | BibTeX | Tags: ]+ catalyst, 3D object, Diffusion, Generative adversarial networks, Generative model, Image objects, Immersive, Interconnected network, Metaverses, Physical reality, Video objects, Virtual landscapes, Virtual Reality
@article{chamola_beyond_2024,
title = {Beyond Reality: The Pivotal Role of Generative AI in the Metaverse},
author = {V. Chamola and G. Bansal and T. K. Das and V. Hassija and S. Sai and J. Wang and S. Zeadally and A. Hussain and F. R. Yu and M. Guizani and D. Niyato},
url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85198004913&doi=10.1109%2fIOTM.001.2300174&partnerID=40&md5=03c679195e42e677de596d7a38df0333},
doi = {10.1109/IOTM.001.2300174},
issn = {25763180 (ISSN)},
year = {2024},
date = {2024-01-01},
journal = {IEEE Internet of Things Magazine},
volume = {7},
number = {4},
pages = {126–135},
abstract = {The Metaverse, an interconnected network of immersive digital realms, is poised to reshape the future by seamlessly merging physical reality with virtual environments. Its potential to revolutionize diverse aspects of human existence, from entertainment to commerce, underscores its significance. At the heart of this transformation lies Generative AI, a branch of artificial intelligence focused on creating novel content. Generative AI serves as a catalyst, propelling the Metaverse's evolution by enhancing it with immersive experiences. The Metaverse is comprised of three pivotal domains, namely, text, visual, and audio. The Metaverse's fabric intertwines with Generative AI models, ushering in innovative interactions. Within Visual, the triad of image, video, and 3D Object generation sets the stage for engaging virtual landscapes. Key to this evolution is five generative models: Transformers, Diffusion, Autoencoders, Autoregressive, and Generative Adversarial Networks (GANs). These models empower the Metaverse, enhancing it with dynamic and diverse content. Notably, technologies like BARD, Point-E, Stable Diffusion, DALL-E, GPT, and AIVA, among others, wield these models to enrich the Metaverse across domains. By discussing the technical issues and real-world applications, this study reveals the intricate tapestry of AI's role in the Metaverse. Anchoring these insights is a case study illuminating Stable Diffusion's role in metamorphosing the virtual realm. Collectively, this exploration illuminates the symbiotic relationship between Generative AI and the Metaverse, foreshadowing a future where immersive, interactive, and personalized experiences blackefine human engagement with digital landscapes. © 2018 IEEE.},
keywords = {]+ catalyst, 3D object, Diffusion, Generative adversarial networks, Generative model, Image objects, Immersive, Interconnected network, Metaverses, Physical reality, Video objects, Virtual landscapes, Virtual Reality},
pubstate = {published},
tppubtype = {article}
}
2023
Si, J.; Yang, S.; Kim, D.; Kim, S.
Metaverse Interview Room Creation With Virtual Interviewer Generation Using Diffusion Model Proceedings Article
In: Proc. IEEE Asia-Pacific Conf. Comput. Sci. Data Eng., CSDE, Institute of Electrical and Electronics Engineers Inc., 2023, ISBN: 979-835034107-2 (ISBN).
Abstract | Links | BibTeX | Tags: Changing trends, Cutting edges, Diffusion, Diffusion Model, Generative AI, Hiring process, Interview skills, It focus, Metaverse, Metaverses, Unity, Virtual Interview, Virtual Reality
@inproceedings{si_metaverse_2023,
title = {Metaverse Interview Room Creation With Virtual Interviewer Generation Using Diffusion Model},
author = {J. Si and S. Yang and D. Kim and S. Kim},
url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85190586380&doi=10.1109%2fCSDE59766.2023.10487677&partnerID=40&md5=9ea374e1fef25598abf12d7636054d89},
doi = {10.1109/CSDE59766.2023.10487677},
isbn = {979-835034107-2 (ISBN)},
year = {2023},
date = {2023-01-01},
booktitle = {Proc. IEEE Asia-Pacific Conf. Comput. Sci. Data Eng., CSDE},
publisher = {Institute of Electrical and Electronics Engineers Inc.},
abstract = {Virtual interviews are an effective way to respond quickly to the changing trends of our time and adapt flexibly to the hiring processes of various organizations. Through this method, applicants have the opportunity to practice their interview skills and receive feedback, greatly aiding their job preparation. Additionally, experiencing a virtual interview environment that is similar to an actual one enables them to adapt more easily to a variety of new interview situations. This paper delves deeply into the virtual interview environment implemented by combining cutting-edge metaverse technology and generative AI. Specifically, it focuses on creating an environment utilizing realistic Diffusion models to generate interviewers, enabling the provision of scenarios that are similar to actual interviews. © 2023 IEEE.},
keywords = {Changing trends, Cutting edges, Diffusion, Diffusion Model, Generative AI, Hiring process, Interview skills, It focus, Metaverse, Metaverses, Unity, Virtual Interview, Virtual Reality},
pubstate = {published},
tppubtype = {inproceedings}
}
2018
Scianna, Andrea; Guardia, Marcello La
3D virtual CH interactive information systems for a smart web browsing experience for desktop PCs and mobile devices Proceedings Article
In: F., Fuse T. Toschi I. Remondino (Ed.): International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives, pp. 1053–1059, International Society for Photogrammetry and Remote Sensing, 2018, (Issue: 2).
Abstract | Links | BibTeX | Tags: 3-d modeling, 3D Modelling, Antennas, Cameras, Cultural heritage, Cultural heritages, Diffusion, Diffusion of knowledge, HTML, HTML5, Image Reconstruction, Information Systems, Information use, Interactive informations, Internet connection, Libraries, Navigation systems, Open-source technology, Photogrammetry, Surveys, Unmanned aerial vehicles (UAV), Virtual Reality, WebGL
@inproceedings{scianna_3d_2018,
title = {3D virtual CH interactive information systems for a smart web browsing experience for desktop PCs and mobile devices},
author = {Andrea Scianna and Marcello La Guardia},
editor = {Fuse T. Toschi I. Remondino F.},
url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85048377074&doi=10.5194%2fisprs-archives-XLII-2-1053-2018&partnerID=40&md5=1c2badcf13bc32fcc40282085af9a980},
doi = {10.5194/isprs-archives-XLII-2-1053-2018},
year = {2018},
date = {2018-01-01},
booktitle = {International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives},
volume = {42},
pages = {1053–1059},
publisher = {International Society for Photogrammetry and Remote Sensing},
abstract = {Recently, the diffusion of knowledge on Cultural Heritage (CH) has become an element of primary importance for its valorization. At the same time, the diffusion of surveys based on U AV Unmanned Aerial Vehicles (UAV) technologies and new methods of photogrammetric reconstruction have opened new possibilities for 3D CH representation. Furthermore the recent development of faster and more stable internet connections leads people to increase the use of mobile devices. In the light of all this, the importance of the development of Virtual Reality (VR) environments applied to CH is strategic for the diffusion of knowledge in a smart solution. In particular, the present work shows how, starting from a basic survey and the further photogrammetric reconstruction of a cultural good, is possible to built a 3D CH interactive information system useful for desktop and mobile devices. For this experimentation the Arab-Norman church of the Trinity of Delia (in Castelvetrano-Sicily-Italy) has been adopted as case study. The survey operations have been carried out considering different rapid methods of acquisition (UAV camera, SLR camera and smartphone camera). The web platform to publish the 3D information has been built using HTML5 markup language and WebGL JavaScript libraries (Three.js libraries). This work presents the construction of a 3D navigation system for a web-browsing of a virtual CH environment, with the integration of first person controls and 3D popup links. This contribution adds a further step to enrich the possibilities of open-source technologies applied to the world of CH valorization on web. © Authors 2018. CC BY 4.0 License.},
note = {Issue: 2},
keywords = {3-d modeling, 3D Modelling, Antennas, Cameras, Cultural heritage, Cultural heritages, Diffusion, Diffusion of knowledge, HTML, HTML5, Image Reconstruction, Information Systems, Information use, Interactive informations, Internet connection, Libraries, Navigation systems, Open-source technology, Photogrammetry, Surveys, Unmanned aerial vehicles (UAV), Virtual Reality, WebGL},
pubstate = {published},
tppubtype = {inproceedings}
}