AHCI RESEARCH GROUP
Publications
Papers published in international journals,
proceedings of conferences, workshops and books.
OUR RESEARCH
Scientific Publications
How to
Here you can find the complete list of our publications.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.
You can use the tag cloud to select only the papers dealing with specific research topics.
You can expand the Abstract, Links and BibTex record for each paper.
2024
Lee, S.; Park, W.; Lee, K.
Building Knowledge Base of 3D Object Assets Using Multimodal LLM AI Model Proceedings Article
In: Int. Conf. ICT Convergence, pp. 416–418, IEEE Computer Society, 2024, ISBN: 21621233 (ISSN); 979-835036463-7 (ISBN).
Abstract | Links | BibTeX | Tags: 3D object, Asset management, Content services, Exponentials, Information Management, Knowledge Base, Language Model, Large language model, LLM, Multi-modal, Multi-Modal AI, Reusability, Visual effects, XR
@inproceedings{lee_building_2024,
title = {Building Knowledge Base of 3D Object Assets Using Multimodal LLM AI Model},
author = {S. Lee and W. Park and K. Lee},
url = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-85217636269&doi=10.1109%2fICTC62082.2024.10827434&partnerID=40&md5=581ee8ca50eb3dae15dc9675971cf428},
doi = {10.1109/ICTC62082.2024.10827434},
isbn = {21621233 (ISSN); 979-835036463-7 (ISBN)},
year = {2024},
date = {2024-01-01},
booktitle = {Int. Conf. ICT Convergence},
pages = {416–418},
publisher = {IEEE Computer Society},
abstract = {The proliferation of various XR (eXtended Reality) services and the increasing incorporation of visual effects into existing content services have led to an exponential rise in the demand for 3D object assets. This paper describes an LLM (Large Language Model)-based multimodal AI model pipeline that can be applied to a generative AI model for creating new 3D objects or restructuring the asset management system to enhance the reusability of existing 3D objects. By leveraging a multimodal AI model, we derived descriptive text for assets such as 3D object, 2D image at a human-perceptible level, rather than mere data, and subsequently used an LLM to generate knowledge triplets for constructing an asset knowledge base. The applicability of this pipeline was verified using actual 3D objects from a content production company. Future work will focus on improving the quality of the generated knowledge triplets themselves by training the multimodal AI model with real-world content usage assets. © 2024 IEEE.},
keywords = {3D object, Asset management, Content services, Exponentials, Information Management, Knowledge Base, Language Model, Large language model, LLM, Multi-modal, Multi-Modal AI, Reusability, Visual effects, XR},
pubstate = {published},
tppubtype = {inproceedings}
}
The proliferation of various XR (eXtended Reality) services and the increasing incorporation of visual effects into existing content services have led to an exponential rise in the demand for 3D object assets. This paper describes an LLM (Large Language Model)-based multimodal AI model pipeline that can be applied to a generative AI model for creating new 3D objects or restructuring the asset management system to enhance the reusability of existing 3D objects. By leveraging a multimodal AI model, we derived descriptive text for assets such as 3D object, 2D image at a human-perceptible level, rather than mere data, and subsequently used an LLM to generate knowledge triplets for constructing an asset knowledge base. The applicability of this pipeline was verified using actual 3D objects from a content production company. Future work will focus on improving the quality of the generated knowledge triplets themselves by training the multimodal AI model with real-world content usage assets. © 2024 IEEE.