• 제목/요약/키워드: Image Generative AI

검색결과 45건 처리시간 0.022초

CINEMAPIC : Generative AI-based movie concept photo booth system (시네마픽 : 생성형 AI기반 영화 컨셉 포토부스 시스템)

  • Seokhyun Jeong;Seungkyu Leem;Jungjin Lee
    • Journal of the Korea Computer Graphics Society
    • /
    • 제30권3호
    • /
    • pp.149-158
    • /
    • 2024
  • Photo booths have traditionally provided a fun and easy way to capture and print photos to cherish memories. These booths allow individuals to capture their desired poses and props, sharing memories with friends and family. To enable diverse expressions, generative AI-powered photo booths have emerged. However, existing AI photo booths face challenges such as difficulty in taking group photos, inability to accurately reflect user's poses, and the challenge of applying different concepts to individual subjects. To tackle these issues, we present CINEMAPIC, a photo booth system that allows users to freely choose poses, positions, and concepts for their photos. The system workflow includes three main steps: pre-processing, generation, and post-processing to apply individualized concepts. To produce high-quality group photos, the system generates a transparent image for each character and enhances the backdrop-composited image through a small number of denoising steps. The workflow is accelerated by applying an optimized diffusion model and GPU parallelization. The system was implemented as a prototype, and its effectiveness was validated through a user study and a large-scale pilot operation involving approximately 400 users. The results showed a significant preference for the proposed system over existing methods, confirming its potential for real-world photo booth applications. The proposed CINEMAPIC photo booth is expected to lead the way in a more creative and differentiated market, with potential for widespread application in various fields.

Empirical Research on the Interaction between Visual Art Creation and Artificial Intelligence Collaboration (시각예술 창작과 인공지능 협업의 상호작용에 관한 실증연구)

  • Hyeonjin Kim;Yeongjo Kim;Donghyeon Yun;Hanjin Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • 제10권1호
    • /
    • pp.517-524
    • /
    • 2024
  • Generative AI, exemplified by models like ChatGPT, has revolutionized human-machine interactions in the 21st century. As these advancements permeate various sectors, their intersection with the arts is both promising and challenging. Despite the arts' historical resistance to AI replacement, recent developments have sparked active research in AI's role in artistry. This study delves into the potential of AI in visual arts education, highlighting the necessity of swift adaptation amidst the Fourth Industrial Revolution. This research, conducted at a 4-year global higher education institution located in Gyeongbuk, involved 70 participants who took part in a creative convergence module course project. The study aimed to examine the influence of AI collaboration in visual arts, analyzing distinctions across majors, grades, and genders. The results indicate that creative activities with AI positively influence students' creativity and digital media literacy. Based on these findings, there is a need to further develop effective educational strategies and directions that incorporate AI.

Technical Trends in Hyperscale Artificial Intelligence Processors (초거대 인공지능 프로세서 반도체 기술 개발 동향)

  • W. Jeon;C.G. Lyuh
    • Electronics and Telecommunications Trends
    • /
    • 제38권5호
    • /
    • pp.1-11
    • /
    • 2023
  • The emergence of generative hyperscale artificial intelligence (AI) has enabled new services, such as image-generating AI and conversational AI based on large language models. Such services likely lead to the influx of numerous users, who cannot be handled using conventional AI models. Furthermore, the exponential increase in training data, computations, and high user demand of AI models has led to intensive hardware resource consumption, highlighting the need to develop domain-specific semiconductors for hyperscale AI. In this technical report, we describe development trends in technologies for hyperscale AI processors pursued by domestic and foreign semiconductor companies, such as NVIDIA, Graphcore, Tesla, Google, Meta, SAPEON, FuriosaAI, and Rebellions.

Game Character Image Generation Using GAN (GAN을 이용한 게임 캐릭터 이미지 생성)

  • Jeoung-Gi Kim;Myoung-Jun Jung;Kyung-Ae Cha
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • 제18권5호
    • /
    • pp.241-248
    • /
    • 2023
  • GAN (Generative Adversarial Networks) creates highly sophisticated counterfeit products by learning real images or text and inferring commonalities. Therefore, it can be useful in fields that require the creation of large-scale images or graphics. In this paper, we implement GAN-based game character creation AI that can dramatically reduce illustration design work costs by providing expansion and automation of game character image creation. This is very efficient in game development as it allows mass production of various character images at low cost.

Best Practice on Automatic Toon Image Creation from JSON File of Message Sequence Diagram via Natural Language based Requirement Specifications

  • Hyuntae Kim;Ji Hoon Kong;Hyun Seung Son;R. Young Chul Kim
    • International journal of advanced smart convergence
    • /
    • 제13권1호
    • /
    • pp.99-107
    • /
    • 2024
  • In AI image generation tools, most general users must use an effective prompt to craft queries or statements to elicit the desired response (image, result) from the AI model. But we are software engineers who focus on software processes. At the process's early stage, we use informal and formal requirement specifications. At this time, we adapt the natural language approach into requirement engineering and toon engineering. Most Generative AI tools do not produce the same image in the same query. The reason is that the same data asset is not used for the same query. To solve this problem, we intend to use informal requirement engineering and linguistics to create a toon. Therefore, we propose a sequence diagram and image generation mechanism by analyzing and applying key objects and attributes as an informal natural language requirement analysis. Identify morpheme and semantic roles by analyzing natural language through linguistic methods. Based on the analysis results, a sequence diagram and an image are generated through the diagram. We expect consistent image generation using the same image element asset through the proposed mechanism.

Enhanced ACGAN based on Progressive Step Training and Weight Transfer

  • Jinmo Byeon;Inshil Doh;Dana Yang
    • Journal of the Korea Society of Computer and Information
    • /
    • 제29권3호
    • /
    • pp.11-20
    • /
    • 2024
  • Among the generative models in Artificial Intelligence (AI), especially Generative Adversarial Network (GAN) has been successful in various applications such as image processing, density estimation, and style transfer. While the GAN models including Conditional GAN (CGAN), CycleGAN, BigGAN, have been extended and improved, researchers face challenges in real-world applications in specific domains such as disaster simulation, healthcare, and urban planning due to data scarcity and unstable learning causing Image distortion. This paper proposes a new progressive learning methodology called Progressive Step Training (PST) based on the Auxiliary Classifier GAN (ACGAN) that discriminates class labels, leveraging the progressive learning approach of the Progressive Growing of GAN (PGGAN). The PST model achieves 70.82% faster stabilization, 51.3% lower standard deviation, stable convergence of loss values in the later high resolution stages, and a 94.6% faster loss reduction compared to conventional methods.

Non-pneumatic Tire Design System based on Generative Adversarial Networks (적대적 생성 신경망 기반 비공기압 타이어 디자인 시스템)

  • JuYong Seong;Hyunjun Lee;Sungchul Lee
    • Journal of Platform Technology
    • /
    • 제11권6호
    • /
    • pp.34-46
    • /
    • 2023
  • The design of non-pneumatic tires, which are created by filling the space between the wheel and the tread with elastomeric compounds or polygonal spokes, has become an important research topic in the automotive and aerospace industries. In this study, a system was designed for the design of non-pneumatic tires through the implementation of a generative adversarial network. We specifically examined factors that could impact the design, including the type of non-pneumatic tire, its intended usage environment, manufacturing techniques, distinctions from pneumatic tires, and how spoke design affects load distribution. Using OpenCV, various shapes and spoke configurations were generated as images, and a GAN model was trained on the projected GANs to generate shapes and spokes for non-pneumatic tire designs. The designed non-pneumatic tires were labeled as available or not, and a Vision Transformer image classification AI model was trained on these labels for classification purposes. Evaluation of the classification model show convergence to a near-zero loss and a 99% accuracy rate confirming the generation of non-pneumatic tire designs.

  • PDF

Research on Generative AI for Korean Multi-Modal Montage App (한국형 멀티모달 몽타주 앱을 위한 생성형 AI 연구)

  • Lim, Jeounghyun;Cha, Kyung-Ae;Koh, Jaepil;Hong, Won-Kee
    • Journal of Service Research and Studies
    • /
    • 제14권1호
    • /
    • pp.13-26
    • /
    • 2024
  • Multi-modal generation is the process of generating results based on a variety of information, such as text, images, and audio. With the rapid development of AI technology, there is a growing number of multi-modal based systems that synthesize different types of data to produce results. In this paper, we present an AI system that uses speech and text recognition to describe a person and generate a montage image. While the existing montage generation technology is based on the appearance of Westerners, the montage generation system developed in this paper learns a model based on Korean facial features. Therefore, it is possible to create more accurate and effective Korean montage images based on multi-modal voice and text specific to Korean. Since the developed montage generation app can be utilized as a draft montage, it can dramatically reduce the manual labor of existing montage production personnel. For this purpose, we utilized persona-based virtual person montage data provided by the AI-Hub of the National Information Society Agency. AI-Hub is an AI integration platform aimed at providing a one-stop service by building artificial intelligence learning data necessary for the development of AI technology and services. The image generation system was implemented using VQGAN, a deep learning model used to generate high-resolution images, and the KoDALLE model, a Korean-based image generation model. It can be confirmed that the learned AI model creates a montage image of a face that is very similar to what was described using voice and text. To verify the practicality of the developed montage generation app, 10 testers used it and more than 70% responded that they were satisfied. The montage generator can be used in various fields, such as criminal detection, to describe and image facial features.

GAN-based research for high-resolution medical image generation (GAN 기반 고해상도 의료 영상 생성을 위한 연구)

  • Ko, Jae-Yeong;Cho, Baek-Hwan;Chung, Myung-Jin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 한국정보처리학회 2020년도 춘계학술발표대회
    • /
    • pp.544-546
    • /
    • 2020
  • 의료 데이터를 이용하여 인공지능 기계학습 연구를 수행할 때 자주 마주하는 문제는 데이터 불균형, 데이터 부족 등이며 특히 정제된 충분한 데이터를 구하기 힘들다는 것이 큰 문제이다. 본 연구에서는 이를 해결하기 위해 GAN(Generative Adversarial Network) 기반 고해상도 의료 영상을 생성하는 프레임워크를 개발하고자 한다. 각 해상도 마다 Scale 의 Gradient 를 동시에 학습하여 빠르게 고해상도 이미지를 생성해낼 수 있도록 했다. 고해상도 이미지를 생성하는 Neural Network 를 고안하였으며, PGGAN, Style-GAN 과의 성능 비교를 통해 제안된 모델이 양질의 고해상도 의료영상 이미지를 더 빠르게 생성할 수 있음을 확인하였다. 이를 통해 인공지능 기계학습 연구에 있어서 의료 영상의 데이터 부족, 데이터 불균형 문제를 해결할 수 있는 Data augmentation 이나, Anomaly detection 등의 연구에 적용할 수 있다.

An Image-to-Image Translation GAN Model for Dental Prothesis Design (치아 보철물 디자인을 위한 이미지 대 이미지 변환 GAN 모델)

  • Tae-Min Kim;Jae-Gon Kim
    • Journal of Information Technology Services
    • /
    • 제22권5호
    • /
    • pp.87-98
    • /
    • 2023
  • Traditionally, tooth restoration has been carried out by replicating teeth using plaster-based materials. However, recent technological advances have simplified the production process through the introduction of computer-aided design(CAD) systems. Nevertheless, dental restoration varies among individuals, and the skill level of dental technicians significantly influences the accuracy of the manufacturing process. To address this challenge, this paper proposes an approach to designing personalized tooth restorations using Generative Adversarial Network(GAN), a widely adopted technique in computer vision. The primary objective of this model is to create customized dental prosthesis for each patient by utilizing 3D data of the specific teeth to be treated and their corresponding opposite tooth. To achieve this, the 3D dental data is converted into a depth map format and used as input data for the GAN model. The proposed model leverages the network architecture of Pixel2Style2Pixel, which has demonstrated superior performance compared to existing models for image conversion and dental prosthesis generation. Furthermore, this approach holds promising potential for future advancements in dental and implant production.