• Title/Summary/Keyword: Text-to-Image Generative AI

Search Result 8, Processing Time 0.018 seconds

Transforming Text into Video: A Proposed Methodology for Video Production Using the VQGAN-CLIP Image Generative AI Model

  • SukChang Lee
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.3
    • /
    • pp.225-230
    • /
    • 2023
  • With the development of AI technology, there is a growing discussion about Text-to-Image Generative AI. We presented a Generative AI video production method and delineated a methodology for the production of personalized AI-generated videos with the objective of broadening the landscape of the video domain. And we meticulously examined the procedural steps involved in AI-driven video production and directly implemented a video creation approach utilizing the VQGAN-CLIP model. The outcomes produced by the VQGAN-CLIP model exhibited a relatively moderate resolution and frame rate, and predominantly manifested as abstract images. Such characteristics indicated potential applicability in OTT-based video content or the realm of visual arts. It is anticipated that AI-driven video production techniques will see heightened utilization in forthcoming endeavors.

Generative AI-based Exterior Building Design Visualization Approach in the Early Design Stage - Leveraging Architects' Style-trained Models - (생성형 AI 기반 초기설계단계 외관디자인 시각화 접근방안 - 건축가 스타일 추가학습 모델 활용을 바탕으로 -)

  • Yoo, Youngjin;Lee, Jin-Kook
    • Journal of KIBIM
    • /
    • v.14 no.2
    • /
    • pp.13-24
    • /
    • 2024
  • This research suggests a novel visualization approach utilizing Generative AI to render photorealistic architectural alternatives images in the early design phase. Photorealistic rendering intuitively describes alternatives and facilitates clear communication between stakeholders. Nevertheless, the conventional rendering process, utilizing 3D modelling and rendering engines, demands sophisticate model and processing time. In this context, the paper suggests a rendering approach employing the text-to-image method aimed at generating a broader range of intuitive and relevant reference images. Additionally, it employs an Text-to-Image method focused on producing a diverse array of alternatives reflecting architects' styles when visualizing the exteriors of residential buildings from the mass model images. To achieve this, fine-tuning for architects' styles was conducted using the Low-Rank Adaptation (LoRA) method. This approach, supported by fine-tuned models, allows not only single style-applied alternatives, but also the fusion of two or more styles to generate new alternatives. Using the proposed approach, we generated more than 15,000 meaningful images, with each image taking only about 5 seconds to produce. This demonstrates that the Generative AI-based visualization approach significantly reduces the labour and time required in conventional visualization processes, holding significant potential for transforming abstract ideas into tangible images, even in the early stages of design.

A Design and Implementation of Generative AI-based Advertising Image Production Service Application

  • Chang Hee Ok;Hyun Sung Lee;Min Soo Jeong;Yu Jin Jeong;Ji An Choi;Young-Bok Cho;Won Joo Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.5
    • /
    • pp.31-38
    • /
    • 2024
  • In this paper, we propose an ASAP(AI-driven Service for Advertisement Production) application that provides a generative AI-based automatic advertising image production service. This application utilizes GPT-3.5 Turbo Instruct to generate suitable background mood and promotional copy based on user-entered keywords. It utilizes OpenAI's DALL·E 3 model and Stability AI's SDXL model to generate background images and text images based on these inputs. Furthermore, OCR technology is employed to improve the accuracy of text images, and all generated outputs are synthesized to create the final advertisement. Additionally, using the PILLOW and OpenCV libraries, text boxes are implemented to insert details such as phone numbers and business hours at the edges of promotional materials. This application offers small business owners who face difficulties in advertising production a simple and cost-effective solution.

A Study on User Experience through Analysis of the Creative Process of Using Image Generative AI: Focusing on User Agency in Creativity (이미지 생성형 AI의 창작 과정 분석을 통한 사용자 경험 연구: 사용자의 창작 주체감을 중심으로)

  • Daeun Han;Dahye Choi;Changhoon Oh
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.667-679
    • /
    • 2023
  • The advent of image generative AI has made it possible for people who are not experts in art and design to create finished artworks through text input. With the increasing availability of generated images and their impact on the art industry, there is a need for research on how users perceive the process of co-creating with AI. In this study, we conducted an experimental study to investigate the expected and experienced processes of image generative AI creation among general users and to find out which processes affect users' sense of creative agency. The results showed that there was a gap between the expected and experienced creative process, and users tended to perceive a low sense of creative agency. We recommend eight ways that AI can act as an enabler to support users' creative intentions so that they can experience a higher sense of creative agency. This study can contribute to the future development of image-generating AI by considering user-centered creative experiences.

Research on Core patent mining methods based on key components of Generative AI (생성형 인공지능 기술의 핵심 구성 요소 기반 주요 특허 발굴 방법에 관한 연구)

  • Gayun Kim;Beom-Seok Kim;Jinhong Yang
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.16 no.5
    • /
    • pp.292-300
    • /
    • 2023
  • This paper proposes a patent discovery method and strategy for Generative AI-related patents by utilizing qualitative evaluation indicators established based on the core components of the technology. Currently, the evaluation of patent quality relies on quantitative indicators, but existing quantitative indicators cannot represent the characteristics of Generative AI technology, making it difficult to accurately evaluate. Therefore, there is a need for additional qualitative indicators that consider technical characteristics based on patent claims, which can reveal the actual strength of the patent. In this paper, we propose a new evaluation index considering the technical characteristics of Generative AI. Core patents were selected using the proposed evaluation index, and the appropriateness of the proposed index was verified through the existing quantitative evaluation method for the selected core patents.

Analysis and Forecast of Venture Capital Investment on Generative AI Startups: Focusing on the U.S. and South Korea (생성 AI 스타트업에 대한 벤처투자 분석과 예측: 미국과 한국을 중심으로)

  • Lee, Seungah;Jung, Taehyun
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.18 no.4
    • /
    • pp.21-35
    • /
    • 2023
  • Expectations surrounding generative AI technology and its profound ramifications are sweeping across various industrial domains. Given the anticipated pivotal role of the startup ecosystem in the utilization and advancement of generative AI technology, it is imperative to cultivate a deeper comprehension of the present state and distinctive attributes characterizing venture capital (VC) investments within this domain. The current investigation delves into South Korea's landscape of VC investment deals and prognosticates the projected VC investments by juxtaposing these against the United States, the frontrunner in the generative AI industry and its associated ecosystem. For analytical purposes, a compilation of 286 investment deals originating from 117 U.S. generative AI startups spanning the period from 2008 to 2023, as well as 144 investment deals from 42 South Korean generative AI startups covering the years 2011 to 2023, was amassed to construct new datasets. The outcomes of this endeavor reveal an upward trajectory in the count of VC investment deals within both the U.S. and South Korea during recent years. Predominantly, these deals have been concentrated within the early-stage investment realm. Noteworthy disparities between the two nations have also come to light. Specifically, in the U.S., in contrast to South Korea, the quantum of recent VC deals has escalated, marking an augmentation ranging from 285% to 488% in the corresponding developmental stage. While the interval between disparate investment stages demonstrated a slight elongation in South Korea relative to the U.S., this discrepancy did not achieve statistical significance. Furthermore, the proportion of VC investments channeled into generative AI enterprises, relative to the aggregate number of deals, exhibited a higher quotient in South Korea compared to the U.S. Upon a comprehensive sectoral breakdown of generative AI, it was discerned that within the U.S., 59.2% of total deals were concentrated in the text and model sectors, whereas in South Korea, 61.9% of deals centered around the video, image, and chat sectors. Through forecasting, the anticipated VC investments in South Korea from 2023 to 2029 were derived via four distinct models, culminating in an estimated average requirement of 3.4 trillion Korean won (ranging from at least 2.408 trillion won to a maximum of 5.919 trillion won). This research bears pragmatic significance as it methodically dissects VC investments within the generative AI domain across both the U.S. and South Korea, culminating in the presentation of an estimated VC investment projection for the latter. Furthermore, its academic significance lies in laying the groundwork for prospective scholarly inquiries by dissecting the current landscape of generative AI VC investments, a sphere that has hitherto remained void of rigorous academic investigation supported by empirical data. Additionally, the study introduces two innovative methodologies for the prediction of VC investment sums. Upon broader integration, application, and refinement of these methodologies within diverse academic explorations, they stand poised to enhance the prognosticative capacity pertaining to VC investment costs.

  • PDF

Analysis of Generative AI Technology Trends Based on Patent Data (특허 데이터 기반 생성형 AI 기술 동향 분석)

  • Seongmu Ryu;Taewon Song;Minjeong Lee;Yoonju Choi;Soonuk Seol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.17 no.1
    • /
    • pp.1-9
    • /
    • 2024
  • This paper analyzes the trends in generative AI technology based on patent application documents. To achieve this, we selected 5,433 generative AI-related patents filed in South Korea, the United States, and Europe from 2003 to 2023, and analyzed the data by country, technology category, year, and applicant, presenting it visually to find insights and understand the flow of technology. The analysis shows that patents in the image category account for 36.9%, the largest share, with a continuous increase in filings, while filings in the text/document and music/speech categories have either decreased or remained stable since 2019. Although the company with the highest number of filings is a South Korean company, four out of the top five filers are U.S. companies, and all companies have filed the majority of their patents in the U.S., indicating that generative AI is growing and competing centered around the U.S. market. The findings of this paper are expected to be useful for future research and development in generative AI, as well as for formulating strategies for acquiring intellectual property.

Research on Generative AI for Korean Multi-Modal Montage App (한국형 멀티모달 몽타주 앱을 위한 생성형 AI 연구)

  • Lim, Jeounghyun;Cha, Kyung-Ae;Koh, Jaepil;Hong, Won-Kee
    • Journal of Service Research and Studies
    • /
    • v.14 no.1
    • /
    • pp.13-26
    • /
    • 2024
  • Multi-modal generation is the process of generating results based on a variety of information, such as text, images, and audio. With the rapid development of AI technology, there is a growing number of multi-modal based systems that synthesize different types of data to produce results. In this paper, we present an AI system that uses speech and text recognition to describe a person and generate a montage image. While the existing montage generation technology is based on the appearance of Westerners, the montage generation system developed in this paper learns a model based on Korean facial features. Therefore, it is possible to create more accurate and effective Korean montage images based on multi-modal voice and text specific to Korean. Since the developed montage generation app can be utilized as a draft montage, it can dramatically reduce the manual labor of existing montage production personnel. For this purpose, we utilized persona-based virtual person montage data provided by the AI-Hub of the National Information Society Agency. AI-Hub is an AI integration platform aimed at providing a one-stop service by building artificial intelligence learning data necessary for the development of AI technology and services. The image generation system was implemented using VQGAN, a deep learning model used to generate high-resolution images, and the KoDALLE model, a Korean-based image generation model. It can be confirmed that the learned AI model creates a montage image of a face that is very similar to what was described using voice and text. To verify the practicality of the developed montage generation app, 10 testers used it and more than 70% responded that they were satisfied. The montage generator can be used in various fields, such as criminal detection, to describe and image facial features.