• Title/Summary/Keyword: deep learning framework

Search Result 259, Processing Time 0.022 seconds

Evaluating the Effectiveness of an Artificial Intelligence Model for Classification of Basic Volcanic Rocks Based on Polarized Microscope Image (편광현미경 이미지 기반 염기성 화산암 분류를 위한 인공지능 모델의 효용성 평가)

  • Sim, Ho;Jung, Wonwoo;Hong, Seongsik;Seo, Jaewon;Park, Changyun;Song, Yungoo
    • Economic and Environmental Geology
    • /
    • v.55 no.3
    • /
    • pp.309-316
    • /
    • 2022
  • In order to minimize the human and time consumption required for rock classification, research on rock classification using artificial intelligence (AI) has recently developed. In this study, basic volcanic rocks were subdivided by using polarizing microscope thin section images. A convolutional neural network (CNN) model based on Tensorflow and Keras libraries was self-producted for rock classification. A total of 720 images of olivine basalt, basaltic andesite, olivine tholeiite, trachytic olivine basalt reference specimens were mounted with open nicol, cross nicol, and adding gypsum plates, and trained at the training : test = 7 : 3 ratio. As a result of machine learning, the classification accuracy was over 80-90%. When we confirmed the classification accuracy of each AI model, it is expected that the rock classification method of this model will not be much different from the rock classification process of a geologist. Furthermore, if not only this model but also models that subdivide more diverse rock types are produced and integrated, the AI model that satisfies both the speed of data classification and the accessibility of non-experts can be developed, thereby providing a new framework for basic petrology research.

Cox Model Improvement Using Residual Blocks in Neural Networks: A Study on the Predictive Model of Cervical Cancer Mortality (신경망 내 잔여 블록을 활용한 콕스 모델 개선: 자궁경부암 사망률 예측모형 연구)

  • Nang Kyeong Lee;Joo Young Kim;Ji Soo Tak;Hyeong Rok Lee;Hyun Ji Jeon;Jee Myung Yang;Seung Won Lee
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.6
    • /
    • pp.260-268
    • /
    • 2024
  • Cervical cancer is the fourth most common cancer in women worldwide, and more than 604,000 new cases were reported in 2020 alone, resulting in approximately 341,831 deaths. The Cox regression model is a major model widely adopted in cancer research, but considering the existence of nonlinear associations, it faces limitations due to linear assumptions. To address this problem, this paper proposes ResSurvNet, a new model that improves the accuracy of cervical cancer mortality prediction using ResNet's residual learning framework. This model showed accuracy that outperforms the DNN, CPH, CoxLasso, Cox Gradient Boost, and RSF models compared in this study. As this model showed accuracy that outperformed the DNN, CPH, CoxLasso, Cox Gradient Boost, and RSF models compared in this study, this excellent predictive performance demonstrates great value in early diagnosis and treatment strategy establishment in the management of cervical cancer patients and represents significant progress in the field of survival analysis.

Automatic Sagittal Plane Detection for the Identification of the Mandibular Canal (치아 신경관 식별을 위한 자동 시상면 검출법)

  • Pak, Hyunji;Kim, Dongjoon;Shin, Yeong-Gil
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.3
    • /
    • pp.31-37
    • /
    • 2020
  • Identification of the mandibular canal path in Computed Tomography (CT) scans is important in dental implantology. Typically, prior to the implant planning, dentists find a sagittal plane where the mandibular canal path is maximally observed, to manually identify the mandibular canal. However, this is time-consuming and requires extensive experience. In this paper, we propose a deep-learning-based framework to detect the desired sagittal plane automatically. This is accomplished by utilizing two main techniques: 1) a modified version of the iterative transformation network (ITN) method for obtaining initial planes, and 2) a fine searching method based on a convolutional neural network (CNN) classifier for detecting the desirable sagittal plane. This combination of techniques facilitates accurate plane detection, which is a limitation of the stand-alone ITN method. We have tested on a number of CT datasets to demonstrate that the proposed method can achieve more satisfactory results compared to the ITN method. This allows dentists to identify the mandibular canal path efficiently, providing a foundation for future research into more efficient, automatic mandibular canal detection methods.

Visualization of Korean Speech Based on the Distance of Acoustic Features (음성특징의 거리에 기반한 한국어 발음의 시각화)

  • Pok, Gou-Chol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.3
    • /
    • pp.197-205
    • /
    • 2020
  • Korean language has the characteristics that the pronunciation of phoneme units such as vowels and consonants are fixed and the pronunciation associated with a notation does not change, so that foreign learners can approach rather easily Korean language. However, when one pronounces words, phrases, or sentences, the pronunciation changes in a manner of a wide variation and complexity at the boundaries of syllables, and the association of notation and pronunciation does not hold any more. Consequently, it is very difficult for foreign learners to study Korean standard pronunciations. Despite these difficulties, it is believed that systematic analysis of pronunciation errors for Korean words is possible according to the advantageous observations that the relationship between Korean notations and pronunciations can be described as a set of firm rules without exceptions unlike other languages including English. In this paper, we propose a visualization framework which shows the differences between standard pronunciations and erratic ones as quantitative measures on the computer screen. Previous researches only show color representation and 3D graphics of speech properties, or an animated view of changing shapes of lips and mouth cavity. Moreover, the features used in the analysis are only point data such as the average of a speech range. In this study, we propose a method which can directly use the time-series data instead of using summary or distorted data. This was realized by using the deep learning-based technique which combines Self-organizing map, variational autoencoder model, and Markov model, and we achieved a superior performance enhancement compared to the method using the point-based data.

Prospect of future water resources in the basins of Chungju Dam and Soyang-gang Dam using a physics-based distributed hydrological model and a deep-learning-based LSTM model (물리기반 분포형 수문 모형과 딥러닝 기반 LSTM 모형을 활용한 충주댐 및 소양강댐 유역의 미래 수자원 전망)

  • Kim, Yongchan;Kim, Youngran;Hwang, Seonghwan;Kim, Dongkyun
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.12
    • /
    • pp.1115-1124
    • /
    • 2022
  • The impact of climate change on water resources was evaluated for Chungju Dam and Soyang-gang Dam basins by constructing an integrated modeling framework consisting of a dam inflow prediction model based on the Variable Infiltration Capacity (VIC) model, a distributed hydrologic model, and an LSTM based dam outflow prediction model. Considering the uncertainty of future climate data, four models of CMIP6 GCM were used as input data of VIC model for future period (2021-2100). As a result of applying future climate data, the average inflow for period increased as the future progressed, and the inflow in the far future (2070-2100) increased by up to 22% compared to that of the observation period (1986-2020). The minimum value of dam discharge lasting 4~50 days was significantly lower than the observed value. This indicates that droughts may occur over a longer period than observed in the past, meaning that citizens of Seoul metropolitan areas may experience severe water shortages due to future droughts. In addition, compared to the near and middle futures, the change in water storage has occurred rapidly in the far future, suggesting that the difficulties of water resource management may increase.

A modified U-net for crack segmentation by Self-Attention-Self-Adaption neuron and random elastic deformation

  • Zhao, Jin;Hu, Fangqiao;Qiao, Weidong;Zhai, Weida;Xu, Yang;Bao, Yuequan;Li, Hui
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.1-16
    • /
    • 2022
  • Despite recent breakthroughs in deep learning and computer vision fields, the pixel-wise identification of tiny objects in high-resolution images with complex disturbances remains challenging. This study proposes a modified U-net for tiny crack segmentation in real-world steel-box-girder bridges. The modified U-net adopts the common U-net framework and a novel Self-Attention-Self-Adaption (SASA) neuron as the fundamental computing element. The Self-Attention module applies softmax and gate operations to obtain the attention vector. It enables the neuron to focus on the most significant receptive fields when processing large-scale feature maps. The Self-Adaption module consists of a multiplayer perceptron subnet and achieves deeper feature extraction inside a single neuron. For data augmentation, a grid-based crack random elastic deformation (CRED) algorithm is designed to enrich the diversities and irregular shapes of distributed cracks. Grid-based uniform control nodes are first set on both input images and binary labels, random offsets are then employed on these control nodes, and bilinear interpolation is performed for the rest pixels. The proposed SASA neuron and CRED algorithm are simultaneously deployed to train the modified U-net. 200 raw images with a high resolution of 4928 × 3264 are collected, 160 for training and the rest 40 for the test. 512 × 512 patches are generated from the original images by a sliding window with an overlap of 256 as inputs. Results show that the average IoU between the recognized and ground-truth cracks reaches 0.409, which is 29.8% higher than the regular U-net. A five-fold cross-validation study is performed to verify that the proposed method is robust to different training and test images. Ablation experiments further demonstrate the effectiveness of the proposed SASA neuron and CRED algorithm. Promotions of the average IoU individually utilizing the SASA and CRED module add up to the final promotion of the full model, indicating that the SASA and CRED modules contribute to the different stages of model and data in the training process.

AI-Based Object Recognition Research for Augmented Reality Character Implementation (증강현실 캐릭터 구현을 위한 AI기반 객체인식 연구)

  • Seok-Hwan Lee;Jung-Keum Lee;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1321-1330
    • /
    • 2023
  • This study attempts to address the problem of 3D pose estimation for multiple human objects through a single image generated during the character development process that can be used in augmented reality. In the existing top-down method, all objects in the image are first detected, and then each is reconstructed independently. The problem is that inconsistent results may occur due to overlap or depth order mismatch between the reconstructed objects. The goal of this study is to solve these problems and develop a single network that provides consistent 3D reconstruction of all humans in a scene. Integrating a human body model based on the SMPL parametric system into a top-down framework became an important choice. Through this, two types of collision loss based on distance field and loss that considers depth order were introduced. The first loss prevents overlap between reconstructed people, and the second loss adjusts the depth ordering of people to render occlusion inference and annotated instance segmentation consistently. This method allows depth information to be provided to the network without explicit 3D annotation of the image. Experimental results show that this study's methodology performs better than existing methods on standard 3D pose benchmarks, and the proposed losses enable more consistent reconstruction from natural images.

Spontaneous Speech Emotion Recognition Based On Spectrogram With Convolutional Neural Network (CNN 기반 스펙트로그램을 이용한 자유발화 음성감정인식)

  • Guiyoung Son;Soonil Kwon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.6
    • /
    • pp.284-290
    • /
    • 2024
  • Speech emotion recognition (SER) is a technique that is used to analyze the speaker's voice patterns, including vibration, intensity, and tone, to determine their emotional state. There has been an increase in interest in artificial intelligence (AI) techniques, which are now widely used in medicine, education, industry, and the military. Nevertheless, existing researchers have attained impressive results by utilizing acted-out speech from skilled actors in a controlled environment for various scenarios. In particular, there is a mismatch between acted and spontaneous speech since acted speech includes more explicit emotional expressions than spontaneous speech. For this reason, spontaneous speech-emotion recognition remains a challenging task. This paper aims to conduct emotion recognition and improve performance using spontaneous speech data. To this end, we implement deep learning-based speech emotion recognition using the VGG (Visual Geometry Group) after converting 1-dimensional audio signals into a 2-dimensional spectrogram image. The experimental evaluations are performed on the Korean spontaneous emotional speech database from AI-Hub, consisting of 7 emotions, i.e., joy, love, anger, fear, sadness, surprise, and neutral. As a result, we achieved an average accuracy of 83.5% and 73.0% for adults and young people using a time-frequency 2-dimension spectrogram, respectively. In conclusion, our findings demonstrated that the suggested framework outperformed current state-of-the-art techniques for spontaneous speech and showed a promising performance despite the difficulty in quantifying spontaneous speech emotional expression.

Understanding the Artificial Intelligence Business Ecosystem for Digital Transformation: A Multi-actor Network Perspective (디지털 트랜스포메이션을 위한 인공지능 비즈니스 생태계 연구: 다행위자 네트워크 관점에서)

  • Yoon Min Hwang;Sung Won Hong
    • Information Systems Review
    • /
    • v.21 no.4
    • /
    • pp.125-141
    • /
    • 2019
  • With the advent of deep learning technology, which is represented by AlphaGo, artificial intelligence (A.I.) has quickly emerged as a key theme of digital transformation to secure competitive advantage for businesses. In order to understand the trends of A.I. based digital transformation, a clear comprehension of the A.I. business ecosystem should precede. Therefore, this study analyzed the A.I. business ecosystem from the multi-actor network perspective and identified the A.I. platform strategy type. Within internal three layers of A.I. business ecosystem (infrastructure & hardware, software & application, service & data layers), this study identified four types of A.I. platform strategy (Tech. vertical × Biz. horizontal, Tech. vertical × Biz. vertical, Tech. horizontal × Biz. horizontal, Tech. horizontal × Biz. vertical). Then, outside of A.I. platform, this study presented five actors (users, investors, policy makers, consortiums & innovators, CSOs/NGOs) and their roles to support sustainable A.I. business ecosystem in symbiosis with human. This study identified A.I. business ecosystem framework and platform strategy type. The roles of government and academia to create a sustainable A.I. business ecosystem were also suggested. These results will help to find proper strategy direction of A.I. business ecosystem and digital transformation.