• Title/Summary/Keyword: Deep Learning based System

Search Result 1,206, Processing Time 0.028 seconds

Estimation of Manhattan Coordinate System using Convolutional Neural Network (합성곱 신경망 기반 맨하탄 좌표계 추정)

  • Lee, Jinwoo;Lee, Hyunjoon;Kim, Junho
    • Journal of the Korea Computer Graphics Society
    • /
    • v.23 no.3
    • /
    • pp.31-38
    • /
    • 2017
  • In this paper, we propose a system which estimates Manhattan coordinate systems for urban scene images using a convolutional neural network (CNN). Estimating the Manhattan coordinate system from an image under the Manhattan world assumption is the basis for solving computer graphics and vision problems such as image adjustment and 3D scene reconstruction. We construct a CNN that estimates Manhattan coordinate systems based on GoogLeNet [1]. To train the CNN, we collect about 155,000 images under the Manhattan world assumption by using the Google Street View APIs and calculate Manhattan coordinate systems using existing calibration methods to generate dataset. In contrast to PoseNet [2] that trains per-scene CNNs, our method learns from images under the Manhattan world assumption and thus estimates Manhattan coordinate systems for new images that have not been learned. Experimental results show that our method estimates Manhattan coordinate systems with the median error of $3.157^{\circ}$ for the Google Street View images of non-trained scenes, as test set. In addition, compared to an existing calibration method [3], the proposed method shows lower intermediate errors for the test set.

A LSTM Based Method for Photovoltaic Power Prediction in Peak Times Without Future Meteorological Information (미래 기상정보를 사용하지 않는 LSTM 기반의 피크시간 태양광 발전량 예측 기법)

  • Lee, Donghun;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.4
    • /
    • pp.119-133
    • /
    • 2019
  • Recently, the importance prediction of photovoltaic power (PV) is considered as an essential function for scheduling adjustments, deciding on storage size, and overall planning for stable operation of PV facility systems. In particular, since most of PV power is generated in peak time, PV power prediction in a peak time is required for the PV system operators that enable to maximize revenue and sustainable electricity quantity. Moreover, Prediction of the PV power output in peak time without meteorological information such as solar radiation, cloudiness, the temperature is considered a challenging problem because it has limitations that the PV power was predicted by using predicted uncertain meteorological information in a wide range of areas in previous studies. Therefore, this paper proposes the LSTM (Long-Short Term Memory) based the PV power prediction model only using the meteorological, seasonal, and the before the obtained PV power before peak time. In this paper, the experiment results based on the proposed model using the real-world data shows the superior performance, which showed a positive impact on improving the PV power in a peak time forecast performance targeted in this study.

Methodology for Developing a Predictive Model for Highway Traffic Information Using LSTM (LSTM을 활용한 고속도로 교통정보 예측 모델 개발 방법론)

  • Yoseph Lee;Hyoung-suk Jin;Yejin Kim;Sung-ho Park;Ilsoo Yun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.5
    • /
    • pp.1-18
    • /
    • 2023
  • With the recent developments in big data and deep learning, a variety of traffic information is collected widely and used for traffic operations. In particular, long short-term memory (LSTM) is used in the field of traffic information prediction with time series characteristics. Since trends, seasons, and cycles differ due to the nature of time series data input for an LSTM, a trial-and-error method based on characteristics of the data is essential for prediction models based on time series data in order to find hyperparameters. If a methodology is established to find suitable hyperparameters, it is possible to reduce the time spent in constructing high-accuracy models. Therefore, in this study, a traffic information prediction model is developed based on highway vehicle detection system (VDS) data and LSTM, and an impact assessment is conducted through changes in the LSTM evaluation indicators for each hyperparameter. In addition, a methodology for finding hyperparameters suitable for predicting highway traffic information in the transportation field is presented.

AI-Based Object Recognition Research for Augmented Reality Character Implementation (증강현실 캐릭터 구현을 위한 AI기반 객체인식 연구)

  • Seok-Hwan Lee;Jung-Keum Lee;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1321-1330
    • /
    • 2023
  • This study attempts to address the problem of 3D pose estimation for multiple human objects through a single image generated during the character development process that can be used in augmented reality. In the existing top-down method, all objects in the image are first detected, and then each is reconstructed independently. The problem is that inconsistent results may occur due to overlap or depth order mismatch between the reconstructed objects. The goal of this study is to solve these problems and develop a single network that provides consistent 3D reconstruction of all humans in a scene. Integrating a human body model based on the SMPL parametric system into a top-down framework became an important choice. Through this, two types of collision loss based on distance field and loss that considers depth order were introduced. The first loss prevents overlap between reconstructed people, and the second loss adjusts the depth ordering of people to render occlusion inference and annotated instance segmentation consistently. This method allows depth information to be provided to the network without explicit 3D annotation of the image. Experimental results show that this study's methodology performs better than existing methods on standard 3D pose benchmarks, and the proposed losses enable more consistent reconstruction from natural images.

Development of a Simulator for Optimizing Semiconductor Manufacturing Incorporating Internet of Things (사물인터넷을 접목한 반도체 소자 공정 최적화 시뮬레이터 개발)

  • Dang, Hyun Shik;Jo, Dong Hee;Kim, Jong Seo;Jung, Taeho
    • Journal of the Korea Society for Simulation
    • /
    • v.26 no.4
    • /
    • pp.35-41
    • /
    • 2017
  • With the advances in Internet over Things, the demand in diverse electronic devices such as mobile phones and sensors has been rapidly increasing and boosting up the researches on those products. Semiconductor materials, devices, and fabrication processes are becoming more diverse and complicated, which accompanies finding parameters for an optimal fabrication process. In order to find the parameters, a process simulation before fabrication or a real-time process control system during fabrication can be used, but they lack incorporating the feedback from post-fabrication data and compatibility with older equipment. In this research, we have developed an artificial intelligence based simulator, which finds parameters for an optimal process and controls process equipment. In order to apply the control concept to all the equipment in a fabrication sequence, we have developed a prototype for a manipulator which can be installed over an existing buttons and knobs in the equipment and controls the equipment communicating with the AI over the Internet. The AI is based on the deep learning to find process parameters that will produce a device having target electrical characteristics. The proposed simulator can control existing equipment via the Internet to fabricate devices with desired performance and, therefore, it will help engineers to develop new devices efficiently and effectively.

Object Detection Based on Hellinger Distance IoU and Objectron Application (Hellinger 거리 IoU와 Objectron 적용을 기반으로 하는 객체 감지)

  • Kim, Yong-Gil;Moon, Kyung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.2
    • /
    • pp.63-70
    • /
    • 2022
  • Although 2D Object detection has been largely improved in the past years with the advance of deep learning methods and the use of large labeled image datasets, 3D object detection from 2D imagery is a challenging problem in a variety of applications such as robotics, due to the lack of data and diversity of appearances and shapes of objects within a category. Google has just announced the launch of Objectron that has a novel data pipeline using mobile augmented reality session data. However, it also is corresponding to 2D-driven 3D object detection technique. This study explores more mature 2D object detection method, and applies its 2D projection to Objectron 3D lifting system. Most object detection methods use bounding boxes to encode and represent the object shape and location. In this work, we explore a stochastic representation of object regions using Gaussian distributions. We also present a similarity measure for the Gaussian distributions based on the Hellinger Distance, which can be viewed as a stochastic Intersection-over-Union. Our experimental results show that the proposed Gaussian representations are closer to annotated segmentation masks in available datasets. Thus, less accuracy problem that is one of several limitations of Objectron can be relaxed.

Multidimensional data generation of water distribution systems using adversarially trained autoencoder (적대적 학습 기반 오토인코더(ATAE)를 이용한 다차원 상수도관망 데이터 생성)

  • Kim, Sehyeong;Jun, Sanghoon;Jung, Donghwi
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.7
    • /
    • pp.439-449
    • /
    • 2023
  • Recent advancements in data measuring technology have facilitated the installation of various sensors, such as pressure meters and flow meters, to effectively assess the real-time conditions of water distribution systems (WDSs). However, as cities expand extensively, the factors that impact the reliability of measurements have become increasingly diverse. In particular, demand data, one of the most significant hydraulic variable in WDS, is challenging to be measured directly and is prone to missing values, making the development of accurate data generation models more important. Therefore, this paper proposes an adversarially trained autoencoder (ATAE) model based on generative deep learning techniques to accurately estimate demand data in WDSs. The proposed model utilizes two neural networks: a generative network and a discriminative network. The generative network generates demand data using the information provided from the measured pressure data, while the discriminative network evaluates the generated demand outputs and provides feedback to the generator to learn the distinctive features of the data. To validate its performance, the ATAE model is applied to a real distribution system in Austin, Texas, USA. The study analyzes the impact of data uncertainty by calculating the accuracy of ATAE's prediction results for varying levels of uncertainty in the demand and the pressure time series data. Additionally, the model's performance is evaluated by comparing the results for different data collection periods (low, average, and high demand hours) to assess its ability to generate demand data based on water consumption levels.

Conventional Versus Artificial Intelligence-Assisted Interpretation of Chest Radiographs in Patients With Acute Respiratory Symptoms in Emergency Department: A Pragmatic Randomized Clinical Trial

  • Eui Jin Hwang;Jin Mo Goo;Ju Gang Nam;Chang Min Park;Ki Jeong Hong;Ki Hong Kim
    • Korean Journal of Radiology
    • /
    • v.24 no.3
    • /
    • pp.259-270
    • /
    • 2023
  • Objective: It is unknown whether artificial intelligence-based computer-aided detection (AI-CAD) can enhance the accuracy of chest radiograph (CR) interpretation in real-world clinical practice. We aimed to compare the accuracy of CR interpretation assisted by AI-CAD to that of conventional interpretation in patients who presented to the emergency department (ED) with acute respiratory symptoms using a pragmatic randomized controlled trial. Materials and Methods: Patients who underwent CRs for acute respiratory symptoms at the ED of a tertiary referral institution were randomly assigned to intervention group (with assistance from an AI-CAD for CR interpretation) or control group (without AI assistance). Using a commercial AI-CAD system (Lunit INSIGHT CXR, version 2.0.2.0; Lunit Inc.). Other clinical practices were consistent with standard procedures. Sensitivity and false-positive rates of CR interpretation by duty trainee radiologists for identifying acute thoracic diseases were the primary and secondary outcomes, respectively. The reference standards for acute thoracic disease were established based on a review of the patient's medical record at least 30 days after the ED visit. Results: We randomly assigned 3576 participants to either the intervention group (1761 participants; mean age ± standard deviation, 65 ± 17 years; 978 males; acute thoracic disease in 472 participants) or the control group (1815 participants; 64 ± 17 years; 988 males; acute thoracic disease in 491 participants). The sensitivity (67.2% [317/472] in the intervention group vs. 66.0% [324/491] in the control group; odds ratio, 1.02 [95% confidence interval, 0.70-1.49]; P = 0.917) and false-positive rate (19.3% [249/1289] vs. 18.5% [245/1324]; odds ratio, 1.00 [95% confidence interval, 0.79-1.26]; P = 0.985) of CR interpretation by duty radiologists were not associated with the use of AI-CAD. Conclusion: AI-CAD did not improve the sensitivity and false-positive rate of CR interpretation for diagnosing acute thoracic disease in patients with acute respiratory symptoms who presented to the ED.

Dynamic forecasts of bankruptcy with Recurrent Neural Network model (RNN(Recurrent Neural Network)을 이용한 기업부도예측모형에서 회계정보의 동적 변화 연구)

  • Kwon, Hyukkun;Lee, Dongkyu;Shin, Minsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.139-153
    • /
    • 2017
  • Corporate bankruptcy can cause great losses not only to stakeholders but also to many related sectors in society. Through the economic crises, bankruptcy have increased and bankruptcy prediction models have become more and more important. Therefore, corporate bankruptcy has been regarded as one of the major topics of research in business management. Also, many studies in the industry are in progress and important. Previous studies attempted to utilize various methodologies to improve the bankruptcy prediction accuracy and to resolve the overfitting problem, such as Multivariate Discriminant Analysis (MDA), Generalized Linear Model (GLM). These methods are based on statistics. Recently, researchers have used machine learning methodologies such as Support Vector Machine (SVM), Artificial Neural Network (ANN). Furthermore, fuzzy theory and genetic algorithms were used. Because of this change, many of bankruptcy models are developed. Also, performance has been improved. In general, the company's financial and accounting information will change over time. Likewise, the market situation also changes, so there are many difficulties in predicting bankruptcy only with information at a certain point in time. However, even though traditional research has problems that don't take into account the time effect, dynamic model has not been studied much. When we ignore the time effect, we get the biased results. So the static model may not be suitable for predicting bankruptcy. Thus, using the dynamic model, there is a possibility that bankruptcy prediction model is improved. In this paper, we propose RNN (Recurrent Neural Network) which is one of the deep learning methodologies. The RNN learns time series data and the performance is known to be good. Prior to experiment, we selected non-financial firms listed on the KOSPI, KOSDAQ and KONEX markets from 2010 to 2016 for the estimation of the bankruptcy prediction model and the comparison of forecasting performance. In order to prevent a mistake of predicting bankruptcy by using the financial information already reflected in the deterioration of the financial condition of the company, the financial information was collected with a lag of two years, and the default period was defined from January to December of the year. Then we defined the bankruptcy. The bankruptcy we defined is the abolition of the listing due to sluggish earnings. We confirmed abolition of the list at KIND that is corporate stock information website. Then we selected variables at previous papers. The first set of variables are Z-score variables. These variables have become traditional variables in predicting bankruptcy. The second set of variables are dynamic variable set. Finally we selected 240 normal companies and 226 bankrupt companies at the first variable set. Likewise, we selected 229 normal companies and 226 bankrupt companies at the second variable set. We created a model that reflects dynamic changes in time-series financial data and by comparing the suggested model with the analysis of existing bankruptcy predictive models, we found that the suggested model could help to improve the accuracy of bankruptcy predictions. We used financial data in KIS Value (Financial database) and selected Multivariate Discriminant Analysis (MDA), Generalized Linear Model called logistic regression (GLM), Support Vector Machine (SVM), Artificial Neural Network (ANN) model as benchmark. The result of the experiment proved that RNN's performance was better than comparative model. The accuracy of RNN was high in both sets of variables and the Area Under the Curve (AUC) value was also high. Also when we saw the hit-ratio table, the ratio of RNNs that predicted a poor company to be bankrupt was higher than that of other comparative models. However the limitation of this paper is that an overfitting problem occurs during RNN learning. But we expect to be able to solve the overfitting problem by selecting more learning data and appropriate variables. From these result, it is expected that this research will contribute to the development of a bankruptcy prediction by proposing a new dynamic model.

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

  • Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.205-225
    • /
    • 2018
  • Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.