• Title/Summary/Keyword: learning through the image

Search Result 925, Processing Time 0.027 seconds

Keypoint-based Deep Learning Approach for Building Footprint Extraction Using Aerial Images

  • Jeong, Doyoung;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.1
    • /
    • pp.111-122
    • /
    • 2021
  • Building footprint extraction is an active topic in the domain of remote sensing, since buildings are a fundamental unit of urban areas. Deep convolutional neural networks successfully perform footprint extraction from optical satellite images. However, semantic segmentation produces coarse results in the output, such as blurred and rounded boundaries, which are caused by the use of convolutional layers with large receptive fields and pooling layers. The objective of this study is to generate visually enhanced building objects by directly extracting the vertices of individual buildings by combining instance segmentation and keypoint detection. The target keypoints in building extraction are defined as points of interest based on the local image gradient direction, that is, the vertices of a building polygon. The proposed framework follows a two-stage, top-down approach that is divided into object detection and keypoint estimation. Keypoints between instances are distinguished by merging the rough segmentation masks and the local features of regions of interest. A building polygon is created by grouping the predicted keypoints through a simple geometric method. Our model achieved an F1-score of 0.650 with an mIoU of 62.6 for building footprint extraction using the OpenCitesAI dataset. The results demonstrated that the proposed framework using keypoint estimation exhibited better segmentation performance when compared with Mask R-CNN in terms of both qualitative and quantitative results.

Development of Color Recognition Algorithm for Traffic Lights using Deep Learning Data (딥러닝 데이터 활용한 신호등 색 인식 알고리즘 개발)

  • Baek, Seoha;Kim, Jongho;Yi, Kyongsu
    • Journal of Auto-vehicle Safety Association
    • /
    • v.14 no.2
    • /
    • pp.45-50
    • /
    • 2022
  • The vehicle motion in urban environment is determined by surrounding traffic flow, which cause understanding the flow to be a factor that dominantly affects the motion planning of the vehicle. The traffic flow in this urban environment is accessed using various urban infrastructure information. This paper represents a color recognition algorithm for traffic lights to perceive traffic condition which is a main information among various urban infrastructure information. Deep learning based vision open source realizes positions of traffic lights around the host vehicle. The data are processed to input data based on whether it exists on the route of ego vehicle. The colors of traffic lights are estimated through pixel values from the camera image. The proposed algorithm is validated in intersection situations with traffic lights on the test track. The results show that the proposed algorithm guarantees precise recognition on traffic lights associated with the ego vehicle path in urban intersection scenarios.

Defects Detection System on Injection Molded Part (사출성형 제품의 결함검출 시스템)

  • Park, In-Kyu;Lee, Wan-Bum;Choi, Gyoo-Seok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.11 no.4
    • /
    • pp.99-104
    • /
    • 2011
  • In this paper the approach of neural network was proposed which detects a variety of defects in the molded parts. In an attempt to improve the response of the system, It is designed to minimize the use of memory via LookUp table in software. The goal of these methods was to extract the features of samples in learning of neural networks, overcoming the algorithms of defects detection and classification. Through the learning of 500 sample patterns of molded parts, defects of 3% molded parts was detected and classified as the incorrect diameter parts. We expect that proposed approach is an effective alternative to save test time and cost for defect detection of a fine pattern within the molded parts.

Development of Access Management System based on Face Recognition using ResNet (ResNet을 이용한 얼굴 인식 기반 출입관리시스템 개발)

  • Rhyou, Se-Yeol;Kim, Hye-Jin;Cha, Kyung-Ae
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.8
    • /
    • pp.823-831
    • /
    • 2019
  • In recent years, there has been developed systems such as a surveillance system and access control using a face recognition function instead of a password or an RFID chip, thereby reducing the risk of falsification. Moreover, deep learning technology has been applied to real-time face recognition technology in video, so it makes possible the development of access control system that improves the accuracy of recognition and efficiency of management. In this paper, we propose a real-time access management system based on face recognition using ResNet. The system is based on web server, which make it possible to manage the access by recognizing the person of the image through the camera and access information stored in the database. It can be accessed by a user application to receive various information. The implemented system identifies a person in real time and allows access control by accurately distinguishing whether they are members or not, and the test results can recognize in 0.2 seconds. The accuracy of recognition rate is up to about 97% depending on the experiment environment. With this system, access can be managed quickly and effectively, even many people rush to it.

Deep Learning based Singing Voice Synthesis Modeling (딥러닝 기반 가창 음성합성(Singing Voice Synthesis) 모델링)

  • Kim, Minae;Kim, Somin;Park, Jihyun;Heo, Gabin;Choi, Yunjeong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.127-130
    • /
    • 2022
  • This paper is a study on singing voice synthesis modeling using a generator loss function, which analyzes various factors that may occur when applying BEGAN among deep learning algorithms optimized for image generation to Audio domain. and we conduct experiments to derive optimal quality. In this paper, we focused the problem that the L1 loss proposed in the BEGAN-based models degrades the meaning of hyperparameter the gamma(𝛾) which was defined to control the diversity and quality of generated audio samples. In experiments we show that our proposed method and finding the optimal values through tuning, it can contribute to the improvement of the quality of the singing synthesis product.

  • PDF

Development of Python-based Annotation Tool Program for Constructing Object Recognition Deep-Learning Model (물체인식 딥러닝 모델 구성을 위한 파이썬 기반의 Annotation 툴 개발)

  • Lim, Song-Won;Park, Goo-man
    • Journal of Broadcast Engineering
    • /
    • v.25 no.3
    • /
    • pp.386-398
    • /
    • 2020
  • We developed an integrative annotation program that can perform data labeling process for deep learning models in object recognition. The program utilizes the basic GUI library of Python and configures crawler functions that allow data collection in real time. Retinanet was used to implement an automatic annotation function. In addition, different data labeling formats for Pascal-VOC, YOLO and Retinanet were generated. Through the experiment of the proposed method, a domestic vehicle image dataset was built, and it is applied to Retinanet and YOLO as the training and test set. The proposed system classified the vehicle model with the accuracy of about 94%.

Rubber O-ring defect detection using adaptive binarization, Convex Hull preprocessing, and convolutional neural network learning method (적응형 이진화와 Convex Hull 전처리 및 합성곱 신경망 학습 방법을 적용한 고무 오링 불량 판별)

  • Seong, Eun-San;Kim, Hyun-Tae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.623-625
    • /
    • 2021
  • Rubber o-rings are produced by conventional injection molding methods. In this case, products that are not normally molded are determined to be defective. However, if images acquired during image-based reading are read as original, there is a problem of poor accuracy. We have thus learned from convolutional neural networks using adaptive binarization and Convex Hull algorithms by extracting only rubber oring parts from the original images through pre-processing. During the test process, it was confirmed that the defect detection performance of the learning method applied pre-processing was better than the standard suggested.

  • PDF

The Development of Subject-matter Knowledge and Pedagogical Content Knowledge in Function Instruction (함수개념의 교수.학습과정에서 나타난 subject-matter knowledge와 pedagogical content knowledge 능력의 발전에 관한 연구)

  • Yoon, Suk-Im
    • Communications of Mathematical Education
    • /
    • v.21 no.4
    • /
    • pp.575-596
    • /
    • 2007
  • This study investigates preservice teachers' development of subject-matter knowledge and pedagogical content knowledge in teaching function concept. This development takes place in the pedagogical mathematics courses in which the theory of constructivism and cooperative learning theory are aligned. Pre and post courses test were administered to examine the development and the follow-up interviews were conducted to gain more details. Analysis of the written questionnaire results and interview transcripts reveal that their limited concept image can be extended and developed in depth through pedagogical mathematics courses that apply reformed teaching methods.

  • PDF

The Audience Behavior-based Emotion Prediction Model for Personalized Service (고객 맞춤형 서비스를 위한 관객 행동 기반 감정예측모형)

  • Ryoo, Eun Chung;Ahn, Hyunchul;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.73-85
    • /
    • 2013
  • Nowadays, in today's information society, the importance of the knowledge service using the information to creative value is getting higher day by day. In addition, depending on the development of IT technology, it is ease to collect and use information. Also, many companies actively use customer information to marketing in a variety of industries. Into the 21st century, companies have been actively using the culture arts to manage corporate image and marketing closely linked to their commercial interests. But, it is difficult that companies attract or maintain consumer's interest through their technology. For that reason, it is trend to perform cultural activities for tool of differentiation over many firms. Many firms used the customer's experience to new marketing strategy in order to effectively respond to competitive market. Accordingly, it is emerging rapidly that the necessity of personalized service to provide a new experience for people based on the personal profile information that contains the characteristics of the individual. Like this, personalized service using customer's individual profile information such as language, symbols, behavior, and emotions is very important today. Through this, we will be able to judge interaction between people and content and to maximize customer's experience and satisfaction. There are various relative works provide customer-centered service. Specially, emotion recognition research is emerging recently. Existing researches experienced emotion recognition using mostly bio-signal. Most of researches are voice and face studies that have great emotional changes. However, there are several difficulties to predict people's emotion caused by limitation of equipment and service environments. So, in this paper, we develop emotion prediction model based on vision-based interface to overcome existing limitations. Emotion recognition research based on people's gesture and posture has been processed by several researchers. This paper developed a model that recognizes people's emotional states through body gesture and posture using difference image method. And we found optimization validation model for four kinds of emotions' prediction. A proposed model purposed to automatically determine and predict 4 human emotions (Sadness, Surprise, Joy, and Disgust). To build up the model, event booth was installed in the KOCCA's lobby and we provided some proper stimulative movie to collect their body gesture and posture as the change of emotions. And then, we extracted body movements using difference image method. And we revised people data to build proposed model through neural network. The proposed model for emotion prediction used 3 type time-frame sets (20 frames, 30 frames, and 40 frames). And then, we adopted the model which has best performance compared with other models.' Before build three kinds of models, the entire 97 data set were divided into three data sets of learning, test, and validation set. The proposed model for emotion prediction was constructed using artificial neural network. In this paper, we used the back-propagation algorithm as a learning method, and set learning rate to 10%, momentum rate to 10%. The sigmoid function was used as the transform function. And we designed a three-layer perceptron neural network with one hidden layer and four output nodes. Based on the test data set, the learning for this research model was stopped when it reaches 50000 after reaching the minimum error in order to explore the point of learning. We finally processed each model's accuracy and found best model to predict each emotions. The result showed prediction accuracy 100% from sadness, and 96% from joy prediction in 20 frames set model. And 88% from surprise, and 98% from disgust in 30 frames set model. The findings of our research are expected to be useful to provide effective algorithm for personalized service in various industries such as advertisement, exhibition, performance, etc.

A Comparative Study on Deepfake Detection using Gray Channel Analysis (Gray 채널 분석을 사용한 딥페이크 탐지 성능 비교 연구)

  • Son, Seok Bin;Jo, Hee Hyeon;Kang, Hee Yoon;Lee, Byung Gul;Lee, Youn Kyu
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.9
    • /
    • pp.1224-1241
    • /
    • 2021
  • Recent development of deep learning techniques for image generation has led to straightforward generation of sophisticated deepfakes. However, as a result, privacy violations through deepfakes has also became increased. To solve this issue, a number of techniques for deepfake detection have been proposed, which are mainly focused on RGB channel-based analysis. Although existing studies have suggested the effectiveness of other color model-based analysis (i.e., Grayscale), their effectiveness has not been quantitatively validated yet. Thus, in this paper, we compare the effectiveness of Grayscale channel-based analysis with RGB channel-based analysis in deepfake detection. Based on the selected CNN-based models and deepfake datasets, we measured the performance of each color model-based analysis in terms of accuracy and time. The evaluation results confirmed that Grayscale channel-based analysis performs better than RGB-channel analysis in several cases.