• Title/Summary/Keyword: learning through the image

Search Result 925, Processing Time 0.024 seconds

Deep Learning OCR based document processing platform and its application in financial domain (금융 특화 딥러닝 광학문자인식 기반 문서 처리 플랫폼 구축 및 금융권 내 활용)

  • Dongyoung Kim;Doohyung Kim;Myungsung Kwak;Hyunsoo Son;Dongwon Sohn;Mingi Lim;Yeji Shin;Hyeonjung Lee;Chandong Park;Mihyang Kim;Dongwon Choi
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.143-174
    • /
    • 2023
  • With the development of deep learning technologies, Artificial Intelligence powered Optical Character Recognition (AI-OCR) has evolved to read multiple languages from various forms of images accurately. For the financial industry, where a large number of diverse documents are processed through manpower, the potential for using AI-OCR is great. In this study, we present a configuration and a design of an AI-OCR modality for use in the financial industry and discuss the platform construction with application cases. Since the use of financial domain data is prohibited under the Personal Information Protection Act, we developed a deep learning-based data generation approach and used it to train the AI-OCR models. The AI-OCR models are trained for image preprocessing, text recognition, and language processing and are configured as a microservice architected platform to process a broad variety of documents. We have demonstrated the AI-OCR platform by applying it to financial domain tasks of document sorting, document verification, and typing assistance The demonstrations confirm the increasing work efficiency and conveniences.

Development and Application of Scientific Model Co-construction Program about Image Formation by Convex Lens (볼록렌즈가 상을 만드는 원리에 대한 과학적 모형의 사회적 구성 프로그램 개발 및 적용)

  • Park, Jeongwoo
    • Korean Journal of Optics and Photonics
    • /
    • v.28 no.5
    • /
    • pp.203-212
    • /
    • 2017
  • A scientific model refers to a conceptual system that can describe, explain, and predict a particular physical phenomenon. The co-construction of the scientific model is attracting attention as a new teaching and learning strategy in the field of science education and various studies. The evaluation and modification of models compared with the predicted models of data from the real world is the core of modeling strategy. However, there were only a limited data provided by the teacher in many studies of modeling comparing the students' predictions of their own models. Most of the students were not given the opportunity to evaluate the suitability of the model with the data in the real world. The purpose of this study was to develop a scientific model co-construction program that can evaluate the model by directly comparing the predicted models with the observed data from the real world. Through a collaborative discussion between teachers and researchers for 6 months, a 5-session scientific model co-construction program on the subject 'image formation by convex lenses' for second grade middle school students was developed. Eighty (80) students in 3 classes and a science teacher with 20 years of service from general public co-educational middle school in Gyeonggi-do participated in this 2-week program. After the class, students were asked about the helpfulness and difficulty of the class, and whether they would like to recommend this class to a friend. After the class, 95.8% of the students constructed the scientific model more than the model using the construction rule. Students had difficulties to identify principles or understand their friends, but the result showed that they could understand through model evaluation experiment. 92.5% of the students said that they would be more than willing to recommend this program to their friends. It is expected that the developed program will be applied to the school and contribute to the improvement of students' modeling ability and co-construction ability.

An Analysis of the Teacher Librarian's Duties and Competencies Embedded in the IB International School Job Advertisement (IB 국제학교 구인광고에 담긴 사서교사의 직무 및 역량 분석)

  • Eun-Hae, Kim;Gi-Ho, Song
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.33 no.4
    • /
    • pp.5-25
    • /
    • 2022
  • The purpose of this study is to analyze the duties and competencies of the teacher librarian required by schools as consumers to operate the curriculum, and to suggest ways to improve their professionality. To this end, the duties and competencies included in 20 job advertisements posted by IB schools to select teacher librarians were analyzed based on the IFLA School Library Guidelines. As a result of the analysis, it was found that the duties and competencies of IB schools are based on the IB curriculum guidelines and this guideline is based on the educational philosophy and learner image that IBO curriculum aims. And the job that schools want the most from the teacher librarian is teaching through library collection management and collaboration, and the main competencies for this are communication and collaboration skills, teaching-learning·curriculum·education design and operation, and digital & media literacy. The results of this analysis show that the professionalism should be based on the vision for the educated person and learner capabilities presented in the curriculum. Based on this results, in this study the ways for developing teacher librarians' professionalism were presented in the following aspects. First, including the educational responsibilities of the school library in the Arrangement and Implementation Guideline of National Level Curriculum. Second, Classifying human resources' duties through revision of the Enforcement Decree of the School Library Promotion Act. Third, reorganizing of basic courses to acquire teacher librarian qualifications and introducing a demonstration of collaborative teaching in the eduactional practice and the certification examination.

Hyperparameter Optimization for Image Classification in Convolutional Neural Network (합성곱 신경망에서 이미지 분류를 위한 하이퍼파라미터 최적화)

  • Lee, Jae-Eun;Kim, Young-Bong;Kim, Jong-Nam
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.21 no.3
    • /
    • pp.148-153
    • /
    • 2020
  • In order to obtain high accuracy with an convolutional neural network(CNN), it is necessary to set the optimal hyperparameters. However, the exact value of the hyperparameter that can make high performance is not known, and the optimal hyperparameter value is different based on the type of the dataset, therefore, it is necessary to find it through various experiments. In addition, since the range of hyperparameter values is wide and the number of combinations is large, it is necessary to find the optimal values of the hyperparameters after the experimental design in order to save time and computational costs. In this paper, we suggest an algorithm that use the design of experiments and grid search algorithm to determine the optimal hyperparameters for a classification problem. This algorithm determines the optima values of the hyperparameters that yields high performance using the factorial design of experiments. It is shown that the amount of computational time can be efficiently reduced and the accuracy can be improved by performing a grid search after reducing the search range of each hyperparameter through the experimental design. Moreover, Based on the experimental results, it was shown that the learning rate is the only hyperparameter that has the greatest effect on the performance of the model.

A Study on the Development of Design Support Program based upon Academic-Industrial Collaboration -Concentrated on furniture industry in Kimpo area- (산학협동 디자인 지원 프로그램 개발 연구 -김포지역 가구 산업체를 중심으로-)

  • 김국선
    • Archives of design research
    • /
    • v.15 no.1
    • /
    • pp.59-67
    • /
    • 2002
  • In accordance with the fast changing circumstance, universities are now reaching, beyond the education place of simple delivery of knowledge and exchange of information, to a place of developing a specialty program related with the region and industry to achieve a competitive edge in education. Through these academic-industrial collaboration program, special knowledge and human resources in the university are utilized in the society and it may contribute to the development of industry and breed of career people based upon the real job site can be achieved. In addition, through the practical operation of such developed program, university may contribute to the enhancement of the competitiveness of the corporation to the activation and acceleration of the regional economy and finally to the enhancement of competitiveness of national industry in international level. This study tries to develop 3 technical instruction and support program related with the legion and industry which may conform to the ideals of university and its goal of education and can provide a platform for the education that is closely related with the regional industry and real job site iud may cope actively with the upcoming knowledge society and fast changing regional circumstances. This study will make research and analyze needs of design support from the industry in order to develop a academic-industrial cooperative design support program for the furniture industry which conforms to the regional characteristic and feature and develop and present contents of program in three area of development of furniture design technology, build-up of furniture design information system and establishment of order based education system. The proposed program is supposed to be operated practically and effectively and contribute to the development of new product, to enhancement of company image and finally to the maximization of corporate profit. Also this study is epected to be used as an important material for establishment of order-based education system which confirms to the job site needs that may be analysed from feedback of product results and for practical learning.

  • PDF

The Educational Effect of the Visualization of Heat Conduction with a Thermal Imaging Camera on Elementary School Students in Small Group Activity - Focusing on the Change of the Mental Model of Why Metal Feels Cold - (열화상 사진기로 열전도 현상을 시각화한 자료가 소집단 활동에서 초등학생에게 미치는 교육적 효과 - 금속이 차갑게 느껴지는 이유에 대한 정신모형 변화를 중심으로 -)

  • Lee, Ga Ram;Ju, Eunjeong;Park, Il-Woo
    • Journal of Korean Elementary Science Education
    • /
    • v.41 no.3
    • /
    • pp.569-591
    • /
    • 2022
  • This study aims to investigate the educational effects of the visualization of heat conduction using a thermal imaging camera on elementary school students through small group activities. It endeavors to explain the reason for why metal feels cold. The scholars conducted in-depth interviews before and after learning the unit "Temperature and Heat" for four students in fifth grade in Seoul. Recorded video and audio materials of the activities, their outputs, and journals of scholars were collected, reviewed, and analyzed. The result demonstrated that visualizing heat conduction using the thermal imaging camera aroused curiosity and provided an opportunity for sophisticated observation and integrated thinking. In addition, the visualization of the heat conduction phenomenon was used as the basis for interpretation and rebuttal for active communication during the small group activities of the students. Consequently, the students changed their non-scientific beliefs, refined their knowledge, and developed their mental models through a small group discussion based on a thermal image video.

Threat Situation Determination System Through AWS-Based Behavior and Object Recognition (AWS 기반 행위와 객체 인식을 통한 위협 상황 판단 시스템)

  • Ye-Young Kim;Su-Hyun Jeong;So-Hyun Park;Young-Ho Park
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.4
    • /
    • pp.189-198
    • /
    • 2023
  • As crimes frequently occur on the street, the spread of CCTV is increasing. However, due to the shortcomings of passively operated CCTV, the need for intelligent CCTV is attracting attention. Due to the heavy system of such intelligent CCTV, high-performance devices are required, which has a problem in that it is expensive to replace the general CCTV. To solve this problem, an intelligent CCTV system that recognizes low-quality images and operates even on devices with low performance is required. Therefore, this paper proposes a Saying CCTV system that can detect threats in real time by using the AWS cloud platform to lighten the system and convert images into text. Based on the data extracted using YOLO v4 and OpenPose, it is implemented to determine the risk object, threat behavior, and threat situation, and calculate the risk using machine learning. Through this, the system can be operated anytime and anywhere as long as the network is connected, and the system can be used even with devices with minimal performance for video shooting and image upload. Furthermore, it is possible to quickly prevent crime by automating meaningful statistics on crime by analyzing the video and using the data stored as text.

The Performance Improvement of U-Net Model for Landcover Semantic Segmentation through Data Augmentation (데이터 확장을 통한 토지피복분류 U-Net 모델의 성능 개선)

  • Baek, Won-Kyung;Lee, Moung-Jin;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1663-1676
    • /
    • 2022
  • Recently, a number of deep-learning based land cover segmentation studies have been introduced. Some studies denoted that the performance of land cover segmentation deteriorated due to insufficient training data. In this study, we verified the improvement of land cover segmentation performance through data augmentation. U-Net was implemented for the segmentation model. And 2020 satellite-derived landcover dataset was utilized for the study data. The pixel accuracies were 0.905 and 0.923 for U-Net trained by original and augmented data respectively. And the mean F1 scores of those models were 0.720 and 0.775 respectively, indicating the better performance of data augmentation. In addition, F1 scores for building, road, paddy field, upland field, forest, and unclassified area class were 0.770, 0.568, 0.433, 0.455, 0.964, and 0.830 for the U-Net trained by original data. It is verified that data augmentation is effective in that the F1 scores of every class were improved to 0.838, 0.660, 0.791, 0.530, 0.969, and 0.860 respectively. Although, we applied data augmentation without considering class balances, we find that data augmentation can mitigate biased segmentation performance caused by data imbalance problems from the comparisons between the performances of two models. It is expected that this study would help to prove the importance and effectiveness of data augmentation in various image processing fields.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

Improved Method of License Plate Detection and Recognition using Synthetic Number Plate (인조 번호판을 이용한 자동차 번호인식 성능 향상 기법)

  • Chang, Il-Sik;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.26 no.4
    • /
    • pp.453-462
    • /
    • 2021
  • A lot of license plate data is required for car number recognition. License plate data needs to be balanced from past license plates to the latest license plates. However, it is difficult to obtain data from the actual past license plate to the latest ones. In order to solve this problem, a license plate recognition study through deep learning is being conducted by creating a synthetic license plates. Since the synthetic data have differences from real data, and various data augmentation techniques are used to solve these problems. Existing data augmentation simply used methods such as brightness, rotation, affine transformation, blur, and noise. In this paper, we apply a style transformation method that transforms synthetic data into real-world data styles with data augmentation methods. In addition, real license plate data are noisy when it is captured from a distance and under the dark environment. If we simply recognize characters with input data, chances of misrecognition are high. To improve character recognition, in this paper, we applied the DeblurGANv2 method as a quality improvement method for character recognition, increasing the accuracy of license plate recognition. The method of deep learning for license plate detection and license plate number recognition used YOLO-V5. To determine the performance of the synthetic license plate data, we construct a test set by collecting our own secured license plates. License plate detection without style conversion recorded 0.614 mAP. As a result of applying the style transformation, we confirm that the license plate detection performance was improved by recording 0.679mAP. In addition, the successul detection rate without image enhancement was 0.872, and the detection rate was 0.915 after image enhancement, confirming that the performance improved.