• Title/Summary/Keyword: Automatic Extraction

Search Result 887, Processing Time 0.023 seconds

Indoor Scene Classification based on Color and Depth Images for Automated Reverberation Sound Editing (자동 잔향 편집을 위한 컬러 및 깊이 정보 기반 실내 장면 분류)

  • Jeong, Min-Heuk;Yu, Yong-Hyun;Park, Sung-Jun;Hwang, Seung-Jun;Baek, Joong-Hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.3
    • /
    • pp.384-390
    • /
    • 2020
  • The reverberation effect on the sound when producing movies or VR contents is a very important factor in the realism and liveliness. The reverberation time depending the space is recommended in a standard called RT60(Reverberation Time 60 dB). In this paper, we propose a scene recognition technique for automatic reverberation editing. To this end, we devised a classification model that independently trains color images and predicted depth images in the same model. Indoor scene classification is limited only by training color information because of the similarity of internal structure. Deep learning based depth information extraction technology is used to use spatial depth information. Based on RT60, 10 scene classes were constructed and model training and evaluation were conducted. Finally, the proposed SCR + DNet (Scene Classification for Reverb + Depth Net) classifier achieves higher performance than conventional CNN classifiers with 92.4% accuracy.

Database Generation and Management System for Small-pixelized Airborne Target Recognition (미소 픽셀을 갖는 비행 객체 인식을 위한 데이터베이스 구축 및 관리시스템 연구)

  • Lee, Hoseop;Shin, Heemin;Shim, David Hyunchul;Cho, Sungwook
    • Journal of Aerospace System Engineering
    • /
    • v.16 no.5
    • /
    • pp.70-77
    • /
    • 2022
  • This paper proposes database generation and management system for small-pixelized airborne target recognition. The proposed system has five main features: 1) image extraction from in-flight test video frames, 2) automatic image archiving, 3) image data labeling and Meta data annotation, 4) virtual image data generation based on color channel convert conversion and seamless cloning and 5) HOG/LBP-based tiny-pixelized target augmented image data. The proposed framework is Python-based PyQt5 and has an interface that includes OpenCV. Using video files collected from flight tests, an image dataset for airborne target recognition on generates by using the proposed system and system input.

A Study on Speechreading about the Korean 8 Vowels (한국어 8모음 자동 독화에 관한 연구)

  • Lee, Kyong-Ho;Yang, Ryong;Kim, Sun-Ok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.3
    • /
    • pp.173-182
    • /
    • 2009
  • In this paper, we studied about the extraction of the parameter and implementation of speechreading system to recognize the Korean 8 vowel. Face features are detected by amplifying, reducing the image value and making a comparison between the image value which is represented for various value in various color space. The eyes position, the nose position, the inner boundary of lip, the outer boundary of upper lip and the outer line of the tooth is found to the feature and using the analysis the area of inner lip, the hight and width of inner lip, the outer line length of the tooth rate about a inner mouth area and the distance between the nose and outer boundary of upper lip are used for the parameter. 2400 data are gathered and analyzed. Based on this analysis, the neural net is constructed and the recognition experiments are performed. In the experiment, 5 normal persons were sampled. The observational error between samples was corrected using normalization method. The experiment show very encouraging result about the usefulness of the parameter.

Classification of Torso Shapes of Men Aged 40-64 - Based on Measurements Extracted from the 8th Size Korea Scans - (40-64세 남성의 토르소 형태 분류에 관한 연구 - 제8차 Size Korea 인체형상으로부터 추출한 측정값을 이용하여 -)

  • Guo Tingyu;Eun Joo Ryu;Hwa Kyung Song
    • Fashion & Textile Research Journal
    • /
    • v.25 no.1
    • /
    • pp.92-103
    • /
    • 2023
  • As the body shape change which occurs after middle age is the main factor affecting the fit of ready-to-wear clothes, this study was designed to classify and analyze the torso shapes of middle-aged men. This study sorted 3D body scans of 200 men aged 40-64 from the 8th Size Korea (2021) database and extracted their 47 measurement values using the Grasshopper algorithm for automatic extraction landmarks and measurements, developed by the previous research (Ryu & Song, 2022). Eight principal components (torso length, shoulder size, overall body size, abdomen prominence, back protrusion, neck inclination, upper body slope, and hip prominence) were identified and four torso shapes were classified. Shape 1 (28.5%) exhibited the shortest torso length, the narrowest shoulders, and the most protruding back. Shape 2 (21.0%) exhibited the skinniest body and the largest backward inclination of the upper body. Hence, the back appeared to be protruding, and the abdomen looked prominent. Shape 3 (25.5%) had the largest overall body size. Thus, the abdomen looked the least protruding, and it exhibited the flattest back. Shape 4 (25.0%) had the longest torso, widest shoulders, straightest neck, and the least protruding hips. This study suggested these three discriminant functions to identify a new person's torso type.

Correlation Extraction from KOSHA to enable the Development of Computer Vision based Risks Recognition System

  • Khan, Numan;Kim, Youjin;Lee, Doyeop;Tran, Si Van-Tien;Park, Chansik
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.87-95
    • /
    • 2020
  • Generally, occupational safety and particularly construction safety is an intricate phenomenon. Industry professionals have devoted vital attention to enforcing Occupational Safety and Health (OHS) from the last three decades to enhance safety management in construction. Despite the efforts of the safety professionals and government agencies, current safety management still relies on manual inspections which are infrequent, time-consuming and prone to error. Extensive research has been carried out to deal with high fatality rates confronting by the construction industry. Sensor systems, visualization-based technologies, and tracking techniques have been deployed by researchers in the last decade. Recently in the construction industry, computer vision has attracted significant attention worldwide. However, the literature revealed the narrow scope of the computer vision technology for safety management, hence, broad scope research for safety monitoring is desired to attain a complete automatic job site monitoring. With this regard, the development of a broader scope computer vision-based risk recognition system for correlation detection between the construction entities is inevitable. For this purpose, a detailed analysis has been conducted and related rules which depict the correlations (positive and negative) between the construction entities were extracted. Deep learning supported Mask R-CNN algorithm is applied to train the model. As proof of concept, a prototype is developed based on real scenarios. The proposed approach is expected to enhance the effectiveness of safety inspection and reduce the encountered burden on safety managers. It is anticipated that this approach may enable a reduction in injuries and fatalities by implementing the exact relevant safety rules and will contribute to enhance the overall safety management and monitoring performance.

  • PDF

Prerequisite Research for the Development of an End-to-End System for Automatic Tooth Segmentation: A Deep Learning-Based Reference Point Setting Algorithm (자동 치아 분할용 종단 간 시스템 개발을 위한 선결 연구: 딥러닝 기반 기준점 설정 알고리즘)

  • Kyungdeok Seo;Sena Lee;Yongkyu Jin;Sejung Yang
    • Journal of Biomedical Engineering Research
    • /
    • v.44 no.5
    • /
    • pp.346-353
    • /
    • 2023
  • In this paper, we propose an innovative approach that leverages deep learning to find optimal reference points for achieving precise tooth segmentation in three-dimensional tooth point cloud data. A dataset consisting of 350 aligned maxillary and mandibular cloud data was used as input, and both end coordinates of individual teeth were used as correct answers. A two-dimensional image was created by projecting the rendered point cloud data along the Z-axis, where an image of individual teeth was created using an object detection algorithm. The proposed algorithm is designed by adding various modules to the Unet model that allow effective learning of a narrow range, and detects both end points of the tooth using the generated tooth image. In the evaluation using DSC, Euclid distance, and MAE as indicators, we achieved superior performance compared to other Unet-based models. In future research, we will develop an algorithm to find the reference point of the point cloud by back-projecting the reference point detected in the image in three dimensions, and based on this, we will develop an algorithm to divide the teeth individually in the point cloud through image processing techniques.

Generating Radiology Reports via Multi-feature Optimization Transformer

  • Rui Wang;Rong Hua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.10
    • /
    • pp.2768-2787
    • /
    • 2023
  • As an important research direction of the application of computer science in the medical field, the automatic generation technology of radiology report has attracted wide attention in the academic community. Because the proportion of normal regions in radiology images is much larger than that of abnormal regions, words describing diseases are often masked by other words, resulting in significant feature loss during the calculation process, which affects the quality of generated reports. In addition, the huge difference between visual features and semantic features causes traditional multi-modal fusion method to fail to generate long narrative structures consisting of multiple sentences, which are required for medical reports. To address these challenges, we propose a multi-feature optimization Transformer (MFOT) for generating radiology reports. In detail, a multi-dimensional mapping attention (MDMA) module is designed to encode the visual grid features from different dimensions to reduce the loss of primary features in the encoding process; a feature pre-fusion (FP) module is constructed to enhance the interaction ability between multi-modal features, so as to generate a reasonably structured radiology report; a detail enhanced attention (DEA) module is proposed to enhance the extraction and utilization of key features and reduce the loss of key features. In conclusion, we evaluate the performance of our proposed model against prevailing mainstream models by utilizing widely-recognized radiology report datasets, namely IU X-Ray and MIMIC-CXR. The experimental outcomes demonstrate that our model achieves SOTA performance on both datasets, compared with the base model, the average improvement of six key indicators is 19.9% and 18.0% respectively. These findings substantiate the efficacy of our model in the domain of automated radiology report generation.

Performance Comparison of CNN-Based Image Classification Models for Drone Identification System (드론 식별 시스템을 위한 합성곱 신경망 기반 이미지 분류 모델 성능 비교)

  • YeongWan Kim;DaeKyun Cho;GunWoo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.639-644
    • /
    • 2024
  • Recent developments in the use of drones on battlefields, extending beyond reconnaissance to firepower support, have greatly increased the importance of technologies for early automatic drone identification. In this study, to identify an effective image classification model that can distinguish drones from other aerial targets of similar size and appearance, such as birds and balloons, we utilized a dataset of 3,600 images collected from the internet. We adopted a transfer learning approach that combines the feature extraction capabilities of three pre-trained convolutional neural network models (VGG16, ResNet50, InceptionV3) with an additional classifier. Specifically, we conducted a comparative analysis of the performance of these three pre-trained models to determine the most effective one. The results showed that the InceptionV3 model achieved the highest accuracy at 99.66%. This research represents a new endeavor in utilizing existing convolutional neural network models and transfer learning for drone identification, which is expected to make a significant contribution to the advancement of drone identification technologies.

GIS based Development of Module and Algorithm for Automatic Catchment Delineation Using Korean Reach File (GIS 기반의 하천망분석도 집수구역 자동 분할을 위한 알고리듬 및 모듈 개발)

  • PARK, Yong-Gil;KIM, Kye-Hyun;YOO, Jae-Hyun
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.20 no.4
    • /
    • pp.126-138
    • /
    • 2017
  • Recently, the national interest in environment is increasing and for dealing with water environment-related issues swiftly and accurately, the demand to facilitate the analysis of water environment data using a GIS is growing. To meet such growing demands, a spatial network data-based stream network analysis map(Korean Reach File; KRF) supporting spatial analysis of water environment data was developed and is being provided. However, there is a difficulty in delineating catchment areas, which are the basis of supplying spatial data including relevant information frequently required by the users such as establishing remediation measures against water pollution accidents. Therefore, in this study, the development of a computer program was made. The development process included steps such as designing a delineation method, and developing an algorithm and modules. DEM(Digital Elevation Model) and FDR(Flow Direction) were used as the major data to automatically delineate catchment areas. The algorithm for the delineation of catchment areas was developed through three stages; catchment area grid extraction, boundary point extraction, and boundary line division. Also, an add-in catchment area delineation module, based on ArcGIS from ESRI, was developed in the consideration of productivity and utility of the program. Using the developed program, the catchment areas were delineated and they were compared to the catchment areas currently used by the government. The results showed that the catchment areas were delineated efficiently using the digital elevation data. Especially, in the regions with clear topographical slopes, they were delineated accurately and swiftly. Although in some regions with flat fields of paddles and downtowns or well-organized drainage facilities, the catchment areas were not segmented accurately, the program definitely reduce the processing time to delineate existing catchment areas. In the future, more efforts should be made to enhance current algorithm to facilitate the use of the higher precision of digital elevation data, and furthermore reducing the calculation time for processing large data volume.

A Dynamic Management Method for FOAF Using RSS and OLAP cube (RSS와 OLAP 큐브를 이용한 FOAF의 동적 관리 기법)

  • Sohn, Jong-Soo;Chung, In-Jeong
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.2
    • /
    • pp.39-60
    • /
    • 2011
  • Since the introduction of web 2.0 technology, social network service has been recognized as the foundation of an important future information technology. The advent of web 2.0 has led to the change of content creators. In the existing web, content creators are service providers, whereas they have changed into service users in the recent web. Users share experiences with other users improving contents quality, thereby it has increased the importance of social network. As a result, diverse forms of social network service have been emerged from relations and experiences of users. Social network is a network to construct and express social relations among people who share interests and activities. Today's social network service has not merely confined itself to showing user interactions, but it has also developed into a level in which content generation and evaluation are interacting with each other. As the volume of contents generated from social network service and the number of connections between users have drastically increased, the social network extraction method becomes more complicated. Consequently the following problems for the social network extraction arise. First problem lies in insufficiency of representational power of object in the social network. Second problem is incapability of expressional power in the diverse connections among users. Third problem is the difficulty of creating dynamic change in the social network due to change in user interests. And lastly, lack of method capable of integrating and processing data efficiently in the heterogeneous distributed computing environment. The first and last problems can be solved by using FOAF, a tool for describing ontology-based user profiles for construction of social network. However, solving second and third problems require a novel technology to reflect dynamic change of user interests and relations. In this paper, we propose a novel method to overcome the above problems of existing social network extraction method by applying FOAF (a tool for describing user profiles) and RSS (a literary web work publishing mechanism) to OLAP system in order to dynamically innovate and manage FOAF. We employed data interoperability which is an important characteristic of FOAF in this paper. Next we used RSS to reflect such changes as time flow and user interests. RSS, a tool for literary web work, provides standard vocabulary for distribution at web sites and contents in the form of RDF/XML. In this paper, we collect personal information and relations of users by utilizing FOAF. We also collect user contents by utilizing RSS. Finally, collected data is inserted into the database by star schema. The system we proposed in this paper generates OLAP cube using data in the database. 'Dynamic FOAF Management Algorithm' processes generated OLAP cube. Dynamic FOAF Management Algorithm consists of two functions: one is find_id_interest() and the other is find_relation (). Find_id_interest() is used to extract user interests during the input period, and find-relation() extracts users matching user interests. Finally, the proposed system reconstructs FOAF by reflecting extracted relationships and interests of users. For the justification of the suggested idea, we showed the implemented result together with its analysis. We used C# language and MS-SQL database, and input FOAF and RSS as data collected from livejournal.com. The implemented result shows that foaf : interest of users has reached an average of 19 percent increase for four weeks. In proportion to the increased foaf : interest change, the number of foaf : knows of users has grown an average of 9 percent for four weeks. As we use FOAF and RSS as basic data which have a wide support in web 2.0 and social network service, we have a definite advantage in utilizing user data distributed in the diverse web sites and services regardless of language and types of computer. By using suggested method in this paper, we can provide better services coping with the rapid change of user interests with the automatic application of FOAF.