• 제목/요약/키워드: Image Extraction

Search Result 2,625, Processing Time 0.027 seconds

LSTM(Long Short-Term Memory)-Based Abnormal Behavior Recognition Using AlphaPose (AlphaPose를 활용한 LSTM(Long Short-Term Memory) 기반 이상행동인식)

  • Bae, Hyun-Jae;Jang, Gyu-Jin;Kim, Young-Hun;Kim, Jin-Pyung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.5
    • /
    • pp.187-194
    • /
    • 2021
  • A person's behavioral recognition is the recognition of what a person does according to joint movements. To this end, we utilize computer vision tasks that are utilized in image processing. Human behavior recognition is a safety accident response service that combines deep learning and CCTV, and can be applied within the safety management site. Existing studies are relatively lacking in behavioral recognition studies through human joint keypoint extraction by utilizing deep learning. There were also problems that were difficult to manage workers continuously and systematically at safety management sites. In this paper, to address these problems, we propose a method to recognize risk behavior using only joint keypoints and joint motion information. AlphaPose, one of the pose estimation methods, was used to extract joint keypoints in the body part. The extracted joint keypoints were sequentially entered into the Long Short-Term Memory (LSTM) model to be learned with continuous data. After checking the behavioral recognition accuracy, it was confirmed that the accuracy of the "Lying Down" behavioral recognition results was high.

A Study on Seasonal Color Image of Flower Display in Commercial Spaces (상업공간 플라워 디스플레이의 계절별 색채이미지에 관한 연구)

  • Yang, Hee Sun;Wang, Kyung Hee;KIm, Jung Min
    • Journal of the Korean Society of Floral Art and Design
    • /
    • no.43
    • /
    • pp.3-17
    • /
    • 2020
  • Through color analysis and survey of seasonal display cases using flower materials in department stores, hotels, and retailers, which are representative commercial spaces in Korea and abroad, it is designed to recognize the need for color planning that applies seasonal colors and emotional adjectives, away from the traditional method of relying on the season, shape and texture of materials, in the process of flower displays. The research method analyzed the colors used in the 48 domestic and foreign commercial space flower display cases collected. Based on this, the first expert questionnaire collected adjectives extraction and seasonal coordinates reminiscent of the case and examined the suitability of emotional adjectives extracted by the second public survey. The research results extracted typical colors and tones of spring, summer, fall, and winter, and recognized seasonal emotional adjectives. Based on these results, We could see that the color scheme should be advanced in the flower display, which used to depend solely on the shape or texture of the flower material, to produce the intended emotional design.

Development of Fast Posture Classification System for Table Tennis Robot (탁구 로봇을 위한 빠른 자세 분류 시스템 개발)

  • Jin, Seongho;Kwon, Yongwoo;Kim, Yoonjeong;Park, Miyoung;An, Jaehoon;Kang, Hosun;Choi, Jiwook;Lee, Inho
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.4
    • /
    • pp.463-476
    • /
    • 2022
  • In this paper, we propose a table tennis posture classification system using a cooperative robot to develop a table tennis robot that can be trained like a real game. The most ideal table tennis robot would be a robot with a high joint driving speed and a high degree of freedom. Therefore, in this paper, we intend to use a cooperative robot with sufficient degrees of freedom to develop a robot that can be trained like a real game. However, cooperative robots have the disadvantage of slow joint driving speed. These shortcomings are expected to be overcome through quick recognition. Therefore, in this paper, we try to quickly classify the opponent's posture to overcome the slow joint driving speed. To this end, learning about dynamic postures was conducted using image data as input, and finally, three classification models were created and comparative experiments and evaluations were performed on the designated dynamic postures. In conclusion, comparative experimental data demonstrate the highest classification accuracy and fastest classification speed in classification models using MLP (Multi-Layer Perceptron), and thus demonstrate the validity of the proposed algorithm.

Morphological Analysis of Hydraulically Stimulated Fractures by Deep-Learning Segmentation Method (딥러닝 기반 균열 추출 기법을 통한 수압 파쇄 균열 형상 분석)

  • Park, Jimin;Kim, Kwang Yeom ;Yun, Tae Sup
    • Journal of the Korean Geotechnical Society
    • /
    • v.39 no.8
    • /
    • pp.17-28
    • /
    • 2023
  • Laboratory-scale hydraulic fracturing experiments were conducted on granite specimens at various viscosities and injection rates of the fracturing fluid. A series of cross-sectional computed tomography (CT) images of fractured specimens was obtained via a three-dimensional X-ray CT imaging method. Pixel-level fracture segmentation of the CT images was conducted using a convolutional neural network (CNN)-based Nested U-Net model structure. Compared with traditional image processing methods, the CNN-based model showed a better performance in the extraction of thin and complex fractures. These extracted fractures extracted were reconstructed in three dimensions and morphologically analyzed based on their fracture volume, aperture, tortuosity, and surface roughness. The fracture volume and aperture increased with the increase in viscosity of the fracturing fluid, while the tortuosity and roughness of the fracture surface decreased. The findings also confirmed the anisotropic tortuosity and roughness of the fracture surface. In this study, a CNN-based model was used to perform accurate fracture segmentation, and quantitative analysis of hydraulic stimulated fractures was conducted successfully.

FE-CBIRS Using Color Distribution for Cut Retrieval in IPTV (IPTV에서 컷 검색을 위한 색 분포정보를 이용한 FE-CBIRS)

  • Koo, Gun-Seo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.1
    • /
    • pp.91-97
    • /
    • 2009
  • This paper proposes novel FE-CBIRS that finds best position of a cut to be retrieved based on color feature distribution in digital contents of IPTV. Conventional CBIRS have used a method that utilizes both color and shape information together to classify images, as well as a method that utilizes both feature information of the entire region and feature information of a partial region that is extracted by segmentation for searching. Also, in the algorithm, average, standard deviation and skewness values are used in case of color features for each hue, saturation and intensity values respectively. Furthermore, in case of using partial regions, only a few major colors are used and in case of shape features, the invariant moment is mainly used on the extracted partial regions. Due to these reasons, some problems have been issued in CBIRS in processing time and accuracy so far. Therefore, in order to tackle these problems, this paper proposes the FE-CBIRS that makes searching speed faster by classifying and indexing the extracted color information by each class and by using several cuts that are restricted in range as comparative images.

Hierarchical Flow-Based Anomaly Detection Model for Motor Gearbox Defect Detection

  • Younghwa Lee;Il-Sik Chang;Suseong Oh;Youngjin Nam;Youngteuk Chae;Geonyoung Choi;Gooman Park
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.6
    • /
    • pp.1516-1529
    • /
    • 2023
  • In this paper, a motor gearbox fault-detection system based on a hierarchical flow-based model is proposed. The proposed system is used for the anomaly detection of a motion sound-based actuator module. The proposed flow-based model, which is a generative model, learns by directly modeling a data distribution function. As the objective function is the maximum likelihood value of the input data, the training is stable and simple to use for anomaly detection. The operation sound of a car's side-view mirror motor is converted into a Mel-spectrogram image, consisting of a folding signal and an unfolding signal, and used as training data in this experiment. The proposed system is composed of an encoder and a decoder. The data extracted from the layer of the pretrained feature extractor are used as the decoder input data in the encoder. This information is used in the decoder by performing an interlayer cross-scale convolution operation. The experimental results indicate that the context information of various dimensions extracted from the interlayer hierarchical data improves the defect detection accuracy. This paper is notable because it uses acoustic data and a normalizing flow model to detect outliers based on the features of experimental data.

Generating Radiology Reports via Multi-feature Optimization Transformer

  • Rui Wang;Rong Hua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.10
    • /
    • pp.2768-2787
    • /
    • 2023
  • As an important research direction of the application of computer science in the medical field, the automatic generation technology of radiology report has attracted wide attention in the academic community. Because the proportion of normal regions in radiology images is much larger than that of abnormal regions, words describing diseases are often masked by other words, resulting in significant feature loss during the calculation process, which affects the quality of generated reports. In addition, the huge difference between visual features and semantic features causes traditional multi-modal fusion method to fail to generate long narrative structures consisting of multiple sentences, which are required for medical reports. To address these challenges, we propose a multi-feature optimization Transformer (MFOT) for generating radiology reports. In detail, a multi-dimensional mapping attention (MDMA) module is designed to encode the visual grid features from different dimensions to reduce the loss of primary features in the encoding process; a feature pre-fusion (FP) module is constructed to enhance the interaction ability between multi-modal features, so as to generate a reasonably structured radiology report; a detail enhanced attention (DEA) module is proposed to enhance the extraction and utilization of key features and reduce the loss of key features. In conclusion, we evaluate the performance of our proposed model against prevailing mainstream models by utilizing widely-recognized radiology report datasets, namely IU X-Ray and MIMIC-CXR. The experimental outcomes demonstrate that our model achieves SOTA performance on both datasets, compared with the base model, the average improvement of six key indicators is 19.9% and 18.0% respectively. These findings substantiate the efficacy of our model in the domain of automated radiology report generation.

Analysis of the Influence of Atmospheric Turbulence on the Ground Calibration of a Star Sensor

  • Xian Ren;Lingyun Wang;Guangxi Li;Bo Cui
    • Current Optics and Photonics
    • /
    • v.8 no.1
    • /
    • pp.38-44
    • /
    • 2024
  • Under the influence of atmospheric turbulence, a star's point image will shake back and forth erratically, and after exposure the originally small star point will spread into a huge spot, which will affect the ground calibration of the star sensor. To analyze the impact of atmospheric turbulence on the positioning accuracy of the star's center of mass, this paper simulates the atmospheric turbulence phase screen using a method based on a sparse spectrum. It is added to the static-star-simulation device to study the transmission characteristics of atmospheric turbulence in star-point simulation, and to analyze the changes in star points under different atmospheric refractive-index structural constants. The simulation results show that the structure function of the atmospheric turbulence phase screen simulated by the sparse spectral method has an average error of 6.8% compared to the theoretical value, while the classical Fourier-transform method can have an error of up to 23% at low frequencies. By including a simulation in which the phase screen would cause errors in the center-of-mass position of the star point, 100 consecutive images are selected and the average drift variance is obtained for each turbulence scenario; The stronger the turbulence, the larger the drift variance. This study can provide a basis for subsequent improvement of the ground-calibration accuracy of a star sensitizer, and for analyzing and evaluating the effect of atmospheric turbulence on the beam.

Neural network with occlusion-resistant and reduced parameters in stereo images (스테레오 영상에서 폐색에 강인하고 축소된 파라미터를 갖는 신경망)

  • Kwang-Yeob Lee;Young-Min Jeon;Jun-Mo Jeong
    • Journal of IKEEE
    • /
    • v.28 no.1
    • /
    • pp.65-71
    • /
    • 2024
  • This paper proposes a neural network that can reduce the number of parameters while reducing matching errors in occluded regions to increase the accuracy of depth maps in stereo matching. Stereo matching-based object recognition is utilized in many fields to more accurately recognize situations using images. When there are many objects in a complex image, an occluded area is generated due to overlap between objects and occlusion by background, thereby lowering the accuracy of the depth map. To solve this problem, existing research methods that create context information and combine it with the cost volume or RoIselect in the occluded area increase the complexity of neural networks, making it difficult to learn and expensive to implement. In this paper, we create a depthwise seperable neural network that enhances regional feature extraction before cost volume generation, reducing the number of parameters and proposing a neural network that is robust to occlusion errors. Compared to PSMNet, the proposed neural network reduced the number of parameters by 30%, improving 5.3% in color error and 3.6% in test loss.

Research on damage detection and assessment of civil engineering structures based on DeepLabV3+ deep learning model

  • Chengyan Song
    • Structural Engineering and Mechanics
    • /
    • v.91 no.5
    • /
    • pp.443-457
    • /
    • 2024
  • At present, the traditional concrete surface inspection methods based on artificial vision have the problems of high cost and insecurity, while the computer vision methods rely on artificial selection features in the case of sensitive environmental changes and difficult promotion. In order to solve these problems, this paper introduces deep learning technology in the field of computer vision to achieve automatic feature extraction of structural damage, with excellent detection speed and strong generalization ability. The main contents of this study are as follows: (1) A method based on DeepLabV3+ convolutional neural network model is proposed for surface detection of post-earthquake structural damage, including surface damage such as concrete cracks, spaling and exposed steel bars. The key semantic information is extracted by different backbone networks, and the data sets containing various surface damage are trained, tested and evaluated. The intersection ratios of 54.4%, 44.2%, and 89.9% in the test set demonstrate the network's capability to accurately identify different types of structural surface damages in pixel-level segmentation, highlighting its effectiveness in varied testing scenarios. (2) A semantic segmentation model based on DeepLabV3+ convolutional neural network is proposed for the detection and evaluation of post-earthquake structural components. Using a dataset that includes building structural components and their damage degrees for training, testing, and evaluation, semantic segmentation detection accuracies were recorded at 98.5% and 56.9%. To provide a comprehensive assessment that considers both false positives and false negatives, the Mean Intersection over Union (Mean IoU) was employed as the primary evaluation metric. This choice ensures that the network's performance in detecting and evaluating pixel-level damage in post-earthquake structural components is evaluated uniformly across all experiments. By incorporating deep learning technology, this study not only offers an innovative solution for accurately identifying post-earthquake damage in civil engineering structures but also contributes significantly to empirical research in automated detection and evaluation within the field of structural health monitoring.