Search | Korea Science

Classification of Whale Sounds using LPC and Neural Networks (신경망과 LPC 계수를 이용한 고래 소리의 분류)

An, Woo-Jin;Lee, Eung-Jae;Kim, Nam-Gyu;Chong, Ui-Pil
- Journal of the Institute of Convergence Signal Processing
- /
- v.18 no.2
- /
- pp.43-48
- /
- 2017
The underwater transients signals contain the characteristics of complexity, time varying, nonlinear, and short duration. So it is very hard to model for these signals with reference patterns. In this paper we separate the whole length of signals into some short duration of constant length with overlapping frame by frame. The 20th LPC(Linear Predictive Coding) coefficients are extracted from the original signals using Durbin algorithm and applied to neural network. The 65% of whole signals were learned and 35% of the signals were tested in the neural network with two hidden layers. The types of the whales for sound classification are Blue whale, Dulsae whale, Gray whale, Humpback whale, Minke whale, and Northern Right whale. Finally, we could obtain more than 83% of classification rate from the test signals.
PDF

Application of deep convolutional neural network for short-term precipitation forecasting using weather radar-based images

Le, Xuan-Hien;Jung, Sungho;Lee, Giha
- Proceedings of the Korea Water Resources Association Conference
- /
- 2021.06a
- /
- pp.136-136
- /
- 2021
In this study, a deep convolutional neural network (DCNN) model is proposed for short-term precipitation forecasting using weather radar-based images. The DCNN model is a combination of convolutional neural networks, autoencoder neural networks, and U-net architecture. The weather radar-based image data used here are retrieved from competition for rainfall forecasting in Korea (AI Contest for Rainfall Prediction of Hydroelectric Dam Using Public Data), organized by Dacon under the sponsorship of the Korean Water Resources Association in October 2020. This data is collected from rainy events during the rainy season (April - October) from 2010 to 2017. These images have undergone a preprocessing step to convert from weather radar data to grayscale image data before they are exploited for the competition. Accordingly, each of these gray images covers a spatial dimension of 120×120 pixels and has a corresponding temporal resolution of 10 minutes. Here, each pixel corresponds to a grid of size 4km×4km. The DCNN model is designed in this study to provide 10-minute predictive images in advance. Then, precipitation information can be obtained from these forecast images through empirical conversion formulas. Model performance is assessed by comparing the Score index, which is defined based on the ratio of MAE (mean absolute error) to CSI (critical success index) values. The competition results have demonstrated the impressive performance of the DCNN model, where the Score value is 0.530 compared to the best value from the competition of 0.500, ranking 16^th out of 463 participating teams. This study's findings exhibit the potential of applying the DCNN model to short-term rainfall prediction using weather radar-based images. As a result, this model can be applied to other areas with different spatiotemporal resolutions.
PDF

Automatic Extraction of Eye and Mouth Fields from Face Images using MultiLayer Perceptrons and Eigenfeatures (고유특징과 다층 신경망을 이용한 얼굴 영상에서의 눈과 입 영역 자동 추출)

Ryu, Yeon-Sik;O, Se-Yeong
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.37 no.2
- /
- pp.31-43
- /
- 2000
This paper presents a novel algorithm lot extraction of the eye and mouth fields (facial features) from 2D gray level face images. First of all, it has been found that Eigenfeatures, derived from the eigenvalues and the eigenvectors of the binary edge data set constructed from the eye and mouth fields are very good features to locate these fields. The Eigenfeatures, extracted from the positive and negative training samples for the facial features, ate used to train a MultiLayer Perceptron(MLP) whose output indicates the degree to which a particular image window contains the eye or the mouth within itself. Second, to ensure robustness, the ensemble network consisting of multiple MLPs is used instead of a single MLP. The output of the ensemble network becomes the average of the multiple locations of the field each found by the constituent MLPs. Finally, in order to reduce the computation time, we extracted the coarse search region lot eyes and mouth by using prior information on face images. The advantages of the proposed approach includes that only a small number of frontal faces are sufficient to train the nets and furthermore, lends themselves to good generalization to non-frontal poses and even to other people's faces. It was also experimentally verified that the proposed algorithm is robust against slight variations of facial size and pose due to the generalization characteristics of neural networks.
PDF

Classification of Fall in Sick Times of Liver Cirrhosis using Magnetic Resonance Image (자기공명영상을 이용한 간경변 단계별 분류에 관한 연구)

Park, Byung-Rae;Jeon, Gye-Rok
- Journal of radiological science and technology
- /
- v.26 no.1
- /
- pp.71-82
- /
- 2003
In this paper, I proposed a classifier of liver cirrhotic step using T1-weighted MRI(magnetic resonance imaging) and hierarchical neural network. The data sets for classification of each stage, which were normal, 1type, 2type and 3type, were obtained in Pusan National University Hospital from June 2001 to december 2001. And the number of data was 46. We extracted liver region and nodule region from T1-weighted MR liver image. Then objective interpretation classifier of liver cirrhotic steps in T1-weighted MR liver images. Liver cirrhosis classifier implemented using hierarchical neural network which gray-level analysis and texture feature descriptors to distinguish normal liver and 3 types of liver cirrhosis. Then proposed Neural network classifier teamed through error back-propagation algorithm. A classifying result shows that recognition rate of normal is 100%, 1type is 82.3%, 2type is 86.7%, 3type is 83.7%. The recognition ratio very high, when compared between the result of obtained quantified data to that of doctors decision data and neural network classifier value. If enough data is offered and other parameter is considered, this paper according to we expected that neural network as well as human experts and could be useful as clinical decision support tool for liver cirrhosis patients.
PDF

Development of Facial Expression Recognition System based on Bayesian Network using FACS and AAM (FACS와 AAM을 이용한 Bayesian Network 기반 얼굴 표정 인식 시스템 개발)

Ko, Kwang-Eun;Sim, Kwee-Bo
- Journal of the Korean Institute of Intelligent Systems
- /
- v.19 no.4
- /
- pp.562-567
- /
- 2009
As a key mechanism of the human emotion interaction, Facial Expression is a powerful tools in HRI(Human Robot Interface) such as Human Computer Interface. By using a facial expression, we can bring out various reaction correspond to emotional state of user in HCI(Human Computer Interaction). Also it can infer that suitable services to supply user from service agents such as intelligent robot. In this article, We addresses the issue of expressive face modeling using an advanced active appearance model for facial emotion recognition. We consider the six universal emotional categories that are defined by Ekman. In human face, emotions are most widely represented with eyes and mouth expression. If we want to recognize the human's emotion from this facial image, we need to extract feature points such as Action Unit(AU) of Ekman. Active Appearance Model (AAM) is one of the commonly used methods for facial feature extraction and it can be applied to construct AU. Regarding the traditional AAM depends on the setting of the initial parameters of the model and this paper introduces a facial emotion recognizing method based on which is combined Advanced AAM with Bayesian Network. Firstly, we obtain the reconstructive parameters of the new gray-scale image by sample-based learning and use them to reconstruct the shape and texture of the new image and calculate the initial parameters of the AAM by the reconstructed facial model. Then reduce the distance error between the model and the target contour by adjusting the parameters of the model. Finally get the model which is matched with the facial feature outline after several iterations and use them to recognize the facial emotion by using Bayesian Network.
https://doi.org/10.5391/JKIIS.2009.19.4.562 인용 PDF KSCI

Non-alcoholic Fatty Liver Disease Classification using Gray Level Co-Ocurrence Matrix and Artificial Neural Network on Non-alcoholic Fatty Liver Ultrasound Images (비알콜성 지방간 초음파 영상에 GLCM과 인공신경망을 적용한 비알콜성 지방간 질환 분류)

Ji-Yul Kim;Soo-Young Ye
- Journal of the Korean Society of Radiology
- /
- v.17 no.5
- /
- pp.735-742
- /
- 2023
Non-alcoholic fatty liver disease is an independent risk factor for the development of cardiovascular disease, diabetes, hypertension, and kidney disease, and the clinical importance of non-alcoholic fatty liver disease has recently been increasing. In this study, we aim to extract feature values by applying GLCM, a texture analysis method, to ultrasound images of patients with non-alcoholic fatty liver disease. By applying an artificial neural network model using extracted feature values, we would like to classify the degree of fat deposition in non-alcoholic fatty liver into normal liver, mild fatty liver, moderate fatty liver, and severe fatty liver. As a result of applying the GLCM algorithm, the parameters Autocorrelation, Sum of squares, Sum average, and sum variance showed a tendency for the average value of the feature values to increase as it progressed from mild fatty liver to moderate fatty liver to severe fatty liver. The four parameters of Autocorrelation, Sum of squares, Sum average, and sum variance extracted by applying the GLCM algorithm to ultrasound images of non-alcoholic fatty liver disease were applied as inputs to the artificial neural network model. The classification accuracy was evaluated by applying the GLCM algorithm to the ultrasound images of non-alcoholic fatty liver disease and applying the extracted images to an artificial neural network, showing a high accuracy of 92.5%. Through these results, we would like to present the results of this study as basic data when conducting a texture analysis GLCM study on ultrasound images of patients with non-alcoholic fatty liver disease.
https://doi.org/10.7742/jksr.2023.17.5.735 인용 PDF HTML

Hybrid Neural Classifier Combined with H-ART2 and F-LVQ for Face Recognition

Kim, Do-Hyeon;Cha, Eui-Young;Kim, Kwang-Baek
- 제어로봇시스템학회:학술대회논문집
- /
- 2005.06a
- /
- pp.1287-1292
- /
- 2005
This paper presents an effective pattern classification model by designing an artificial neural network based pattern classifiers for face recognition. First, a RGB image inputted from a frame grabber is converted into a HSV image which is similar to the human beings' vision system. Then, the coarse facial region is extracted using the hue(H) and saturation(S) components except intensity(V) component which is sensitive to the environmental illumination. Next, the fine facial region extraction process is performed by matching with the edge and gray based templates. To make a light-invariant and qualified facial image, histogram equalization and intensity compensation processing using illumination plane are performed. The finally extracted and enhanced facial images are used for training the pattern classification models. The proposed H-ART2 model which has the hierarchical ART2 layers and F-LVQ model which is optimized by fuzzy membership make it possible to classify facial patterns by optimizing relations of clusters and searching clustered reference patterns effectively. Experimental results show that the proposed face recognition system is as good as the SVM model which is famous for face recognition field in recognition rate and even better in classification speed. Moreover high recognition rate could be acquired by combining the proposed neural classification models.
PDF

Web-based Real-time 3D Video Communication System for Reality Teleconferencing

Ko, Jung-Hwan;Kim, Dong-Kyu;Hwang, Dong-Chun;Kim, Eun-Soo
- 한국정보디스플레이학회:학술대회논문집
- /
- 2005.07b
- /
- pp.1611-1614
- /
- 2005
In this paper, a new multi-view 3D video communication system for real-time Reality teleconferencing application is proposed by usin gthe IEEE 1394 digital cameras, Intel Xeon server computer system and Microsoft's DirectShow programming library and its performance is analyzed in terms of image-grabbing frame rate and number of views. The captured two-view image data is compressed by extraction of disparity data between them and transmitted to another client system through the communication network, in which multi-view could be synthesized with this received 2-view data using the intermediate view reconstruction technique and displayed on the multi-view 3D display system. From some experimental results, it is found that the proposed system can display 16-view 3D images with a gray of 8bits and a frame rate of 15fps.
PDF

Brain MR Multimodal Medical Image Registration Based on Image Segmentation and Symmetric Self-similarity

Yang, Zhenzhen;Kuang, Nan;Yang, Yongpeng;Kang, Bin
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.3
- /
- pp.1167-1187
- /
- 2020
With the development of medical imaging technology, image registration has been widely used in the field of disease diagnosis. The registration between different modal images of brain magnetic resonance (MR) is particularly important for the diagnosis of brain diseases. However, previous registration methods don't take advantage of the prior knowledge of bilateral brain symmetry. Moreover, the difference in gray scale information of different modal images increases the difficulty of registration. In this paper, a multimodal medical image registration method based on image segmentation and symmetric self-similarity is proposed. This method uses modal independent self-similar information and modal consistency information to register images. More particularly, we propose two novel symmetric self-similarity constraint operators to constrain the segmented medical images and convert each modal medical image into a unified modal for multimodal image registration. The experimental results show that the proposed method can effectively reduce the error rate of brain MR multimodal medical image registration with rotation and translation transformations (average 0.43mm and 0.60mm) respectively, whose accuracy is better compared to state-of-the-art image registration methods.
https://doi.org/10.3837/tiis.2020.03.014 인용 PDF KSCI HTML

Caption Detection and Recognition for Video Image Information Retrieval (비디오 영상 정보 검색을 위한 문자 추출 및 인식)

구건서
- Journal of the Korea Computer Industry Society
- /
- v.3 no.7
- /
- pp.901-914
- /
- 2002
In this paper, We propose an efficient automatic caption detection and location method, caption recognition using FE-MCBP(Feature Extraction based Multichained BackPropagation) neural network for content based retrieval of video. Frames are selected at fixed time interval from video and key frames are selected by gray scale histogram method. for each key frames, segmentation is performed and caption lines are detected using line scan method. lastly each characters are separated. This research improves speed and efficiency by color segmentation using local maximum analysis method before line scanning. Caption detection is a first stage of multimedia database organization and detected captions are used as input of text recognition system. Recognized captions can be searched by content based retrieval method.
PDF

Search Result 130, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)