• Title/Summary/Keyword: segmentation error rate

Search Result 55, Processing Time 0.026 seconds

Object-based Stereoscopic Video Coding Using Image Segmentation and Prediction (영역분할 및 예측을 통한 객체기반 스테레오 동영상 부호화)

  • 권순규;배태면;한규필;정의윤;하영호
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.12B
    • /
    • pp.2349-2358
    • /
    • 1999
  • Object-based stereoscopic video coding scheme is presented in this paper. In conventional BMA based stereoscopic video coding for low bit rate transmission, image prediction errors such as block artifacts and mosquito phenomena are occurred. In order to reduce these errors, object based coding scheme is adopted. The proposed scheme consists of preprocessing, object extraction, and object update procedures. The preprocessing procedure extracts non-object regions having low reliability for motion and disparity estimation. This procedure prohibits extracting inaccurate objects. For the better prediction of left channel image, the disparity information is added to the object extraction. And the proposed algorithm can reduce the accumulated error through the object update procedure that detects newly emerging objects, merges objects that have the same object-disparity and object motion, and splits object which has large image prediction error. The experimental results show that the proposed algorithms improve the quality of the prediction without block artifacts and mosquito phenomena.

  • PDF

Efficient VLSI Architecture for Disparity Calculation based on Geodesic Support-weight (Geodesic Support-weight 기반 깊이정보 추출 알고리즘의 효율적인 VLSI 구조)

  • Ryu, Donghoon;Park, Taegeun
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.9
    • /
    • pp.45-53
    • /
    • 2015
  • Adaptive support-weight based algorithm can produce better disparity map compared to generic area-based algorithms and also can be implemented as a realtime system. In this paper, we propose a realtime system based on geodesic support-weight which performs better segmentation of objects in the window. The data scheduling is analyzed for efficient hardware design and better performance and the parallel architecture for weight update which takes the longest delay is proposed. The exponential function is efficiently designed using a simple step function by careful error analysis. The proposed architecture is designed with verilogHDL and synthesized using Donbu Hitek 0.18um standard cell library. The proposed system shows 2.22% of error rate and can run up to 260Mhz (25fps) operation frequency with 182K gates.

A Study on the Voice Dialing using HMM and Post Processing of the Connected Digits (HMM과 연결 숫자음의 후처리를 이용한 음성 다이얼링에 관한 연구)

  • Yang, Jin-Woo;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.5
    • /
    • pp.74-82
    • /
    • 1995
  • This paper is study on the voice dialing using HMM and post processing of the connected digits. HMM algorithm is widely used in the speech recognition with a good result. But, the maximum likelihood estimation of HMM(Hidden Markov Model) training in the speech recognition does not lead to values which maximize recognition rate. To solve the problem, we applied the post processing to segmental K-means procedure are in the recognition experiment. Korea connected digits are influenced by the prolongation more than English connected digits. To decrease the segmentation error in the level building algorithm some word models which can be produced by the prolongation are added. Some rules for the added models are applied to the recognition result and it is updated. The recognition system was implemented with DSP board having a TMS320C30 processor and IBM PC. The reference patterns were made by 3 male speakers in the noisy laboratory. The recognition experiment was performed for 21 sort of telephone number, 252 data. The recognition rate was $6\%$ in the speaker dependent, and $80.5\%$ in the speaker independent recognition test.

  • PDF

Automatic Liver Segmentation of a Contrast Enhanced CT Image Using a Partial Histogram Threshold Algorithm (부분 히스토그램 문턱치 알고리즘을 사용한 조영증강 CT영상의 자동 간 분할)

  • Kyung-Sik Seo;Seung-Jin Park;Jong An Park
    • Journal of Biomedical Engineering Research
    • /
    • v.25 no.3
    • /
    • pp.189-194
    • /
    • 2004
  • Pixel values of contrast enhanced computed tomography (CE-CT) images are randomly changed. Also, the middle liver part has a problem to segregate the liver structure because of similar gray-level values of a pancreas in the abdomen. In this paper, an automatic liver segmentation method using a partial histogram threshold (PHT) algorithm is proposed for overcoming randomness of CE-CT images and removing the pancreas. After histogram transformation, adaptive multi-modal threshold is used to find the range of gray-level values of the liver structure. Also, the PHT algorithm is performed for removing the pancreas. Then, morphological filtering is processed for removing of unnecessary objects and smoothing of the boundary. Four CE-CT slices of eight patients were selected to evaluate the proposed method. As the average of normalized average area of the automatic segmented method II (ASM II) using the PHT and manual segmented method (MSM) are 0.1671 and 0.1711, these two method shows very small differences. Also, the average area error rate between the ASM II and MSM is 6.8339 %. From the results of experiments, the proposed method has similar performance as the MSM by medical Doctor.

2D/3D image Conversion Method using Simplification of Level and Reduction of Noise for Optical Flow and Information of Edge (Optical flow의 레벨 간소화 및 노이즈 제거와 에지 정보를 이용한 2D/3D 변환 기법)

  • Han, Hyeon-Ho;Lee, Gang-Seong;Lee, Sang-Hun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.2
    • /
    • pp.827-833
    • /
    • 2012
  • In this paper, we propose an improved optical flow algorithm which reduces computational complexity as well as noise level. This algorithm reduces computational time by applying level simplification technique and removes noise by using eigenvectors of objects. Optical flow is one of the accurate algorithms used to generate depth information from two image frames using the vectors which track the motions of pixels. This technique, however, has disadvantage of taking very long computational time because of the pixel-based calculation and can cause some noise problems. The level simplifying technique is applied to reduce the computational time, and the noise is removed by applying optical flow only to the area of having eigenvector, then using the edge image to generate the depth information of background area. Three-dimensional images were created from two-dimensional images using the proposed method which generates the depth information first and then converts into three-dimensional image using the depth information and DIBR(Depth Image Based Rendering) technique. The error rate was obtained using the SSIM(Structural SIMilarity index).

Distance Measurement of the Multi Moving Objects using Parallel Stereo Camera in the Video Monitoring System (영상감시 시스템에서 평행식 스테레오 카메라를 이용한 다중 이동물체의 거리측정)

  • 김수인;이재수;손영우
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.18 no.1
    • /
    • pp.137-145
    • /
    • 2004
  • In this paper, a new algorithm for the segmentation of the multi moving objects at the 3 dimension space and the method of measuring the distance from the camera to the moving object by using stereo video monitoring system is proposed. It get the input image of left and right from the stereo video monitoring system, and the area of the multi moving objects segmented by using adaptive threshold and PRA(pixel recursive algorithm). Each of the object segmented by window mask, then each coordinate value and stereo disparity of the multi moving objects obtained from the window masks. The distance of the multi moving objects can be calculated by this disparity, the feature of the stereo vision system and the trigonometric function. From the experimental results, the error rate of a distance measurement be existed within 7.28%, therefore, in case of implementation the proposed algorithm, the stereo security system, the automatic moving robot system and the stereo remote control system will be applied practical application.

Speech Recognition on Korean Monosyllable using Phoneme Discriminant Filters (음소판별필터를 이용한 한국어 단음절 음성인식)

  • Hur, Sung-Phil;Chung, Hyun-Yeol;Kim, Kyung-Tae
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.1
    • /
    • pp.31-39
    • /
    • 1995
  • In this paper, we have constructed phoneme discriminant filters [PDF] according to the linear discriminant function. These discriminant filters do not follow the heuristic rules by the experts but the mathematical methods in iterative learning. Proposed system. is based on the piecewise linear classifier and error correction learning method. The segmentation of speech and the classification of phoneme are carried out simutaneously by the PDF. Because each of them operates independently, some speech intervals may have multiple outputs. Therefore, we introduce the unified coefficients by the output unification process. But sometimes the output has a region which shows no response, or insensitive. So we propose time windows and median filters to remove such problems. We have trained this system with the 549 monosyllables uttered 3 times by 3 male speakers. After we detect the endpoint of speech signal using threshold value and zero crossing rate, the vowels and consonants are separated by the PDF, and then selected phoneme passes through the following PDF. Finally this system unifies the outputs for competitive region or insensitive area using time window and median filter.

  • PDF

Performance Improvement of Pedestrian Detection using a GM-PHD Filter (GM-PHD 필터를 이용한 보행자 탐지 성능 향상 방법)

  • Lee, Yeon-Jun;Seo, Seung-Woo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.12
    • /
    • pp.150-157
    • /
    • 2015
  • Pedestrian detection has largely been researched as one of the important technologies for autonomous driving vehicle and preventing accidents. There are two categories for pedestrian detection, camera-based and LIDAR-based. LIDAR-based methods have the advantage of the wide angle of view and insensitivity of illuminance change while camera-based methods have not. However, there are several problems with 3D LIDAR, such as insufficient resolution to detect distant pedestrians and decrease in detection rate in a complex situation due to segmentation error and occlusion. In this paper, two methods using GM-PHD filter are proposed to improve the poor rates of pedestrian detection algorithms based on 3D LIDAR. First one improves detection performance and resolution of object by automatic accumulation of points in previous frames onto current objects. Second one additionally enhances the detection results by applying the GM-PHD filter which is modified in order to handle the poor situation to classified multi target. A quantitative evaluation with autonomously acquired road environment data shows the proposed methods highly increase the performance of existing pedestrian detection algorithms.

Word Recognition using Fuzzy Inference based on LPC (선형예측계수에 기초한 퍼지추론 단어 인식)

  • Choi, Seung-Ho;Kim, Hyeong-Geun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.1
    • /
    • pp.32-41
    • /
    • 1994
  • To solve the frequency variation of speech patterns which consist of LPC sequences, new membership function view from LPC, spectrum and the relations between the order of LPC and spectrum is proposed. To solve the time variation, multi-secation equi-segmentation method which equally divide the speech section into several section are applied. False recognition mainly occur at time when the same syllable is placed at the same utterance. To reduce the error, fuzzy inference is executed using the proposed membership function and weights are assigned into sectional certainty and then the decision method for recognized the section up to the third candidate. To testify the validation of this method, we experimented the recognition test of 28 DDD area names. The recognition rate of the fuzzy inference by the triangle membership function is $92\%$. That of the combined method of the fuzzy inference and the dicision method is $92.9\%$ and that of fuzzy inference by the proposed membership funtion is $93.8\%$.

  • PDF

Emergency dispatching based on automatic speech recognition (음성인식 기반 응급상황관제)

  • Lee, Kyuwhan;Chung, Jio;Shin, Daejin;Chung, Minhwa;Kang, Kyunghee;Jang, Yunhee;Jang, Kyungho
    • Phonetics and Speech Sciences
    • /
    • v.8 no.2
    • /
    • pp.31-39
    • /
    • 2016
  • In emergency dispatching at 119 Command & Dispatch Center, some inconsistencies between the 'standard emergency aid system' and 'dispatch protocol,' which are both mandatory to follow, cause inefficiency in the dispatcher's performance. If an emergency dispatch system uses automatic speech recognition (ASR) to process the dispatcher's protocol speech during the case registration, it instantly extracts and provides the required information specified in the 'standard emergency aid system,' making the rescue command more efficient. For this purpose, we have developed a Korean large vocabulary continuous speech recognition system for 400,000 words to be used for the emergency dispatch system. The 400,000 words include vocabulary from news, SNS, blogs and emergency rescue domains. Acoustic model is constructed by using 1,300 hours of telephone call (8 kHz) speech, whereas language model is constructed by using 13 GB text corpus. From the transcribed corpus of 6,600 real telephone calls, call logs with emergency rescue command class and identified major symptom are extracted in connection with the rescue activity log and National Emergency Department Information System (NEDIS). ASR is applied to emergency dispatcher's repetition utterances about the patient information. Based on the Levenshtein distance between the ASR result and the template information, the emergency patient information is extracted. Experimental results show that 9.15% Word Error Rate of the speech recognition performance and 95.8% of emergency response detection performance are obtained for the emergency dispatch system.