Proceedings of the IEEK Conference (대한전자공학회:학술대회논문집)
The Institute of Electronics and Information Engineers (IEIE)
- 기타
2001.06d
-
We propose a realtime numeric caption recognition algorithm that automatically recognizes the numeric caption generated by computer graphics (CG) and displays the modified caption using the recognized resource only when a valuable numeric caption appears in the aimed specific region of the live sportscast scene produced by other broadcasting stations. We extract the mesh feature from the enhanced binary image as a feature vector after acquiring the sports broadcast scenes using a frame grabber in realtime and then recover the valuable resource from just a numeric image by perceiving the character using the neural network. Finally, the result is verified by the knowledge-based rule set designed for more stable and reliable output and is displayed on a screen as the converted CC caption serving our purpose. At present, we have actually provided the realtime automatic mile-to-kilometer caption conversion system taking up our algorithm f3r the regular Major League Baseball (MLB) program being broadcasted live throughout Korea over our nationwide network. This caption conversion system is able to automatically convert the caption in mile universally used in the United States into that in kilometer in realtime, which is familiar to almost Koreans, and makes us get a favorable criticism from the TV audience.
-
Point pattern matching schemes for finger print recognition do not guarantee robust matching performance for finger print images of poor quality. We present a finger print recognition scheme, where transformation parameter of matched ridge pairs are estimated by Hough transform and the matching hypothesis is verified by a new measure of the matching degree using selective directional information. Proposed method may exhibit extremely low FAR(False Accept Ratio) while maintaining low reject ratio even for the images of poor quality because of the robustness to the variation of minutia points.
-
In the Dedicated Short Range Communication (DSRC) system channel, a large number of bit errors occur because of Additive White Gaussian Noise (AWGN) and fading. When an image data is transmitted under the condition, reconstructed image quality is significantly degraded. In this paper, as an alternative to the error correcting code and/or automatic repeat request scheme, we propose an error recovery scheme for image data transmission. We first analyze how transmission errors in the DSRC system channel degrade image quality. Then, in order to improve image quality, we propose error resilient and concealment schemes for still image transmission using DCT-based fixed length coding, hamming code, cyclic redundancy check, and interleaver. Finally, we show its performance by an experiment.
-
Design and Implementation of Auto-Focusing, Auto-Exposure and Auto-White balance Video Camera SystemThis paper has been studied a vedio camera system with AF (auto-focus), AE (auto exposure), AWB(auto-whitebalance). And then this paper has designed an advanced method to improve AF, AE and AWB video camera system
-
In this paper, we present a fast face detection algorithm by estimating the eye region using neural network. To implement a real time face detection system, it is necessary to reduce search space. We limit the search space just to a few pairs of eye candidates. For the selection of them, we first isolate possible eye regions in the fast and robust way by modified histogram equalization. The eye candidates are paired to form an eye pair and each of the eye pair is estimated how close it is to a true eye pair in two aspects : One is how similar the two eye candidates are in shape and the other is how close each of them is to a true eye image A multi-layer perceptron neural network is used to find the eye candidate region's closeness to the true eye image. Just a few best candidates are then verified by eigenfaces. The experimental results show that this approach is fast and reliable. We achieved 94% detection rate with average 0.1 sec Processing time in Pentium III PC in the experiment on 424 gray scale images from MIT, Yale, and Yonsei databases.
-
본 논문은 잡음에 의해 열화된 지문영상을 향상시키기 위해 화소가 가지는 다중방향성을 이용하는 방법을 제안한 다. 지문영상은 융선과 골의 규칙적인 배열로 이루어지는데 이러한 융선과 골을 구성하는 화소의 방향성들은 잡음이 가지는 방향성들과는 구별된다 방향성대역 통과 필터뱅크(Directional Filter Bank : DFB)는 입력영상이 가진 일정한 방향성 대역을 통과시켜서 다수개의 부영상들을 생성한다. 본 논문은 DFB를 이용하여 지문영상을 다수개의 특정 방향성을 가진 부영상들로 분해한 다음 이러한 부영상들을 사용하여 지문영상을 개선한다
-
Visual speech information improves the performance of speech recognition, especially in noisy environment. We have tested the various spatial-temporal features for the Korean lipreading and evaluated the performance by using a hidden Markov model based classifier. The results have shown that the direction as well as the magnitude of the movement of the lip contour over time is useful features for the lipreading.
-
We introduce, in this paper, the face recognition method that improves recognition rate and training time in eigen system. To increase recognition rate we use Gabor filter. To reduce the increasing training time owing to use Gabor filtering, we extract new feature vectors that are made with average and standard deviation. In experimental results, we get higher recognition rate and shorter training time in improved system than it in original eigen system.
-
The most important issues in gesture recognition are the simplification of algorithm and the reduction of processing time. The mathematical morphology based on geometrical set theory is best used to perform the real-time processing. A key idea of the algorithm proposed in this paper is to apply morphological shape decomposition. The primitive elements extracted from a hand gesture have very important information including the directivity of the hand gestures. Based on this algorithm, we proposed the morphological hand-gesture recognition algorithm using feature vectors extracted from lines connecting the center points of a main-primitive element and sub-primitive elements. Through the experiments, we applied to the video contents browsing system with natural interactions and demonstrated the efficiency of this algorithm.
-
Variable length coding (VLC) has been used in many well known standard video coding algorithms such as MPEG and H.26x. However, VLC can not be processed parallelly because of its sequentiality. This sequentiality is a big barrier for implementing a real-time software video codec since parallel schemes can not be applied. In this paper, we propose a new fast VLD (Variable Length Decoding) method based on the probabilistic distribution of symbols in VLC tables used in MPEG as well as H.263 standard codecs. Even though MPEG suggests the table partitioning method, they do not show theoretically why the number of partitioned tables is two or three. We suggest the method for deciding the number of partitioned tables. Applying our scheme to several well-known MPEG-2 test sequences, we can reduce the computational time up to about 10% without any sacrificing video quality
-
Unequal Error Protection(UP) is reasonable scheme in transmission of compressed video with low bit rate. Because it offers the error correction ability each other data according to the Source Significance Information. Hence it can also be flexible to the given channel environment on the video transmission. This paper propose the joint source channel coding through the UEP in consideration of the hierarchical structure of H.263+ based video and the influence of the transmission error. It especially proposes the error-resilient video transmission technique which can reduce complexity of channel coder & decoder by partitioning the video data with a frame. As the result of the proposed algorithm, it is possible to increase the quality of reconstructed video in the error environment without creating additional bits.
-
This paper presents a FPGA Implementation of wavelet-based CODEC, which can compress 2-dimensional image. For real-time processing, a scheduling method of input image data is proposed and a new structure of MAC(multiplier-accumulator) is proposed for wavelet transforms. Also this study proposes global pipelining structure of wavelet CODEC and efficient buffering method at interfaces between each module with different clock frequency.
-
This paper presents a computationally efficient post-processing algorithm for HDTV. The proposed algorithm can reduce both blocking artifacts and mosquito noise while preserving the sharpness and naturalness of the reconstructed video signal. Performance improvements compared with other techniques are obtained according to simulation results.
-
In this paper, we propose a new 3D object retrieval system using the shape information of 2D silhouette images. 2D images at different view points are derived from a 3D model and linked to the model. Shape feature of 2D image is extracted by a region-based descriptor. In the experiment, we compare the results of the proposed system with those of the system using curvature scale space(CSS) to show the efficiency of our system.
-
This paper proposes an automatic method for generating and 3D draping the skirt 3D models. The method constructs a 3D basic model of the skirts using elliptic cylinder and generates the various skirt 3D models by controlling the ellipses of the basic model. B-Spline approximates the wrinkles around the ellipse for the various 3D draping changing according to designs and textiles. We make some real skirts and investigate appearances for their drapes. This paper simulates their appearances and obtains good results. Furthermore, adaptation of the skirt 3D model to a personal character implements realistic coordination of the various skirts.
-
In this paper, we propose a method to interpolate two images obtained from two parallel cameras. The proposed method uses BDM(Bidirectional Disparity Map) to prevent hole generation due to occlusion. Furthermore, we use the block-based DM(Disparity Map) to decrease the amount of computation, and also use the adaptive block size to minimize the error of the block-based DM.
-
Probabilistic Estimation of Area- and Feature-based Stereo Matching Results Considering UncertatintyThis paper proposes a positional estimation method for extracted line features by stereo vision. Based on given reference plane, planar surfaces corresponding the given plane are first extracted. Then, features in the planar surfaces are selected. Using the Extended Kalman Filter the feature positions are estimated by combining area- and feature-based stereo matching results. Experimental results show the proposed method is feasible.
-
In this paper, we utilize a HVS(Human Visual System) watermarking method where watermarks are embedded in a DFT domain. The HVS watermarking method is robust for attacks like JPEC, filtering, noise, etc. But, when images are attacked by basic geometric attacks as cropping, scaling, rotation, a watermarks may not be detected. In this paper, we introduce the HVS watermarking method that inserts references In a domain of LSB(Least Significant Bit) of image. Experimental results show that the proposed method based on HVS watermarking method gives more robustness to the basic geometric attacks compared with original HVS watermarking methods.
-
In this paper, we have introduced a new watermarking method using the Discrete Wavelet Transform (DWT). This method has two features. Firstly the trade-off between the quality and the robustness, and between the quality and the capacitance can be controlled. Next, this method use different scheme according to the watermarks. We have also implemented numerical examples for several kinds of attack. It is found that watermarking method in this paper is robust to above attacks.
-
In this paper, we proposed a digital watermarking for color still image using the characteristics of human visual system and the achromatic block information. We use a binary watermark signal and insert watermark signal in the chromatic component region of YCrCb color space. In order to extract the watermark signal, we extracted the watermark signal by presuming that modified pattern of chromatic saturation without using original an image. Experimental results show that the proposed watermarking method has a good performance to embed watermark signal and extract one.
-
Unlike rigid objects or This paper developes the algorithm for segmenting gaseous objects on an image plane. Unlike rigid objects or solid non-rigid objects, gaseous objects vary in density even within single-object regions and the edge intensity differs at different locations. So, an edge detector may detect only strong edges and detected edges may be an incomplete parts of an whole object's boundary. Due to this property of gaseous objects, it is not easy to distinguish the real edges of gaseous objects from the noisy-like edges such as leaves. Our algorithm uses two criteria of edge intensity and edge's line connectivity, then applies fuzzy set so as to obtain the proper threshold of the edge detector
-
In this paper, we propose a new approach for tracking a moving object in moving image sequences using active contour models and optical flow. In our approach object segmentation is achieved by active contours, and object tracking is done by motion estimation based on optical flow. To get more dynamic characteristics, Lagrangian dynamics combined to the active contour models. For the optical flow computation, a method, which is based on Spatiotempo-ral Energy Models, is employed to perform robust tracking under poor environments. A prototype real tracking system has been developed and applied to a contents-based video retrieval systems.
-
In this paper, we present a vision algorithm and method for input image improvement and preprocessing of dented and raised characters on the sidewall of tires. we define optical condition between reflect coefficient and reflectance by the physical vector calculate. On the contrary this work will recognize the engraved characters using the computer vision technique. Tire input images have all most same grey levels between the characters and backgrounds. The reflectance is little from a tire surface. therefore, it's very difficult segment the characters from the background. Moreover, one side of the character string is raised and the other is dented. So, the captured images are varied with the angle of camera and illumination. For optimum input images, the angle between camera and illumination was found out to be with in 90。 .In addition, We used complex filtering with low-pass and high-pass band filters to improve input images, for clear input images. Finally we define equation reflect coefficient and reflectance. By doing this, we obtained good images of tires for pattern recognition.
-
In this paper, We propose a new semiautomatic segmentation method using spatio-temporal similarity. In the proposed scheme, segmentation is performed using gradual region merging and hi-direction at spatio-temporal refinement. Simulation results show the efficiency of the proposed method in semantic object extraction.
-
This paper proposes a unique shot boundary detection algorithm for the video indexing and/or browsing. Conventional methods based on the frame differences and the histogram differences are improved. Instead of using absolute frame differences, block by block based relative frame differences are employed. Frame adaptive thresholding values are also employed for the better detection. for the cases that the frame differences are not enough to detect the shot boundary, histogram differences are selectively applied. Experimental results show that the proposed algorithm reduces both the “false positive” errors and the “false negative” errors especially for the videos of dynamic local and/or global motions
-
This paper proposed an efficient memory scheduling method (E
$^2$ M$^2$ ) by which the real-time image compression using 2-dimensional discrete wavelet transform(2-D DWT) is possible in an FPGA chip. In this paper, we assumed that the 2-D DWT was performed as the Mallat-tree. After the memory mapping method was proved in software, the memory controller was designed for an commercial SDRAM IC. -
This paper proposes a method to classify different video contents using features of digital video. Classified video types are the news, drama, show, sports, and talk program. Features, such as intra-coded macroblock number St motion vector in P-picture in MPEG domain are used. The frame difference of YCbCr is also employed as a measure of classification. We detect the occurrences of cuts in a video for a measure of classification. Finally, back-propagation neural-network of 3 layers is used to classify video contents.
-
This paper presents dispersed-pulse and random codebook for CELP coder. This coder operates on speech frames of 20ms and generates an excitation vector by convoluting dispersion vectors with signed pulses in an algebraic codevector. The improvement of pulse-based fixed codebook is performed at a low bit rate. A high performance fixed-codebook consists of a partial algebraic codebook and a random codebook in unvoiced and stationary noise regions. The proposed CELP coder is quantized with 4kb/s and is compared with G.729 (Bkb/s CS-ACELP). Subjective testing shows better quality than reference coders under some background noise conditions
-
본 논문에서는 음성압축 앨고리즘인 MPEG-4 CELP coder를 16 bit DSP 구현에 필요한 고정소수점 연산구조로 구현하였다. 기본 앨고리즘 중에 LSP 계수를 구하는 방법인 Chebyshev series method 대신 고정소수점 구현에 유리한 Real root method 앨고리즘을 사용하였다. 또한 cosine, log 둥 DSP 명령어가 지원하지 않는 수학 함수들은 미리 계산하여 테이블 적용기법을 사용하였고 고정 소수점 연산에 불리한 나눗셈 연산을 최대한 배제하였다. 고정 소수점 연산 구조로 변환한 후 부동 소수점 연산구조와의 비교를 통하여 오차를 최소화하도록 하였다 구현한 음성코더를 남, 여 각 5문장에 적용했을 때 부동 소수점 연산구조에 비교해 음질의 열화가 없음을 확인하였다.
-
Block filter implementation technique for uniform filter banks is uniform in this paper. By applying block filter into decimation and interpolation filters, it is shown that down and up samplers are cancelled out in respective liters. Furthermore by applying block filters into uniform filter banks, significant reduction for computational complexity is achieved since prototype filter can be shared in each channel implementation. Also, it is shown that proposed implementation is a reconfigurable structure in terms of order variation.
-
본 논문에서는 SIMD (Single Instruction stream and Multiple Data stream)형 병렬 구조의 다중 프로세서를 이용하여 NTGST (noise-tolerant generalized symmetry transform)를 병렬 고속화하였다. 먼저 NTGST의 화소 및 영상 영역간의 계산 독립성을 이용하여 영상을 분할하여 P개의 프로세서에 할당하고, 이들 각각을 N개의 데이터를 한번에 처리하는 SIMD 구조로 병렬화하여 NP에 비례하는 속도 향상을 얻었다. 실험에서 MMX 기술의 펜티엄 Ⅲ 프로세서를 2개 사용하여 제안한 알고리즘이 기존의 NTGST 보다 8배 가까이 고속으로 처리됨을 확인하였다.
-
본 논문에서는 효율적인 허프만 디코딩을 수행할 수 있도록 하기 위하여 Bit-wise comparison 방법을 제시하였다. 이 방법은 허프만 코딩 원리인 이진트리 구성에 기초하여 허프만 테이블을 재구성 함으로서 디코딩 사간의 단축 및 알고리즘의 간소화를 가져오도록 하였고, 이를 토대로 MPEG-2 AAC 디코더의 허프만 디코딩 부분에 적용함으로써 성능검증을 수행하였다.
-
Many researchers have developed the techniques of an efficient 3-D sound system based on the psycho-acoustics of spatial hearing with multimedia or virtual reality In this paper, we propose an idea for the improved 3-D sound system using conventional stereo headphones to obtain a better sound diffusion from the mono-sound recorded at an anechoic chamber. We use the HRTF (Head Related Transfer Function) for the sound localization and the wavelet filter bank with time delay for the sound diffusion. We investigate the effects of the 3-B sound depending on the length of time delay at lowest frequency band. Also the correlation coefficient of the signals between the left channel and the right channel is measured to identify the sound diffusion.
-
In this paper, a new CIC filter structure for reduction of passband droop in CIC decimation filter is proposed. For improvement of passband characteristics, the 1st order Generalized Comb Filter(UF) is proposed in addition to the conventional CIC decimation filter. By using this filter, it is shown that passband droop is remarkably reduced and aliasing attenuation is slightly increased. We also propose how to find the optimum filter coefficient Furthermora, it is shown that choices for GCF are possible in terms of the number of Half-Band filter. Passband droops and aliasing attenuations of the proposed structure are compared with those of the CIC structures using conventional sharpening techniques.
-
본 논문에서는 다 채널 페이저 연산 장치를 전용하드웨어로 구현하기 위한 설계 구조에 대하여 제시하였으며, 이를 연산량이 많은 곱셈기를 시분할에 의해 공유하는 구조를 제시하였다. 또한 페이저 측정을 위한 Sliding-DFT 알고리즘을 순환 구현할 경우의 근사구현 오차에 관한 정량적인 연구를 수행하였다. 이러한 오차 영향의 해석을 기반으로 하여 곱셈기 공유 구조를 적용한 페이저 연산 장치를 설계하고, 설계한 하드웨어의 내부동작을 보여주는 시뮬레이션을 통해 설계의 정확성을 확인하였다
-
In this paper, a low-power CIC(U Integrator Comb) filter bank structure is proposed for wireless communication systems. Since conventional CIC filter can not meet the desired center frequencies in each channel of filter banks, we developed a Modified CIC Filter(MCF) for those purpose.
-
In this Paper, a new harmonic imaging technique is proposed and evaluated experimentally. In the proposed method, a weighted chin signal with a hanning window is transmitted. The RF samples obtained on each array element are individually compressed by correlating with the reference signal defined as the 2nd harmonic (2f0) component of a transmitted chirp signal generated in a square-law system. The proposed method uses the compressed 2f0 component to form an image, for which the crosscorrelation term with f0 component should be suppressed below at least -60dB. After experiment, the 6dB pulse width and peak sidelobe level of the compressed 2f0 component were 0.7us and -60dB, respectively. This result shows that the proposed method can successfully eliminate the f0 component with a single transmit-receive event and therefore is more efficient than the conventional pulse inversion (PI) method in terms of frame rate. We also observed that the 2nd harmonic compont starts to decrease for source pressure higher than 210kPa in water, which implies that SNR of the 2nd harmonic imaging using short pulses cnanot be incresed beyond a certain limit.
-
A new method for simultaneous multiple transmit focusing using orthogonal weighted FM chirp is proposed. Weighted chirp signals focused at different depths are transmitted at the same time. These chirp signals are mutually orthogonal in the approximate sense that the autocorrelation function of each signal has a narrow mainlobe width and low sidelobe levels, and the crosscorrellation function of any pair of the signals has smaller values than the sidelobe levels of each autocorrelation function. This means that each weighted chirp signal can be separately compressed into a short pulse, focused individually and combined with other focused beams to form a frame of image. Theoretically, any two chirp signals defined in two nonoverlapped frequency bands are mutually orthogonal. In the present work, however, a fractional overlap of adjacent frequency bands, by up to 25%, were permitted to design more chirp signals within a given transducer bandwidth. The crosscorrelation values due to the frequency overlap could be reduced by alternating the direction of frequency sweep of the adjacent chirp signals. The simulation results show that this method can improve the lateral resolution of image without sacrifice in the frame rate compared with the conventional pulse system.
-
In this Paper, a new multi-sensor single-target tracking method in cluttered environment is proposed. Unlike the established methods such as probabilistic data association filter (PDAF), the proposed method intends to reflect the information in detection phase into parameters in tracking so as to reduce uncertainty due to clutter. This is achieved by first modifying the Bayes risk in Bayesian detection criterion to incorporate the likelihood of measurements from multiple sensors. The final estimate is then computed by taking a linear combination of the likelihood and the estimate of measurements. We develop the procedure and discuss the results from representative simulations.
-
본 논문에서는 ITS(Intelligent Transportation System) 기술중의 하나인 차량 충돌 방지 시스템의 신호처리부를 설계 구현하였다. 제안된 시스템은 FMCW (Frequency Modulated Continuous Wave)방식의 770Hz 밀리미터파 레이더를 기준으로 파라미터 값을 설계하여 거리와 속도를 실시간 검출하도록 구현되었다. 제안된 시스템은 TI사의 TMS320C31-40 DSP 와 AT89C52 Bbit 마이크로프로세서로 구현되어 10Hz 이상의 갱신율, 0.2m의 거리 분해능 및 2knvh의 속도 분해능을 제공하고 있다. 실험 환경으로 주파수 발생기(Function Generator)에서 비트주파수(Beat Frequency)를 생성하여 동작을 확인하였다.
-
A low power pulse Doppler radar should integrate a large number of data to provide a required maximum detectable distance. Doppler filter needs a window that has good out-of-bard rejection level to maintain high dynamic range. From these facts, we can apply decimation and presumming to increase the speed of Doppler processing. This Paper investigates the efficiencies of several decimation methods and the loss of presumming. And I propose a method to increase processing speed but to maintain the maximum detectable distance.
-
This paper proposes a novel human-computer interaction system for the disabled using recognition of face direction. Face direction is recognized by comparing positions of center of gravity between face region and facial features such as eyes and eyebrows. The face region is first selected by using color information, and then the facial features are extracted by applying a separation filter to the face region. The process speed for recognition of face direction is 6.57frame/sec with a success rate of 92.9% without any special hardware for image processing. We implement human-computer interaction system using screen menu, and show a validity of the proposed method from experimental results.
-
This paper has been studied a wavelet based still image transmission over the wireless channel. EZW(Embedded Zerotree Wavelet) is an efficient and scalable wavelet based image coding technique, which provides progressive transfer of signal resulted in multi-resolution representation. It reduces therefore the reduce cost of storage media. Although EZW has many advantages, it is very sensitive on error. Because coding are performed in subband by subband, and it uses arithmetic coding which is a kind of variable length coding. Therefore only 1∼2bit error may degrade quality of the entire image. So study of error localization and recovery are required. This paper investigates the use of reversible variable length codes(RVLC) and data partitioning. RVLC are known to have a superior error recovery property due to their two-way decoding capability and data partitioning is essential to applying RVLC. In this work, we show that appropriate data partitioning length for each SNR(Signal-to-Noise Power Ratio) and error localization in wireless channel.
-
This paper for the method that automatically extracts moving object of the video image is presented. In order to extract moving object, it is that velocity vectors correspond to each frame of the video image. Using the estimated velocity vector, the position of the object are determined. the value of the coordination of the object is initialized to the seed, and in the image plane, the moving object is automatically segmented by the region growing method and tracked by the range of intensity and information about Position. As the result of an application in sequential images, it is available to extract a moving object.
-
In this paper, we present a novel digital watermarking technique based on the concept of multiresolution decomposition and Human Visual System(HVS). Proposed watermarking is to embed watermark by quantization, that is to construct ‘perceptually lossless’quantization matrix, by using a quantization factor for each level and orientation and variance within a band. We compare our approach with another wavelet domain watermarking methods. Simulation results show the superior performance of robustness for variety image distortions.
-
This paper presents a real-time implementation method of a laser pointer mouse system. This system consists of a camera, a FPGA circuits to track a laser footprint and RF module for communication between a laser pointer and the proposed system. We first simulate the system and realize the system by a FPGA circuit after implementing it by a VHDL.
-
We proposes a new fingerprint minutia matching algorithm which matches the fingerprint minutiae by using local alignment. In general, fingerprint is deformed by Pressure and orientation when a user presses his fingerprint to the sensor. These nonlinear deformations change the position and the orientation of minutiae which decrease reliability of minutiae. Matching by using global alignment uses one alignment point. But, the problem with this method is that, due to the deformation, matching reliability of a minutia decreases as the distance from the alignment minutia increases. Matching by using local alignment overcomes this problem by considering minutiae which are located in a short distance boundary. Experimental results show that the performance of the proposed algorithm is superior to that of using global alignment.
-
A current TV-OUT format is quite different from that of HDTY or PC monitor in encoding techniques. In other words, a conventional analog TV uses interlaced display while HDTV or PC monitor uses Non-interlaced / Progressive-scanned display. In order to encode image signals coming from devices that takes interlaced display format for progressive scanned display, a hardware logic in which scanning and interpolation algorithms are implemented is necessary. The ELA (Edge-Based Line Average) algorithm have been widely used because it provided good characteristics. In this study, the ADI(Adaptive De-interlacing Interpolation) algorithm using to improve the algorithm which shows low quality in vertical edge detections and low efficiency of horizontal edge lines. With the De-interlacing ASIC chip that converts the interlaced Digital YUV to De-interlaced Digital RGB is designed. The VHDL is used for chip design.
-
This Paper suggests real-time video streaming method by using DCT-based scalability, and evaluates and analyzes the function. It is similar to using lowpass filter. That is, as following figure, this method is to split the encoded data in splitter and transmit it, and to decode the data according to the situation. This method can be applied to any video CODEC which is based on DCT. Therefore, this thesis suggests easy video streaming method by using DCT-based scalability, and shows the result of experiment. By using suggested scalability, calculations are reduced, and spacial scalability is realized. Moreover, the objective data which meet user's need according to the network condition and choose the appropriate scalability according to the capability of terming can be extracted. And it is possible to apply any resources according to the specificity of image.
-
This paper has been studied a environment matting and compositing, which captures not just a foreground object and its traditional opacity matte from a real-world scene, but also a description of how that object refracts and reflects light. And then this paper has verified and implemented the image compositing system using environment matting method.
-
This paper presents an efficient fast motion estimation algorithm and image segmentation method for low bit-rate coding. First, with region split information, the algorithm splits the image having homogeneous and semantic regions like face and semantic regions in image. Then, in these regions, We find the motion vector using adaptive search window adjustment. Additionally, with this new segment based fast motion estimation, we reduce blocking artifacts by intensively coding our interesting region(face or arm) in input image. The simulation results show the improvement in coding performance and image quality.
-
본 연구에서는 저해상도 다분광 영상으로부터 고해상도 다분광 영상을 효과적인 추출하기 위해 인공지능의 한 기법인 신경회로망을 이용하여 중합을 위한 구조를 제안하였고, IKONOS 위성 영상에 적용하여 실험 및 결과를 제시하였다. 실험 결과에서 얻어진 화상은 비교적 좋은 분광특성을 나타내었으며, 향후, 본 연구의 중합방법은 토지 이용분류, 환경감시, 자원조사 등의 많은 분야와 지형공간정보 시스템의 데이터 활용 등 여러분야 응용될 경우 우수한 성능을 제공할 것으로 사료된다.
-
Moving objects in video data are main elements for video analysis and retrieval. In this paper, we propose a new algorithm for tracking and segmenting moving objects in color image sequences that include complex camera motion such as zoom, pan and rotating. The Proposed algorithm is based on the Mean-shift color segmentation and stochastic region matching method. For segmenting moving objects, each sequence is divided into a set of similar color regions using Mean-shift color segmentation algorithm. Each segmented region is matched to the corresponding region in the subsequent frame. The motion vector of each matched region is then estimated and these motion vectors are summed to estimate global motion. Once motion vectors are estimated for all frame of video sequences, independently moving regions can be segmented by comparing their trajectories with that of global motion. Finally, segmented regions are merged into the independently moving object by comparing the similarities of trajectories, positions and emerging period. The experimental results show that the proposed algorithm is capable of segmenting independently moving objects in the video sequences including complex camera motion.
-
This paper suggest a fast multimedia delivery system. by using accelerated video. The system can delivery multimedia data faster and more effectively than previous delivery system. By using accelerated video, users can get information by least cost, and internet delivery server can send maximum information on limited bandwidth.
-
In this paper, We present a 3D image data for ocular refina. This 3D display techniques are used voxel(cuboid) projection. Voxel is 3D reconstruction method of the pixel. In this paper, 3D image display system is constructed under PC environment and programed based on modular programming by using Visual C++. The hole procedures are composed of data preparation, 3D Display over transformation and scaling.
-
This paper proposes a face recognition technique that effectively combines elastic graph matching (EGM) and Fisherface algorithm. EGM as one of dynamic lint architecture uses not only face-shape but also the gray information of image, and Fisherface algorithm as a class specific method is robust about variations such as lighting direction and facial expression. In the proposed face recognition adopting the above two methods, the linear projection per node of an image graph reduces dimensionality of labeled graph vector and provides a feature space to be used effectively for the classification. In comparison with a conventional method, the proposed approach could obtain satisfactory results in the perspectives of recognition rates and speeds. Especially, we could get maximum recognition rate of 99.3% by leaving-one-out method for the experiments with the Yale Face Databases.
-
최근 효과적인 내용기반 영상검색을 위해 특징 추출 방법이 많이 연구되고 있다. 특히 칼라 정보를 이용하여 특징을 얻는 방법은 여러 가지 장점 때문에 많이 사용되고 있다 본 논문에서는 칼라 코렐로그램(color correlogram) 기반의 새로운 특징 추출 방법을 제안한다. 제안한 방법은 웨이브릿 변환 계수를 사용하여 영상을 복잡한 영역과 그렇지 않은 영역으로 분할하고, 각 영역의 칼라 코렐로그램을 영상의 특징으로 사용해 영상을 검색하는 방법이다. 제안한 방법으로 영상을 검색하는 방법은 기존의 칼라 코렐로그램을 이용한 방법보다 성능이 우수함을 실험에서 확인할 수 있었다.
-
As systems for real time computer vision are confronted with prodigious amounts of visual information, it has become a priority to locate and analyze just that information essential to the task at hand, while ignoring the vast flow of irrelevant detail. A method of achieving this is to using human visual attention mechanism. In this paper, short review of human visual attention mechanisms and some computation models of visual attention were shown. This paper can be used as the basic data for researches on development of visual attention system that can perform various complex tasks more efficiently.
-
턱 논문에서는 복잡한 배경에서 다양한 조명과 얼굴의 크기 변화를 가지는 영상으로부터 눈을 검출하는 새로운 방법을 제안한다. 반사 대칭 조건과 타원 모델링을 이용하여 먼저 얼굴을 검출하고 그 영역 내에서 수리 형태학을 이용한 valley detection, binary opening을 수행함으로써 눈 후보 영역을 추출한다. 그리고 정확한 눈동자의 위치를 검출하기 위하여 눈동자 정합 마스크를 제안하였다 얼굴 검출 과정에서 타원의 단축 길이를 추정하여 추출된 얼굴 영상의 크기를 정규화 하였다. 정규화 된 얼굴 영상에서 눈 검출에 적합한 형태소(structuring element)를 결정하여 눈 검출 결과를 보다 견실하게 하였다.
-
The most basic means of communication among humans is a voice. Without speaking of voice technologies, we found it is important and convenient to use a voice in everyday life. But. in consideration to speech recognition systems, we can't always desire a normal voice input as input signal to the system. Generally speaking. a pathological voice as against a normal which is a voice with a problem in the larynx. could be also special case of input voice. Of course, but the distortion of a speech signal by environmental effects i.e., noise or transmission channel was a raised problem. we will take up a pathological voices with laryngeal disease which is essential distortion factor in voice. Also, we are to find out the difference of acoustic parameters distribution between normal and pathological voice by a statistical method in our research.
-
Characteristics of fruit vibrations are related to the material properties of the fruit. A new method for spectral analysis is developed and used for the non-destructive estimation of fruit firmness. The resonant frequency of the fruit is related to its firmness. However, the determination of the resonant frequency is not easy So the smoothing method is applied to the frequency spectra to obtain a robust estimate for the resonant frequency.
-
In this paper, we propose a robust audio watermarking method. The proposed watermarking algorithm is composed of a psychoacoustic model to achieve perceptual transparency and spread spectrum technique to embed watermark. The watermark is embedded in each audio frame by adding a perceptually-shaped pseudo-random sequence. We demonstrate the robustness of the watermarking algorithm.
-
In this paper, we introduce a simple design method f9r mot-squared type raised cosine filter with equiripple characteristics. Through some design examples, we show that the proposed filter has much better performance in ripple than the conventional SRCF at the expense of small increasing of ISI. In addition, the proposed Inter is compatible with conventional SRCF. Finally, we designs the filter for W-CDMPI which uses RRC (Root Raised Cosine) with a=0.22, in 12bit finite precision.
-
This paper proposes the strategy of noise injection into inputs in the Kohonen learning algorithm (KKA) to improve the local convergence problem of the KLA. Noise strengths are high in the begin of the learning and gradually lowered as the teaming proceeds. This strategy is a kind of stochastic relaxation (SR) method which is broadly used in the general optimization problems. It is convenient to implement and improves the convergence properties of the KLA with moderately increased computing time compared to the KLA. Experimental results for Gauss-Markov sources and real speech demonstrate that the proposed method can consistently provide better codebooks than the KLA.
-
In speech recognition technology, the utterance of every talker have special resonant frequency according to shape of talker's lip and to the motion of tongue. And utterances are different according to each talker. Accordingly, we need the superior moth-od of speech feature parameter extraction which reflect talker's characteristic well. This paper suggests the modified-MfCC combined existing MFCC with gammatone filter. We experimented with speech data from telephone and then we obtained results of enhanced speech recognition rate which is higher than that of the other methods.
-
CELP 부호화기는 선형 예측 합성에 의한 분석 부호화의 원칙에 기본을 두고 있다. 그리고 음성 신호의 스펙트럼을 LPC 분석을 통해 부호화하는데 고정 윈도우를 사용하여 부호화한다. 그러나 음성신호는 화자의 발성속도에 따라 파형의 변화가 시간적으로 빠르게 변화하기도 하고, 반대로 유사한 파형이 일정시간 유지되기도 한다. 따라서 윈도우의 크기를 발성속도에 맞추어 분석한다면 보다 효율적인 부호화를 할 수 있다. 본 논문에서는 발성속도에 따라 전송률을 달리 적용하는 방법을 제안한다. 발성속도의 측정은 스펙트럼 변화도를 이용하여 측정하였고, 발성속도가 빠를 때는 프레임 크기를 줄여 시간적으로 빠르게 변화하는 신호에 적응적으로 분석하고 대신 파라미터 표현에 비트를 줄인다. 반대로 발성속도가 느릴 때는 프레임 크기를 키우고 파라미터 표현에 비트를 더 할당한다. 제안한 방법을 실험하기 위해 G.723.1 5.3kbps ACELP 부호화기를 이용하였다 음질의 열하 없이 평균 16.34% 전송률 감소효과를 얻을 수 있었다.
-
The preprocessing is very important course in speech signal processing. It influence the compression-rate in speech coding and the recognition-rate in speech recognition etc. In this paper, we propose that minimizing window-influence method with pitch period and start points. The proposed method is available for voiced detection and word labeling.
-
Speech is classified into voiced signal and unvoiced signal. Since the amplitude of voiced fall off at about -20dB/decade, dynamic range is often compressed prior to spectral analysis so that details at weak, high frequencies may be visible[5][6] There is a distinct difference in spectrum slope between voiced signal and unvoiced signal. In this paper, we got the slope of each frame by using autocorrelation method, and determined voiced /unvoiced region. Also, we used energy to decide region of silence. To show experimental results, we allot to 1 value in voiced region, -1 value in unvoiced region and 0 value in silence region.
-
In the speech analysis, to estimate formant center frequencies exactly is very important. If we know formant frequencies, we can expect which pronunciation is uttered. Generally, the magnitude of first formant frequency in voiced speech is 10dB more than other formant frequency. So, the shape of voice signal in time domain is affected by mainly first formant. Therefore we can get first formant frequency roughly by using ZCR(Zero Cross Rate). In this paper, we proposed the improvement method to get first formant frequency by using ZCR. We did autocorrelation before getting ZCR. This procedure makes voice signal smooth so, first formant in voice signal is emphasized. As a result of this method, we got more exact ZCR and first formant frequency. Conventional method of formant estimate is done in frequency domain but proposed method is done in time domain. So, this is very simple.