Search | Korea Science

Study of Scene change Detection and Adaptive Rate Control Schemes for MPEG Video Encoder (MPEG 비디오 인코더를 위한 장면전환 검출 및 적응적 율 제어 방식 연구)

Nam, Jae-Yeol;Gang, Byeong-Ho;Son, Yu-Ik
- The Transactions of the Korea Information Processing Society
- /
- v.6 no.2
- /
- pp.534-542
- /
- 1999
A sell-designed rate control strategy can improve overall picture quality for video transmission over a constant bit rate channel and the rate control method is not a normative part of MPEG-video standard, the performance of MPEG video codec can be quite different depends on how to implement the rate control scheme. The rate control scheme proposed in MPEG show good results when scene changes is not occurred. But it has weakness that it does not properly handle scene-changed pictures. Therefore picture quality after scene change is deteriorated, and possibility of overflow occurrence becomes high. In this paper, a new method for detection of scene change occurrence using local variance and a new determination scheme for adaptive quantization parameter, mqunt, which can consider local characteristic of an image by using previously computed the local variance from the scene change detection part are proposed. IN addition, and adaptive rate control scheme which can handles scene changed picture very efficiently by scene-changed picture is proposed. Computer simulations are performed to verify the performance of the proposed algorithm. The suggested detection algorithm precisely detected scene change. And the proposed rate control scheme shows better rate control performance as compared with that of the conventional MPEG scheme.
PDF

A Study on Fuzziness Parameter Selection in Fuzzy Vector Quantization for High Quality Speech Synthesis (고음질의 음성합성을 위한 퍼지벡터양자화의 퍼지니스 파라메타선정에 관한 연구)

이진이
- Journal of the Korean Institute of Intelligent Systems
- /
- v.8 no.2
- /
- pp.60-69
- /
- 1998
This paper proposes a speech synthesis method using Fuzzy VQ, and then study how to make choice of fuzziness value which optimizes (controls) the performance of FVQ in order to obtain the synthesized speech which is closer to the original speech. When FVQ is used to synthesize a speech, analysis stage generates membership function values which represents the degree to which an input speech pattern matches each speech patterns in codebook, and synthesis stage reproduces a synthesized speech, using membership function values which is obtained in analysis stage, fuzziness value, and fuzzy-c-means operation. By comparsion of the performance of the FVQ and VQ synthesizer with simmulation, we show that, although the FVQ codebook size is half of a VQ codebook size, the performance of FVQ is almost equal to that of VQ. This results imply that, when Fuzzy VQ is used to obtain the same performance with that of VQ in speech synthesis, we can reduce by half of memory size at a codebook storage. And then we have found that, for the optimized FVQ with maximum SQNR in synthesized speech, the fuzziness value should be small when the variance of analysis frame is relatively large, while fuzziness value should be large, when it is small. As a results of comparsion of the speeches synthesized by VQ and FVQ in their spectrogram of frequency domain, we have found that spectrum bands(formant frequency and pitch frequency) of FVQ synthesized speech are closer to the original speech than those using VQ.
PDF

Learning-based Super-resolution for Text Images (글자 영상을 위한 학습기반 초고해상도 기법)

Heo, Bo-Young;Song, Byung Cheol
- Journal of the Institute of Electronics and Information Engineers
- /
- v.52 no.4
- /
- pp.175-183
- /
- 2015
The proposed algorithm consists of two stages: the learning and synthesis stages. At the learning stage, we first collect various high-resolution (HR)-low-resolution (LR) text image pairs, and quantize the LR images, and extract HR-LR block pairs. Based on quantized LR blocks, the LR-HR block pairs are clustered into a pre-determined number of classes. For each class, an optimal 2D-FIR filter is computed, and it is stored into a dictionary with the corresponding LR block for indexing. At the synthesis stage, each quantized LR block in an input LR image is compared with every LR block in the dictionary, and the FIR filter of the best-matched LR block is selected. Finally, a HR block is synthesized with the chosen filter, and a final HR image is produced. Also, in order to cope with noisy environment, we generate multiple dictionaries according to noise level at the learning stage. So, the dictionary corresponding to the noise level of the input image is chosen, and a final HR image is produced using the selected dictionary. Experimental results show that the proposed algorithm outperforms the previous works for noisy images as well as noise-free images.
https://doi.org/10.5573/ieie.2015.52.4.175 인용 PDF KSCI

A Study on the Dynamic Range Performance Evaluation Method of Detector with Variation of Tube Voltage and Automatic Exposure Control (AEC) in Digital Radiography (DR) -Focused on the Dynamic Step Wedge and Histogram Evaluation (DR(Digital Radiography)에서 관전압 및 자동노출제어장치의 감도 변화에 따른 검출기의 동적 범위 성능평가 방법연구 -Dynamic Step Wedge와 히스토그램 평가를 중심으로)

Hwang, Jun-Ho;Choi, Ji-An;Kim, Hyun-Soo;Lee, Kyung-Bae
- The Journal of the Korea Contents Association
- /
- v.19 no.4
- /
- pp.368-380
- /
- 2019
This study proposes a method to evaluate the performance of a detector by analyzing the dynamic step wedge and histogram according to the change of the tube voltage and sensitivity when using the Automatic Exposure Control (AEC). The performance of a detector was evaluated by measuring X-ray quality, Entrance Surface Dose (ESD), tube current, dynamic range corresponding to detector sensitivities of S200, S400, S800, S1000 per tube voltage of 60, 70, 81, 90 kVp. As a results, all of the qualities satisfied the acceptance criteria, and the Entrance Surface Dose and tube current were decreased stage by stage as sensitivity was set higher. In the dynamic step wedge, the observable dynamic range has also increased as tube voltage became higher. The histogram showed the quantization separation phenomena as the tube voltage was set higher. The higher the sensitivity, the more the underflow and overflow occurred in which the amount of information on both ends of the histogram was lost. In conclusion, the deterioration in the performance of the detector was found to be insufficient to realize the change of the tube voltage and sensitivity when using the Automatic Exposure Control, and it is useful to use dynamic step wedge and histogram in evaluating detector performance evaluation.
https://doi.org/10.5392/JKCA.2019.19.04.368 인용 PDF KSCI HTML

Video Watermarking Scheme with Adaptive Embedding in 3D-DCT domain (3D-DCT 계수를 적응적으로 이용한 비디오 워터마킹)

Park Hyun;Han Ji-Seok;Moon Young-Shik
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.15 no.3
- /
- pp.3-12
- /
- 2005
This paper introduces a 3D perceptual model based on JND(Just Noticeable Difference) and proposes a video watermarking scheme which is perceptual approach of adaptive embedding in 3D-DCT domain. Videos are composed of consecutive frames with many similar adjacent frames. If a watermark is embedded in the period of similar frames with little motion, it can be easily noticed by human eyes. Therefore, for the transparency the watermark should be embedded into some places where motions exist and for the robustness its magnitude needs to be adjusted properly. For the transparency and the robustness, watermark based on 3D perceptual model is utilized. That is. the sensitivities from the 3D-DCT quantization are derived based on 3D perceptual model, and the sensitivities of the regions having more local motion than global motion are adjusted. Then the watermark is embedded into visually significant coefficients in proportion to the strength of motion in 3D-DCT domain. Experimental results show that the proposed scheme improves the robustness to MPEG compression and temporal attacks by about $3{\sim}9\%$, compared to the existing 3D-DCT based method. In terms of PSNR, the proposed method is similar to the existing method, but JND guarantees the transparency of watermark.
https://doi.org/10.13089/JKIISC.2005.15.3.3 인용 PDF KSCI HTML

Real-time Watermarking Algorithm using Multiresolution Statistics for DWT Image Compressor (DWT기반 영상 압축기의 다해상도의 통계적 특성을 이용한 실시간 워터마킹 알고리즘)

최순영;서영호;유지상;김대경;김동욱
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.13 no.6
- /
- pp.33-43
- /
- 2003
In this paper, we proposed a real-time watermarking algorithm to be combined and to work with a DWT(Discrete Wavelet Transform)-based image compressor. To reduce the amount of computation in selecting the watermarking positions, the proposed algorithm uses a pre-established look-up table for critical values, which was established statistically by computing the correlation according to the energy values of the corresponding wavelet coefficients. That is, watermark is embedded into the coefficients whose values are greater than the critical value in the look-up table which is searched on the basis of the energy values of the corresponding level-1 subband coefficients. Therefore, the proposed algorithm can operate in a real-time because the watermarking process operates in parallel with the compression procession without affecting the operation of the image compression. Also it improved the property of losing the watermark and the efficiency of image compression by watermark inserting, which results from the quantization and Huffman-Coding during the image compression. Visual recognizable patterns such as binary image were used as a watermark The experimental results showed that the proposed algorithm satisfied the properties of robustness and imperceptibility that are the major conditions of watermarking.
https://doi.org/10.13089/JKIISC.2003.13.6.33 인용 PDF KSCI HTML

Latent Shifting and Compensation for Learned Video Compression (신경망 기반 비디오 압축을 위한 레이턴트 정보의 방향 이동 및 보상)

Kim, Yeongwoong;Kim, Donghyun;Jeong, Se Yoon;Choi, Jin Soo;Kim, Hui Yong
- Journal of Broadcast Engineering
- /
- v.27 no.1
- /
- pp.31-43
- /
- 2022
Traditional video compression has developed so far based on hybrid compression methods through motion prediction, residual coding, and quantization. With the rapid development of technology through artificial neural networks in recent years, research on image compression and video compression based on artificial neural networks is also progressing rapidly, showing competitiveness compared to the performance of traditional video compression codecs. In this paper, a new method capable of improving the performance of such an artificial neural network-based video compression model is presented. Basically, we take the rate-distortion optimization method using the auto-encoder and entropy model adopted by the existing learned video compression model and shifts some components of the latent information that are difficult for entropy model to estimate when transmitting compressed latent representation to the decoder side from the encoder side, and finally compensates the distortion of lost information. In this way, the existing neural network based video compression framework, MFVC (Motion Free Video Compression) is improved and the BDBR (Bjøntegaard Delta-Rate) calculated based on H.264 is nearly twice the amount of bits (-27%) of MFVC (-14%). The proposed method has the advantage of being widely applicable to neural network based image or video compression technologies, not only to MFVC, but also to models using latent information and entropy model.
https://doi.org/10.5909/JBE.2022.27.1.31 인용 PDF KSCI KPUBS

Implementation of Parallel Processor for Sound Synthesis of Guitar (기타의 음 합성을 위한 병렬 프로세서 구현)

Choi, Ji-Won;Kim, Yong-Min;Cho, Sang-Jin;Kim, Jong-Myon;Chong, Ui-Pil
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.3
- /
- pp.191-199
- /
- 2010
Physical modeling is a synthesis method of high quality sound which is similar to real sound for musical instruments. However, since physical modeling requires a lot of parameters to synthesize sound of a musical instrument, it prevents real-time processing for the musical instrument which supports a large number of sounds simultaneously. To solve this problem, this paper proposes a single instruction multiple data (SIMD) parallel processor that supports real-time processing of sound synthesis of guitar, a representative plucked string musical instrument. To control six strings of guitar, we used a SIMD parallel processor which consists of six processing elements (PEs). Each PE supports modeling of the corresponding string. The proposed SIMD processor can generate synthesized sounds of six strings simultaneously when a parallel synthesis algorithm receives excitation signals and parameters of each string as an input. Experimental results using a sampling rate 44.1 kHz and 16 bits quantization indicate that synthesis sounds using the proposed parallel processor were very similar to original sound. In addition, the proposed parallel processor outperforms commercial TI's TMS320C6416 in terms of execution time (8.9x better) and energy efficiency (39.8x better).
https://doi.org/10.7776/ASK.2010.29.3.191 인용 PDF KSCI

Object Detection Performance Analysis between On-GPU and On-Board Analysis for Military Domain Images

Du-Hwan Hur;Dae-Hyeon Park;Deok-Woong Kim;Jae-Yong Baek;Jun-Hyeong Bak;Seung-Hwan Bae
- Journal of the Korea Society of Computer and Information
- /
- v.29 no.8
- /
- pp.157-164
- /
- 2024
In this paper, we propose a discussion that the feasibility of deploying a deep learning-based detector on the resource-limited board. Although many studies evaluate the detector on machines with high-performed GPUs, evaluation on the board with limited computation resources is still insufficient. Therefore, in this work, we implement the deep-learning detectors and deploy them on the compact board by parsing and optimizing a detector. To figure out the performance of deep learning based detectors on limited resources, we monitor the performance of several detectors with different H/W resource. On COCO detection datasets, we compare and analyze the evaluation results of detection model in On-Board and the detection model in On-GPU in terms of several metrics with mAP, power consumption, and execution speed (FPS). To demonstrate the effect of applying our detector for the military area, we evaluate them on our dataset consisting of thermal images considering the flight battle scenarios. As a results, we investigate the strength of deep learning-based on-board detector, and show that deep learning-based vision models can contribute in the flight battle scenarios.
https://doi.org/10.9708/jksci.2024.29.08.157 인용 PDF HTML

A Study on Music Summarization (음악요약 생성에 관한 연구)

Kim Sung-Tak;Kim Sang-Ho;Kim Hoi-Rin;Choi Ji-Hoon;Lee Han-Kyu;Hong Jin-Woo
- Journal of Broadcast Engineering
- /
- v.11 no.1 s.30
- /
- pp.3-14
- /
- 2006
Music summarization means a technique which automatically generates the most importantand representative a part or parts ill music content. The techniques of music summarization have been studied with two categories according to summary characteristics. The first one is that the repeated part is provided as music summary and the second provides the combined segments which consist of segments with different characteristics as music summary in music content In this paper, we propose and evaluate two kinds of music summarization techniques. The algorithm using multi-level vector quantization which provides a repeated part as music summary gives fixed-length music summary is evaluated by overlapping ration between hand-made repeated parts and automatically generated summary. As results, the overlapping ratios of conventional methods are 42.2% and 47.4%, but that of proposed method with fixed-length summary is 67.1%. Optimal length music summary is evaluated by the portion of overlapping between summary and repeated part which is different length according to music content and the result shows that automatically-generated summary expresses more effective part than fixed-length summary with optimal length. The cluster-based algorithm using 2-D similarity matrix and k-means algorithm provides the combined segments as music summary. In order to evaluate this algorithm, we use MOS test consisting of two questions(How many similar segments are in summarized music? How many segments are included in same structure?) and the results show good performance.
PDF KSCI

Search Result 1,543, Processing Time 0.039 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)