• Title/Summary/Keyword: encoding

Search Result 4,339, Processing Time 0.031 seconds

Masked cross self-attentive encoding based speaker embedding for speaker verification (화자 검증을 위한 마스킹된 교차 자기주의 인코딩 기반 화자 임베딩)

  • Seo, Soonshin;Kim, Ji-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.497-504
    • /
    • 2020
  • Constructing speaker embeddings in speaker verification is an important issue. In general, a self-attention mechanism has been applied for speaker embedding encoding. Previous studies focused on training the self-attention in a high-level layer, such as the last pooling layer. In this case, the effect of low-level layers is not well represented in the speaker embedding encoding. In this study, we propose Masked Cross Self-Attentive Encoding (MCSAE) using ResNet. It focuses on training the features of both high-level and low-level layers. Based on multi-layer aggregation, the output features of each residual layer are used for the MCSAE. In the MCSAE, the interdependence of each input features is trained by cross self-attention module. A random masking regularization module is also applied to prevent overfitting problem. The MCSAE enhances the weight of frames representing the speaker information. Then, the output features are concatenated and encoded in the speaker embedding. Therefore, a more informative speaker embedding is encoded by using the MCSAE. The experimental results showed an equal error rate of 2.63 % using the VoxCeleb1 evaluation dataset. It improved performance compared with the previous self-attentive encoding and state-of-the-art methods.

Fractal Coding Method using Bit - plane Image (비트 - 플레인 영상을 이용한 프랙탈 부호화 기법)

  • Kim, Jeong-Il;Lee, Kwang-Bae;Kim, Hyen-Ug
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.4
    • /
    • pp.1057-1065
    • /
    • 1998
  • This paper describes a new fractal image coding algorithm to shorten time to take on fractal image encoding by using limited search area method and scaling method. First, the original image is contracted respectively by half and by quarter with the scaling method. And then, the corresponding domain block of the quarter-sized image which is most similar with one range block of the half-sized image is searched within the limited area in order to reduce the encoding time extremely. This searched block is used in encoding. Also, we propose an algorithm provided much shorter encoding time and better compression ratio with a little degradation of the decoded image quality than Jacquin's method.

  • PDF

DESIGN AND IMPLEMENTATION OF MOVING OBJECTS MANAGEMENT SYSTEM APPLYING OPEN GOESPATIAL DATA ENCODING

  • Lee, Hye-Jin;Lee, Hyun-Ah;Park, Jong-Heung
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.663-666
    • /
    • 2005
  • The Geography Markup Language (GML) is an XML encoding for the transport and storage of geographic information, including both the geometry and properties of geographic features. This paper uses the GML to provide extendibility and interoperability of spatial data in moving objects management system. Since the purpose of the system is to provide locations of the moving objects in the web and mobile environments, we used the GML both for presenting map data and trajectories of the moving objects. The proposed system is composed of Location Data Interface, Moving Objects Engine, and Web/Mobile Presentation Interface. We utilized the concept of Web Map Server, that is web mapping technology of OGC (Open Geospatial Consortium), to integrate map data and the location information of the moving objects. In the process of the integration, we used the standard data model and interfaces while defining new application schema. Since our suggested system uses open spatial data encoding and interfaces, both extendibility and interoperability are guaranteed.

  • PDF

Improved Melody Recognition Performance of a Cochlear Implant Speech Processing Strategy Using Instantaneous Frequency Encoding Based on Teager Energy Operator

  • Choi, Sung-Jin;Ryu, Sang-Baek;Kim, Kyung-Hwan
    • Journal of Biomedical Engineering Research
    • /
    • v.31 no.6
    • /
    • pp.417-426
    • /
    • 2010
  • We present a speech processing strategy incorporating instantaneous frequency (IF) encoding for the enhancement of melody recognition performance of cochlear implants. For the IF extraction from incoming sound, we propose the use of a Teager energy operator (TEO), which is advantageous for its lower computational load. From time-frequency analysis, we verified that the TEO-based method provides proper IF encoding of input sound, which is crucial for melody recognition. Similar benefit could be obtained also from the use of a Hilbert transform (HT), but much higher computational cost was required. The melody recognition performance of the proposed speech processing strategy was compared with those of a conventional strategy using envelope extraction, and the HT-based IF encoding. Hearing tests on normal subjects were performed using acoustic simulation and a musical contour identification task. Insignificant difference in melody recognition performance was observed between the TEO-based and HT-based IF encodings, and both were superior to the conventional strategy. However, the TEO-based strategy was advantageous considering that it was approximately 35% faster than the HT-based strategy.

Survey on Nucleotide Encoding Techniques and SVM Kernel Design for Human Splice Site Prediction

  • Bari, A.T.M. Golam;Reaz, Mst. Rokeya;Choi, Ho-Jin;Jeong, Byeong-Soo
    • Interdisciplinary Bio Central
    • /
    • v.4 no.4
    • /
    • pp.14.1-14.6
    • /
    • 2012
  • Splice site prediction in DNA sequence is a basic search problem for finding exon/intron and intron/exon boundaries. Removing introns and then joining the exons together forms the mRNA sequence. These sequences are the input of the translation process. It is a necessary step in the central dogma of molecular biology. The main task of splice site prediction is to find out the exact GT and AG ended sequences. Then it identifies the true and false GT and AG ended sequences among those candidate sequences. In this paper, we survey research works on splice site prediction based on support vector machine (SVM). The basic difference between these research works is nucleotide encoding technique and SVM kernel selection. Some methods encode the DNA sequence in a sparse way whereas others encode in a probabilistic manner. The encoded sequences serve as input of SVM. The task of SVM is to classify them using its learning model. The accuracy of classification largely depends on the proper kernel selection for sequence data as well as a selection of kernel parameter. We observe each encoding technique and classify them according to their similarity. Then we discuss about kernel and their parameter selection. Our survey paper provides a basic understanding of encoding approaches and proper kernel selection of SVM for splice site prediction.

Optimal CNF Encoding for Representing Adjacency in Boolean Cardinality Constraints (이진 기수 조건에서 인접성 표현을 위한 최적화된 CNF 변환)

  • Park, Sa-Choun;Kwon, Gi-Hwon
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.11
    • /
    • pp.661-670
    • /
    • 2008
  • In some applications of software engineering such as the verification of software model or embedded program, SAT solver is used. To practical use a SAT solver, a problem is encoded to a CNF formula, but because the formula has lower expressiveness than software models or source codes, optimal CNF encoding is required. In this paper, we propose optimal encoding techniques for the problem of "Selecting adjacent $k{\leq}n$ among n objects," Through experimental results we show the proposed constraint is efficient and correct to solve Japanese puzzle. As we know, this paper is the first study about CNF encoding for adjacency in BCC.

Summarization Based Multi-news Title Extraction Using Term Relevance Estimation and Byte Pair Encoding (단어 관련성 추정과 바이트 페어 인코딩(Byte Pair Encoding)을 이용한 요약 기반 다중 뉴스 기사 제목 추출)

  • Yu, Hongyeon;Lee, Seungwoo;Ko, Youngjoong
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.115-119
    • /
    • 2018
  • 다중 문서 제목 추출은 하나의 주제를 가지는 다중 문서에 대한 제목을 추출하는 것을 말한다. 일반적으로 다중 문서 제목 추출에서는 다중 문서 집합을 단일 문서로 본 다음 키워드를 제목 후보군으로 추출하고, 추출된 후보를 나열하는 형식의 연구가 많이 진행되어져 왔다. 하지만 이러한 방법은 크게 두 가지의 한계점을 가지고 있다. 먼저, 다중 문서를 단순히 하나의 문서로 보는 방법은 전체적인 주제를 반영한 제목을 추출하기 어렵다는 문제점이 있다. 다음으로, 키워드를 조합하는 형식의 방법은 키워드의 단위를 찾는 방법에 따라 추출된 제목이 자연스럽지 못하다는 한계점이 있다. 따라서 본 논문에서는 이 한계점들을 보완하기 위하여 단어 관련성 추정과 Byte Pair Encoding을 이용한 요약 기반의 다중 뉴스 기사 제목 추출 방법을 제안한다. 평가를 위해서는 자동으로 군집된 총 12개의 주제에 대한 다중 뉴스 기사 집합을 사용하였으며 전문 교육을 받은 연구원들이 정성평가를 진행하여 5점 만점 기준 평균 3.68점을 얻었다.

  • PDF

Real-time fractal coding implementation using the PC (PC를 이용한 실시간 프랙탈 부호화 구현)

  • 김재철;박종식
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.11
    • /
    • pp.2789-2800
    • /
    • 1996
  • Real time fractal coding for successive QCIF 144*176 luminance images has been implemented on a 50MHz IBM 486 personal computer. To satisfy the frame encoding speed and data compression ratio, following algorithms are adopted. In order to minimize encoding time, extension SAS being not searching of domain blocks is used. for reducing the bits per pixel, conventioal 4*4 range block is extended to 8*8 range block. and range block extension decrease quality of decoded image. For improvement quality of decoded image, the paper apply quad-tree partition mothod. In order to divide **8 range block, self-simiarity is compared 8*8 range block with spatial contractive transformed 8*8 domain block. According to self-simiarity, the block is partitioned and owing to block partition, increased encoding time is minimized. According to self-simiarity of 8*8 range block and spatial contractive transformed 8*8 domain block, number of fractal factor is varied. Simultaneously with minimizing the decrement of decoded image's quality, transmittion rate and encoding time is shorted. The results enable us to process the real-time fractal coding. For the claire test image, the average PSNR was 32.4dB, 0.12 bit rates and 33ms coding time per frame.

  • PDF

Performance Analysis of Coding According to the Interpolation filter in Inter layer Intra Prediction of H.264/SVC (H.264/SVC의 계층간 화면내 예측에서 보간법에 따른 부호화 성능 분석)

  • Gil, Dae-Nam;Cheong, Cha-Keon
    • Proceedings of the IEEK Conference
    • /
    • 2009.05a
    • /
    • pp.225-227
    • /
    • 2009
  • International standard specification, H.264/SVC improved from H.264/AVC, is set up so as to promote free use of huge multimedia data in various channel environments.;H.264/AVC is a international standard speicification for video compression, adopted and commercialized as standard for DMB broadcasting by JVT of ISO/IEC MPEG and ITU-T VCEG. SVC standard uses 'intra/inter prediction' in AVC as well as 'inter-layer intra prediction', 'inter-layer motion prediction' and 'inter-layer residual prediction' to improve efficiency of encoding. Among prediction technologies, 'inter-layer intra prediction' is to use co-located block of up sampled sublevels as a prediction signal. At this time, application of interpolation is one of the most important factors to determine encoding efficiency. SVC's currently using poly-phase FIR filter of 4-tap and 2-tap respectively to luma components. This paper is written for the purpose of analyzing encoding performance according to the interpolation. For this purpose, we applied poly-phase FIR filter of '2-tap', '4-tap' and '6-tap' respectively to luma components and then measured bit-rate, PNSR and running time of interpolation filter. We're expecting that the analysis results of this paper will be utilized for effective application of interpolation filter. SVC standard uses 'intra/inter prediction' in AVC as well as 'inter-layer intra prediction', 'inter-layer motion prediction' and 'inter-layer residual prediction' to improve efficiency of encoding.

  • PDF

New Encoding Method for Low Power Sequential Access ROMs

  • Cho, Seong-Ik;Jung, Ki-Sang;Kim, Sung-Mi;You, Namhee;Lee, Jong-Yeol
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.13 no.5
    • /
    • pp.443-450
    • /
    • 2013
  • This paper propose a new ROM data encoding method that takes into account of a sequential access pattern to reduce the power consumption in ROMs used in applications such as FIR filters that access the ROM sequentially. In the proposed encoding method, the number of 1's, of which the increment leads to the increase of the power consumption, is reduced by applying an exclusive-or (XOR) operation to a bit pair composed of two consecutive bits in a bit line. The encoded data can be decoded by using XOR gates and D flip-flops, which are usually used in digital systems for synchronization and glitch suppression. By applying the proposed encoding method to coefficient ROMs of FIR filters designed by using various design methods, we can achieve average reduction of 43.7% over the unencoded original data in the power consumption, which is larger reduction than those achieved by previous methods.