• Title/Summary/Keyword: invariant

Search Result 2,153, Processing Time 0.031 seconds

A study on end-to-end speaker diarization system using single-label classification (단일 레이블 분류를 이용한 종단 간 화자 분할 시스템 성능 향상에 관한 연구)

  • Jaehee Jung;Wooil Kim
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.6
    • /
    • pp.536-543
    • /
    • 2023
  • Speaker diarization, which labels for "who spoken when?" in speech with multiple speakers, has been studied on a deep neural network-based end-to-end method for labeling on speech overlap and optimization of speaker diarization models. Most deep neural network-based end-to-end speaker diarization systems perform multi-label classification problem that predicts the labels of all speakers spoken in each frame of speech. However, the performance of the multi-label-based model varies greatly depending on what the threshold is set to. In this paper, it is studied a speaker diarization system using single-label classification so that speaker diarization can be performed without thresholds. The proposed model estimate labels from the output of the model by converting speaker labels into a single label. To consider speaker label permutations in the training, the proposed model is used a combination of Permutation Invariant Training (PIT) loss and cross-entropy loss. In addition, how to add the residual connection structures to model is studied for effective learning of speaker diarization models with deep structures. The experiment used the Librispech database to generate and use simulated noise data for two speakers. When compared with the proposed method and baseline model using the Diarization Error Rate (DER) performance the proposed method can be labeling without threshold, and it has improved performance by about 20.7 %.

A Study on 3-Dimensional Near-Field Source Localization Using Interference Pattern Matching in Shallow Water Environments (천해에서 간섭패턴 정합을 이용한 근거리 음원의 3차원 위치추정 기법연구)

  • Kim, Se-Young;Chun, Seung-Yong;Son, Yoon-Jun;Kim, Ki-Man
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.4
    • /
    • pp.318-327
    • /
    • 2009
  • In this paper, we propose a 3-D geometric localization method for near-field broadband source in shallow water environments. According to the waveguide invariant theory, slope of the interference pattern which is seen in a sensor spectrogram directly proportional to a range of the source. The relative ratio of the range between source and sensors was estimated by matching of two interference patterns in spectrogram. Then this ratio is applied to the Apollonius's circle which shows the locus of a source whose range ratio from two sensors is constant. Two Apollonius's circles from three sensors make the intersection point that means the horizontal range and the azimuth angle of the source. And this intersection point is constant with source depth. Therefore the source depth can be estimated using 3-D hyperboloid equation whose range difference from two sensors is constant. To evaluate a performance of the proposed localization algorithm, simulation is performed using acoustic propagation program and analysis of localization error is demonstrated. From simulation results, error estimate for range and depth is described within 50 m and 15 m respectively.

Underwater Target Localization Using the Interference Pattern of Broadband Spectrogram Estimated by Three Sensors (3개 센서의 광대역 신호 스펙트로그램에 나타나는 간섭패턴을 이용한 수중 표적의 위치 추정)

  • Kim, Se-Young;Chun, Seung-Yong;Kim, Ki-Man
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.4
    • /
    • pp.173-181
    • /
    • 2007
  • In this paper, we propose a moving target localization algorithm using acoustic spectrograms. A time-versus-frequency spectrogram provide a information of trajectory of the moving target in underwater. For a source at sufficiently long range from a receiver, broadband striation patterns seen in spectrogram represents the mutual interference between modes which reflected by surface and bottom. The slope of the maximum intensity striation is influenced by waveguide invariant parameter ${\beta}$ and distance between target and sensor. When more than two sensors are applied to measure the moving ship-radited noise, the slope and frequency of the maximum intensity striation are depend on distance between target and receiver. We assumed two sensors to fixed point then form a circle of apollonios which set of all points whose distances from two fixed points are in a constant ratio. In case of three sensors are applied, two circle form an intersection point so coordinates of this point can be estimated as a position of target. To evaluates a performance of the proposed localization algorithm, simulation is performed using acoustic propagation program.

Characterization Of Rainrate Fields Using A Multi-Dimensional Precipitation Model

  • Yoo, Chul-sang;Kwon, Snag-woo
    • Water Engineering Research
    • /
    • v.1 no.2
    • /
    • pp.147-158
    • /
    • 2000
  • In this study, we characterized the seasonal variation of rainrate fields in the Han river basin using the WGR multi-dimensional precipitation model (Waymire, Gupta, and Rodriguez-Iturbe, 1984) by estimating and comparing the parameters derived for each month and for the plain area, the mountain area and overall basin, respectively. The first-and second-order statistics derived from observed point gauge data were used to estimate the model parameters based on the Davidon-Fletcher-Powell algorithm of optimization. As a result of the study, we can find that the higher rainfall amount during summer is mainly due to the arrival rate of rain bands, mean number of cells per cluster potential center, and raincell intensity. However, other parameters controlling the mean number of rain cells per cluster, the cellular birth rate, and the mean cell age are found invariant to the rainfall amounts. In the application to the downstream plain area and upstream mountain area of the Han river basin, we found that the number of storms in the mountain area was estimated a little higher than that in the plain area, but the cell intensity in the mountain area a little lower than that in the plain area. Thus, in the mountain area more frequent but less intense storms can be expected due to the orographic effect, but the total amount of rainfall in a given period seems to remain the same.

  • PDF

A Prediction of Behavior of Granular Soils Based on the Advanced Elasto Plastic Model (개선된 탄.소성 구성모델을 이용한 사질토의 응력-변형률 거동예측)

  • Park, Byeong-Gi;Im, Seong-Cheol;Lee, Gang-Il
    • Geotechnical Engineering
    • /
    • v.11 no.3
    • /
    • pp.81-90
    • /
    • 1995
  • Based on the close investigation of Lade elasto -plastic model, this study proposes a new elasto -plastic constitutive model for foundation composed of granular soils. The new model contains 1st stress invariant in plastic potential function as well as yield function, which is different from Lade original model. Both these functions called a correction function include a correction term. To validate the new analytical model, it was compered with some previous models. Comparison between the test results and numerical results using Lade and new model was carried out concerning Sacramento River sand, U.S.A and Backma River sand. The conclusion was obtained that more refined model well be deft.eloped throughout this research.

  • PDF

Sound Synthesis of Gayageum by Impulse Responses of Body and Anjok (안족과 몸통의 임펄스 응답을 이용한 가야금 사운드 합성)

  • Cho Sang-Jin;Choi Gin-Kyu;Chong Ui-Pil
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.7 no.3
    • /
    • pp.102-107
    • /
    • 2006
  • In this paper, we propose a method of a sound synthesis of Korean plucked string instrument, gayageum, by physical modeling which use impulse responses of body and Anjok. Gayageum consists of three kinds of systems: string, body, and Anjok. These are a serial combination of linear time invariant systems. String can be modeled by digital delay line. Body and Anjok can be estimated by their impulse responses. We found three resonance frequencies in the body impulse response, and implemented resonator as body. Anjok was implemented as high pass filter in fundamental frequency band of gayageum. RMSEs of synthesized sounds are distributed from 0.01 to 0.03. It was difficult to distinguish the resulting synthesized sounds from the originals sound by ear.

  • PDF

CONVERGENCE ANALYSIS OF THE EAPG ALGORITHM FOR NON-NEGATIVE MATRIX FACTORIZATION

  • Yang, Chenxue;Ye, Mao
    • Journal of applied mathematics & informatics
    • /
    • v.30 no.3_4
    • /
    • pp.365-380
    • /
    • 2012
  • Non-negative matrix factorization (NMF) is a very efficient method to explain the relationship between functions for finding basis information of multivariate nonnegative data. The multiplicative update (MU) algorithm is a popular approach to solve the NMF problem, but it fails to approach a stationary point and has inner iteration and zero divisor. So the elementwisely alternating projected gradient (eAPG) algorithm was proposed to overcome the defects. In this paper, we use the fact that the equilibrium point is stable to prove the convergence of the eAPG algorithm. By using a classic model, the equilibrium point is obtained and the invariant sets are constructed to guarantee the integrity of the stability. Finally, the convergence conditions of the eAPG algorithm are obtained, which can accelerate the convergence. In addition, the conditions, which satisfy that the non-zero equilibrium point exists and is stable, can cause that the algorithm converges to different values. Both of them are confirmed in the experiments. And we give the mathematical proof that the eAPG algorithm can reach the appointed precision at the least iterations compared to the MU algorithm. Thus, we theoretically illustrate the advantages of the eAPG algorithm.

Implementation of an Indoor Mobile Robot and Environment Recognition using Line Histogram Method (실내 자율주행 로봇의 구현 및 라인 히스토그램을 이용한 환경인식)

  • Moon, Chan-Woo;Lee, Young-Dae
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.2
    • /
    • pp.45-50
    • /
    • 2009
  • The environment exploration is an essential process for indoor robots such as clean robot and security robot. Apartment house and office building has common frame structure, but internal arrangement of each room may be slightly different. So, it is more convenient to use a common frame map than to build a new map at every time the arrangement is changed. In this case, it is important to recognize invariant features such as wall, door and window. In this paper, an indoor mobile robot is implemented, and by using the laser scanner data and line segment histogram with respect to segment orientation and distance, an environment exploration method is presented and tested. This robot is fitted with a laser scanner, gyro sensor, ultra sonic sensor and IR sensor, and programed with C language.

  • PDF

Measurement and Forecast of the Visibility Range according to Illuminance and the Character Sizes (조도와 글자 크기에 따른 가시거리 측정과 예상)

  • Kim, Tae-Hyun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.1
    • /
    • pp.425-429
    • /
    • 2014
  • The visibility range is defined from where one can see. And it can be changed by illuminance, the character size, and eyesight and so on. In this paper the visibility range of 120 students is measured for 4 character sizes and 3 illuminations in a classroom. In order to forecast the visibility range of unmeasured data, using least square approximation theory, functions whose independent variable is illuminance and whose dependent variable is the visibility range is proposed. Because the visibility range is invariant according to illuminance, common logarithmic functions for 4 character sizes are used. The small difference between the postulated functions and the measured data verifies the accuracy of the functions.

The Haar Function Approach for the Unknown Input Observer Design (미지입력 관측기 설계를 위한 하알함수 접근법)

  • 김진태;이한석;임윤식;김종부;이명규
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.40 no.3
    • /
    • pp.117-126
    • /
    • 2003
  • This paper proposes a real-time application of Walsh functions which is based on the on-line Walsh transformation and on-line Walsh function's differential operation. In the existing method of orthogonal functions, a major disadvantage is that process signals need to be recorded prior to obtaining their expansions. This paper proposes a novel method of Walsh transformation to overcome this shortcoming. And the proposed method apply to the unknown inputs observer(UIO) design for linear time-invariant dynamical systems