• Title/Summary/Keyword: Statistical similarity

Search Result 311, Processing Time 0.024 seconds

Optimized Chinese Pronunciation Prediction by Component-Based Statistical Machine Translation

  • Zhu, Shunle
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.203-212
    • /
    • 2021
  • To eliminate ambiguities in the existing methods to simplify Chinese pronunciation learning, we propose a model that can predict the pronunciation of Chinese characters automatically. The proposed model relies on a statistical machine translation (SMT) framework. In particular, we consider the components of Chinese characters as the basic unit and consider the pronunciation prediction as a machine translation procedure (the component sequence as a source sentence, the pronunciation, pinyin, as a target sentence). In addition to traditional features such as the bidirectional word translation and the n-gram language model, we also implement a component similarity feature to overcome some typos during practical use. We incorporate these features into a log-linear model. The experimental results show that our approach significantly outperforms other baseline models.

Evaluations of Museum Recommender System Based on Different Visitor Trip Times

  • Sanpechuda, Taweesak;Kovavisaruch, La-or
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.2
    • /
    • pp.131-136
    • /
    • 2022
  • The recommendation system applied in museums has been widely adopted owing to its advanced technology. However, it is unclear which recommendation is suitable for indoor museum guidance. This study evaluated a recommender system based on social-filtering and statistical methods applied to actual museum databases. We evaluated both methods using two different datasets. Statistical methods use collective data, whereas social methods use individual data. The results showed that both methods could provide significantly better results than random methods. However, we found that the trip time length and the dataset's sizes affect the performance of both methods. The social-filtering method provides better performance for long trip periods and includes more complex calculations, whereas the statistical method provides better performance for short trip periods. The critical points are defined to indicate the trip time for which the performances of both methods are equal.

A new method for automatic areal feature matching based on shape similarity using CRITIC method (CRITIC 방법을 이용한 형상유사도 기반의 면 객체 자동매칭 방법)

  • Kim, Ji-Young;Huh, Yong;Kim, Doe-Sung;Yu, Ki-Yun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.29 no.2
    • /
    • pp.113-121
    • /
    • 2011
  • In this paper, we proposed the method automatically to match areal feature based on similarity using spatial information. For this, we extracted candidate matching pairs intersected between two different spatial datasets, and then measured a shape similarity, which is calculated by an weight sum method of each matching criterion automatically derived from CRITIC method. In this time, matching pairs were selected when similarity is more than a threshold determined by outliers detection of adjusted boxplot from training data. After applying this method to two distinct spatial datasets: a digital topographic map and street-name address base map, we conformed that buildings were matched, that shape is similar and a large area is overlaid in visual evaluation, and F-Measure is highly 0.932 in statistical evaluation.

Image Recognition by Using Hybrid Coefficient Measure of Correlation and Distance (상관계수과 거리계수의 조합형 척도를 이용한 영상인식)

  • Hong, Seong-Jun;Cho, Yong-Hyun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.3
    • /
    • pp.343-347
    • /
    • 2010
  • This paper presents an efficient image recognition method using the hybrid coefficient measure of correlation and distance. The correlation coefficient is applied to measure the statistical similarity by using Pearson coefficient, and distance coefficient is also applied to measure the spacial similarity by using city-block. The total similarity among images is calculated by extending the similarity between the feature vectors, then the feature vectors can be extracted by PCA and ICA, respectively. The proposed method has been applied to the problem for recognizing the 960(30 persons * 4 expressions * 2 lights * 4 poses) facial images of 40*50 pixels. The experimental results show that the proposed method of ICA has a superior recognition performances than the method using PCA, and is affected less by the environmental influences so as lighting.

Building a Robust 3D Statistical Shape Model of the Mandible (견고한 3차원 하악골 통계 형상 모델 생성)

  • Yoo, Ji-Hyun;Hong, Helen
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.2
    • /
    • pp.118-127
    • /
    • 2008
  • In this paper, we propose a method for construction of robust 3D statistical shape model in the mandible CT datasets. Our method consists of following four steps. First, we decompose a 3D input shape Into patches. Second, to generate a corresponding shape of a floating shape, all shapes in the training set are parameterized onto a disk similar to the patch topology. Third, we generate the corresponding shape by one-to-one mapping between the reference and the floating shapes. We solve the problem failed to generate the corresponding points near the patch boundary Finally, the corresponding shapes are aligned with the reference shape. Then statistical shape model is generated by principle component analysis. To evaluate the accuracy of our 3D statistical shape model of the mandible, we perform visual inspection and similarity measure using average distance difference between the floating and the corresponding shapes. In addition, we measure the compactness of statistical shape model using the modes of variation. Experimental results show that our 3D statistical shape model generated by the mandible CT datasets with various characteristics has a high similarity between the floating and corresponding shapes and is represented by the small number of modes.

A Study of Document Ranking Algorithms in a P-norm Retrieval System (P-norm 검색의 문헌 순위화 기법에 관한 실험적 연구)

  • 고미영;정영미
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.1
    • /
    • pp.7-30
    • /
    • 1999
  • This study is to develop effective document ranking algorithms in the P-norm retrieval system which can be implemented to the Boolean retrieval system without major difficulties by using non-statistical term weights based on document structure. Also, it is to enhance the performance by introducing the rank adjustment process which rearranges the ranks of retrieved documents according to the similarity between the top ranked documents and the rest of them. Of the non-statistical term weight algorithms, this study uses field weight and term pair distance weight. In the rank adjustment process, five retrieval experiments were performed, ranging between the case of using one record for the similarity measurement and the case of using first five records. It is proved that non-statistical term weights are highly effective and the rank adjustment process enhance the performance further.

  • PDF

Reduction of Simulation Number for Ship Handling Safety Assessment (선박운항 시뮬레이터 실험조건 축소화 연구)

  • Kwon, S.H.;Oh, H.S.
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.35 no.1
    • /
    • pp.101-106
    • /
    • 2012
  • Ship handling simulator is a virtual ship navigating system with three dimensional screen system and simulation programs. FTS simulation can produce theoretically infinite experiment tests without time constraint, but which results in collecting determinstic observations. RTS simulation can collect statistical observations but has disadvantage of spending at least 30 minutes for a single experiment. The previous studies suggested that the number of experiment conditions to be tested could be reduced to obtain random data with RTS simulation by focusing on highly difficult experiment condition for ship handling. It has the limitation of not estimating the distribution of ship handling difficulty for the route. In this paper, similarity and clustering analysis are suggested for reduction methodology of experiment conditions. Similarity of experiment conditions are measured as follows: euclidean distance of ship handling difficulty index and correlation matrix of distance differences from the designed route. Clustering analysis and multi-dimensional scaling are applied to classify experiment conditions with measured similarity into reducing the number of RTS simulation conditions. An empirical result on Dangin harbor is shown and discussed.

Improvement of Relevance Feedback for Image Retrieval (영상 검색을 위한 적합성 피드백의 개선)

  • Yoon, Su-Jung;Park, Dong-Kwon;Won, Chee-Sun
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.39 no.4
    • /
    • pp.28-37
    • /
    • 2002
  • In this paper, we present an image retrieval method for improving retrieval performance by fusion of probabilistic method and query point movement. In the proposed algorithm, the similarity for probabilistic method and the similarity for query point movement are fused in the computation of the similarity between a query image and database image. The probabilistic method used in this paper is suitable for handling negative examples. On the other hand, query point movement deals with the statistical property of positive examples. Combining these two methods, our goal is to overcome their shortcoming. Experimental results show that the proposed method yields better performances over the probabilistic method and query point movement, respectively.

Micro-seismic monitoring in mines based on cross wavelet transform

  • Huang, Linqi;Hao, Hong;Li, Xibing;Li, Jun
    • Earthquakes and Structures
    • /
    • v.11 no.6
    • /
    • pp.1143-1164
    • /
    • 2016
  • Time Delay of Arrival (TDOA) estimation methods based on correlation function analysis play an important role in the micro-seismic event monitoring. It makes full use of the similarity in the recorded signals that are from the same source. However, those methods are subjected to the noise effect, particularly when the global similarity of the signals is low. This paper proposes a new approach for micro-seismic monitoring based on cross wavelet transform. The cross wavelet transform is utilized to analyse the measured signals under micro-seismic events, and the cross wavelet power spectrum is used to measure the similarity of two signals in a multi-scale dimension and subsequently identify TDOA. The offset time instant associated with the maximum cross wavelet transform spectrum power is identified as TDOA, and then the location of micro-seismic event can be identified. Individual and statistical identification tests are performed with measurement data from an in-field mine. Experimental studies demonstrate that the proposed approach significantly improves the robustness and accuracy of micro-seismic source locating in mines compared to several existing methods, such as the cross-correlation, multi-correlation, STA/LTA and Kurtosis methods.

A Statistical Approache to Scene Change Detection using Motion Compensation in MPEG (움직임 보상을 이용한 MPEG 비디오의 통계적 장면전환검출)

  • Jang, Dong-Sik;Kwon, Do-Kyoung;Lee, Man-Hee
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.5
    • /
    • pp.440-450
    • /
    • 2001
  • This paper discusses an effective algorithm which is proposed for abrupt scene change detection in MPEG bitstream. The proposed algorithm restores DC images by decoding only DC coefficients and estimates the new motion vectors between adjacent DC images and detects scene change by similarity measure between frames. The proposed algorithm calculates similarity measure between adjacent frames, i.e motion compensated inter-frame correlation, and detects scene change by comparing this similarity measure with threshold value independent of sequences. Experimental results show that the proposed algorithm has more than 90% \`recall\` and \`precision\` in almost sequences and these results can be considered better than other algorithms using threshold value dependent of sequences.

  • PDF