• Title/Summary/Keyword: 통계적 유사성

Search Result 378, Processing Time 0.032 seconds

Stochastic Self-similarity Analysis and Visualization of Earthquakes on the Korean Peninsula (한반도에서 발생한 지진의 통계적 자기 유사성 분석 및 시각화)

  • JaeMin Hwang;Jiyoung Lim;Hae-Duck J. Jeong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.11
    • /
    • pp.493-504
    • /
    • 2023
  • The Republic of Korea is located far from the boundary of the earthquake plate, and the intra-plate earthquake occurring in these areas is generally small in size and less frequent than the interplate earthquake. Nevertheless, as a result of investigating and analyzing earthquakes that occurred on the Korean Peninsula between the past two years and 1904 and earthquakes that occurred after observing recent earthquakes on the Korean Peninsula, it was found that of a magnitude of 9. In this paper, the Korean Peninsula Historical Earthquake Record (2 years to 1904) published by the National Meteorological Research Institute is used to analyze the relationship between earthquakes on the Korean Peninsula and statistical self-similarity. In addition, the problem solved through this paper was the first to investigate the relationship between earthquake data occurring on the Korean Peninsula and statistical self-similarity. As a result of measuring the degree of self-similarity of earthquakes on the Korean Peninsula using three quantitative estimation methods, the self-similarity parameter H value (0.5 < H < 1) was found to be above 0.8 on average, indicating a high degree of self-similarity. And through graph visualization, it can be easily figured out in which region earthquakes occur most often, and it is expected that it can be used in the development of a prediction system that can predict damage in the event of an earthquake in the future and minimize damage to property and people, as well as in earthquake data analysis and modeling research. Based on the findings of this study, the self-similar process is expected to help understand the patterns and statistical characteristics of seismic activities, group and classify similar seismic events, and be used for prediction of seismic activities, seismic risk assessments, and seismic engineering.

A Stability Test of the Regression Coefficients for the Linear Models using Chow Test (차우검정을 활용한 선형회귀모형간 유사성 검증)

  • Lee, Ki-Young;Lee, Seongkwan Mark;Jeong, So-Young;Heo, Tae-Young
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.2
    • /
    • pp.73-82
    • /
    • 2017
  • In this research, we tried to check the applicability of a Chow test to the linear models which are generated in the process of transportation planning or traffic flow analyses. The Chow test is a very popular statistical method which is being used to see if the coefficients from two separate linear regression models are equal or not. In order to prove the effectiveness of the Chow test, we found the linear relationships between speed and density under the situations such as driving in daytime and in nighttime on a rainy day. Based on the two months of Joong-Bu Expressway traffic data, we proved that the Chow test is useful to testify the similarity between two linear regression models. And this statistical tool seems to be able to have a very important role in traffic flow analysis or in transportation planning process. Finally, we expect the Chow test be implemented even to the non-linear regression models or to the multi-variate models.

A Similarity Valuating System using The Pattern Matching (패턴매칭을 이용한 유사도 비교 분석)

  • Ko, Bang-Won;Kim, Young-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.1
    • /
    • pp.185-192
    • /
    • 2010
  • This research suggests that valuate similarities by using the matches of patterns which is appeared on different two documents. Statistical ways such as fingerprint method are mainly used for evaluate similarities of existing documents. However, this method has a problem of accuracy for the high similarity which is occurred when many similar words are appeared from two irrelevant documents. These issues are caused by simple comparing of statistical parameters of two documents. But the method using patterns suggested on this research solved those problems because it judges similarity by searching same patterns. This method has a defect, however, that takes long time to search patterns, but this research introduce the algorithms complement this defect.

Reliable Data Selection using Similarity Measure (유사측도를 이용한 신뢰성 있는 데이터의 추출)

  • Ryu, Soo-Rok;Lee, Sang-Hyuk
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.2
    • /
    • pp.200-205
    • /
    • 2008
  • For data analysis, fuzzy entropy is introduced as the measure of fuzziness, similarity measure is also constructed to represent similarity between data. Similarity measure between fuzzy membership functions is constructed through distance measure, and the proposed similarity measure are proved. Application of proposed similarity measure to the example of reliable data selection is also carried out. Application results are compared with the previous results that is obtained through fuzzy entropy and statistical knowledge.

An approach to predict size distribution of suspended sediment - noncohesive sediment (유사의 입경분포 모의를 위한 방안 연구 - 비점착성 유사의 경우)

  • Son, Minwoo;Byun, Jisun;Park, Byeoung Eun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.289-289
    • /
    • 2018
  • 하천의 유사는 이동 형태에 따라 소류사와 부유사로 분류되는데, 대부분의 자연하천에서 유사는 난류로 인해 부유사의 형태로 이송된다. 하천 흐름 내 부유사는 크기와 모양이 서로 다른 입자들로 구성되어 있으며, 부유사의 입도 분포는 유사의 특성 뿐 아니라 흐름의 유사 이동 능력과 같은 유수동역학적 특성 또한 함께 고려되어야 한다. 유사의 입도 분포는 통계적인 방법을 통해 결정되며, 일반적으로 모래 하천의 입도 분포는 로그 정규 분포를 따르는 것으로 알려져 있다. 이에 본 연구에서는 부유사의 입도 분포 모형을 이용하여 다양한 흐름 조건 하에서의 입도분포를 살펴본다. 비점착성 유사의 입도 분포 모형은 점착성 유사의 입도 분포 모형으로부터 얻어지며, 1차원 유사 이동 모형과의 결합을 통해 다양한 흐름 조건에서 부유된 유사의 입도 분포를 모의할 수 있다. 여러 연구결과를 분석한 결과, 부유사의 입도 분포는 최빈치가 하나인 단최빈 분포(Unimodal Distribution)가 대다수를 차지하였으나, 최대 빈도가 두 개 이상 나타나는 쌍최빈 분포(Bimodal Distribution) 또한 흔히 나타나는 것이 확인된다. 본 연구에서 개발된 비점착성 유사의 입도 분포모형은 단최빈 및 쌍최빈이 나타난 실험실 실험 자료를 이용하여 검증된다. 단최빈의 입도 분포를 나타내는 실험 결과 2가지와 쌍최빈의 입도 분포를 나타내는 실험 결과 2가지를 이용하였을 때, 총 4가지의 다양한 유수동역학적 조건 하에서 비점착성 유사의 입도 분포가 합리적으로 모의되는 것이 확인된다.

  • PDF

A Similarity Evaluation using Structural Information of Documents (문서구조 정보 기반의 유사도 측정)

  • Shin, Mi-Hae;Ko, Bang-Won;Kim, Young-Chul;Jeong, Jin-Yeong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2010.07a
    • /
    • pp.499-502
    • /
    • 2010
  • 인터넷의 발달로 인한 수많은 정보의 공유는 지식 정보사회의 발전을 가져왔다. 이러한 정보사회의 발전과 동시에 표절과 같은 새로운 지식 범죄도 급증하고 있다. 표절은 연구의 정직성과 창의성을 떨어뜨리고 학문의 발전을 저해하는 요소이다. 이러한 표절을 근절하기 위해서 그동안 많은 방법들과 시스템들이 제시되었다. 이중 자연어로 구성된 구조가 없는 일반 문서의 표절을 검사하는 방법은 지문법을 이용하였다. 지문법과 같이 통계적인 방법을 이용한 유사도 검사 방법은 문서 대 문서 전체를 비교하기 때문에 부분적 유사성, 즉 문장이나 문단 단위의 비교를 할 수 없는 단점이 있다. 본 논문에서 제시하는 시스템은 자연어로 이루어진 일반문서 중 특별한 문서의 구조 정보를 가질 수 있는 일반 텍스트 문서를 대상으로 유사도를 측정하였다. 즉 텍스트 문서 구조를 AST 형태의 자료구조로 표시하고 이를 이용하여 사용자가 원하는 부분 또는 전체 유사도 측정 방법을 제시한다.

  • PDF

Preliminary Study on the Analysis of Term Associations in Korean Text (한국어 텍스트 내 용어연관성 분석을 위한 기초 연구)

  • 정영미;이재윤
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 1998.08a
    • /
    • pp.243-246
    • /
    • 1998
  • 텍스트 자동분석을 통해 얻어진 통계적인 용어연관성은 정보검색 및 언어 처리와 관련된 여러 분야에서 폭넓게 이용되고 있다. 용어연관성을 구하기 위한 연관계수는 여러 가지가 있지만 적용분야에 관계없이 유사계수 공식이나 상호정보량 공식이 주류를 차지하고 있다. 이런 공식들은 그 통계적 특성이 서로 다르기 때문에 알맞은 적용분야를 파악할 필요가 있다. 이 연구에서는 필요 연관계수 공식의 특성을 이론적으로 파악하였고, 실험으로 검증하기 위하여 240만 어절 분량의 실험용 한국어 신문기사 데이터베이스를 구축하였다.

  • PDF

Understanding of Statistical concepts Examined through Problem Posing by Analogy (유추에 의한 문제제기 활동을 통해 본 통계적 개념 이해)

  • Park, Mi-Mi;Lee, Dong-Hwan;Lee, Kyeong-Hwa;Ko, Eun-Sung
    • Journal of Educational Research in Mathematics
    • /
    • v.22 no.1
    • /
    • pp.101-115
    • /
    • 2012
  • Analogy, a plausible reasoning on the basis of similarity, is one of the thinking strategy for concept formation, problem solving, and new discovery in many disciplines. Statistics educators argue that analogy can be used as an useful thinking strategy in statistics as well. This study investigated the characteristics of students' analogical thinking in statistics. The mathematically gifted were asked to construct similar problems to a base problem which is a statistical problem having a statistical context. From the analysis of the problems, students' new problems were classified into five types on the basis of the preservation of the statistical context and that of the basic structure of the base problem. From the result, researchers provide some implications. In statistics, the problems, which failed to preserve the statistical context of base problem, have no meaning in statistics. However, the problems which preserved the statistical context can give possibilities for reconceptualization of the statistical concept even though the basic structure of the problem were changed.

  • PDF

A New Statistical Index for Detecting Cheaters on Multiple Choice Tests (다중선택 시험에서 부정행위자 발견을 위한 새로운 통계적 측도)

  • Han, Eun Su;Lim, Johan;Lee, Kyeong Eun
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.1
    • /
    • pp.81-92
    • /
    • 2013
  • It is important to construct a firm basis for accusing potential violators of academic integrity in order to avoid spurious accusations and false convictions. Educational researchers have developed many statistical methods that can either uncover or confirm cases of cheating on tests. However, most of them rely on simple correlation-based measures, and often fail to account for patterns in responses or answers. In this paper, we propose a new statistical index denoted by a Standardized Signed Entropy Similarity Score to resolve this difficulty. In addition, we apply the proposed method to analyze a real data set and compare the results to other existing methods.

A study on the efficiency of multidimensional scalin using bootstrap method (붓스트랩을 이용한 다차원척도법의 효율성 연구)

  • Kim, Woo-Jong;Kang, Kee-Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.301-309
    • /
    • 2009
  • Multidimensional scaling(MDS) is a statistical multivariate analysis technique that is often used in information visualization for exploring similarities or dissimilarities in data. In order to analyse and visualize data, MDS measures the dissimilarities between objects and uses them or their mean if they are repeatedly measured. When there exist outliers or when the variation of data is too large, we can hardly get reliable results on the research using MDS. In this paper, we consider the MDS based on bootstrap method when the variation of data is large. Standardized residual sum of squares is considered as measuring goodness-of-fit of the model. A real data analysis is include to examine our approach.

  • PDF