• Title/Summary/Keyword: Iterative Error Analysis

Search Result 122, Processing Time 0.017 seconds

Recommending Core and Connecting Keywords of Research Area Using Social Network and Data Mining Techniques (소셜 네트워크와 데이터 마이닝 기법을 활용한 학문 분야 중심 및 융합 키워드 추천 서비스)

  • Cho, In-Dong;Kim, Nam-Gyu
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.127-138
    • /
    • 2011
  • The core service of most research portal sites is providing relevant research papers to various researchers that match their research interests. This kind of service may only be effective and easy to use when a user can provide correct and concrete information about a paper such as the title, authors, and keywords. However, unfortunately, most users of this service are not acquainted with concrete bibliographic information. It implies that most users inevitably experience repeated trial and error attempts of keyword-based search. Especially, retrieving a relevant research paper is more difficult when a user is novice in the research domain and does not know appropriate keywords. In this case, a user should perform iterative searches as follows : i) perform an initial search with an arbitrary keyword, ii) acquire related keywords from the retrieved papers, and iii) perform another search again with the acquired keywords. This usage pattern implies that the level of service quality and user satisfaction of a portal site are strongly affected by the level of keyword management and searching mechanism. To overcome this kind of inefficiency, some leading research portal sites adopt the association rule mining-based keyword recommendation service that is similar to the product recommendation of online shopping malls. However, keyword recommendation only based on association analysis has limitation that it can show only a simple and direct relationship between two keywords. In other words, the association analysis itself is unable to present the complex relationships among many keywords in some adjacent research areas. To overcome this limitation, we propose the hybrid approach for establishing association network among keywords used in research papers. The keyword association network can be established by the following phases : i) a set of keywords specified in a certain paper are regarded as co-purchased items, ii) perform association analysis for the keywords and extract frequent patterns of keywords that satisfy predefined thresholds of confidence, support, and lift, and iii) schematize the frequent keyword patterns as a network to show the core keywords of each research area and connecting keywords among two or more research areas. To estimate the practical application of our approach, we performed a simple experiment with 600 keywords. The keywords are extracted from 131 research papers published in five prominent Korean journals in 2009. In the experiment, we used the SAS Enterprise Miner for association analysis and the R software for social network analysis. As the final outcome, we presented a network diagram and a cluster dendrogram for the keyword association network. We summarized the results in Section 4 of this paper. The main contribution of our proposed approach can be found in the following aspects : i) the keyword network can provide an initial roadmap of a research area to researchers who are novice in the domain, ii) a researcher can grasp the distribution of many keywords neighboring to a certain keyword, and iii) researchers can get some idea for converging different research areas by observing connecting keywords in the keyword association network. Further studies should include the following. First, the current version of our approach does not implement a standard meta-dictionary. For practical use, homonyms, synonyms, and multilingual problems should be resolved with a standard meta-dictionary. Additionally, more clear guidelines for clustering research areas and defining core and connecting keywords should be provided. Finally, intensive experiments not only on Korean research papers but also on international papers should be performed in further studies.

Performance Test of Hypocenter Determination Methods under the Assumption of Inaccurate Velocity Models: A case of surface microseismic monitoring (부정확한 속도 모델을 가정한 진원 결정 방법의 성능평가: 지표면 미소지진 모니터링 사례)

  • Woo, Jeong-Ung;Rhie, Junkee;Kang, Tae-Seob
    • Geophysics and Geophysical Exploration
    • /
    • v.19 no.1
    • /
    • pp.1-10
    • /
    • 2016
  • The hypocenter distribution of microseismic events generated by hydraulic fracturing for shale gas development provides essential information for understanding characteristics of fracture network. In this study, we evaluate how inaccurate velocity models influence the inversion results of two widely used location programs, hypoellipse and hypoDD, which are developed based on an iterative linear inversion. We assume that 98 stations are densely located inside the circle with a radius of 4 km and 5 artificial hypocenter sets (S0 ~ S4) are located from the center of the network to the south with 1 km interval. Each hypocenter set contains 25 events placed on the plane. To quantify accuracies of the inversion results, we defined 6 parameters: difference between average hypocenters of assumed and inverted locations, $d_1$; ratio of assumed and inverted areas estimated by hypocenters, r; difference between dip of the reference plane and the best fitting plane for determined hypocenters, ${\theta}$; difference between strike of the reference plane and the best fitting plane for determined hypocenters, ${\phi}$; root-mean-square distance between hypocenters and the best fitting plane, $d_2$; root-mean-square error in horizontal direction on the best fitting plane, $d_3$. Synthetic travel times are calculated for the reference model having 1D layered structure and the inaccurate velocity model for the inversion is constructed by using normal distribution with standard deviations of 0.1, 0.2, and 0.3 km/s, respectively, with respect to the reference model. The parameters $d_1$, r, ${\theta}$, and $d_2$ show positive correlation with the level of velocity perturbations, but the others are not sensitive to the perturbations except S4, which is located at the outer boundary of the network. In cases of S0, S1, S2, and S3, hypoellipse and hypoDD provide similar results for $d_1$. However, for other parameters, hypoDD shows much better results and errors of locations can be reduced by about several meters regardless of the level of perturbations. In light of the purpose to understand the characteristics of hydraulic fracturing, $1{\sigma}$ error of velocity structure should be under 0.2 km/s in hypoellipse and 0.3 km/s in hypoDD.