• 제목/요약/키워드: 2-Step Clustering

검색결과 86건 처리시간 0.023초

이상탐지 기반의 효율적인 시계열 유사도 측정 및 순위화 (Efficient Time-Series Similarity Measurement and Ranking Based on Anomaly Detection)

  • 최지현;안현
    • 인터넷정보학회논문지
    • /
    • 제25권2호
    • /
    • pp.39-47
    • /
    • 2024
  • 시계열 분석은 시간 순서로 정렬된 데이터로부터 다양한 정보와 인사이트를 발견하기 위한 방법으로 많은 조직에서 비즈니스 문제 해결을 위해 적용하고 있다. 그중에서 시계열 유사도 측정은 패턴이 비슷한 시계열들을 식별하기 위한 단계로서 시계열 검색 및 군집화와 같은 시계열 분석 응용에서 매우 중요하다. 본 연구에서는 전체 시계열이 아닌 이상치들을 중심으로 시계열 유사도 측정을 계산 효율적으로 수행하는 방법을 제안한다. 이와 관련하여 이상탐지를 통해 추출된 서브시퀀스 집합에 대한 유사도 측정 결과와 시계열 전체에 대한 유사도 측정 결과 사이의 순위 상관관계를 측정 및 분석하여 제안 방법을 검증한다. 실험 결과로써, 주식 종목 시계열 데이터에 이상치 비율 10% 을 적용한 유사도 측정으로부터 최대 0.9 이상의 스피어만 순위 상관계수를 확인하였다. 결론적으로 제안 방법을 통해 시계열 유사도 측정에 소요되는 계산량을 유의미하게 절감하는 동시에 신뢰 가능한 시계열 검색 및 군집화 결과를 기대할 수 있다.

이단계 군집분석에 의한 농촌관광 편의시설 유형별 소비자 선호 결정요인 (Determinants of Consumer Preference by type of Accommodation: Two Step Cluster Analysis)

  • 박덕병;윤유식;이민수
    • 마케팅과학연구
    • /
    • 제17권3호
    • /
    • pp.1-19
    • /
    • 2007
  • 본 연구에서는 농촌관광 방문객에게 제공되는 편의시설을 유형화하고 어떤 특징을 가진 방문객이 어떤 편의시설을 선호하는지를 규명하기 위한 방법과 그 분석결과를 제시하였다. 이를 위하여 우선 2단계 군집분석법을 사용하여 농촌관광 편의시설을 유형화하였다. 그 다음으로 군집분석에 사용되는 변인이 범주형 변인이 있을 경우 전통적인 군집분석 방법을 적용할 수 없기 때문에 2단계 군집분석을 하였다. 본 연구는 2단계 군집분석법이 범주형 변인으로 측정된 농촌관광의 편의시설을 유형화하는 데 매우 유용하다는 것을 보여 주고 있다. 다중로짓 모형을 사용하여 특정 편의시설 유형을 선호할 확률에 영향을 미치는 농촌관광 방문자의 사회인구학적 특성과 여행특성을 규명하였다. 즉, 다중로짓 모형을 통해 참조항(일반농가형)으로 설정된 편의시설 유형에 비해 특정 편의시설을 선호할 확률에 영향을 미치는 소비자의 특성을 규명할 수 있다는 것이 본 연구의 특징이다.

  • PDF

웹기반 전복류 (Haliotis) SNP 데이터베이스 구축 (Construction of web-based Database for Haliotis SNP)

  • 정지은;이재봉;강세원;백문기;한연수;최태진;강정하;이용석
    • 한국패류학회지
    • /
    • 제26권2호
    • /
    • pp.185-188
    • /
    • 2010
  • - 본 웹 데이터베이스 서버의 구축을 통해 Haliotis 속간의 염기서열과 일치하는 서열을 자체 BLAST 를 통해 매우 빠른 속도로 추출 할 수 있었다. - Repeat elements, E. coli, vector 등의 서열들과 동시에 BLAST를 시행할 수 있어 cDNA 또는 genomic DNA 라이브러리를 구축할 때 라이브러리의 오염, 삽입체의 길이 등의 상태를 쉽게 확인 할 수 있었다. - Clustering Res. 인터페이스를 통해 SNPs 발굴이 용이하게 되었으며 자체 구축된 primer3 를 통해 실험용 시발체를 제작할 수 있게 되었다 (Evans et al. 2001). - 이러한 SNP 데이터베이스 구축은 SNP 발굴 작업을 극대화 시킬 수 있어 차후 수행될 Haliotis 관련 분자육종 관련연구에 많은 도움이 될 것으로 기대된다.

Genetic Diversity and Population Structure of Korean Soybean Landrace [Glycine max(L.) Merr.]

  • Cho, Gyu-Taek;Lee, Jeong-Ran;Moon, Jung-Kyung;Yoon, Mun-Sup;Baek, Hyung-Jin;Kang, Jung-Hoon;Kim, Tae-San;Paek, Nam-Chon
    • Journal of Crop Science and Biotechnology
    • /
    • 제11권2호
    • /
    • pp.83-90
    • /
    • 2008
  • Two hundred and sixty Korean soybean landrace accessions were analyzed for polymorphism at 92 simple sequence repeat(SSR) loci. The 995 identified alleles served as raw data for estimating genetic diversity and population structure. The number of alleles at a locus ranged from three to 27 with a mean of 10.4 alleles per locus. $F_{ST}$ values estimated by analysis of molecular variance(AMOVA) using SSR data set were 0.018, 0.027, and 0.016 for usage, collection site and maturity groups, respectively, indicating little genetic differentiation. The model-based clustering analysis placed the accessions into three clusters(K=3) with 0.0503 of $F_{ST}$, indicating moderate genetic differentiation. Duncan's Multiple Range Test at K = 3 on the basis of 18 quantitative traits revealed that one cluster was mainly differentiated from the other two clusters by seed related traits and the other two clusters were differentiated from each other by biochemical traits. Genetic structure of Korean soybean landraces was differentiated by model-based clustering and supported by their phenotypic traits in part. This preliminary study could be the first step towards more efficient germplasm management and utilization of soybean landraces and helpful in association studies between genotypic and phenotypic traits in Korean soybean landraces.

  • PDF

Effect of Annealing of Nafion Recast Membranes Containing Ionic Liquids

  • Park, Jin-Soo;Shin, Mun-Sik;Sekhon, S.S.;Choi, Young-Woo;Yang, Tae-Hyun
    • 전기화학회지
    • /
    • 제14권1호
    • /
    • pp.9-15
    • /
    • 2011
  • The composite membranes comprising of sulfonated polymers as matrix and ionic liquids as ion-conducting medium in replacement of water are studied to investigate the effect of annealing of the sulfonated polymers. The polymeric membranes are prepared on recast Nafion containing the ionic liquid, 1-ethyl-3-methylimidazolium tetrafluoroborate ($EMIBF_4$). The composite membranes are characterized by thermogravitational analyses, ion conductivity and small-angle X-ray scattering. The composite membranes annealed at $190^{\circ}C$ for 2 h after the fixed drying step showed better ionic conductivity, but no significant increase in thermal stability. The mean Bragg distance between the ionic clusters, which is reflected in the position of the ionomer peak (small-angle scattering maximum), is larger in the annealed composite membranes containing $EMIBF_4$ than the non-annealed ones. It might have been explained to be due to the different level of ion-clustering ability of the hydrophilic parts (i.e., sulfonic acid groups) in the non- and annealed polymer matrix. In addition, the ionic conductivity of the membranes shows higher for the annealed composite membranes containing $EMIBF_4$. It can be concluded that the annealing of the composite membranes containing ionic liquids due to an increase in ion-clustering ability is able to bring about the enhancement of ionic conductivity suitable for potential use in proton exchange membrane fuel cells (PEMFCs) at medium temperatures ($150-200^{\circ}C$) in the absence of external humidification.

반응면 기법을 이용한 램 가속기 최적설계에 관한 연구 (Ram Accelerator Optimization Using the Response Surface Method)

  • 전용희;전권수;이재우;변영환
    • 한국전산유체공학회:학술대회논문집
    • /
    • 한국전산유체공학회 2000년도 춘계 학술대회논문집
    • /
    • pp.159-165
    • /
    • 2000
  • In this paper, numerical study has been done for the improvement of the superdetonative ram accelerator performance and for the design optimization of the system. The objective function to optimize the premixture composition is the ram tube length required to accelerate projectile from initial velocity $V_o$ to target velocity $V_e$. The premixture is composed of $H_2,\;O_2,\;N_2$ and the mole numbers of these species are selected at design variables. RSM(Response Surface Methodology) which is widely used for the complex optimization problems is selected as the optimization technique. In particular, to improve the non-linearity of the response and to consider the accuracy and efficiency of the solution, design space stretching technique has been applied. Separate sub-optimization routine is introduced to determine the stretching position and clustering parameters which construct the optimum regression model. Two step optimization technique has been applied to obtain the optimal system. With the application of stretching technique, we can perform system optimization with a small number of experimental points, and construct precise regression model for highly non-linear domain. The error to compared with analysis result is only $0.01\%$ and it is demonstrated that present method can be applied more practical design optimization problems with many design variables.

  • PDF

박막 소자 개발과 보론 확산 시뮬레이터 설계 (Shallow Junction Device Formation and the Design of Boron Diffusion Simulator)

  • 한명석;박성종;김재영
    • 대한공업교육학회지
    • /
    • 제33권1호
    • /
    • pp.249-264
    • /
    • 2008
  • 본 연구에서는 저 에너지 이온 주입과 이중 열처리를 통하여 박막 $p^+-n$ 접합을 형성하였고, 보론 확산 모델을 가지고 새로운 시뮬레이터를 설계하여 이온 주입과 열처리 후의 보론 분포를 재현하였다. $BF_2$ 이온을 가지고 실리콘 기판에 저 에너지 이온 주입을 하였고, 이후 RTA(Rapid Thermal Annealing)와 FA(Furnace Annealing)를 통하여 열처리 과정을 수행하였다. 시뮬레이션을 위한 확산 모델은 점결함의 생성과 재결합, BI 쌍의 생성, 보론의 활성화와 침전 현상 등을 고려하였다. FA+RTA 열처리가 RTA+FA 보다 면저항 측면의 접합 특성에서 우수한 결과를 나타내었고, 시뮬레이터에서도 동일한 결과를 나타내었다. 따라서 본 연구를 통하여 박막접합을 형성할 때 열적 효율성을 고려하면 제안된 확산 시뮬레이터와 FA+RTA 공정 방법의 유용성을 기대할 수 있다.

System identification of a super high-rise building via a stochastic subspace approach

  • Faravelli, Lucia;Ubertini, Filippo;Fuggini, Clemente
    • Smart Structures and Systems
    • /
    • 제7권2호
    • /
    • pp.133-152
    • /
    • 2011
  • System identification is a fundamental step towards the application of structural health monitoring and damage detection techniques. On this respect, the development of evolved identification strategies is a priority for obtaining reliable and repeatable baseline modal parameters of an undamaged structure to be adopted as references for future structural health assessments. The paper presents the identification of the modal parameters of the Guangzhou New Television Tower, China, using a data-driven stochastic subspace identification (SSI-data) approach complemented with an appropriate automatic mode selection strategy which proved to be successful in previous literature studies. This well-known approach is based on a clustering technique which is adopted to discriminate structural modes from spurious noise ones. The method is applied to the acceleration measurements made available within the task I of the ANCRiSST benchmark problem, which cover 24 hours of continuous monitoring of the structural response under ambient excitation. These records are then subdivided into a convenient number of data sets and the variability of modal parameter estimates with ambient temperature and mean wind velocity are pointed out. Both 10 minutes and 1 hour long records are considered for this purpose. A comparison with finite element model predictions is finally carried out, using the structural matrices provided within the benchmark, in order to check that all the structural modes contained in the considered frequency interval are effectively identified via SSI-data.

Experimental Evaluation of Distance-based and Probability-based Clustering

  • Kwon, Na Yeon;Kim, Jang Il;Dollein, Richard;Seo, Weon Joon;Jung, Yong Gyu
    • International journal of advanced smart convergence
    • /
    • 제2권1호
    • /
    • pp.36-41
    • /
    • 2013
  • Decision-making is to extract information that can be executed in the future, it refers to the process of discovering a new data model that is induced in the data. In other words, it is to find out the information to peel off to find the vein to catch the relationship between the hidden patterns in data. The information found here, is a process of finding the relationship between the useful patterns by applying modeling techniques and sophisticated statistical analysis of the data. It is called data mining which is a key technology for marketing database. Therefore, research for cluster analysis of the current is performed actively, which is capable of extracting information on the basis of the large data set without a clear criterion. The EM and K-means methods are used a lot in particular, how the result values of evaluating are come out in experiments, which are depending on the size of the data by the type of distance-based and probability-based data analysis.

A Biclustering Method for Time Series Analysis

  • Lee, Jeong-Hwa;Lee, Young-Rok;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • 제9권2호
    • /
    • pp.131-140
    • /
    • 2010
  • Biclustering is a method of finding meaningful subsets of objects and attributes simultaneously, which may not be detected by traditional clustering methods. It is popularly used for the analysis of microarray data representing the expression levels of genes by conditions. Usually, biclustering algorithms do not consider a sequential relation between attributes. For time series data, however, bicluster solutions should keep the time sequence. This paper proposes a new biclustering algorithm for time series data by modifying the plaid model. The proposed algorithm introduces a parameter controlling an interval between two selected time points. Also, the pruning step preventing an over-fitting problem is modified so as to eliminate only starting or ending points. Results from artificial data sets show that the proposed method is more suitable for the extraction of biclusters from time series data sets. Moreover, by using the proposed method, we find some interesting observations from real-world time-course microarray data sets and apartment price data sets in metropolitan areas.