• Title/Summary/Keyword: Hierarchical Clustering Analysis

Search Result 250, Processing Time 0.03 seconds

Video Scene Detection using Shot Clustering based on Visual Features (시각적 특징을 기반한 샷 클러스터링을 통한 비디오 씬 탐지 기법)

  • Shin, Dong-Wook;Kim, Tae-Hwan;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.47-60
    • /
    • 2012
  • Video data comes in the form of the unstructured and the complex structure. As the importance of efficient management and retrieval for video data increases, studies on the video parsing based on the visual features contained in the video contents are researched to reconstruct video data as the meaningful structure. The early studies on video parsing are focused on splitting video data into shots, but detecting the shot boundary defined with the physical boundary does not cosider the semantic association of video data. Recently, studies on structuralizing video shots having the semantic association to the video scene defined with the semantic boundary by utilizing clustering methods are actively progressed. Previous studies on detecting the video scene try to detect video scenes by utilizing clustering algorithms based on the similarity measure between video shots mainly depended on color features. However, the correct identification of a video shot or scene and the detection of the gradual transitions such as dissolve, fade and wipe are difficult because color features of video data contain a noise and are abruptly changed due to the intervention of an unexpected object. In this paper, to solve these problems, we propose the Scene Detector by using Color histogram, corner Edge and Object color histogram (SDCEO) that clusters similar shots organizing same event based on visual features including the color histogram, the corner edge and the object color histogram to detect video scenes. The SDCEO is worthy of notice in a sense that it uses the edge feature with the color feature, and as a result, it effectively detects the gradual transitions as well as the abrupt transitions. The SDCEO consists of the Shot Bound Identifier and the Video Scene Detector. The Shot Bound Identifier is comprised of the Color Histogram Analysis step and the Corner Edge Analysis step. In the Color Histogram Analysis step, SDCEO uses the color histogram feature to organizing shot boundaries. The color histogram, recording the percentage of each quantized color among all pixels in a frame, are chosen for their good performance, as also reported in other work of content-based image and video analysis. To organize shot boundaries, SDCEO joins associated sequential frames into shot boundaries by measuring the similarity of the color histogram between frames. In the Corner Edge Analysis step, SDCEO identifies the final shot boundaries by using the corner edge feature. SDCEO detect associated shot boundaries comparing the corner edge feature between the last frame of previous shot boundary and the first frame of next shot boundary. In the Key-frame Extraction step, SDCEO compares each frame with all frames and measures the similarity by using histogram euclidean distance, and then select the frame the most similar with all frames contained in same shot boundary as the key-frame. Video Scene Detector clusters associated shots organizing same event by utilizing the hierarchical agglomerative clustering method based on the visual features including the color histogram and the object color histogram. After detecting video scenes, SDCEO organizes final video scene by repetitive clustering until the simiarity distance between shot boundaries less than the threshold h. In this paper, we construct the prototype of SDCEO and experiments are carried out with the baseline data that are manually constructed, and the experimental results that the precision of shot boundary detection is 93.3% and the precision of video scene detection is 83.3% are satisfactory.

An Application of FCM(Fuzzy C-Means) for Clustering of Asian Ports Competitiveness Level and Status of Busan Port (FCM법을 이용한 아시아 항만의 경쟁력 수준 분류와 부산항의 위상)

  • 류형근;이홍걸;여기태
    • Journal of Korean Society of Transportation
    • /
    • v.21 no.5
    • /
    • pp.7-18
    • /
    • 2003
  • Due to the changes of shipping and logistic environment, Asian ports today face severe competition. To be a mega-hub port, Asian ports have achieved a big scale development. For these reasons, it has been widely recognized as an important study to analyze and evaluate characteristics of Asian ports, from the standpoint of Korea where Busan Port is located. Although some previous studies have been reported, most of them have been beyond the scope of Asian ports and analyzed the world's major ports; moreover, the studied ports have been about the ports which are well known from the previous research and reports. So, most studies is unlikely to be used as substantial indicators from the perspective of Busan Port. In addition. most of the existing studies have used hierarchical evaluation algorithm for port ranking, such as AHP (analytical hierarchy process) and clustering analysis. However, these two methods have fundamental weaknesses from the algorithm perspective. The aim of this study is to classify major Asian ports based on competitiveness level. Especially. in order to overcome serious problem of the existing studies, major Asian ports were analyzed by using objective indicators. and Fuzzy C-Means algorithm, which alleviates the weakness of the clustering method. It was found that 10 ports of 16 major Asian ports have their own phases and were classified into 4 port groups. This result implies that some ports have higher potential as ports to lead some zones in Asia. Based on those results. present status and future direction of Busan port were discussed as well.

Source Characteristics of Particulate Trace Metals in Daegu Area (대구지역 부유분진 중 미량금속성분의 발생원 특성연구)

  • 최성우;송형도
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.16 no.5
    • /
    • pp.469-476
    • /
    • 2000
  • This study was performed to understand the behavior and source characteristics of particulate trace metals in Daegu area. To do this, total of 84 samples had been collected from January to December 1999. TSP (total suspended particulate matter) and PM-10(particulate matter with aerodynamic diameters less 10${\mu}{\textrm}{m}$) were collected by filters on portable air sampler, and in TSP and PM-10 were analyzed by ICP(Inductively Coupled Plasma Spectrometer) after preliminary treatment. The results were follow as: first, annul means of TSP and PM-10 concentration were 123 and 69$\mu\textrm{g}$/㎤ respectively. The concentration of TSP adn PM-10 were highest in winter season compared to other seasons. Second, the concentration of Al, Fe, Mn were higher in TSP than in PM-10, indicating that these metals are generally associate with natural contributions. Third, a hierarchical clustering technique was used to group 9 metals. The results from the cluster analysis of TSP and PM-10 shows a similar clustering pattern : Fe, Al in a group and the rest of the metals such as Ni, Cr, As, Mn, Cd, Pb, Zn in the other group. One group of metal such as Fe, Al is associated with natural sources such as soil and dust. The other is closely related to urban anthropogenic sources such as fuel combustion, incineration, and refuse burning, Finally, using Al as a reference element, enrichment factors were used for identifying the major particulate contributors. The enrichment factors of Al. Fe<10 (standard value of enrichment factor) were considered to have a significant dust and soil source and termed nonenriched. Ni, Cr, As, Mn, Cd, Pb, Zn》10 is enriched and has a significant which is contributed by athropogenic sources.

  • PDF

Nonstandard Machine Learning Algorithms for Microarray Data Mining

  • Zhang, Byoung-Tak
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2001.10a
    • /
    • pp.165-196
    • /
    • 2001
  • DNA chip 또는 microarray는 다수의 유전자 또는 유전자 조각을 (보통 수천내지 수만 개)칩상에 고정시켜 놓고 DNA hybridization 반응을 이용하여 유전자들의 발현 양상을 분석할 수 있는 기술이다. 이러한 high-throughput기술은 예전에는 생각하지 못했던 여러가지 분자생물학의 문제에 대한 해답을 제시해 줄 수 있을 뿐 만 아니라, 분자수준에서의 질병 진단, 신약 개발, 환경 오염 문제의 해결 등 그 응용 가능성이 무한하다. 이 기술의 실용적인 적용을 위해서는 DNA chip을 제작하기 위한 하드웨어/웻웨어 기술 외에도 이러한 데이터로부터 최대한 유용하고 새로운 지식을 창출하기 위한 bioinformatics 기술이 핵심이라고 할 수 있다. 유전자 발현 패턴을 데이터마이닝하는 문제는 크게 clustering, classification, dependency analysis로 구분할 수 있으며 이러한 기술은 통계학과인공지능 기계학습에 기반을 두고 있다. 주로 사용된 기법으로는 principal component analysis, hierarchical clustering, k-means, self-organizing maps, decision trees, multilayer perceptron neural networks, association rules 등이다. 본 세미나에서는 이러한 기본적인 기계학습 기술 외에 최근에 연구되고 있는 새로운 학습 기술로서 probabilistic graphical model (PGM)을 소개하고 이를 DNA chip 데이터 분석에 응용하는 연구를 살펴본다. PGM은 인공신경망, 그래프 이론, 확률 이론이 결합되어 형성된 기계학습 모델로서 인간 두뇌의 기억과 학습 기작에 기반을 두고 있으며 다른 기계학습 모델과의 큰 차이점 중의 하나는 generative model이라는 것이다. 즉 일단 모델이 만들어지면 이것으로부터 새로운 데이터를 생성할 수 있는 능력이 있어서, 만들어진 모델을 검증하고 이로부터 새로운 사실을 추론해 낼 수 있어 biological data mining 문제에서와 같이 새로운 지식을 발견하는 exploratory analysis에 적합하다. 또한probabilistic graphical model은 기존의 신경망 모델과는 달리 deterministic한의사결정이 아니라 확률에 기반한 soft inference를 하고 학습된 모델로부터 관련된 요인들간의 인과관계(causal relationship) 또는 상호의존관계(dependency)를 분석하기에 적합한 장점이 있다. 군체적인 PGM 모델의 예로서, Bayesian network, nonnegative matrix factorization (NMF), generative topographic mapping (GTM)의 구조와 학습 및 추론알고리즘을소개하고 이를 DNA칩 데이터 분석 평가 대회인 CAMDA-2000과 CAMDA-2001에서 사용된cancer diagnosis 문제와 gene-drug dependency analysis 문제에 적용한 결과를 살펴본다.

  • PDF

User-interface Considerations for the Main Button Layout of the Tactical Computer for Korea Army (한국군 전술컴퓨터의 인간공학적 메인버튼 설계)

  • Baek, Seung-Chang;Jung, Eui-S.;Park, Sung-Joon
    • Journal of the Ergonomics Society of Korea
    • /
    • v.28 no.4
    • /
    • pp.147-154
    • /
    • 2009
  • The tactical computer is currently being developed and installed in armored vehicles and tanks for reinforcement. With the tactical computer, Korea Army will be able to grasp the deployment status of our forces, enemy, and obstacles under varying situations. Furthermore, it makes the exchange of command and tactical intelligence possible. Recent studies showed that the task performance is greatly affected by the user interface. The U.S. Army is now conducting user-centered evaluation tests based on C2 (Command & Control) to develop tactical intelligence machinery and tools. This study aims to classify and regroup subordinate menu functions according to the user-centered task performance for the Korea Army's tactical computer. Also, the research suggests an ergonomically sound layout and size of main touch buttons by considering human factors guidelines for button design. To achieve this goal, eight hierarchical subordinate menu functions are initially drawn through clustering analysis and then each group of menu functions was renamed. Based on the suggested menu structure, new location and size of the buttons were tested in terms of response time, number of error, and subjective preference by comparing them to existing ones. The result showed that the best performance was obtained when the number of buttons or functions was eight to conduct tactical missions. Also, the improved button size and location were suggested through the experiment. It was found in addition that the location and size of the buttons had interactions regarding the user's preference.

Characteristics of Source Acupoints: Data Mining of Clinical Trials Database (데이터 마이닝을 이용한 임상연구 데이터베이스 기반 원혈의 주치 특성)

  • Choi, Dha-Hyun;Lee, Seoyoung;Lee, In-Seon;Ryu, Yeonhee;Chae, Younbyoung
    • Korean Journal of Acupuncture
    • /
    • v.38 no.2
    • /
    • pp.100-109
    • /
    • 2021
  • Objectives : Source acupoint is one of the representative acupoints to treat various diseases in each meridian. We aimed to identify the patterns of selection of Source acupoints and their associations with diseases using clinical trials data. Methods : We extracted the frequency of Source acupoints across 30 diseases from clinical trials database. Acupuncture treatment regimens were retrieved from the Cochrane Database of Systematic Reviews. The frequency of Source acupoint use was calculated as the number of studies using a certain acupoint divided by the total number of included studies. Using hierarchical clustering and multidimensional scaling, the characteristics of Source acupoints were analyzed based on the similarity of the relationships between the Source acupoints and the diseases. Results : A total of 421 clinical trials were included for this analysis. LR3, HT7, KI3, and LI4 acupoints were most frequently used for the treatment of 30 diseases. Cluster analysis showed that LR3 and LI4 acupoints were grouped together and HT7 and KI3 acupoints were grouped together. Multidimensional scaling revealed that LR3, LI4, HT7, and KI3 acupoints have intrinsic properties in the two-dimensional space. Conclusions : The present study identified the selection patterns of the Source acupoints using clinical trials data. Our finding will provide the understanding of the characteristics of Source acupoints.

Antioxidant Activities and Total Phenolic Contents of Three Legumes

  • Lee, Kyung Jun;Kim, Ga-Hee;Lee, Gi-An;Lee, Jung-Ro;Cho, Gyu-Taek;Ma, Kyung-Ho;Lee, Sookyeong
    • Korean Journal of Plant Resources
    • /
    • v.34 no.6
    • /
    • pp.527-535
    • /
    • 2021
  • Legumes have been important components of the human diet. They contain not only protein, starch, and dietary fiber, but also various phenolic compounds such as flavonoids and phenolic acids. The importance of phenolic compounds to human health is well known due to their antioxidant activities. In this study, three legumes (adzuki beans, common beans, and black soybeans) frequently cultivated in Korea were evaluated for their total phenolic content (TPC) and antioxidant activities using DPPH (2,2-diphenyl-1-picrylhydrazyl), ABTS (2,2'-azinobis (3-ethylbenzothiazoline 6-sulfonate)), and FRAP (ferric reducing antioxidant potential) assays. In addition, correlations between agricultural traits and antioxidant activities of these three legumes were analyzed. Antioxidant activities assessed by DPPH, ABTS, and FRAP assays and TPC showed wide variations among legumes types and accessions. Among the three legumes, adzuki beans showed higher TPC and antioxidant activity than the other two legumes. In correlation analysis, seed size showed negative correlations with antioxidant activities and TPC. In principal component analysis and hierarchical clustering analysis, each of the three legumes was clearly separate. Results of this study can be used as basic information for developing functional materials for each legume. They can also help us understand the overall antioxidant activity of the three legumes.

A Classification of Luxury Fashion Brands' E-commerce Sites

  • Kim, Sunghee
    • Journal of Fashion Business
    • /
    • v.17 no.6
    • /
    • pp.125-140
    • /
    • 2013
  • The aim of this study was to analyze e-commerce sites of luxury fashion brands in order to provide insights on how to enhance online site quality. For the research, forty-eight components of thirty-one luxury fashion brands' e-commerce sites were investigated during October 2013. For the analysis of clustering e-commerce site components and segmenting e-commerce sites of luxury brands, a hierarchical cluster analysis was applied through using the Ward's method and squared Euclidian distance for binary data. Further, Fisher's exact test was applied in order to distinguish three groups of characteristics in the luxury e-commerce sites. These analyses were carried out by SPSS 21. The result indicated that the components of e-commerce sites were grouped into three categories: basic elements, additional elements and elements of building brand identity. These components were categorized by whether their functions were basic and essential or additional and advanced. The other norm of categorization was related to brand identity. Furthermore, the luxury brands' e-commerce sites were segmented into three groups: a group of endeavoring to promote goods, a group of undistinguished performance, and a group of endeavoring to intensify brand identity. In this segmentation, brand identity or promotional aspects were decisive. Overall, luxury brands were trying to convey their traditional strength through their e-commerce sites. In order to achieve this purpose, brand identity or promotional aspects played an important role.

Effect of Herbicide Combinations on Bt-Maize Rhizobacterial Diversity

  • Valverde, Jose R.;Marin, Silvia;Mellado, Rafael P.
    • Journal of Microbiology and Biotechnology
    • /
    • v.24 no.11
    • /
    • pp.1473-1483
    • /
    • 2014
  • Reports of herbicide resistance events are proliferating worldwide, leading to new cultivation strategies using combinations of pre-emergence and post-emergence herbicides. We analyzed the impact during a one-year cultivation cycle of several herbicide combinations on the rhizobacterial community of glyphosate-tolerant Bt-maize and compared them to those of the untreated or glyphosate-treated soils. Samples were analyzed using pyrosequencing of the V6 hypervariable region of the 16S rRNA gene. The sequences obtained were subjected to taxonomic, taxonomy-independent, and phylogeny-based diversity studies, followed by a statistical analysis using principal components analysis and hierarchical clustering with jackknife statistical validation. The resilience of the microbial communities was analyzed by comparing their relative composition at the end of the cultivation cycle. The bacterial communites from soil subjected to a combined treatment with mesotrione plus s-metolachlor followed by glyphosate were not statistically different from those treated with glyphosate or the untreated ones. The use of acetochlor plus terbuthylazine followed by glyphosate, and the use of aclonifen plus isoxaflutole followed by mesotrione clearly affected the resilience of their corresponding bacterial communities. The treatment with pethoxamid followed by glyphosate resulted in an intermediate effect. The use of glyphosate alone seems to be the less aggressive one for bacterial communities. Should a combined treatment be needed, the combination of mesotrione and s-metolachlor shows the next best final resilience. Our results show the relevance of comparative rhizobacterial community studies when novel combined herbicide treatments are deemed necessary to control weed growth.

Dynamic Text Categorizing Method using Text Mining and Association Rule

  • Kim, Young-Wook;Kim, Ki-Hyun;Lee, Hong-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.10
    • /
    • pp.103-109
    • /
    • 2018
  • In this paper, we propose a dynamic document classification method which breaks away from existing document classification method with artificial categorization rules focusing on suppliers and has changing categorization rules according to users' needs or social trends. The core of this dynamic document classification method lies in the fact that it creates classification criteria real-time by using topic modeling techniques without standardized category rules, which does not force users to use unnecessary frames. In addition, it can also search the details through the relevance analysis by calculating the relationship between the words that is difficult to grasp by word frequency alone. Rather than for logical and systematic documents, this method proposed can be used more effectively for situation analysis and retrieving information of unstructured data which do not fit the category of existing classification such as VOC (Voice Of Customer), SNS and customer reviews of Internet shopping malls and it can react to users' needs flexibly. In addition, it has no process of selecting the classification rules by the suppliers and in case there is a misclassification, it requires no manual work, which reduces unnecessary workload.