• Title/Summary/Keyword: Data Clustering

Search Result 2,747, Processing Time 0.035 seconds

Video Retrieval System supporting Adaptive Streaming Service (적응형 스트리밍 서비스를 지원하는 비디오 검색 시스템)

  • 이윤채;전형수;장옥배
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.1
    • /
    • pp.1-12
    • /
    • 2003
  • Recently, many researches into distributed processing on Internet, and multimedia data processing have been performed. Rapid and convenient multimedia services supplied with high quality and high speed are to be needed. In this paper, we design and implement clip-based video retrieval system on the Web enviroment in real-time. Our system consists of the content-based indexing system supporting convenient services for video content providers, and the Web-based retrieval system in order to make it easy and various information retrieval for users in the Web. Three important methods are used in the content-based indexing system, key frame extracting method by dividing video data, clip file creation method by clustering related information, and video database construction method by using clip unit. In Web-based retrieval system, retrieval method ny using a key word, two dimension browsing method of key frame, and real-time display method of the clip are used. In this paper, we design and implement the system that supports real-time display method of the clip are used. In this paper, we design and implement the system that supports real-time retrieval for video clips on Web environment and provides the multimedia service in stability. The proposed methods show a usefulness of video content providing, and provide an easy method for serching intented video content.

Development of an Activity-Based Conceptual Cost Estimating Model for P.S.CBox Girder Bridge (대표공종 기반의 P.S.C 박스 거더교 개략공사비 산정모델 개발 -상부공사 중심으로-)

  • Cho, Ji-Hoon;Kim, Sang-Bum
    • Proceedings of the Korean Institute Of Construction Engineering and Management
    • /
    • 2008.11a
    • /
    • pp.197-201
    • /
    • 2008
  • Conceptual cost estimates for domestic highway projects have generally been conducted using governmental unit-price references. Inaccuracies in governmental unit-price data has repeatedly addressed in the Korean construction industry which often lead to poor decision making and cost management practices. Thus, needs for developing a better way of conceptual cost estimating has been widely recognized. This research is considered as the first step in developing such model using real-world cost data based on actual construction activities. The data analyzed in this paper includes 41 P.S.C (Prestressed Concrete) Box bridges which broke into 4 categories based on construction methods such as I.L.M(Incremental Launching Method), M.S.S(Movable Scaffolding System), F.S.M(Full Staging Method), and F.C.M(Free Cantilever Method). Actual design documents; including actual cost estimating documents, drawings and specifications were carefully reviewed to effectively break down cost structures for PSC girder bridges. Among more than 40 cost categories for each P.S.C girder bridge type, 7 of them were identified which accounted for more than 95% of total construction cost (ILM: 99.47%, MSS: 99.22%, FSM: 98.18%, and FCM: 98.12%). In order to validate the clustering of cost categories, the variation of each cost category has been investigated which resulted in between -1.16 % and 0.59%.

  • PDF

Patent data analysis using clique analysis in a keyword network (키워드 네트워크의 클릭 분석을 이용한 특허 데이터 분석)

  • Kim, Hyon Hee;Kim, Donggeon;Jo, Jinnam
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1273-1284
    • /
    • 2016
  • In this paper, we analyzed the patents on machine learning using keyword network analysis and clique analysis. To construct a keyword network, important keywords were extracted based on the TF-IDF weight and their association, and network structure analysis and clique analysis was performed. Density and clustering coefficient of the patent keyword network are low, which shows that patent keywords on machine learning are weakly connected with each other. It is because the important patents on machine learning are mainly registered in the application system of machine learning rather thant machine learning techniques. Also, our results of clique analysis showed that the keywords found by cliques in 2005 patents are the subjects such as newsmaker verification, product forecasting, virus detection, biomarkers, and workflow management, while those in 2015 patents contain the subjects such as digital imaging, payment card, calling system, mammogram system, price prediction, etc. The clique analysis can be used not only for identifying specialized subjects, but also for search keywords in patent search systems.

SOM-Based $R^{*}-Tree$ for Similarity Retrieval (자기 조직화 맵 기반 유사 검색 시스템)

  • O, Chang-Yun;Im, Dong-Ju;O, Gun-Seok;Bae, Sang-Hyeon
    • The KIPS Transactions:PartD
    • /
    • v.8D no.5
    • /
    • pp.507-512
    • /
    • 2001
  • Feature-based similarity has become an important research issue in multimedia database systems. The features of multimedia data are useful for discriminating between multimedia objects. the performance of conventional multidimensional data structures tends to deteriorate as the number of dimensions of feature vectors increase. The $R^{*}-Tree$ is the most successful variant of the R-Tree. In this paper, we propose a SOM-based $R^{*}-Tree$ as a new indexing method for high-dimensional feature vectors. The SOM-based $R^{*}-Tree$ combines SOM and $R^{*}-Tree$ to achieve search performance more scalable to high-dimensionalties. Self-Organizingf Maps (SOMs) provide mapping from high-dimensional feature vectors onto a two-dimensional space. The map is called a topological feature map, and preserves the mutual relationships (similarity) in the feature spaces of input data, clustering mutually similar feature vectors in neighboring nodes. Each node of the topological feature map holds a codebook vector. We experimentally compare the retrieval time cost of a SOM-based $R^{*}-Tree$ with of an SOM and $R^{*}-Tree$ using color feature vectors extracted from 40,000 images. The results show that the SOM-based $R^{*}-Tree$ outperform both the SOM and $R^{*}-Tree$ due to reduction of the number of nodes to build $R^{*}-Tree$ and retrieval time cost.

  • PDF

Trend Analysis of Corona Virus(COVID-19) based on Social Media (소셜미디어에 나타난 코로나 바이러스(COVID-19) 인식 분석)

  • Yoon, Sanghoo;Jung, Sangyun;Kim, Young A
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.5
    • /
    • pp.317-324
    • /
    • 2021
  • This study deals with keywords from social media on domestic portal sites related to COVID-19, which is spreading widely. The data were collected between January 20 and August 15, 2020, and were divided into three stages. The precursor period is before COVID-19 started spreading widely between January 20 and February 17, the serious period denotes the spread in Daegu between February 18 and April 20, and the stable period is the decrease in numbers of confirmed infections up to August 15. The top 50 words were extracted and clustered based on TF-IDF. As a result of the analysis, the precursor period keywords corresponded to congestion of the Situation. The frequent keywords in the serious period were Nation and Infection Route, along with instability surrounding the Treatment of COVID-19. The most common keywords in all periods were infection, mask, person, occurrence, confirmation, and information. People's emotions are becoming more positive as time goes by. Cafes and blogs share text containing writers' thoughts and subjectivity via the internet, so they are the main information-sharing spaces in the non-face-to-face era caused by COVID-19. However, since selectivity and randomness in information delivery exists, a critical view of the information produced on social media is necessary.

A Cluster Based Energy Efficient Tree Routing Protocol in Wireless Sensor Networks (광역 WSN 을 위한 클러스팅 트리 라우팅 프로토콜)

  • Nurhayati, Nurhayati;Choi, Sung-Hee;Lee, Kyung-Oh
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.576-579
    • /
    • 2011
  • Wireless sensor network are widely all over different fields. Because of its distinguished characteristics, we must take account of the factor of energy consumed when designing routing protocol. Wireless sensor networks consist of small battery powered devices with limited energy resources. Once deployed, the small sensor nodes are usually inaccessible to the user, and thus replacement of the energy source is not feasible. Hence, energy efficiency is a key design issue that needs to be enhanced in order to improve the life span of the network. In BCDCP, all sensors sends data from the CH (Cluster Head) and then to the BS (Base Station). BCDCP works well in a smallscale network however is not preferred in a large scale network since it uses much energy for long distance wireless communication. TBRP can be used for large scale network, but it weakness lies on the fact that the nodedry out of energy easily since it uses multi-hops transmission data to the Base Station. Here, we proposed a routing protocol. A Cluster Based Energy Efficient Tree Routing Protocol (CETRP) in Wireless Sensor Networks (WSNs) to prolong network life time through the balanced energy consumption. CETRP selects Cluster Head of cluster tree shape and uses maximum two hops data transmission to the Cluster Head in every level. We show CETRP outperforms BCDCP and TBRP with several experiments.

Affinity-based Dynamic Transaction Routing in a Shared Disk Cluster (공유 디스크 클러스터에서 친화도 기반 동적 트랜잭션 라우팅)

  • 온경오;조행래
    • Journal of KIISE:Databases
    • /
    • v.30 no.6
    • /
    • pp.629-640
    • /
    • 2003
  • A shared disk (SD) cluster couples multiple nodes for high performance transaction processing, and all the coupled nodes share a common database at the disk level. In the SD cluster, a transaction routing corresponds to select a node for an incoming transaction to be executed. An affinity-based routing can increase local buffer hit ratio of each node by clustering transactions referencing similar data to be executed on the same node. However, the affinity-based routing is very much non-adaptive to the changes in the system load, and thus a specific node will be overloaded if transactions in some class are congested. In this paper, we propose a dynamic transaction routing scheme that can achieve an optimal balance between affinity-based routing and dynamic load balancing of all the nodes in the SD cluster. The proposed scheme is novel in the sense that it can improve the system performance by increasing the local buffer hit ratio and reducing the buffer invalidation overhead.

Annotation and Expression Profile Analysis of cDNAs from the Antarctic Diatom Chaetoceros neogracile

  • Jung, Gyeong-Seo;Lee, Choul-Gyun;Kang, Sung-Ho;Jin, Eon-Seon
    • Journal of Microbiology and Biotechnology
    • /
    • v.17 no.8
    • /
    • pp.1330-1337
    • /
    • 2007
  • To better understand the gene expression of the cold-adapted polar diatom, we conducted a survey of the Chaetoceros neogracile transcriptome by cDNA sequencing and expression of interested cDNAs from the Antarctic diatom. A non-normalized cDNA library was constructed from the C. neogracile, and a total of 2,500 cDNAs were sequenced to generate 1,881 high-quality expressed sequence tags (ESTs) (accession numbers EL620615-EL622495). Based on their clustering, we identified 154 unique clusters comprising 342 ESTs. The remaining 1,540 ESTs did not cluster. The number of unique genes identified in the data set is thus estimated to be 1,694. Taking advantage of various tools and databases, putative functions were assigned to 939 (55.4%) of these genes. Of the remaining 540 (31.9%) unknown sequences, 215 (12.7%) appeared to be C. neogracile-specific since they lacked any significant sequence similarity to any sequence available in the public databases. C. neogracile consisted of a relatively high percentage of genes involved in metabolism, genetic information processing, cellular processes, defense or stress resistance, photosynthesis, structure, and signal transduction. From the ESTs, the expression of these putative C. neogracile genes was investigated: fucoxanthin chlorophyll (chl) a,c-binding protein (FCP), ascorbate peroxidase (ASP), and heat-shock protein 90 (HSP90). The abundance of ASP and HSP90 changed substantially in response to different culture conditions, indicating the possible regulation of these genes in C. neogracile.

Gene Expression Profiling of the Rewarding Effect Caused by Methamphetamine in the Mesolimbic Dopamine System

  • Yang, Moon Hee;Jung, Min-Suk;Lee, Min Joo;Yoo, Kyung Hyun;Yook, Yeon Joo;Park, Eun Young;Choi, Seo Hee;Suh, Young Ju;Kim, Kee-Won;Park, Jong Hoon
    • Molecules and Cells
    • /
    • v.26 no.2
    • /
    • pp.121-130
    • /
    • 2008
  • Methamphetamine, a commonly used addictive drug, is a powerful addictive stimulant that dramatically affects the CNS. Repeated METH administration leads to a rewarding effect in a state of addiction that includes sensitization, dependence, and other phenomena. It is well known that susceptibility to the development of addiction is influenced by sources of reinforcement, variable neuroadaptive mechanisms, and neurochemical changes that together lead to altered homeostasis of the brain reward system. These behavioral abnormalities reflect neuroadaptive changes in signal transduction function and cellular gene expression produced by repeated drug exposure. To provide a better understanding of addiction and the mechanism of the rewarding effect, it is important to identify related genes. In the present study, we performed gene expression profiling using microarray analysis in a reward effect animal model. We also investigated gene expression in four important regions of the brain, the nucleus accumbens, striatum, hippocampus, and cingulated cortex, and analyzed the data by two clustering methods. Genes related to signaling pathways including G-protein-coupled receptor-related pathways predominated among the identified genes. The genes identified in our study may contribute to the development of a gene modeling network for methamphetamine addiction.

Multivariate Analysis among Leaf/Smoke Components and Sensory Properties about Tobacco Leaves Blending Ratio

  • Lee Seung-Yong;Lee Whan-Woo;Lee Kyung-Ku;Kim Young-Hoh
    • Journal of the Korean Society of Tobacco Science
    • /
    • v.27 no.1 s.53
    • /
    • pp.141-152
    • /
    • 2005
  • This study focused on the relationships among leaf and smoke components and sensory properties following tobacco leaf blending. A completely randomized experimental design was used to evaluate components of leaf and smoke and sensory properties for sample cigarettes with four mixtures of flue cured and burley tobacco (40:60, 60:40, 80:20 and 100:0). Eleven leaf components, six smoke components, and eight sensory properties of smoking taste were analyzed. A sensory evaluation method known as quantitative descriptive analysis was used to evaluate perceptual strength on a fifteen score scale. Raw data from ten trained panelists were obtained and statistically analyzed. Based on the MANOVA, clustering analysis, correlation matrix and partial least square (PLS) method were applied to find out which smoke component most affected sensory properties. The PLS method was used to remove the influence between explanatory variables in the leaf, smoke components derived from the results. High correlations (p<0.0l) were found among ten specific leaf and smoke components and sensory attributes. Total nitrogen, ammonia, total volatile base, and nitrate in the leaf were significantly correlated (p<0.05) with impact, bitterness, tobacco taste, irritation, smoke volume, and smoke pungency. From the results of PLS analysis, influence variables are used to explain about the correlation. In terms of bitterness, with only two explanatory variables, Leaf $NO_3$ and Leaf crude fiber were enough for guessing their correlation. In the distance weighted least square fitting analysis, carbon monoxide highly influenced bitterness, hay like taste, and smoke volume.