• Title/Summary/Keyword: retrieval system

Search Result 2,267, Processing Time 0.034 seconds

Determining the number of Clusters in On-Line Document Clustering Algorithm (온라인 문서 군집화에서 군집 수 결정 방법)

  • Jee, Tae-Chang;Lee, Hyun-Jin;Lee, Yill-Byung
    • The KIPS Transactions:PartB
    • /
    • v.14B no.7
    • /
    • pp.513-522
    • /
    • 2007
  • Clustering is to divide given data and automatically find out the hidden meanings in the data. It analyzes data, which are difficult for people to check in detail, and then, makes several clusters consisting of data with similar characteristics. On-Line Document Clustering System, which makes a group of similar documents by use of results of the search engine, is aimed to increase the convenience of information retrieval area. Document clustering is automatically done without human interference, and the number of clusters, which affect the result of clustering, should be decided automatically too. Also, the one of the characteristics of an on-line system is guarantying fast response time. This paper proposed a method of determining the number of clusters automatically by geometrical information. The proposed method composed of two stages. In the first stage, centers of clusters are projected on the low-dimensional plane, and in the second stage, clusters are combined by use of distance of centers of clusters in the low-dimensional plane. As a result of experimenting this method with real data, it was found that clustering performance became better and the response time is suitable to on-line circumstance.

Retrieval of Vertical Single-scattering albedo of Asian dust using Multi-wavelength Raman Lidar System (다파장 라만 라이다 시스템을 이용한 고도별 황사의 단산란 알베도 산출)

  • Noh, Youngmin;Lee, Chulkyu;Kim, Kwanchul;Shin, Sungkyun;Shin, Dongho;Choi, Sungchul
    • Korean Journal of Remote Sensing
    • /
    • v.29 no.4
    • /
    • pp.415-421
    • /
    • 2013
  • A new approach to retrieve the single-scattering albedo (SSA) of Asian dust plume, mixed with pollution particles, using multi-wavelength Raman lidar system was suggested in this study. Asian dust plume was separated as dust and non-dust particle (i.e. spherical particle) by the particle depolarization ratio at 532 nm. The vertical profiles of optical properties (the particle extinction coefficient at 355 and 532 nm and backscatter coefficient at 355, 532 and 1064 nm) for non-dust particle were used as input parameter for the inversion algorithm. The inversion algorithm provides the vertical distribution of microphysical properties of non-dust particle only so that the estimation of the SSA for the Asian dust in mixing state was suggested in this study. In order to estimate the SSA for the mixed Asian dust, we combined the SSA of non-dust particles retrieved by the inversion algorithms with assumed the SSA of 0.96 at 532 nm for dust. The retrieved SSA of Asian dust plume by lidar data was compared with the Aerosol Robotics Network (AERONET) retrieved values and showed good agreement.

Revising Passive Satellite-based Soil Moisture Retrievals over East Asia Using SMOS (MIRAS) and GCOM-W1 (AMSR2) Satellite and GLDAS Dataset (자료동화 토양수분 데이터를 활용한 동아시아지역 수동형 위성 토양수분 데이터 보정: SMOS (MIRAS), GCOM-W1 (AMSR2) 위성 및 GLDAS 데이터 활용)

  • Kim, Hyunglok;Kim, Seongkyun;Jeong, Jeahwan;Shin, Incheol;Shin, Jinho;Choi, Minha
    • Journal of Wetlands Research
    • /
    • v.18 no.2
    • /
    • pp.132-147
    • /
    • 2016
  • In this study the Microwave Imaging Radiometer using Aperture Synthesis (MIRAS) sensor onboard the Soil Moisture Ocean Salinity (SMOS) and Advanced Microwave Scanning Radiometer 2 (AMSR2) sensor onboard the Global Change Observation Mission-Water (GCOM-W1) based soil moisture retrievals were revised to obtain better accuracy of soil moisture and higher data acquisition rate over East Asia. These satellite-based soil moisture products are revised against a reference land model data set, called Global Land Data Assimilation System (GLDAS), using Cumulative Distribution Function (CDF) matching and regression approach. Since MIRAS sensor is perturbed by radio frequency interferences (RFI), the worst part of soil moisture retrieval, East Asia, constantly have been undergoing loss of data acquisition rate. To overcome this limitation, the threshold of RFI, DQX, and composite days were suggested to increase data acquisition rate while maintaining appropriate data quality through comparison of land surface model data set. The revised MIRAS and AMSR2 products were compared with in-situ soil moisture and land model data set. The results showed that the revising process increased correlation coefficient values of SMOS and AMSR2 averagely 27% 11% and decreased the root mean square deviation (RMSD) decreased 61% and 57% as compared to in-situ data set. In addition, when the revised products' correlation coefficient values are calculated with model data set, about 80% and 90% of pixels' correlation coefficients of SMOS and AMSR2 increased and all pixels' RMSD decreased. Through our CDF-based revising processes, we propose the way of mutual supplementation of MIRAS and AMSR2 soil moisture retrievals.

Efficient Management of Statistical Information of Keywords on E-Catalogs (전자 카탈로그에 대한 효율적인 색인어 통계 정보 관리 방법)

  • Lee, Dong-Joo;Hwang, In-Beom;Lee, Sang-Goo
    • The Journal of Society for e-Business Studies
    • /
    • v.14 no.4
    • /
    • pp.1-17
    • /
    • 2009
  • E-Catalogs which describe products or services are one of the most important data for the electronic commerce. E-Catalogs are created, updated, and removed in order to keep up-to-date information in e-Catalog database. However, when the number of catalogs increases, information integrity is violated by the several reasons like catalog duplication and abnormal classification. Catalog search, duplication checking, and automatic classification are important functions to utilize e-Catalogs and keep the integrity of e-Catalog database. To implement these functions, probabilistic models that use statistics of index words extracted from e-Catalogs had been suggested and the feasibility of the methods had been shown in several papers. However, even though these functions are used together in the e-Catalog management system, there has not been enough consideration about how to share common data used for each function and how to effectively manage statistics of index words. In this paper, we suggest a method to implement these three functions by using simple SQL supported by relational database management system. In addition, we use materialized views to reduce the load for implementing an application that manages statistics of index words. This brings the efficiency of managing statistics of index words by putting database management systems optimize statistics updating. We showed that our method is feasible to implement three functions and effective to manage statistics of index words with empirical evaluation.

  • PDF

Is it necessary to distinguish semantic memory from episodic memory\ulcorner (의미기억과 일화기억의 구분은 필요한가)

  • 이정모;박희경
    • Korean Journal of Cognitive Science
    • /
    • v.11 no.3_4
    • /
    • pp.33-43
    • /
    • 2000
  • The distinction between short-term store (STS) and long-term store (LTS) has been made in the perspective of information processing. Memory system theorists have argued that memory could be conceived as multiple memory systems beyond the concept of a single LTS. Popular memory system models are Schacter & Tulving (994)'s multiple memory systems and Squire (987)'s the taxonomy of long-term memory. Those m models agree that amnesic patients have intact STS but impaired LTS and have preserved implicit memory. However. there is a debate about the nature of the long-term memory impairment. One model considers amnesic deficit as a selective episodic memory impairment. whereas the other sees the deficits as both episodic and semantic memory impairment. At present, it remains unclear that episodic memory should be distinguished from semantic memory in terms of retrieval operation. The distinction between declarative memory and nondeclarative memory would be the alternative way to reflect explicit memory and implicit memory. The research focused on the function of frontal lobe might give clues to the debate about the nature of LTS.

  • PDF

Implementation and Performance Analysis of the Group Communication Using CORBA-ORB, JAVA-RMI and Socket (CORBA-ORB, JAVA-RMI, 소켓을 이용한 그룹 통신의 구현 및 성능 분석)

  • 한윤기;구용완
    • Journal of Internet Computing and Services
    • /
    • v.3 no.1
    • /
    • pp.81-90
    • /
    • 2002
  • Large-scale distributed applications based on Internet and client/server applications have to deal with series of problems. Load balancing, unpredictable communication delays, and networking failures can be the example of the series of problems. Therefore. sophisticated applications such as teleconferencing, video-on-demand, and concurrent software engineering require an abstracted group communication, CORBA does not address these paradigms adequately. It mainly deals with point-to-point communication and does not support the development of reliable applications that include predictable behavior in distributed systems. In this paper, we present our design, implementation and performance analysis of the group communication using the CORBA-ORB. JAVA-RML and Socket based on distributed computing Performance analysis will be estimated latency-lime according to object increment, in case of group communication using ORB of CORBA the average is 14.5172msec, in case of group communication using RMI of Java the average is 21.4085msec, in case of group communication using socket the average is becoming 18.0714msec. Each group communication using multicast and UDP can be estimated 0.2735msec and 0.2157msec. The performance of the CORBA-ORB group communication is increased because of the increased object by the result of this research. This study can be applied to the fault-tolerant client/server system, group-ware. text retrieval system, and financial information systems.

  • PDF

A News Video Mining based on Multi-modal Approach and Text Mining (멀티모달 방법론과 텍스트 마이닝 기반의 뉴스 비디오 마이닝)

  • Lee, Han-Sung;Im, Young-Hee;Yu, Jae-Hak;Oh, Seung-Geun;Park, Dai-Hee
    • Journal of KIISE:Databases
    • /
    • v.37 no.3
    • /
    • pp.127-136
    • /
    • 2010
  • With rapid growth of information and computer communication technologies, the numbers of digital documents including multimedia data have been recently exploded. In particular, news video database and news video mining have became the subject of extensive research, to develop effective and efficient tools for manipulation and analysis of news videos, because of their information richness. However, many research focus on browsing, retrieval and summarization of news videos. Up to date, it is a relatively early state to discover and to analyse the plentiful latent semantic knowledge from news videos. In this paper, we propose the news video mining system based on multi-modal approach and text mining, which uses the visual-textual information of news video clips and their scripts. The proposed system systematically constructs a taxonomy of news video stories in automatic manner with hierarchical clustering algorithm which is one of text mining methods. Then, it multilaterally analyzes the topics of news video stories by means of time-cluster trend graph, weighted cluster growth index, and network analysis. To clarify the validity of our approach, we analyzed the news videos on "The Second Summit of South and North Korea in 2007".

Trajectory Index Structure based on Signatures for Moving Objects on a Spatial Network (공간 네트워크 상의 이동객체를 위한 시그니처 기반의 궤적 색인구조)

  • Kim, Young-Jin;Kim, Young-Chang;Chang, Jae-Woo;Sim, Chun-Bo
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.3
    • /
    • pp.1-18
    • /
    • 2008
  • Because we can usually get many information through analyzing trajectories of moving objects on spatial networks, efficient trajectory index structures are required to achieve good retrieval performance on their trajectories. However, there has been little research on trajectory index structures for spatial networks such as FNR-tree and MON-tree. Also, because FNR-tree and MON-tree store the segment unit of moving objects, they can't support the trajectory of whole moving objects. In this paper, we propose an efficient trajectory index structures based on signatures on a spatial network, named SigMO-Tree. For this, we divide moving object data into spatial and temporal attributes, and design an index structure which supports not only range query but trajectory query by preserving the whole trajectory of moving objects. In addition, we divide user queries into trajectory query based on spatio-temporal area and similar-tralectory query, and propose query processing algorithms to support them. The algorithm uses a signature file in order to retrieve candidate trajectories efficiently Finally, we show from our performance analysis that our trajectory index structure outperforms the existing index structures like FNR-Tree and MON-Tree.

  • PDF

Construction of Component Repository for Supporting the CBD Process (CBD 프로세스 지원을 위한 컴포넌트 저장소의 구축)

  • Cha, Jung-Eun;Kim, Hang-Kon
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.7
    • /
    • pp.476-486
    • /
    • 2002
  • CBD(Component Based Development) has become the best strategical method for the business application. Because CBD is a new development paradigm which makes it possible to assemble the software components for application, it copes with the rapid challenge of business process and meets the increasing requirements for productivity. Since the business process is rapidly changing, CBD technology is the promising way to solve the productivity. Especially, the repository is the most important part for the development, distribution and reuse of components. In component repository, we can store and manage the related work-products produced at each step of component development as well as component itself. In this paper, we suggested a practical approach for repository construction to support and realize the CBD process and developed the CRMS(Component Repository Management System) as implementation product of the proposed techniques. CRMS can manage a variety of component products based on component architecture, and help software developers to search a candidate component for their project and to understand a variety of information for the component. In the paper, a practical approach for component repository was suggested, and a supporting environment was constructed to make CBD to be working efficiently. We expect this work wall be valuable research for component repository and the entire supporting Component Based Development Process.

An Interconnection Method for Streaming Framework and Multimedia Database (스트리밍 프레임워크와 멀티미디어 데이타베이스와의 연동기법)

  • Lee, Jae-Wook;Lee, Sung-Young;Lee, Jong-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.7
    • /
    • pp.436-449
    • /
    • 2002
  • This paper describes on our experience of developing the Database Connector as an interconnection method between multimedia database, and the streaming framework. It is possible to support diverse and mature multimedia database services such as retrieval and join operation during the streaming if an interconnection method is provided in between streaming system and multimedia databases. The currently available interconnection schemes, however have mainly used the file systems or the relational databases that are Implemented with separated form of meta data, which deafs with information of multimedia contents, and streaming data which deals with multimedia data itself. Consequently, existing interconnection mechanisms could not come up with many virtues of multimedia database services during the streaming operation. In order to resolve these drawbacks, we propose a novel scheme for an interconnection between streaming framework and multimedia database, called the Inter-Process Communication (IPC) based Database connector, under the assumption that two systems are located in a same host. We define four transaction primitives; Read, Write, Find, Play, as well as define the interface for transactions that are implemented based on the plug-in, which in consequence can extend to other multimedia databases that will come for some later years. Our simulation study show that performance of the proposed IPC based interconnection scheme is not much far behind compared with that of file systems.