• Title/Summary/Keyword: Similar information retrieval

Search Result 297, Processing Time 0.027 seconds

A Design of KP AGENT for Intelligent Information Retrieval (지능형 정보검색을 위한 KP AGENT의 설계)

  • 박경우;배상현
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.2
    • /
    • pp.443-451
    • /
    • 2000
  • Until now, there have been various kinds of science information databsae which databased the science technology information, but they do not satisfy the aspiration of the users. Therefore, in the position of the users, it suggests the technology information space as a now paradigm, which supplement the function of science information DB. ICPIS which inputs described papers with keywords, offers the itemized summary of these contents, the visual indication and comparison of similar thesis, and it also supplises the abundant summary information, survey information, more than ten volumes of info communication thesis with starting the casual relation extraction for the users, playing a significant role in ICPIS is called KP, and it is package of domain knowledge that unifies the extraction and structure narration of the technology information. ICPIS extracts the technology information among the thesis that are deserved by the natural language treatment in the itemized KP keywords described, and form the prescribed summary structure in KP.

  • PDF

An Object-Oriented Case-Base Design and Similarity Measures for Bundle Products Recommendation Systems (번들상품추천시스템 개발을 위한 객체지향 사례베이스 설계와 유사도 측정에 관한 연구)

  • 정대율
    • Journal of Intelligence and Information Systems
    • /
    • v.9 no.1
    • /
    • pp.23-51
    • /
    • 2003
  • With the recent expansion of internet shopping mall, the importance of intelligent products recommendation agents has been increasing. for the products recommendation, This paper propose case-based reasoning approach, and developed a case-based bundle products recommendation system which can recommend a set of sea food used in family events. To apply CBR approach to the bundle products recommendation, it requires the following 4R steps : \circled1 Retrieval, \circled2 Reuse, \circled3 Revise, \circled4 Retain. To retrieve similar cases from the case-base efficiently, case representation scheme is most important. This paper used OW(Object Modeling Technique) to represent bundle products recommendation cases, and developed a similarity measure method to search similar cases. To measure similarity, we used weight-sum approach basically. Especially This paper propose the meaning and uses of taxonomies for representing case features.

  • PDF

A Study on the Musical Theme Clustering for Searching Note Sequences (음렬 탐색을 위한 주제소절 자동분류에 관한 연구)

  • 심지영;김태수
    • Journal of the Korean Society for information Management
    • /
    • v.19 no.3
    • /
    • pp.5-30
    • /
    • 2002
  • In this paper, classification feature is selected with focus of musical content, note sequences pattern, and measures similarity between note sequences followed by constructing clusters by similar note sequences, which is easier for users to search by showing the similar note sequences with the search result in the CBMR system. Experimental document was $\ulcorner$A Dictionary of Musical Themes$\lrcorner$, the index of theme bar focused on classical music and obtained kern-type file. Humdrum Toolkit version 1.0 was used as note sequences treat tool. The hierarchical clustering method is by stages focused on four-type similarity matrices by whether the note sequences segmentation or not and where the starting point is. For the measurement of the result, WACS standard is used in the case of being manual classification and in the case of the note sequences starling from any point in the note sequences, there is used common feature pattern distribution in the cluster obtained from the clustering result. According to the result, clustering with segmented feature unconnected with the starting point Is higher with distinct difference compared with clustering with non-segmented feature.

An Efficient Frequent Melody Indexing Method to Improve Performance of Query-By-Humming System (허밍 질의 처리 시스템의 성능 향상을 위한 효율적인 빈번 멜로디 인덱싱 방법)

  • You, Jin-Hee;Park, Sang-Hyun
    • Journal of KIISE:Databases
    • /
    • v.34 no.4
    • /
    • pp.283-303
    • /
    • 2007
  • Recently, the study of efficient way to store and retrieve enormous music data is becoming the one of important issues in the multimedia database. Most general method of MIR (Music Information Retrieval) includes a text-based approach using text information to search a desired music. However, if users did not remember the keyword about the music, it can not give them correct answers. Moreover, since these types of systems are implemented only for exact matching between the query and music data, it can not mine any information on similar music data. Thus, these systems are inappropriate to achieve similarity matching of music data. In order to solve the problem, we propose an Efficient Query-By-Humming System (EQBHS) with a content-based indexing method that efficiently retrieve and store music when a user inquires with his incorrect humming. For the purpose of accelerating query processing in EQBHS, we design indices for significant melodies, which are 1) frequent melodies occurring many times in a single music, on the assumption that users are to hum what they can easily remember and 2) melodies partitioned by rests. In addition, we propose an error tolerated mapping method from a note to a character to make searching efficient, and the frequent melody extraction algorithm. We verified the assumption for frequent melodies by making up questions and compared the performance of the proposed EQBHS with N-gram by executing various experiments with a number of music data.

Comparison of Land Surface Temperature Algorithm Using Landsat-8 Data for South Korea

  • Choi, Sungwon;Lee, Kyeong-Sang;Seo, Minji;Seong, Noh-Hun;Jin, Donghyun;Jung, Daeseong;Sim, Suyoung;Jung, Im Gook;Han, Kyung-Soo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.1
    • /
    • pp.153-160
    • /
    • 2021
  • Land Surface Temperature (LST) is the radiological surface temperature which observed by satellite. It is very important factor to estimate condition of the Earth such as Global warming and Heat island. For these reasons, many countries operate their own satellite to observe the Earth condition. South Korea has many landcovers such as forest, crop land, urban. Therefore, if we want to retrieve accurate LST, we would use high-resolution satellite data. In this study, we made LSTs with 4 LST retrieval algorithms which are used widely with Landsat-8 data which has 30 m spatial resolution. We retrieved LST using equations of Price, Becker et al. Prata, Coll et al. and they showed very similar spatial distribution. We validated 4 LSTs with Moderate resolution Imaging Spectroradiometer (MODIS) LST data to find the most suitable algorithm. As a result, every LST shows 2.160 ~ 3.387 K of RMSE. And LST by Prata algorithm show the lowest RMSE than others. With this validation result, we choose LST by Prata algorithm as the most suitable LST to South Korea.

An Investigation on Image Needs and Contexts in Image Search Failure (이미지 검색 실패에 나타난 이미지 요구와 맥락에 관한 분석)

  • Chung, EunKyung
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.26 no.1
    • /
    • pp.199-215
    • /
    • 2015
  • As a way of identifying users' image needs for improved effectiveness of image search, there have been recent research approaches to examine contextual factors in image needs with multiple perspectives. In this line of research, this study examined a total of 70 unsuccessful image searches for the purpose of investigating users' image needs. In order to achieve the purpose of this study, in particular, the characteristics of image needs, contextual factors on image needs, and image queries were investigated. The findings of this study demonstrated that information needs from the failed image searches are categorized primarily into specific and general/nameable categories. More importantly, these information needs are embedded with multiple contextual factors, primarily, task purpose and use purpose. With an analysis of detailed use purposes for image, illustration use was found most in this data set. For query analysis, the type of unique/refined image query was revealed primarily. As the results of this study were found similar to the findings of previous studies, it is possible to characterize the image needs from the failed image searches. In addition, the findings of this study are expected to be useful to the design and service of image retrieval.

n-Gram/2L: A Space and Time Efficient Two-Level n-Gram Inverted Index Structure (n-gram/2L: 공간 및 시간 효율적인 2단계 n-gram 역색인 구조)

  • Kim Min-Soo;Whang Kyu-Young;Lee Jae-Gil;Lee Min-Jae
    • Journal of KIISE:Databases
    • /
    • v.33 no.1
    • /
    • pp.12-31
    • /
    • 2006
  • The n-gram inverted index has two major advantages: language-neutral and error-tolerant. Due to these advantages, it has been widely used in information retrieval or in similar sequence matching for DNA and Protein databases. Nevertheless, the n-gram inverted index also has drawbacks: the size tends to be very large, and the performance of queries tends to be bad. In this paper, we propose the two-level n-gram inverted index (simply, the n-gram/2L index) that significantly reduces the size and improves the query performance while preserving the advantages of the n-gram inverted index. The proposed index eliminates the redundancy of the position information that exists in the n-gram inverted index. The proposed index is constructed in two steps: 1) extracting subsequences of length m from documents and 2) extracting n-grams from those subsequences. We formally prove that this two-step construction is identical to the relational normalization process that removes the redundancy caused by a non-trivial multivalued dependency. The n-gram/2L index has excellent properties: 1) it significantly reduces the size and improves the Performance compared with the n-gram inverted index with these improvements becoming more marked as the database size gets larger; 2) the query processing time increases only very slightly as the query length gets longer. Experimental results using databases of 1 GBytes show that the size of the n-gram/2L index is reduced by up to 1.9${\~}$2.7 times and, at the same time, the query performance is improved by up to 13.1 times compared with those of the n-gram inverted index.

Algorithms for Indexing and Integrating MPEG-7 Visual Descriptors (MPEG-7 시각 정보 기술자의 인덱싱 및 결합 알고리즘)

  • Song, Chi-Ill;Nang, Jong-Ho
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.1
    • /
    • pp.1-10
    • /
    • 2007
  • This paper proposes a new indexing mechanism for MPEG-7 visual descriptors, especially Dominant Color and Contour Shape descriptors, that guarantees an efficient similarity search for the multimedia database whose visual meta-data are represented with MPEG-7. Since the similarity metric used in the Dominant Color descriptor is based on Gaussian mixture model, the descriptor itself could be transform into a color histogram in which the distribution of the color values follows the Gauss distribution. Then, the transformed Dominant Color descriptor (i.e., the color histogram) is indexed in the proposed indexing mechanism. For the indexing of Contour Shape descriptor, we have used a two-pass algorithm. That is, in the first pass, since the similarity of two shapes could be roughly measured with the global parameters such as eccentricity and circularity used in Contour shape descriptor, the dissimilar image objects could be excluded with these global parameters first. Then, the similarities between the query and remaining image objects are measured with the peak parameters of Contour Shape descriptor. This two-pass approach helps to reduce the computational resources to measure the similarity of image objects using Contour Shape descriptor. This paper also proposes two integration schemes of visual descriptors for an efficient retrieval of multimedia database. The one is to use the weight of descriptor as a yardstick to determine the number of selected similar image objects with respect to that descriptor, and the other is to use the weight as the degree of importance of the descriptor in the global similarity measurement. Experimental results show that the proposed indexing and integration schemes produce a remarkable speed-up comparing to the exact similarity search, although there are some losses in the accuracy because of the approximated computation in indexing. The proposed schemes could be used to build a multimedia database represented in MPEG-7 that guarantees an efficient retrieval.

SOM-Based $R^{*}-Tree$ for Similarity Retrieval (자기 조직화 맵 기반 유사 검색 시스템)

  • O, Chang-Yun;Im, Dong-Ju;O, Gun-Seok;Bae, Sang-Hyeon
    • The KIPS Transactions:PartD
    • /
    • v.8D no.5
    • /
    • pp.507-512
    • /
    • 2001
  • Feature-based similarity has become an important research issue in multimedia database systems. The features of multimedia data are useful for discriminating between multimedia objects. the performance of conventional multidimensional data structures tends to deteriorate as the number of dimensions of feature vectors increase. The $R^{*}-Tree$ is the most successful variant of the R-Tree. In this paper, we propose a SOM-based $R^{*}-Tree$ as a new indexing method for high-dimensional feature vectors. The SOM-based $R^{*}-Tree$ combines SOM and $R^{*}-Tree$ to achieve search performance more scalable to high-dimensionalties. Self-Organizingf Maps (SOMs) provide mapping from high-dimensional feature vectors onto a two-dimensional space. The map is called a topological feature map, and preserves the mutual relationships (similarity) in the feature spaces of input data, clustering mutually similar feature vectors in neighboring nodes. Each node of the topological feature map holds a codebook vector. We experimentally compare the retrieval time cost of a SOM-based $R^{*}-Tree$ with of an SOM and $R^{*}-Tree$ using color feature vectors extracted from 40,000 images. The results show that the SOM-based $R^{*}-Tree$ outperform both the SOM and $R^{*}-Tree$ due to reduction of the number of nodes to build $R^{*}-Tree$ and retrieval time cost.

  • PDF

Preference Element Changeable Recommender System based on Extended Collaborative Filtering (확장된 협업 필터링을 활용한 선호 요소 가변 추천 시스템)

  • Oh, Jung-Min;Moon, Nam-Mee
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.4
    • /
    • pp.18-24
    • /
    • 2010
  • Mobile devices wide spread among users after the release of Apple's iPhone, especially in Korea. Mobile device has their own advantages in terms of weight, size, mobility and so on. But, on the contrary, mobile device has to provide more accurate and personalized information because of a small screen and a limited function of information retrieval. This paper presents a user"s preference element changeable recommender system by employing extended collaborative filtering as a technique to provide useful information in a mobile environment. Proposed system reflects user's similar groups by simultaneously considering users' information with preferences and demographic characteristics. Then we construct list of recommenders by user's choice. Finally, we show the implementation of a prototype based on iPhone.