• Title/Summary/Keyword: 유사도 질의

Search Result 1,858, Processing Time 0.032 seconds

Image Retrieval using Annotation Expansion based on WordNet (WordNet기반 주석확장을 이용한 이미지 검색)

  • Hwang, Kwang-Su;Kim, Pan-Koo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2007.11a
    • /
    • pp.165-168
    • /
    • 2007
  • 이미지 데이터를 의미적으로 검색하기 위한 가장 중요한 요소는 이미지의 정보를 표현하고 있는 주석이라고 할 수 있다. 이미지의 주석은 관리자가 사용자 입장에서 검색이 가능한 이미지를 표현할 수 있는 키워드를 선별하여 데이터화한 것이다. 그러다보니 이미지내 의미를 모두 표현하기위해 주석에 수는 증가되고, 증가된 주석은 각각에 이미지에서 차지하고 있는 의미량을 고려하지않고 동일한 크기를 가지게 된다. 이러한 경우 실제적으로 검색하였을 때 의미량에 상관없이 질의어와 주석이 일치한 모든 이미지를 검색하므로 사용자가 검색 결과에서 의미량이 큰 이미지를 다시 재검색하거나 주석입력자와 사용자와 어휘 표현에 차이 때문에 검색에 재검색해야한다. 따라서 본 논문에서는 의미량을 이용하여 효율적인 이미지 검색을 하기 위해 각 키워드 간에 의미적인 관계를 어휘 온톨로지인 WordNet을 이용하여 유사도 측정을 하고, 측정한 데이터를 이용하여 전체 이미지 의미량에서 해당 키워드가 갖는 의미량을 측정한다. 의미량은 이미지 검색시 질의어가 이미지에서 차지하고 있는 비율을 비교하여 가장 높은 의미량을 갖는 이미지를 우선 검색하고 의미량이 가장 큰 키워드를 대표키워드로 추출하여 WordNet상에서 동일한 의미를 갖는 계층에 단어들로 주석을 확장한다.

Spherical Pyramid-Technique : An Efficient Indexing Technique for Similarity Search in High-Dimensional Data (구형 피라미드 기법 : 고차원 데이터의 유사성 검색을 위한 효율적인 색인 기법)

  • Lee, Dong-Ho;Jeong, Jin-Wan;Kim, Hyeong-Ju
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.11
    • /
    • pp.1270-1281
    • /
    • 1999
  • 피라미드 기법 1 은 d-차원의 공간을 2d개의 피라미드들로 분할하는 특별한 공간 분할 방식을 이용하여 고차원 데이타를 효율적으로 색인할 수 있는 새로운 색인 방법으로 제안되었다. 피라미드 기법은 고차원 사각형 형태의 영역 질의에는 효율적이나, 유사성 검색에 많이 사용되는 고차원 구형태의 영역 질의에는 비효율적인 면이 존재한다. 본 논문에서는 고차원 데이타를 많이 사용하는 유사성 검색에 효율적인 새로운 색인 기법으로 구형 피라미드 기법을 제안한다. 구형 피라미드 기법은 먼저 d-차원의 공간을 2d개의 구형 피라미드로 분할하고, 각 단일 구형 피라미드를 다시 구형태의 조각으로 분할하는 특별한 공간 분할 방법에 기반하고 있다. 이러한 공간 분할 방식은 피라미드 기법과 마찬가지로 d-차원 공간을 1-차원 공간으로 변환할 수 있다. 따라서, 변환된 1-차원 데이타를 다루기 위하여 B+-트리를 사용할 수 있다. 본 논문에서는 이렇게 분할된 공간에서 고차원 구형태의 영역 질의를 효율적으로 처리할 수 있는 알고리즘을 제안한다. 마지막으로, 인위적 데이타와 실제 데이타를 사용한 다양한 실험을 통하여 구형 피라미드 기법이 구형태의 영역 질의를 처리하는데 있어서 기존의 피라미드 기법보다 효율적임을 보인다.Abstract The Pyramid-Technique 1 was proposed as a new indexing method for high- dimensional data spaces using a special partitioning strategy that divides d-dimensional space into 2d pyramids. It is efficient for hypercube range query, but is not efficient for hypersphere range query which is frequently used in similarity search. In this paper, we propose the Spherical Pyramid-Technique, an efficient indexing method for similarity search in high-dimensional space. The Spherical Pyramid-Technique is based on a special partitioning strategy, which is to divide the d-dimensional data space first into 2d spherical pyramids, and then cut the single spherical pyramid into several spherical slices. This partition provides a transformation of d-dimensional space into 1-dimensional space as the Pyramid-Technique does. Thus, we are able to use a B+-tree to manage the transformed 1-dimensional data. We also propose the algorithm of processing hypersphere range query on the space partitioned by this partitioning strategy. Finally, we show that the Spherical Pyramid-Technique clearly outperforms the Pyramid-Technique in processing hypersphere range queries through various experiments using synthetic and real data.

A Method to Solve the Entity Linking Ambiguity and NIL Entity Recognition for efficient Entity Linking based on Wikipedia (위키피디아 기반의 효과적인 개체 링킹을 위한 NIL 개체 인식과 개체 연결 중의성 해소 방법)

  • Lee, Hokyung;An, Jaehyun;Yoon, Jeongmin;Bae, Kyoungman;Ko, Youngjoong
    • Journal of KIISE
    • /
    • v.44 no.8
    • /
    • pp.813-821
    • /
    • 2017
  • Entity Linking find the meaning of an entity mention, which indicate the entity using different expressions, in a user's query by linking the entity mention and the entity in the knowledge base. This task has four challenges, including the difficult knowledge base construction problem, multiple presentation of the entity mention, ambiguity of entity linking, and NIL entity recognition. In this paper, we first construct the entity name dictionary based on Wikipedia to build a knowledge base and solve the multiple presentation problem. We then propose various methods for NIL entity recognition and solve the ambiguity of entity linking by training the support vector machine based on several features, including the similarity of the context, semantic relevance, clue word score, named entity type similarity of the mansion, entity name matching score, and object popularity score. We sequentially use the proposed two methods based on the constructed knowledge base, to obtain the good performance in the entity linking. In the result of the experiment, our system achieved 83.66% and 90.81% F1 score, which is the performance of the NIL entity recognition to solve the ambiguity of the entity linking.

Improved SIM Algorithm for Contents-based Image Retrieval (내용 기반 이미지 검색을 위한 개선된 SIM 방법)

  • Kim, Kwang-Baek
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.2
    • /
    • pp.49-59
    • /
    • 2009
  • Contents-based image retrieval methods are in general more objective and effective than text-based image retrieval algorithms since they use color and texture in search and avoid annotating all images for search. SIM(Self-organizing Image browsing Map) is one of contents-based image retrieval algorithms that uses only browsable mapping results obtained by SOM(Self Organizing Map). However, SOM may have an error in selecting the right BMU in learning phase if there are similar nodes with distorted color information due to the intensity of light or objects' movements in the image. Such images may be mapped into other grouping nodes thus the search rate could be decreased by this effect. In this paper, we propose an improved SIM that uses HSV color model in extracting image features with color quantization. In order to avoid unexpected learning error mentioned above, our SOM consists of two layers. In learning phase, SOM layer 1 has the color feature vectors as input. After learning SOM Layer 1, the connection weights of this layer become the input of SOM Layer 2 and re-learning occurs. With this multi-layered SOM learning, we can avoid mapping errors among similar nodes of different color information. In search, we put the query image vector into SOM layer 2 and select nodes of SOM layer 1 that connects with chosen BMU of SOM layer 2. In experiment, we verified that the proposed SIM was better than the original SIM and avoid mapping error effectively.

  • PDF

A DB Pruning Method in a Large Corpus-Based TTS with Multiple Candidate Speech Segments (대용량 복수후보 TTS 방식에서 합성용 DB의 감량 방법)

  • Lee, Jung-Chul;Kang, Tae-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.6
    • /
    • pp.572-577
    • /
    • 2009
  • Large corpus-based concatenating Text-to-Speech (TTS) systems can generate natural synthetic speech without additional signal processing. To prune the redundant speech segments in a large speech segment DB, we can utilize a decision-tree based triphone clustering algorithm widely used in speech recognition area. But, the conventional methods have problems in representing the acoustic transitional characteristics of the phones and in applying context questions with hierarchic priority. In this paper, we propose a new clustering algorithm to downsize the speech DB. Firstly, three 13th order MFCC vectors from first, medial, and final frame of a phone are combined into a 39 dimensional vector to represent the transitional characteristics of a phone. And then the hierarchically grouped three question sets are used to construct the triphone trees. For the performance test, we used DTW algorithm to calculate the acoustic similarity between the target triphone and the triphone from the tree search result. Experimental results show that the proposed method can reduce the size of speech DB by 23% and select better phones with higher acoustic similarity. Therefore the proposed method can be applied to make a small sized TTS.

Design and Implementation of a Similarity based Plant Disease Image Retrieval using Combined Descriptors and Inverse Proportion of Image Volumes (Descriptor 조합 및 동일 병명 이미지 수량 역비율 가중치를 적용한 유사도 기반 작물 질병 검색 기술 설계 및 구현)

  • Lim, Hye Jin;Jeong, Da Woon;Yoo, Seong Joon;Gu, Yeong Hyeon;Park, Jong Han
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.14 no.6
    • /
    • pp.30-43
    • /
    • 2018
  • Many studies have been carried out to retrieve images using colors, shapes, and textures which are characteristic of images. In addition, there is also progress in research related to the disease images of the crop. In this paper, to be a help to identify the disease occurred in crops grown in the agricultural field, we propose a similarity-based crop disease search system using the diseases image of horticulture crops. The proposed system improves the similarity retrieval performance compared to existing ones through the combination descriptor without using a single descriptor and applied the weight based calculation method to provide users with highly readable similarity search results. In this paper, a total of 13 Descriptors were used in combination. We used to retrieval of disease of six crops using a combination Descriptor, and a combination Descriptor with the highest average accuracy for each crop was selected as a combination Descriptor for the crop. The retrieved result were expressed as a percentage using the calculation method based on the ratio of disease names, and calculation method based on the weight. The calculation method based on the ratio of disease name has a problem in that number of images used in the query image and similarity search was output in a first order. To solve this problem, we used a calculation method based on weight. We applied the test image of each disease name to each of the two calculation methods to measure the classification performance of the retrieval results. We compared averages of retrieval performance for two calculation method for each crop. In cases of red pepper and apple, the performance of the calculation method based on the ratio of disease names was about 11.89% on average higher than that of the calculation method based on weight, respectively. In cases of chrysanthemum, strawberry, pear, and grape, the performance of the calculation method based on the weight was about 20.34% on average higher than that of the calculation method based on the ratio of disease names, respectively. In addition, the system proposed in this paper, UI/UX was configured conveniently via the feedback of actual users. Each system screen has a title and a description of the screen at the top, and was configured to display a user to conveniently view the information on the disease. The information of the disease searched based on the calculation method proposed above displays images and disease names of similar diseases. The system's environment is implemented for use with a web browser based on a pc environment and a web browser based on a mobile device environment.

Correlation Between Physical and Compaction Characteristics of Various Soils (다양한 지반의 물리적 특성과 다짐특성 상관성)

  • Park, Choonsik;Kim, Jonghwan
    • Journal of the Korean GEO-environmental Society
    • /
    • v.18 no.1
    • /
    • pp.23-29
    • /
    • 2017
  • This study, to provide quantitative data related to compaction characteristics, identifies the compaction characteristics of various types of soil samplers, in relation to their particle-size distribution and plasticity degree, and the compaction characteristics of artificially created granular materials, in relation to their A & D compaction. The results of the experiments show as follows. $r_{dmax}$ of clay is less than those of both sand and gravel approximately by 10%. O.M.C of clay has turned out to be greater than sand and gravel approximately by 20% and 30%, respectively. Changes in the compaction characteristics can be observed clearly around 30~60% of sand and 30~50% of passing No.200 sieve. It has also been shown that the compaction characteristics related to LL and PL are similar to each other in changes, and that the compaction characteristics become less clear with higher percent of fine grained soil. The compaction characteristics of the artificially created granular materials and field materials have appeared almost similar to each other. $r_{dmax}$ is less approximately by 30% and O.M.C greater approximately by 20% in A compaction than in D compaction. As $r_{dmax}$ and O.M.C become greater, its rate increases.

An Index-Based Approach for Subsequence Matching Under Time Warping in Sequence Databases (시퀀스 데이터베이스에서 타임 워핑을 지원하는 효과적인 인덱스 기반 서브시퀀스 매칭)

  • Park, Sang-Hyeon;Kim, Sang-Uk;Jo, Jun-Seo;Lee, Heon-Gil
    • The KIPS Transactions:PartD
    • /
    • v.9D no.2
    • /
    • pp.173-184
    • /
    • 2002
  • This paper discuss an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In earlier work, Kim et al. suggested an efficient method for whole matching under time warping. This method constructs a multidimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality. In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multidimensional index using a feature vector as indexing attributes. For query processing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verify the superiority of our approach, we perform extensive experiments. The results reveal that our approach achieves significant speedup with real-world S&P 500 stock data and with very large synthetic data.

Experimental Study on the Properties of Solid Material Made by Autoclave Curing according to CaO/SiO2 Ratio and W/B (CaO/SiO2비 및 W/B 변화에 따른 오토클레이브 양생 경화체의 특성에 관한 실험적 연구)

  • Kang, Cheol;Kang, Ki-Woong;Kim, Jin-Man
    • Journal of the Korea Concrete Institute
    • /
    • v.21 no.5
    • /
    • pp.557-563
    • /
    • 2009
  • This study is on the properties of inorganic porous calcium silicate material made from silica powder through the autoclaving curing, the results of this study should be utilized fundamental data for the development of noise reduction porous solid material using siliceous byproduct generated by various manufacture process. For the manufacture of autoclave curing specimen, various calcareous materials used and siliceous materials used silica powder. In this study, properties in density and compressive strength according to the change of W/B and C/S ratio, microscopy for the shape of pore, SEM and XRD for the examination of hydrate after autoclave curing are carried out respectively. The test results shown that the more slurry density decrease, the more W/B increase at the fresh state, this tendency shown similar to in hardened state. Among the specimens of C/S ratio, the compressive strength of C/S ratio of 0.85 gave the highest the compressive strength. In the results of XRD, tobermorite generated by autoclaving curing was created all of specimens regardless of C/S ratio. To ascertain pore structure, we compared with existing porous calcium silicate product(ALC, organic sound absorbing porous material). The results of microscope observation, pore structure of specimen of this study was similar to that of existing inorganic sound absorbing foam concrete. therefore, we could conformed a possibility of sound absorbing porous solid material on the basis of the results.

A Question Answering Agent for Effective Web Information Providing Service: Implementation and Application (효과적인 웹 경보 제공 서비스를 위한 질의응답 에이전트의 구현과 응용)

  • Kim Kyoung-Min;Cho Sung-Bae
    • Korean Journal of Cognitive Science
    • /
    • v.15 no.3
    • /
    • pp.35-44
    • /
    • 2004
  • As the use of internet becomes proliferated, a great amount of information is provided through diverse channels. Users require effective information providing service and we have studied the conversational agent that exchanges information between users and agents using natural language dialogue. In this paper, we develop a question answering agent providing the corresponding answer by analyzing the user's intention using artificial intelligence techniques such as pattern matching and Bayesian network We work out various problems in knowledge representation of users by constructing keyword synonym database. The proposed method is applied to designing an agent for the introduction of a fashion web site, which confirms that it responds more flexibly to the user's queries.

  • PDF