• Title/Summary/Keyword: 색인화

Search Result 272, Processing Time 0.03 seconds

A Study for Parallelizing Sequential Algorithms of Search Engine in Parallel Information Retrieval System (병렬 정보검색 시스템의 순차적인 검색엔진 알고리즘의 병렬화를 위한 연구)

  • Kim, Seok Young;Park, Mi-Young;Park, Hyuk-Ro;Chung, In Sang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2007.11a
    • /
    • pp.693-696
    • /
    • 2007
  • 대규모 데이터를 효율적으로 검색하기 위한 병렬 정보검색 시스템에서는 하드웨어 확장으로 인한 병렬화로 시스템 전체의 작업 처리량을 증가시켰다. 그러나 병렬 시스템 상에서 수행되는 검색엔진의 알고리즘들은 여전히 순차적으로 수행되기 때문에, 사용자의 개별적인 질의처리 시간은 단축되지 않는다. 본 연구는 검색엔진의 병렬화를 위하여 사용자 질의처리 과정과 역색인 파일처리 과정의 순차 알고리즘들을 조사하여 병렬화의 필요성과 가능성을 평가한다. 이러한 평가는 병렬 정보검색 시스템에서 수행되는 순차 알고리즘들의 효과적이고 체계적인 병렬화를 도모하고, 보다 효율적인 병렬 정보검색 시스템의 구축을 가능하게 한다.

Study of Improvement of Search Range Compression Method of VP-tree for Video Indexes (영상 색인용 VP-tree의 검색 범위 압축법의 개선에 관한 연구)

  • Park, Gil-Yang;Lee, Samuel Sang-Kon;Hwang, Jea-Jeong
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.2
    • /
    • pp.215-225
    • /
    • 2012
  • In multimedia database, a multidimensional space-based indexing has been used to increase search efficiency. However, this method is inefficient in terms of ubiquity because it uses Euclidean distance as a scale of distance calculation. On the contrary, a metric space-based indexing method, in which metric axiom is prerequisite is widely available because a metric scale other than Euclidean distance could be used. This paper is attempted to propose a way of improving VP-tree, one of the metric space indexing methods. The VP-tree calculates the distance with an object which is ultimately linked to the a leaf node depending on the node fit for the search range from a root node and examines if it is appropriate with the search range. Because search speed decreases as the number of distance calculations at the leaf node increases, however, this paper has proposed a method which uses the latest interface on query object as the base point of trigonometric inequality for improvement after focusing on the trigonometric inequality-based range compression method in a leaf node. This improvement method would be able to narrow the search range and reduce the number of distance calculations. According to a system performance test using 10,000 video data, the new method reduced search time for similar videos by 5-12%, compared to a conventional method.

Towards Next Generation Multimedia Information Retrieval by Analyzing User-centered Image Access and Use (이용자 중심의 이미지 접근과 이용 분석을 통한 차세대 멀티미디어 검색 패러다임 요소에 관한 연구)

  • Chung, EunKyung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.51 no.4
    • /
    • pp.121-138
    • /
    • 2017
  • As information users seek multimedia with a wide variety of information needs, information environments for multimedia have been developed drastically. More specifically, as seeking multimedia with emotional access points has been popular, the needs for indexing in terms of abstract concepts including emotions have grown. This study aims to analyze the index terms extracted from Getty Image Bank. Five basic emotion terms, which are sadness, love, horror, happiness, anger, were used when collected the indexing terms. A total 22,675 index terms were used for this study. The data are three sets; entire emotion, positive emotion, and negative emotion. For these three data sets, co-word occurrence matrices were created and visualized in weighted network with PNNC clusters. The entire emotion network demonstrates three clusters and 20 sub-clusters. On the other hand, positive emotion network and negative emotion network show 10 clusters, respectively. The results point out three elements for next generation of multimedia retrieval: (1) the analysis on index terms for emotions shown in people on image, (2) the relationship between connotative term and denotative term and possibility for inferring connotative terms from denotative terms using the relationship, and (3) the significance of thesaurus on connotative term in order to expand related terms or synonyms for better access points.

Design of the Flexible Buffer Node Technique to Adjust the Insertion/Search Cost in Historical Index (과거 위치 색인에서 입력/검색 비용 조정을 위한 가변 버퍼 노드 기법 설계)

  • Jung, Young-Jin;Ahn, Bu-Young;Lee, Yang-Koo;Lee, Dong-Gyu;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.18D no.4
    • /
    • pp.225-236
    • /
    • 2011
  • Various applications of LBS (Location Based Services) are being developed to provide the customized service depending on user's location with progress of wireless communication technology and miniaturization of personalized device. To effectively process an amount of vehicles' location data, LBS requires the techniques such as vehicle observation, data communication, data insertion and search, and user query processing. In this paper, we propose the historical location index, GIP-FB (Group Insertion tree with Flexible Buffer Node) and the flexible buffer node technique to adjust the cost of data insertion and search. the designed GIP+ based index employs the buffer node and the projection storage to cut the cost of insertion and search. Besides, it adjusts the cost of insertion and search by changing the number of line segments of the buffer node with user defined time interval. In the experiment, the buffer node size influences the performance of GIP-FB by changing the number of non-leaf node of the index. the proposed flexible buffer node is used to adjust the performance of the historical location index depending on the applications of LBS.

"영추(靈樞).잡병(雜病)"에 대(對)한 연구(硏究) -대어(對於)"영추(靈樞).잡병(雜病)"적연구(的硏究)-

  • Han, Sang-In;Lee, Nam-Gu
    • Journal of Korean Medical classics
    • /
    • v.18 no.2 s.29
    • /
    • pp.124-144
    • /
    • 2005
  • ${\ulcorner}$영추(靈樞).잡병(雜病)${\lrcorner}$논술료궐기사증(論述了厥氣四證,), 심통육증(心痛六證), 함통이증, 이급슬통(以及膝痛), 후통(喉痛), 치통(齒痛), 이롱(耳聾), 학(虐), 요통(腰痛), 기역(氣逆), 항통(項痛), 복통(腹痛), 위궐(萎厥), 육(戮), 홰등각증적병정, 진단급자법(診斷及刺法). 인해편소논술적내용병미국한어특정적질병(因該篇所論述的內容幷未局限於特定的疾病), 이시논술료임상상견적제다병증(而是論述了臨床常見的諸多病證), 소이편명칭위잡병(所以篇名稱爲雜病). 잡병시인내상혹외감소치적질병(雜病是因內傷或外感所致的疾病), 재고대상한등외감성질환(在古代傷寒等外感性疾患), 대인류생명구성료흔대위협, 단시도료현재유어현대문명발달(但是到了現在由於現代文明發達), 이급생활방식적다양화(以及生活方式的多樣化), 내상잡병각성료갱대적위협인소(內傷雜病却成了更大的威脅因素). 본편재(本篇在)${\ulcorner}$황제내경장구색인(黃帝內經章句索引)${\lrcorner}$분위료오절(分爲了五節), 재(在)${\ulcorner}$영추경교석(靈樞經校釋)${\lrcorner}$ 급(及)${\ulcorner}$황제내경영추경어역(黃帝內經靈樞經語譯)${\lrcorner}$분위료이십구절(分爲了二十九節), 필자기어편이상적고려(筆者寄於便利上的考慮), 준순(遵循) ${\ulcorner}$황제내경장구색인(黃帝內經章句索引)${\lrcorner}$이분위오절진행료연구(而分爲五節進行了硏究). 본편기재료상견적흔다잡병, 재임상상유흔대적연구가치. 단시통과역대다차전초(但是通過歷代多次轉抄), 유흔다오식, 착간(錯簡), 가차적부분(假借的部分), 인차유흔다불역리해적지방. 여과불참조다종판본화역대주석가적연구성과(如果不參照多種版本和歷代註釋家的硏究成果), 취무법진정영회기본의(就無法眞正領會其本意). 기어저일점(寄於這一點), 본편논문연구료역대판본화주석가적견해(本篇論文硏究了歷代版本和註釋家的見解), 병진행료교감화교주(幷進行了校勘和校註), 재가어료현토급국어주석(再加於了懸吐及國語注釋), 이기갱유조어정확지리해원문적본의(以期更有助於正確地理解原文的本意).

  • PDF

Grouping Method Based Query Range Density for Efficient Operation Sharing of Spatial Range Query (공간영역질의의 효율적인 연산 공유를 위한 질의영역 밀집도 기반의 그룹화 기법)

  • Lim, Jung-Hyeun;Shin, Soong-Sun;Baek, Sung-Ha;Lee, Dong-Wook;Kim, Kyung-Bae;Bae, Hae-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.348-351
    • /
    • 2009
  • 유비쿼터스 사회를 실현하는 핵심기술인 u-GIS 공간정보 기술은 데이터 스트림 처리 시스템(Data Stream Management System)과 지리정보 시스템(Geography Information System)이 결합된 플랫폼인 u-GIS DSMS를 요구한다. u-GIS DSMS는 GeoSeonsor에서 수집되는 센서 테이터와 GIS의 공간정보 데이터를 결합하여 처리하는 공간영역질의가 다수 요구된다. 이런 공간영역질의들은 특정 지역에 밀집하게 등록되는 경향이 있으며, 유사한 프리디킷을 가질 가능성이 높다. 이러한 특징은 공간영역질의가 특정 지역에 밀집되면 다수의 비슷한 연산들이 반복적으로 처리하기 때문에 시스템 성능이 저하 될 것이다. 이를 해결하기 위해 영역질의 색인기법 연구가 활발히 진행되고 있다. 그러나 기존의 VCR-Index와 CQI-Index 기법은 질의영역을 셀 구조나 가상구조로 분할하여 처리하기 때문에 자원 및 연산을 공유 할 수 없어 질의 처리 속도가 현저히 저하되기 때문에 대량의 공간영역질의 처리에는 부적합하다. 그래서 본 논문에서는 공간영역질의의 효율적인 연산 공유를 위한 질의영역 밀집도 기반의 그룹화 기법을 제안한다. 이 기법은 질의영역의 밀집도를 이용하여 공간영역질의들을 그룹화 후 색인을 구성한다. 색인된 영역들의 데이터는 단일 큐로 구성 후 질의들의 프리디킷을 분석하여 자원 및 연산 공유기법을 통해 기존의 기법보다 처리 속도 향상 및 메모리 사용을 감소시켰다.

Design and Implementation of Real-Time Research Trend Analysis System Using Author Keyword of Articles (논문의 저자 키워드를 이용한 실시간 연구동향 분석시스템 설계 및 구현)

  • Kim, Young-Chan;Jin, Byoung-Sam;Bae, Young-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.1
    • /
    • pp.141-146
    • /
    • 2018
  • The authors' author keywords are the most important elements that characterize the contents of the paper, By analyzing this in real time and providing it to users, It is possible to grasp research trends. Unstructured data of a journal created in a paper is constructed as a database, make use of this to make index data structure that can search in real time. In the index data structure, a thesis containing a specific keyword is searched, By extracting and clustering the author keywords, By presenting to the user a word cloud that can be displayed by size according to the weight, designed a method to visualize research trends. We also present the results of the research trend analysis of the keywords "virus" and "iris recognition" in the implemented system.

Efficient Nearest Neighbor Search on Moving Object Trajectories (이동객체궤적에 대한 효율적인 최근접이웃검색)

  • Kim, Gyu-Jae;Park, Young-Hee;Cho, Woo-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.12
    • /
    • pp.2919-2925
    • /
    • 2014
  • Because of the rapid growth of mobile communication and wireless communication, Location-based services are handled in many applications. So, the management and analysis of spatio-temporal data are a hot issue in database research. Index structure and query processing of such contents are very important for these applications. This paper addressees algorithms that make index structure by using Douglas-Peucker Algorithm and process nearest neighbor search query efficiently on moving objects trajectories. We compare and analyze our algorithms by experiments. Our algorithms make small size of index structure and process the query more efficiently.

Latent Semantic Indexing Analysis of K-Means Document Clustering for Changing Index Terms Weighting (색인어 가중치 부여 방법에 따른 K-Means 문서 클러스터링의 LSI 분석)

  • Oh, Hyung-Jin;Go, Ji-Hyun;An, Dong-Un;Park, Soon-Chul
    • The KIPS Transactions:PartB
    • /
    • v.10B no.7
    • /
    • pp.735-742
    • /
    • 2003
  • In the information retrieval system, document clustering technique is to provide user convenience and visual effects by rearranging documents according to the specific topics from the retrieved ones. In this paper, we clustered documents using K-Means algorithm and present the effect of index terms weighting scheme on the document clustering. To verify the experiment, we applied Latent Semantic Indexing approach to illustrate the clustering results and analyzed the clustering results in 2-dimensional space. Experimental results showed that in case of applying local weighting, global weighting and normalization factor, the density of clustering is higher than those of similar or same weighting schemes in 2-dimensional space. Especially, the logarithm of local and global weighting is noticeable.

A study on the Filtering of Spam E-mail using n-Gram indexing and Support Vector Machine (n-Gram 색인화와 Support Vector Machine을 사용한 스팸메일 필터링에 대한 연구)

  • 서정우;손태식;서정택;문종섭
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.14 no.2
    • /
    • pp.23-33
    • /
    • 2004
  • Because of a rapid growth of internet environment, it is also fast increasing to exchange message using e-mail. But, despite the convenience of e-mail, it is rising a currently bi9 issue to waste their time and cost due to the spam mail in an individual or enterprise. Many kinds of solutions have been studied to solve harmful effects of spam mail. Such typical methods are as follows; pattern matching using the keyword with representative method and method using the probability like Naive Bayesian. In this paper, we propose a classification method of spam mails from normal mails using Support Vector Machine, which has excellent performance in pattern classification problems, to compensate for the problems of existing research. Especially, the proposed method practices efficiently a teaming procedure with a word dictionary including a generated index by the n-Gram. In the conclusion, we verified the proposed method through the accuracy comparison of spm mail separation between an existing research and proposed scheme.