• Title/Summary/Keyword: Search Keyword Extraction

Search Result 42, Processing Time 0.037 seconds

Keyword Selection for Visual Search based on Wikipedia (비주얼 검색을 위한 위키피디아 기반의 질의어 추출)

  • Kim, Jongwoo;Cho, Soosun
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.8
    • /
    • pp.960-968
    • /
    • 2018
  • The mobile visual search service uses a query image to acquire linkage information through pre-constructed DB search. From the standpoint of this purpose, it would be more useful if you could perform a search on a web-based keyword search system instead of a pre-built DB search. In this paper, we propose a representative query extraction algorithm to be used as a keyword on a web-based search system. To do this, we use image classification labels generated by the CNN (Convolutional Neural Network) algorithm based on Deep Learning, which has a remarkable performance in image recognition. In the query extraction algorithm, dictionary meaningful words are extracted using Wikipedia, and hierarchical categories are constructed using WordNet. The performance of the proposed algorithm is evaluated by measuring the system response time.

Tag Search System Using the Keyword Extraction and Similarity Evaluation (키워드 추출 및 유사도 평가를 통한 태그 검색 시스템)

  • Jung, Jaein;Yoo, Myungsik
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.12
    • /
    • pp.2485-2487
    • /
    • 2015
  • Recently, Hashtag is widely used in SNS like Facebook, Twitter and personal blogs. However, the efficiency of tag search system is poor due to the indiscriminate use of hashtags. To enhance the accuracy of tag search system, we proposed a tag search system using the keyword extraction and similarity evaluation. The experimental results show that the proposed system provides the higher accuracy on tag search results.

Keyword Weight based Paragraph Extraction Algorithm (키워드 가중치 기반 문단 추출 알고리즘)

  • Lee, Jongwon;Joo, Sangwoong;Lee, Hyunju;Jung, Hoekyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.504-505
    • /
    • 2017
  • Existing morpheme analyzers classify the words used in writing documents. A system for extracting sentences and paragraphs based on a morpheme analyzer is being developed. However, there are very few systems that compress documents and extract important paragraphs. The algorithm proposed in this paper calculates the weights of the keyword written in the document and extracts the paragraphs containing the keyword. Users can reduce the time to understand the document by reading the paragraphs containing the keyword without reading the entire document. In addition, since the number of extracted paragraphs differs according to the number of keyword used in the search, the user can search various patterns compared to the existing system.

  • PDF

Design and Implementation of Potential Advertisement Keyword Extraction System Using SNS (SNS를 이용한 잠재적 광고 키워드 추출 시스템 설계 및 구현)

  • Seo, Hyun-Gon;Park, Hee-Wan
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.7
    • /
    • pp.17-24
    • /
    • 2018
  • One of the major issues in big data processing is extracting keywords from internet and using them to process the necessary information. Most of the proposed keyword extraction algorithms extract keywords using search function of a large portal site. In addition, these methods extract keywords based on already posted or created documents or fixed contents. In this paper, we propose a KAES(Keyword Advertisement Extraction System) system that helps the potential shopping keyword marketing to extract issue keywords and related keywords based on dynamic instant messages such as various issues, interests, comments posted on SNS. The KAES system makes a list of specific accounts to extract keywords and related keywords that have most frequency in the SNS.

A Study of High Speed Retrieval Algorithm of Long Component Keyword (복합키워드의 고속검색 알고리즘에 관한 연구)

  • Lee Jin-Kwan;Jung Kyu-cheol;Lee Tae-hun;Park Ki-hong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.8
    • /
    • pp.1769-1776
    • /
    • 2004
  • Effective keyword extraction is important in the information search system and there are several ways to select proper keyword in many keywords. Among them, DER Structure for AC Algorithm to search single keyword, can search multiple keywords but it has time complexity problem. In this paper, we developed a algorithm, "EDER structure" by expanding standalone search table based on DER structure search method to improve time complexity. We tested the algorithm using 500 text files and found that EDER structure is more efficient than DER structure for AC for keyword posting result and time complexity that 0.2 second for EDER and 0.6 second for DER structure,structure,

A study about IR Keyword Abstraction using AC Algorithm (AC 알고리즘을 이용한 정보검색 키워드 추출에 관한 연구)

  • 장혜숙;이진관;박기홍
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2002.11a
    • /
    • pp.667-671
    • /
    • 2002
  • It is very difficult to extract the words fitted for the purpose in spite of the great importance of efficient keyword extraction in information retrieval systems because there are many compound words. For example, AC machine is not able to search compound keywords from a single keyword. The DER structure solves this problem, but there remains a problem that it takes too much time to search keywords. Therefore a DERtable structure based on these methods is proposed in this dissertation to solve the above problems in which method tables are added to the existing DER structure and utilized to search keywords.

  • PDF

A Methodology for Extracting Shopping-Related Keywords by Analyzing Internet Navigation Patterns (인터넷 검색기록 분석을 통한 쇼핑의도 포함 키워드 자동 추출 기법)

  • Kim, Mingyu;Kim, Namgyu;Jung, Inhwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.123-136
    • /
    • 2014
  • Recently, online shopping has further developed as the use of the Internet and a variety of smart mobile devices becomes more prevalent. The increase in the scale of such shopping has led to the creation of many Internet shopping malls. Consequently, there is a tendency for increasingly fierce competition among online retailers, and as a result, many Internet shopping malls are making significant attempts to attract online users to their sites. One such attempt is keyword marketing, whereby a retail site pays a fee to expose its link to potential customers when they insert a specific keyword on an Internet portal site. The price related to each keyword is generally estimated by the keyword's frequency of appearance. However, it is widely accepted that the price of keywords cannot be based solely on their frequency because many keywords may appear frequently but have little relationship to shopping. This implies that it is unreasonable for an online shopping mall to spend a great deal on some keywords simply because people frequently use them. Therefore, from the perspective of shopping malls, a specialized process is required to extract meaningful keywords. Further, the demand for automating this extraction process is increasing because of the drive to improve online sales performance. In this study, we propose a methodology that can automatically extract only shopping-related keywords from the entire set of search keywords used on portal sites. We define a shopping-related keyword as a keyword that is used directly before shopping behaviors. In other words, only search keywords that direct the search results page to shopping-related pages are extracted from among the entire set of search keywords. A comparison is then made between the extracted keywords' rankings and the rankings of the entire set of search keywords. Two types of data are used in our study's experiment: web browsing history from July 1, 2012 to June 30, 2013, and site information. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The original sample dataset contains 150 million transaction logs. First, portal sites are selected, and search keywords in those sites are extracted. Search keywords can be easily extracted by simple parsing. The extracted keywords are ranked according to their frequency. The experiment uses approximately 3.9 million search results from Korea's largest search portal site. As a result, a total of 344,822 search keywords were extracted. Next, by using web browsing history and site information, the shopping-related keywords were taken from the entire set of search keywords. As a result, we obtained 4,709 shopping-related keywords. For performance evaluation, we compared the hit ratios of all the search keywords with the shopping-related keywords. To achieve this, we extracted 80,298 search keywords from several Internet shopping malls and then chose the top 1,000 keywords as a set of true shopping keywords. We measured precision, recall, and F-scores of the entire amount of keywords and the shopping-related keywords. The F-Score was formulated by calculating the harmonic mean of precision and recall. The precision, recall, and F-score of shopping-related keywords derived by the proposed methodology were revealed to be higher than those of the entire number of keywords. This study proposes a scheme that is able to obtain shopping-related keywords in a relatively simple manner. We could easily extract shopping-related keywords simply by examining transactions whose next visit is a shopping mall. The resultant shopping-related keyword set is expected to be a useful asset for many shopping malls that participate in keyword marketing. Moreover, the proposed methodology can be easily applied to the construction of special area-related keywords as well as shopping-related ones.

A Study on the Optimal Search Keyword Extraction and Retrieval Technique Generation Using Word Embedding (워드 임베딩(Word Embedding)을 활용한 최적의 키워드 추출 및 검색 방법 연구)

  • Jeong-In Lee;Jin-Hee Ahn;Kyung-Taek Koh;YoungSeok Kim
    • Journal of the Korean Geosynthetics Society
    • /
    • v.22 no.2
    • /
    • pp.47-54
    • /
    • 2023
  • In this paper, we propose the technique of optimal search keyword extraction and retrieval for news article classification. The proposed technique was verified as an example of identifying trends related to North Korean construction. A representative Korean media platform, BigKinds, was used to select sample articles and extract keywords. The extracted keywords were vectorized using word embedding and based on this, the similarity between the extracted keywords was examined through cosine similarity. In addition, words with a similarity of 0.5 or higher were clustered based on the top 10 frequencies. Each cluster was formed as 'OR' between keywords inside the cluster and 'AND' between clusters according to the search form of the BigKinds. As a result of the in-depth analysis, it was confirmed that meaningful articles appropriate for the original purpose were extracted. This paper is significant in that it is possible to classify news articles suitable for the user's specific purpose without modifying the existing classification system and search form.

The Use and Understanding of Keyword Searching in SELIS Online Public Access Catalogs (SELIS OPAC에 있어서 키워드탐색의 이용과 이해)

  • Koo Bon-Young
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.33 no.2
    • /
    • pp.119-139
    • /
    • 1999
  • It Is the purpose of this research to analyse users' understanding how keyword and boolean search work in SELIS(SEoul Women's University Library and Information System) OPAC. Results of analyses of the subject, SELIS OPAC system processing, are: comprehension percentage of keyword extraction is $67(22.48\%)$ out of total 298 persons, no comprehension is $231(77.52\%)$ understanding of boolean OR In keyword search appears $115(22.48\%)$ out of 297, no understanding does $182(77.52\%)$ : comprehension of boolean AND is $98(33.11\%)$ out of 296, no understanding appears $198(66.89\%)$ understanding of using boolean and symbols is $109(36.49\%)$ out of 285, no understanding is $181(63.51\%)$ which Is lower percentage generally. And in SELIS OPAC system, in Intentional analyses to see any difference in understanding of keyword search between experience of keyword search or no, It shows no difference in interrelation $5\%$ level of significance, but In boolean search it does in interrelation $5\%$ level of significance.

  • PDF

Keyword Analysis Based Document Compression System

  • Cao, Kerang;Lee, Jongwon;Jung, Hoekyung
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.1
    • /
    • pp.48-51
    • /
    • 2018
  • The traditional documents analysis was centered on words based system was implemented using a morpheme analyzer. These traditional systems can classify used words in the document but, cannot help to user's document understanding or analysis. In this problem solved, System needs extract for most valuable paragraphs what can help to user understanding documents. In this paper, we propose system extracts paragraphs of normalized XML document. User insert to system what filename when wants for analyze XML document. Then, system is search for keyword of the document. And system shows results searched keyword. When user choice and inserts keyword for user wants then, extracting for paragraph including keyword. After extracting paragraph, system operating maintenance paragraph sequence and check duplication. If exist duplication then, system deletes paragraph of duplication. And system informs result to user what counting each keyword frequency and weight to user, sorted paragraphs.