• Title/Summary/Keyword: Keyword-based

Search Result 1,126, Processing Time 0.029 seconds

Proposal of keyword extraction method based on morphological analysis and PageRank in Tweeter (트위터에서 형태소 분석과 PageRank 기반 화제단어 추출 방법 제안)

  • Lee, Won-Hyung;Cho, Sung-Il;Kim, Dong-Hoi
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.157-163
    • /
    • 2018
  • People who use SNS publish their diverse ideas on SNS every day. The data posted on the SNS contains many people's thoughts and opinions. In particular, popular keywords served on Twitter compile the number of frequently appearing words in user posts and rank them. However, this method is sensitive to unnecessary data simply by listing duplicate words. The proposed method determines the ranking based on the topic of the word using the relationship diagram between words, so that the influence of unnecessary data is less and the main word can be stably extracted. For the performance comparison in terms of the descending keyword rank and the ratios of meaningless keywords among high rank 20 keywords, we make a comparison between the proposed scheme which is based on morphological analysis and PageRank, and the existing scheme which is based on the number of appearances. As a result, the proposed scheme and the existing scheme have included 55% and 70% of meaningless keywords among high rank 20 keywords, respectively, where the proposed scheme is improved about 15% compared with the existing scheme.

Design and Evaluation of an Individual Instance-based Ontology Retrieval System for Archival Records of the "Saemaul Movement" (새마을운동 기록물의 개체기반 온톨로지 검색시스템 설계 및 평가)

  • Lee, Byung Gil;Kim, Heesop
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.13 no.3
    • /
    • pp.67-97
    • /
    • 2013
  • The purpose of this study is to design and evaluate an individual instance-based ontology retrieval system for archival records of the "Saemaul Movement". We used Protege editor 4.1 to design an individual instance-based ontology. To evaluate the proposed ontology retrieval system, five short queries and ten narrative queries were used and compared their precision and recall against the NARA keyword-based retrieval system. The performance results showed that the individual-based ontology retrieval system outperformed the keyword-based retrieval system in terms of the measurement of precision and recall.

A Methodology for Extracting Shopping-Related Keywords by Analyzing Internet Navigation Patterns (인터넷 검색기록 분석을 통한 쇼핑의도 포함 키워드 자동 추출 기법)

  • Kim, Mingyu;Kim, Namgyu;Jung, Inhwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.123-136
    • /
    • 2014
  • Recently, online shopping has further developed as the use of the Internet and a variety of smart mobile devices becomes more prevalent. The increase in the scale of such shopping has led to the creation of many Internet shopping malls. Consequently, there is a tendency for increasingly fierce competition among online retailers, and as a result, many Internet shopping malls are making significant attempts to attract online users to their sites. One such attempt is keyword marketing, whereby a retail site pays a fee to expose its link to potential customers when they insert a specific keyword on an Internet portal site. The price related to each keyword is generally estimated by the keyword's frequency of appearance. However, it is widely accepted that the price of keywords cannot be based solely on their frequency because many keywords may appear frequently but have little relationship to shopping. This implies that it is unreasonable for an online shopping mall to spend a great deal on some keywords simply because people frequently use them. Therefore, from the perspective of shopping malls, a specialized process is required to extract meaningful keywords. Further, the demand for automating this extraction process is increasing because of the drive to improve online sales performance. In this study, we propose a methodology that can automatically extract only shopping-related keywords from the entire set of search keywords used on portal sites. We define a shopping-related keyword as a keyword that is used directly before shopping behaviors. In other words, only search keywords that direct the search results page to shopping-related pages are extracted from among the entire set of search keywords. A comparison is then made between the extracted keywords' rankings and the rankings of the entire set of search keywords. Two types of data are used in our study's experiment: web browsing history from July 1, 2012 to June 30, 2013, and site information. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The original sample dataset contains 150 million transaction logs. First, portal sites are selected, and search keywords in those sites are extracted. Search keywords can be easily extracted by simple parsing. The extracted keywords are ranked according to their frequency. The experiment uses approximately 3.9 million search results from Korea's largest search portal site. As a result, a total of 344,822 search keywords were extracted. Next, by using web browsing history and site information, the shopping-related keywords were taken from the entire set of search keywords. As a result, we obtained 4,709 shopping-related keywords. For performance evaluation, we compared the hit ratios of all the search keywords with the shopping-related keywords. To achieve this, we extracted 80,298 search keywords from several Internet shopping malls and then chose the top 1,000 keywords as a set of true shopping keywords. We measured precision, recall, and F-scores of the entire amount of keywords and the shopping-related keywords. The F-Score was formulated by calculating the harmonic mean of precision and recall. The precision, recall, and F-score of shopping-related keywords derived by the proposed methodology were revealed to be higher than those of the entire number of keywords. This study proposes a scheme that is able to obtain shopping-related keywords in a relatively simple manner. We could easily extract shopping-related keywords simply by examining transactions whose next visit is a shopping mall. The resultant shopping-related keyword set is expected to be a useful asset for many shopping malls that participate in keyword marketing. Moreover, the proposed methodology can be easily applied to the construction of special area-related keywords as well as shopping-related ones.

kUTAF: Keyword based web service UI Test Automation Framework (키워드기반 웹서비스 UI 테스트 자동화 프레임웍 소개)

  • Hwang, Young-Seok;Jung, Sang-Moon;Hwa, Chang-Deuk
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06a
    • /
    • pp.158-161
    • /
    • 2011
  • 웹 UI 자동화 테스트 환경 구축하여 운영할 때, 현실적으로 어려움이 있다. 자동화 셋을 구축하는데 초기 투자비용이 많이 들고, 구축한 흐름에도 유지보수 비용이 많이 발생한다. 또한 필요시 바로 실행될 수 있는 자동화 셋을 통합 관리하려면 많은 공수가 투입되어야 한다. 본 논문에서는 웹 UI 자동화 테스트시 발생하는 이런 어려움들을 효과적으로 해결할 수 있는 키워드 기반 웹서비스 UI 테스트 자동화 프레임웍 (kUTAF: Keyword based web service UI Test Automation Framework) 을 소개한다. kUTAF은 웹 구성요소와 사용자 행위를 도메인 지정 언어로 매핑시킨 키워드를 사용해 테스트케이스를 작성하여 자동화 테스트를 수행하고, 작성된 자동화 테스트 셋을 효과적으로 통합 관리하는 프레임웍이다. kUTAF을 적용하면 자동화 셋의 유지보수가 쉬워지고, 커뮤니케이션 비용이 감소하고, 자동화 테스트를 통합 관리하는 부문에서 많은 효과를 볼 수 있다.

Discovery of promising business items by technology-industry concordance and keyword co-occurrence analysis of US patents. (기술-산업 연계구조 및 특허 분석을 통한 미래유망 아이템 발굴)

  • Cho Byoung-Youl;Rho Hyun-Sook
    • Journal of Korea Technology Innovation Society
    • /
    • v.8 no.2
    • /
    • pp.860-885
    • /
    • 2005
  • This study relates to develop a quantitative method through which promising technology-based business items can be discovered and selected. For this study, we utilized patent trend analysis, technology-industry concordance analysis, and keyword co-occurrence analysis of US patents. By analyzing patent trends and technology-industry concordance, we were able to find out the emerging industry trends : prevalence of bio industry, service industry, and B2C business. From the direct and co-occurrence analysis of newly discovered patent keywords in the year, 2000, 28 promising business item candidates were extracted. Finally, the promising item candidates were prioritized using 4 business attractiveness determinants; market size, product life cycle, degree of the technological innovation, and coincidence with the industry trends. This result implicates that reliable discovery and selection of promising technology-based business items can be performed by a quantitative, objective and low- cost process using knowledge discovery method from patent database instead of peer review.

  • PDF

RAKTA: Automation of Exploratory Testing Based on Keyword (RAKTA: 키워드 기반 탐색적 테스팅 자동화)

  • Hwang, Jun-Sun;Choi, Eun Man
    • Annual Conference of KIPS
    • /
    • 2019.05a
    • /
    • pp.331-334
    • /
    • 2019
  • 일반적인 키워드 기반 테스트는 기능 위주의 키워드를 작성하여 테스트를 자동화하여 비용은 적게 들지만 활용도가 높은 테스트를 자동화기 어렵다. 한편 탐색적 테스트는 리스크 기반으로 차터를 작성하여 짧은 시간동안 많은 에러를 탐지하는 장점이 있으나, 문서화가 미흡하다는 단점이 있다. 위와 같은 단점을 보완하기 위하여 탐색적 테스트의 기본 원리를 고수하면서 효율적 키워드 기반 자동화가 가능한 RAKTA(Record And Keyword-based Test Automation) 방법론을 제안한다. RAKTA는 오픈 소스 키워드 기반 자동화 프레임워크인 로봇 프레임워크의 기술을 사용하여, 키워드 기반과, 탐색적 테스트의 장점을 뽑아 효율적으로 테스트 자동화하여 비용을 줄이고 많은 에러를 탐지할 수 있다. 또한 본 논문에서는 RAKTA 방법론을 활용한 여러 가지 키워드 재사용 사례와 기존 조직에서 사용하던 테스트 스크립트를 혼합하여 통합 테스트, 인수 테스트, 설치 테스트를 자동화하는 방법을 제안한다.

Keyword identifications on dimensions for service quality of Healthcare providers (헬스케어 서비스 리뷰를 활용한 서비스 품질 차원 별 중요 단어 파악 방안)

  • Lee, Hong Joo
    • Knowledge Management Research
    • /
    • v.19 no.4
    • /
    • pp.171-185
    • /
    • 2018
  • Studies on online review have carried out analysis of the rating and topic as a whole. However, it is necessary to analyze opinions on various dimensions of service quality. This study classifies reviews of healthcare services into service quality dimensions, and proposes a method to identify words that are mainly referred to in each dimension. Service quality was based on the dimensions provided by SERVQUAL, and patient reviews have collected from NHSChoice. The 2,000 sentences sampled were classified into service quality dimension of SERVQUAL and a method of extracting important keywords from sentences by service quality dimension was suggested. The RAKE algorithm is used to extract key words from a single document and an index is considered to consider frequently used words in various documents. Since we need to identify key words in various reviews, we have considered frequency and discrimination (IDF) at the same time, rather than identifying key words based only on the RAKE score. In SERVQUAL dimension, we identified the words that patients mentioned mainly, and also identified the words that patients mainly refer to by review rating.

ORGANIC RELATIONSHIP BETWEEN LAWS BASED ON JUDICIAL PRECEDENTS USING TOPOLOGICAL DATA ANALYSIS

  • Kim, Seonghun;Jeong, Jaeheon
    • Korean Journal of Mathematics
    • /
    • v.29 no.4
    • /
    • pp.649-664
    • /
    • 2021
  • There have been numerous efforts to provide legal information to the general public easily. Most of the existing legal information services are based on keyword-oriented legal ontology. However, this keyword-oriented ontology construction has a sense of disparity from the relationship between the laws used together in actual cases. To solve this problem, it is necessary to study which laws are actually used together in various judicial precedents. However, this is difficult to implement with the existing methods used in computer science or law. In our study, we analyzed this by using topological data analysis, which has recently attracted attention very promisingly in the field of data analysis. In this paper, we applied the the Mapper algorithm, which is one of the topological data analysis techniques, to visualize the relationships that laws form organically in actual precedents.

Preference-based search technology for the user query semantic interpretation (사용자 질의 의미 해석을 위한 선호도 기반 검색 기술)

  • Jeong, Hoon;Lee, Moo-Hun;Do, Hana;Choi, Eui-In
    • Journal of Digital Convergence
    • /
    • v.11 no.2
    • /
    • pp.271-277
    • /
    • 2013
  • Typical semantic search query for Semantic search promises to provide more accurate result than present-day keyword matching-based search by using the knowledge base represented logically. Existing keyword-based retrieval system is Preference for the semantic interpretation of a user's query is not the meaning of the user keywords of interconnect, you can not search. In this paper, we propose a method that can provide accurate results to meet the user's search intent to user preference based evaluation by ranking search. The proposed scheme is Integrated ontology-based knowledge base built on the formal structure of the semantic interpretation process based on ontology knowledge base system.

Keyword-Based Contents Recommendation Web Service (키워드 기반 콘텐츠 추천 웹서비스)

  • Park, Dong-Jin;Kim, Min-Geun;Song, Hyeon-Seop;Yoon, Seok-Min;Kim, Youngjong
    • Annual Conference of KIPS
    • /
    • 2022.05a
    • /
    • pp.346-348
    • /
    • 2022
  • Media Contents Recommendation Web Service (service name 'mobodra') is a web service that analyzes media types and genre tastes for each user and recommends content accordingly. Users select some of the works randomly provided on the web when signing up for membership and analyze their tastes based on this. Based on this analysis, preferred content for each user is recommended. In this paper, we implement a content recommendation algorithm through item-based collaborative filtering. When the user's activity data or preference is re-examined, the above process is executed again to update the user's taste.