• Title/Summary/Keyword: Keyword-based

Search Result 1,126, Processing Time 0.037 seconds

System Design for Supporting Keyword Search in DHT-based P2P systems (DHT 기반 P2P 시스템에서 키워드 검색 지원을 위한 시스템 디자인)

  • 진명희;이승은;손영성;김경석
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10c
    • /
    • pp.550-552
    • /
    • 2004
  • 분산 해시 테이블 (Distributed Hash Table) 을 사용한 P2P 시스템에서는 해시함수를 사용하며 파일과 노드의 ID를 정의하고 파일의 ID와 매핑 (mapping) 되는 ID를 가진 노드에 파일을 저장함으로써 시스템 전체에 파일을 완전히 분산시킨다. 이러한 시스템에서는 파일을 찾을 때 해시된 파일 ID로 찾기 때문에 정확한 매치 (exact match) 만 가능하다. 하지만 현재 P2P 파일 공유 시스템에서는 파일의 전체 이름을 정확히 알지 못하더라도 부분적인 키워드로 파일을 검색할 수 있도록 하는 키워드 검색 (keyword search) 이 요구된다. 본 논문에서는 분산 해시 테이블을 기반으로 하는 P2P 시스템에서 키워드 검색이 가능하도록 하는 방안을 제안한다.

  • PDF

Utterance Verification using Phone-Level Log-Likelihood Ratio Patterns in Word Spotting Systems (핵심어 인식기에서 단어의 음소레벨 로그 우도 비율의 패턴을 이용한 발화검증 방법)

  • Kim, Chong-Hyon;Kwon, Suk-Bong;Kim, Hoi-Rin
    • Phonetics and Speech Sciences
    • /
    • v.1 no.1
    • /
    • pp.55-62
    • /
    • 2009
  • This paper proposes an improved method to verify a keyword segment that results from a word spotting system. First a baseline word spotting system is implemented. In order to improve performance of the word spotting systems, we use a two-pass structure which consists of a word spotting system and an utterance verification system. Using the basic likelihood ratio test (LRT) based utterance verification system to verify the keywords, there have been certain problems which lead to performance degradation. So, we propose a method which uses phone-level log-likelihood ratios (PLLR) patterns in computing confidence measures for each keyword. The proposed method generates weights according to the PLLR patterns and assigns different weights to each phone in the process of generating confidence measures for the keywords. This proposed method has shown to be more appropriate to word spotting systems and we can achieve improvement in final word spotting accuracy.

  • PDF

Document Content Similarity Detection Algorithm Using Word Cooccurrence Statistical Information Based Keyword Extraction (단어 공기 통계 정보 기반 색인어 추출을 활용한 문서 유사도 검사 알고리즘)

  • Kim, Jinkyu;Yi, Seungchul;Park, Kibong;Haing, Huhduck
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2016.01a
    • /
    • pp.111-113
    • /
    • 2016
  • 빠른 속도로 쏟아지고 있는 각종 발행물, 논문들에 대한 표절 검토는 표절 검출 알고리즘을 통해 직접적인 복제, 짜깁기, 말 바꾸어 쓰기 등을 검토하거나 표절 검토자가 직접 해당 문서의 키워드를 검색하여 확인하는 방식으로 이루어지고 있다. 하지만 점점 더 늘어나는 방대한 양의 문서들에 대한 표절 검토 작업은 더욱 정교한 검토 방법론을 필요로 하고 있으며, 이를 돕기 위해 문서의 직접적인 단어나 복제 비교에서 더 나아가 문서의 내용을 비교하여 비슷한 내용의 문서들을 필터링 및 검출할 수 있는 방법을 제안한다. 문서의 내용을 비교하기 위해 키워드 추출 알고리즘을 선행하며, 이를 통해 문서의 핵심 내용을 비교할 수 있는 기반을 마련하여 표절 검토자의 작업의 정확성과 속도를 향상시키고자 한다.

  • PDF

Occupational Health Could be the New Normal Challenge in the Trade and Health Cycle: Keywords Analysis Between 1990 and 2020

  • Kiran, Sibel
    • Safety and Health at Work
    • /
    • v.12 no.2
    • /
    • pp.272-276
    • /
    • 2021
  • This brief report aims to establish the keyword content of studies on occupational health and safety-the key framework of the world of work in the trade and health domain. Data were collected from the SCOPUS database, focusing on articles on occupational health and safety and related keywords, with an emphasis on abstracts and titles. Data were analyzed and summarized based on keywords included from the MeSH database. There were 24,499 manuscripts in the domain and 1,346 (5.40%) occupational health-related keywords, including those that overlapped. The most frequently referenced occupational health-related keyword was "occupational health" (452 articles), followed by "occupational safety" (141 articles). There were fewer keywords on occupational health in the trade and health literature. As the world of work has been prioritized because of the recent new normal of work life since the COVID-19 pandemic, examining the focus of occupational health priorities within the global perspective is crucial.

Big-data Analytics: Exploring the Well-being Trend in South Korea Through Inductive Reasoning

  • Lee, Younghan;Kim, Mi-Lyang;Hong, Seoyoun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.6
    • /
    • pp.1996-2011
    • /
    • 2021
  • To understand a trend is to explore the intricate process of how something or a particular situation is constantly changing or developing in a certain direction. This exploration is about observing and describing an unknown field of knowledge, not testing theories or models with a preconceived hypothesis. The purpose is to gain knowledge we did not expect and to recognize the associations among the elements that were suspected or not. This generally requires examining a massive amount of data to find information that could be transformed into meaningful knowledge. That is, looking through the lens of big-data analytics with an inductive reasoning approach will help expand our understanding of the complex nature of a trend. The current study explored the trend of well-being in South Korea using big-data analytic techniques to discover hidden search patterns, associative rules, and keyword signals. Thereafter, a theory was developed based on inductive reasoning - namely the hook, upward push, and downward pull to elucidate a holistic picture of how big-data implications alongside social phenomena may have influenced the well-being trend.

A study on Metaverse keyword Consumer perception survey after Covid-19 using big Data

  • LEE, JINHO;Byun, Kwang Min;Ryu, Gi Hwan
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.4
    • /
    • pp.52-57
    • /
    • 2022
  • In this study, keywords from representative online portal sites such as Naver, Google, and Youtube were collected based on text mining analysis technique using Textom to check the changes in metqaverse after COVID-19. before Corona, it was confirmed that social media platforms such as Kakao Talk, Facebook, and Twitter were mentioned, and among the four metaverse, consumer awareness was still concentrated in the field of life logging. However, after Corona, keywords from Roblox, Fortnite, and Geppetto appeared, and keywords such as Universe, Space, Meta, and the world appeared, so Metaverse was recognized as a virtual world. As a result, it was confirmed that consumer perception changed from the life logging of Metaverse to the mirror world. Third, keywords such as cryptocurrency, cryptocurrency, coin, and exchange appeared before Corona, and the word frequency ranking for blockchain, which is an underlying technology, was high, but after Corona, the word frequency ranking fell significantly as mentioned above.

Comparative Policy Analysis on ICT Small and Medium-sized Venture Using Cognitive Map Analysis (인지지도를 활용한 ICT 중소벤처 지원정책 비교분석)

  • Park, Eunyub;Lee, Jung Mann
    • Journal of Information Technology Applications and Management
    • /
    • v.29 no.3
    • /
    • pp.75-93
    • /
    • 2022
  • The purpose of this study is to compare and analyze each government's ICT SME support policies to cope with changes in the ICT ecosystem paradigm. In particular, the core policies and policy trends of the Moon's government are presented through keyword network analysis and cognitive map analysis. As a result, core technologies such as ICT(Information Communication Technology), AI(Artificial Intelligence), Big Data, and 5G, which have high values of betweenness centrality and closeness centrality, are major keywords with high propagation power. The cognitive map analysis shows that the opportunity factors for the 4th industrial revolution are being activated through the ICT infrastructure circulation process, the domestic market circulation process, and the global market circulation process. This study is meaningful in terms of cognitive map analysis and utilization based on scientific analysis.

Identifying research trends in the emergency medical technician field using topic modeling (토픽모델링을 활용한 응급구조사 관련 연구동향)

  • Lee, Jung Eun;Kim, Moo-Hyun
    • The Korean Journal of Emergency Medical Services
    • /
    • v.26 no.2
    • /
    • pp.19-35
    • /
    • 2022
  • Purpose: This study aimed to identify research topics in the emergency medical technician (EMT) field and examine research trends. Methods: In this study, 261 research papers published between January 2000 and May 2022 were collected, and EMT research topics and trends were analyzed using topic modeling techniques. This study used a text mining technique and was conducted using data collection flow, keyword preprocessing, and analysis. Keyword preprocessing and data analysis were done with the RStudio Version 4.0.0 program. Results: Keywords were derived through topic modeling analysis, and eight topics were ultimately identified: patient treatment, various roles, the performance of duties, cardiopulmonary resuscitation, triage systems, job stress, disaster management, and education programs. Conclusion: Based on the research results, it is believed that a study on the development and application of education programs that can successfully increase the emergency care capabilities of EMTs is needed.

A System for Keyword Extraction and Keyword-based Sentiment Analysis for Topic Analysis in Discussion (토론 대화에서의 토픽 분석을 위한 키워드 추출 및 키워드 기반 감성분석 시스템)

  • Yong-Bin Jeong;Yu-Jin Oh;Jae-Wan Park;Sae-Mi Jang;Young-Gyun Hahm
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.164-169
    • /
    • 2022
  • 토픽 모델링은 비즈니스 분석이나 기술 동향 파악 등 다방면에서 많이 사용되고 있는 기술이다. 하지만 대표적인 방법인 LDA와 같은 비지도학습의 경우, 그 알고리즘 구조상 문서의 수가 많을 때 토픽 모델링이 가능하다. 본 논문에서는 문서의 수가 적은 경우도, 키워드 및 키프레이즈를 이용한 군집화를 통해 토픽 모델링을 하고 감성분석을 통해 토픽에 대한 분석도 제시하였다. 이에 필요한 데이터 제작 및 키워드 추출, 키워드 기반 감성분석, 키워드 임베딩 및 군집화를 구현하였고, 결과를 정성적으로 보았을 때 유의미한 분석이 되는 것을 확인하였다.

  • PDF

Reliable Image-Text Fusion CAPTCHA to Improve User-Friendliness and Efficiency (사용자 편의성과 효율성을 증진하기 위한 신뢰도 높은 이미지-텍스트 융합 CAPTCHA)

  • Moon, Kwang-Ho;Kim, Yoo-Sung
    • The KIPS Transactions:PartC
    • /
    • v.17C no.1
    • /
    • pp.27-36
    • /
    • 2010
  • In Web registration pages and online polling applications, CAPTCHA(Completely Automated Public Turing Test To Tell Computers and Human Apart) is used for distinguishing human users from automated programs. Text-based CAPTCHAs have been widely used in many popular Web sites in which distorted text is used. However, because the advanced optical character recognition techniques can recognize the distorted texts, the reliability becomes low. Image-based CAPTCHAs have been proposed to improve the reliability of the text-based CAPTCHAs. However, these systems also are known as having some drawbacks. First, some image-based CAPTCHA systems with small number of image files in their image dictionary is not so reliable since attacker can recognize images by repeated executions of machine learning programs. Second, users may feel uncomfortable since they have to try CAPTCHA tests repeatedly when they fail to input a correct keyword. Third, some image-base CAPTCHAs require high communication cost since they should send several image files for one CAPTCHA. To solve these problems of image-based CAPTCHA, this paper proposes a new CAPTCHA based on both image and text. In this system, an image and keywords are integrated into one CAPTCHA image to give user a hint for the answer keyword. The proposed CAPTCHA can help users to input easily the answer keyword with the hint in the fused image. Also, the proposed system can reduce the communication costs since it uses only a fused image file for one CAPTCHA. To improve the reliability of the image-text fusion CAPTCHA, we also propose a dynamic building method of large image dictionary from gathering huge amount of images from theinternet with filtering phase for preserving the correctness of CAPTCHA images. In this paper, we proved that the proposed image-text fusion CAPTCHA provides users more convenience and high reliability than the image-based CAPTCHA through experiments.