• Title/Summary/Keyword: 유사 키워드

Search Result 312, Processing Time 0.022 seconds

Semantic Search and Recommendation of e-Catalog Documents through Concept Network (개념 망을 통한 전자 카탈로그의 시맨틱 검색 및 추천)

  • Lee, Jae-Won;Park, Sung-Chan;Lee, Sang-Keun;Park, Jae-Hui;Kim, Han-Joon;Lee, Sang-Goo
    • The Journal of Society for e-Business Studies
    • /
    • v.15 no.3
    • /
    • pp.131-145
    • /
    • 2010
  • Until now, popular paradigms to provide e-catalog documents that are adapted to users' needs are keyword search or collaborative filtering based recommendation. Since users' queries are too short to represent what users want, it is hard to provide the users with e-catalog documents that are adapted to their needs(i.e., queries and preferences). Although various techniques have beenproposed to overcome this problem, they are based on index term matching. A conventional Bayesian belief network-based approach represents the users' needs and e-catalog documents with their corresponding concepts. However, since the concepts are the index terms that are extracted from the e-catalog documents, it is hard to represent relationships between concepts. In our work, we extend the conventional Bayesian belief network based approach to represent users' needs and e-catalog documents with a concept network which is derived from the Web directory. By exploiting the concept network, it is possible to search conceptually relevant e-catalog documents although they do not contain the index terms of queries. Furthermore, by computing the conceptual similarity between users, we can exploit a semantic collaborative filtering technique for recommending e-catalog documents.

Recognition Method of Korean Abnormal Language for Spam Mail Filtering (스팸메일 필터링을 위한 한글 변칙어 인식 방법)

  • Ahn, Hee-Kook;Han, Uk-Pyo;Shin, Seung-Ho;Yang, Dong-Il;Roh, Hee-Young
    • Journal of Advanced Navigation Technology
    • /
    • v.15 no.2
    • /
    • pp.287-297
    • /
    • 2011
  • As electronic mails are being widely used for facility and speedness of information communication, as the amount of spam mails which have malice and advertisement increase and cause lots of social and economic problem. A number of approaches have been proposed to alleviate the impact of spam. These approaches can be categorized into pre-acceptance and post-acceptance methods. Post-acceptance methods include bayesian filters, collaborative filtering and e-mail prioritization which are based on words or sentances. But, spammers are changing those characteristics and sending to avoid filtering system. In the case of Korean, the abnormal usages can be much more than other languages because syllable is composed of chosung, jungsung, and jongsung. Existing formal expressions and learning algorithms have the limits to meet with those changes promptly and efficiently. So, we present an methods for recognizing Korean abnormal language(Koral) to improve accuracy and efficiency of filtering system. The method is based on syllabic than word and Smith-waterman algorithm. Through the experiment on filter keyword and e-mail extracted from mail server, we confirmed that Koral is recognized exactly according to similarity level. The required time and space costs are within the permitted limit.

A Study on the Contemporary Definition of 'GARDEN' - Keyword Analysis used Literature Research and Big Data - ('정원'의 시대적 정의에 관한 연구 - 문헌연구와 빅데이터를 활용한 키워드 분석을 중심으로-)

  • Woo, Kyungsook;Suh, Joo Hwan
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.44 no.5
    • /
    • pp.1-11
    • /
    • 2016
  • There has been an increasingly high interest in gardens and garden design in Korea recently. However, the usage of the term 'garden' is extremely varied and complex, and there has been very little academic research made on the meaning of garden. Therefore, this research attempts to investigate the ideas of current gardens and to elucidate their changing patterns by means of extensive literature research and big data analysis. The notion of garden in the past was broad including not only private space such as Madang(마당) and Teul(뜰), but also even field and grass land as public outdoor space. Yet, the meaning has become smaller to merely private space due to the change of dwelling systems due to high industrial development of the 20th century. Furthermore, the introduction of urban parks as an interactive space between nature and humans, the similar spatial function of gardens, has blurred the boundary between garden and park, which created confusion in understanding the concept of a garden. After all, garden is a subject for humans. The meanings of garden need to be recognized from various points of view since garden itself is a creation by the sum of diverse fields such as natural and social sciences as well as culturology. This discussion on the meaning of garden in the present day will give a conceptual foundation for future research on gardens and garden design. Also, the big data analysis employed here as a research method can help other similar research topics, particularly semantics in landscape architecture.

Reliable Image-Text Fusion CAPTCHA to Improve User-Friendliness and Efficiency (사용자 편의성과 효율성을 증진하기 위한 신뢰도 높은 이미지-텍스트 융합 CAPTCHA)

  • Moon, Kwang-Ho;Kim, Yoo-Sung
    • The KIPS Transactions:PartC
    • /
    • v.17C no.1
    • /
    • pp.27-36
    • /
    • 2010
  • In Web registration pages and online polling applications, CAPTCHA(Completely Automated Public Turing Test To Tell Computers and Human Apart) is used for distinguishing human users from automated programs. Text-based CAPTCHAs have been widely used in many popular Web sites in which distorted text is used. However, because the advanced optical character recognition techniques can recognize the distorted texts, the reliability becomes low. Image-based CAPTCHAs have been proposed to improve the reliability of the text-based CAPTCHAs. However, these systems also are known as having some drawbacks. First, some image-based CAPTCHA systems with small number of image files in their image dictionary is not so reliable since attacker can recognize images by repeated executions of machine learning programs. Second, users may feel uncomfortable since they have to try CAPTCHA tests repeatedly when they fail to input a correct keyword. Third, some image-base CAPTCHAs require high communication cost since they should send several image files for one CAPTCHA. To solve these problems of image-based CAPTCHA, this paper proposes a new CAPTCHA based on both image and text. In this system, an image and keywords are integrated into one CAPTCHA image to give user a hint for the answer keyword. The proposed CAPTCHA can help users to input easily the answer keyword with the hint in the fused image. Also, the proposed system can reduce the communication costs since it uses only a fused image file for one CAPTCHA. To improve the reliability of the image-text fusion CAPTCHA, we also propose a dynamic building method of large image dictionary from gathering huge amount of images from theinternet with filtering phase for preserving the correctness of CAPTCHA images. In this paper, we proved that the proposed image-text fusion CAPTCHA provides users more convenience and high reliability than the image-based CAPTCHA through experiments.

Knowledge Structure of Cognitive Behavioral Therapy Studies in Korea: Co-word Analysis (국내 인지행동치료 연구의 지식구조: 동시출현단어 분석)

  • Kim, Do-Hee;Kim, Hyeon-Jin;An, Da-Hye
    • Journal of Digital Convergence
    • /
    • v.17 no.12
    • /
    • pp.509-521
    • /
    • 2019
  • The purpose of this study is to examine the patterns of the keywords in journals in the field of Cognitive Behavioral Therapy (CBT) to identify the knowledge structure of CBT studies in Korea. To compare CBT studies from Korea and abroad, 234 articles (2008-2019) published on "Cognitive Behavior Therapy in Korea" and 2,316 articles (1977-2019) published on "Cognitive Therapy and Research" were collected. The data were analyzed using NetMiner 4.3. The co-word analysis was done by calculating the cosine similarity matrix of major keywords, followed by visualizing the network. The results of this study identified the main interests of Korean CBT scholars, and categorized the knowledge structure of CBT in Korea into 9 research areas: "scale validation"; "perfectionism and entrapment"; "cognitive, emotional, and relationship characteristics of schizophrenic patients"; "cognitive characteristics and treatment of borderline personality disorder and depression/bipolar disorder patients"; "adaptation and psychological health"; "cognitive characteristics and treatment of patients with social anxiety disorder"; "causes and co-morbidities of depression"; "acceptance and commitment therapy"; and "understanding and the treatment of binge eating disorder patients." This study is meaningful in that it has reviewed the accumulated knowledge in the CBT field in Korea for the past 11 years, and suggests future tasks for development to improve the standards of CBT practice.

A Personalized Product Recommendation Agent on Mobile Internet (무선인터넷 환경에서의 개인화상품추천에이전트)

  • 이승화;이은석
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.04b
    • /
    • pp.145-147
    • /
    • 2004
  • 본 논문에서는 무선인터넷 환경에 적합한 개인화된 상품추천에이전트를 제안한다. 기존에 유선인터넷상의 많은 개인화 추천시스템에서는 초기 사용자 모델링을 위해 사용자에게 수많은 질의를 하고 응답을 요구하였다. 그러나 이러한 방식은 무선인터넷 환경에서 정보 전송량에 따른 높은 사용요금을 고려할 때 적용하기 힘든 방식이다. 본 제안 시스템은 사용자의 Social data률 이용하여 사용자를 비슷한 연령과 성별 그룹으로 나누고, 해당 그룹에서 구매율이 높은 상품을 우선 제시한 후, 사용자 행동을 모니터링 하여 암시적(Implicit)피드백을 통해 프로파일을 생성함으로써, 번거로운 질의-응답 과정 없이도 초기 사용자 모델링을 수행할 수 있다. 프로파일 생성 이후에는 이를 기반으로 하여 사용자몰 유사한 취향을 가진 그룹으로 다시 군집화한 후 협력적 추천을 하게 되며, 프로파일에는 해당 상품의 최종 카테고리명과 키워드를 수집함으로써, 상품의 브랜드와 규격정보를 반영한 추천이 가능하다. 또한 추천 상품과 사용자의 구매데이터와의 비교를 수행하여 사용자가 해당상품을 구매하였을 경우, 상품에 대한 취향정보는 그대로 유지하고 관련 상품을 추천하되, 구매한 상품이 중복 추천되지 않도록 하였다. 시스템 평가를 위해 프로토타입을 구현하여, 다수의 사용자에게 시스템을 이용하며 관심품목을 체크하도록 하였고. 추천횟수가 반복되며 히트율이 증가하는 결과를 통해 시스템의 학습속도와 성능을 평가하였다. 그리고 쇼핌몰에서 구매경험이 있는 사용자의 기존 구매데이터와 Social data를 이용한 초기 제시상품을 역으로 비교하여 오랜 시간과 비용 발생 없이도 초기 프로파일 생성의 유효성을 증명하였다. 포함하는 XML 질의에 대해서도 웹에서 캐쉬를 이용한 처리가 효율적임을 확인하였다.키는데 목적이 있다.RED에 비해 향상된 성능을 보여주었다.웍스 네트워크상의 다양한 디바이스들간의 네트워크 다양화와 분산화 기능을 얻을 수 있었고, 기존의 고가의 해외 솔루션인 Echelon사의 LonMaker 소프트웨어를 사용하지 않고도 국내의 순수 솔루션인 리눅스 기반의 LonWare 3.0 다중 바인딩 기능을 통해 저 비용으로 홈 네트워크 구성 관리 서버 시스템 개발에 대한 비용을 줄일 수 있다. 기대된다.e 함량이 대체로 높게 나타났다. 점미가 수가용성분에서 goucose대비 용출함량이 고르게 나타나는 경향을 보였고 흑미는 알칼리가용분에서 glucose가 상당량(0.68%) 포함되고 있음을 보여주었고 arabinose(0.68%), xylose(0.05%)도 다른 종류에 비해서 다량 함유한 것으로 나타났다. 흑미는 총식이섬유 함량이 높고 pectic substances, hemicellulose, uronic acid 함량이 높아서 콜레스테롤 저하 등의 효과가 기대되며 고섬유식품으로서 조리 특성 연구가 필요한 것으로 사료된다.리하였다. 얻어진 소견(所見)은 다음과 같았다. 1. 모년령(母年齡), 임신회수(姙娠回數), 임신기간(姙娠其間), 출산시체중등(出産時體重等)의 제요인(諸要因)은 주산기사망(周産基死亡)에 대(對)하여 통계적(統計的)으로 유의(有意)한 영향을 미치고 있어 $25{\sim}29$세(歲)의 연령군에서, 2번째 임신과 2번째의 출산에서 그리고 만삭의 임신 기간에, 출산시체중(出産時體重) $3.50{\sim}3.99kg$사이의 아

  • PDF

Semantic Topic Selection Method of Document for Classification (문서분류를 위한 의미적 주제선정방법)

  • Ko, kwang-Sup;Kim, Pan-Koo;Lee, Chang-Hoon;Hwang, Myung-Gwon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.1
    • /
    • pp.163-172
    • /
    • 2007
  • The web as global network includes text document, video, sound, etc and connects each distributed information using link Through development of web, it accumulates abundant information and the main is text based documents. Most of user use the web to retrieve information what they want. So, numerous researches have progressed to retrieve the text documents using the many methods, such as probability, statistics, vector similarity, Bayesian, and so on. These researches however, could not consider both the subject and the semantics of documents. As a result user have to find by their hand again. Especially, it is more hard to find the korean document because the researches of korean document classification is insufficient. So, to overcome the previous problems, we propose the korean document classification method for semantic retrieval. This method firstly, extracts TF value and RV value of concepts that is included in document, and maps into U-WIN that is korean vocabulary dictionary to select the topic of document. This method is possible to classify the document semantically and showed the efficiency through experiment.

A Comparative Study on the Symbolism of the Combination of Animals One Another in East Asian Comedic Stories and Proverbs (동아시아 소화(笑話)·속담(俗談)속의 동물조합 상징성 비교)

  • Keum, Young-Jin
    • Cross-Cultural Studies
    • /
    • v.42
    • /
    • pp.205-240
    • /
    • 2016
  • The combination of animals has been developed in each of the cultural spheres as a method of metaphor and symbolism of the cultural code. However, its symbolism is not a fixed constant, but a variable and relative constant. This work focused on its features in comparison with East Asian cultural spheres comedic stories and proverbs. Consequently, several features were identified. First, the combinations of animals in similar comedic stories and proverbs among Korea, Japan and China show a difference in point of view. Korean focuses on the difference of the two animals, but Chinese and Japanese focus on the differences in value and level. Second, the method of anthropomorphization is relatively more developed in China and Japan than Korea. The combinations of animals of Chinese comedic stories and proverbs particularly in the field of anthropomorphization, are most focused on age and sex of the animal. The animal's age or sex remains mostly undetermined in Korean animal's proverbs, unlike Chinese proverbs. On the other hand, two animals in Japanese comedic stories and proverbs are usually of the male and female gender from. Third, the combinations of animals of Chinese and Japanese focus on the animal's body and its characteristics of action. Chinese and Japanese combine the characteristics of the two animal's bodies and actions. This feature apparently caused the resultant combinations of the animal's body parts, for example, the Dragon. Understanding of the combinations of two animals is a good portal into the features of East Asian culture sphere.

Automatic Classification and Vocabulary Analysis of Political Bias in News Articles by Using Subword Tokenization (부분 단어 토큰화 기법을 이용한 뉴스 기사 정치적 편향성 자동 분류 및 어휘 분석)

  • Cho, Dan Bi;Lee, Hyun Young;Jung, Won Sup;Kang, Seung Shik
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.1
    • /
    • pp.1-8
    • /
    • 2021
  • In the political field of news articles, there are polarized and biased characteristics such as conservative and liberal, which is called political bias. We constructed keyword-based dataset to classify bias of news articles. Most embedding researches represent a sentence with sequence of morphemes. In our work, we expect that the number of unknown tokens will be reduced if the sentences are constituted by subwords that are segmented by the language model. We propose a document embedding model with subword tokenization and apply this model to SVM and feedforward neural network structure to classify the political bias. As a result of comparing the performance of the document embedding model with morphological analysis, the document embedding model with subwords showed the highest accuracy at 78.22%. It was confirmed that the number of unknown tokens was reduced by subword tokenization. Using the best performance embedding model in our bias classification task, we extract the keywords based on politicians. The bias of keywords was verified by the average similarity with the vector of politicians from each political tendency.

Video-to-Video Generated by Collage Technique (콜라주 기법으로 해석한 비디오 생성)

  • Cho, Hyeongrae;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.26 no.1
    • /
    • pp.39-60
    • /
    • 2021
  • In the field of deep learning, there are many algorithms mainly after GAN in research related to generation, but in terms of generation, there are similarities and differences with art. If the generation in the engineering aspect is mainly to judge the presence or absence of a quantitative indicator or the correct answer and the incorrect answer, the creation in the artistic aspect creates a creation that interprets the world and human life by cross-validating and doubting the correct answer and incorrect answer from various perspectives. In this paper, the video generation ability of deep learning was interpreted from the perspective of collage and compared with the results made by the artist. The characteristic of the experiment is to compare and analyze how much GAN reproduces the result of the creator made with the collage technique and the difference between the creative part, and investigate the satisfaction level by making performance evaluation items for the reproducibility of GAN. In order to experiment on how much the creator's statement and purpose of expression were reproduced, a deep learning algorithm corresponding to the statement keyword was found and its similarity was compared. As a result of the experiment, GAN did not meet much expectations to express the collage technique. Nevertheless, the image association showed higher satisfaction than human ability, which is a positive discovery that GAN can show comparable ability to humans in terms of abstract creation.