• Title/Summary/Keyword: target text

Search Result 237, Processing Time 0.313 seconds

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.

A Study on Effectiveness of Hang-Tag Type and Preferred Method of Functional Information for Outdoor Jackets (아웃도어 재킷의 기능성표기 행택 유형에 대한 소비자의 이해도 및 선호도 연구)

  • Bang, Giseong;Yoo, Shinjung
    • Science of Emotion and Sensibility
    • /
    • v.19 no.4
    • /
    • pp.83-94
    • /
    • 2016
  • The aim of this study was to investigate effective expression method of performance information of functional clothing and preferred alternatives for the categorized consumer groups according to the perception about the functional outdoor jacket. 472 males and females in their 20s-60s' were surveyed and their answers were analysed and categorized by using SPSS 21.0 statistical program. For the study, four different expression methods for waterproof & breathable fabric, 'illustration only', 'illustration+Korean text', 'illustration+foreign text', and 'chart with Korean text', were presented and asked for finding correct answers. The analysis was done for three categorized consumer groups from the former study, 'unconversant/brand & design pursuing group', 'conversant/function pursuing group' and 'high price/high function preferring group'. The results showed that regardless of groups, 'picture only' was the most preferred method and 'graph' was the least. However, the percentage of correct answers for the 'graph' was the biggest, especially for 'conversant/function pursuing group'. It implied that the effective expression method should be differentiated depending on the target consumer groups. 'Conversant/function pursuing group' more agreed on the need for additional information, such as 'after-washed performance', than other two groups.

Mammalian Research Topics and Trends in Korea (국내 포유류 연구의 주제와 동향)

  • Ko, Byung June;Eo, Soo Hyung
    • Korean Journal of Environment and Ecology
    • /
    • v.31 no.1
    • /
    • pp.30-41
    • /
    • 2017
  • Mammals in Korea have been studied in various fields such as animal science, veterinary medicine, laboratory animal science, ecology, and genetics. As the importance of biodiversity has been emphasized recently, conservation and management of mammals have attracted much public attention. However, in spite of such an increase in scientific research and public interest, it is still difficult to find a report or summary to grasp the trend of mammalian research in Korea. The purpose of this study is to provide the basic data for future plans of the detailed research area and the related policies by grasping the research trends of mammals in Korea. Using text-ming and co-word analysis, we analyzed 392 mammalian research papers published in Korean national journals as of 2015. Our results showed that the number of mammalian research papers published in Korea has gradually increased and that the research target species have also become increasingly diverse. The major research areas identified through text-mining and co-word analysis are (1) evolution/phylogenetics/genetics, (2) environmental science/ecology, (3) embryology/reproductive biology/cell biology, (4) veterinary medicine related to parasites, (5) parasitology related to rodents, (6) bacteriology/virology, (7) anatomy/cell biology/laboratory animal science, (8) veterinary science related to morphology and anatomy, (9) animal science, (10) marine mammalogy, and (11) Chiroptera (bat) research. Environmental science/ecology has been the most active field among the 11 research areas in recent times, and the proportion of research has increased sharply compared to the past. Environmental science/ecology is the core of biodiversity conservation, and as the importance of biodiversity has been emphasized in recent years, researchers' interest in mammal ecology appears to have increased. We expect that the results of this study will be useful for future research plan and related policies on mammals in Korea.

Reading Korean and Chinese Paintings Expressing the Ideas of Classical Literary Works - Focused on Interpretation of The Text (한국과 중국의 시의화(詩意畵) 읽기 - 텍스트의 해석을 중심으로 -)

  • Kang, KyungHee
    • (The)Study of the Eastern Classic
    • /
    • no.50
    • /
    • pp.261-294
    • /
    • 2013
  • The purpose of this paper lies how the original text of Chinese classical literary works have been implemented in the paintings of China and Korea, and inspect the ways how of these original text interpreted in paintings. It is an experiment of trying to analyze through literature with painting and read again painting through literature. Qu Yuan(屈原) Prose Poem of Fisherman("漁父辭"), Tao Yuanming(陶淵明) Prose Poem of Returning Home("歸去來辭") and the prose with a poem on the peach blossom spring("桃花源記幷詩"), Du Fu(杜甫), Song of Eight Drunken Celestials("飮中八仙歌"), Su Shi(蘇軾), Odes on the Red Cliff("赤壁賦"), Ou Yangxiu(歐陽脩), Odes of the Sounds of Autumn("秋聲賦") and the paintings which based on these texts were the target of examination. These literary texts shared by Chinese and Korea have been compared in the aspects of acceptance and enjoyment. And on the basis of this process the characteristics of korean paintings expressing the ideas of classical literary works was induced. As a result, the following facts are derived. First, By the emergence of the typical style which was formed historically in China at the korean painting shows that korean painters not only actively embraced the art style of China also did not lose the international sense. Second, through the profound study for chinese painting, they transformed it in accordance with korean aesthetic view and finally revealed typical korean characteristics. Third, the results as described above showed the difference of perception and interpretation of literary works between China and Korea.

Analyzing the Phenomena of Hate in Korea by Text Mining Techniques (텍스트마이닝 기법을 이용한 한국 사회의 혐오 양상 분석)

  • Hea-Jin, Kim
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.56 no.4
    • /
    • pp.431-453
    • /
    • 2022
  • Hate is a collective expression of exclusivity toward others and it is fostered and reproduced through false public perception. This study aims to explore the objects and issues of hate discussed in our society using text mining techniques. To this end, we collected 17,867 news data published from 1990 to 2020 and constructed a co-word network and cluster analysis. In order to derive an explicit co-word network highly related to hate, we carried out sentence split and extracted a total of 52,520 sentences containing the words 'hate', 'prejudice' and 'discrimination' in the preprocessing phase. As a result of analyzing the frequency of words in the collected news data, the subjects that appeared most frequently in relation to hate in our society were women, race, and sexual minorities, and the related issues were related laws and crimes. As a result of cluster analysis based on the co-word network, we found a total of six hate-related clusters. The largest cluster was 'genderphobic', accounting for 41.4% of the total, followed by 'sexual minority hatred' at 28.7%, 'racial hatred' at 15.1%, 'selective hatred' at 8.5%, 'political hatred' accounted for 5.7% and 'environmental hatred' accounted for 0.3%. In the discussion, we comprehensively extracted all specific hate target names from the collected news data, which were not specifically revealed as a result of the cluster analysis.

Development and Effects of a Sex Education Program with Blended Learning for University Students (대학생을 위한 블렌디드 러닝 기법의 성 교육 프로그램 개발 및 효과)

  • Kim, Il-Ok;Yeom, Gye Jeong;Kim, Mi Jeong
    • Child Health Nursing Research
    • /
    • v.24 no.4
    • /
    • pp.443-453
    • /
    • 2018
  • Purpose: This study was describes the development and implementation a sex education program with a blended learning method for university students. Methods: Sixty-eight university students were recruited either to the experimental group (n=35) or the control group (n=33). This program was developed based on the analysis, design, development, implementation, and evaluation model. The analysis phase consisted of a literature review, focus group interview, expert consultations, and target group survey. In addition, learning objectives and structure were designed, and a printed text-book, presentation slides, cross-word puzzle, and debate topics were developed. In the implementation phase, the program was conducted 3 times over the course of 3 weeks. The evaluation phase involved verification of the effects of the program on sex-related knowledge, sexual autonomy, and justification of violence, as well as an assessment of satisfaction with the program. Results: The experimental group had significantly higher scores on sex-related knowledge (t=5.47, p<.001), sexual autonomy (t=2.40, p=.019), and justification of violence (t=2.52, p=.015) than the control group. Conclusion: The results indicate that this sex education program with blended learning was effective in meeting the needs of university students and can be widely used in this context.

The Acoustic Analysis of Korean Read Speech - with respect to the prosodic phrasing - (한국어 낭독체 문장의 음향분석 -바람과 햇님의 운율구 생성을 중심으로-)

  • Sung Chuljae
    • Proceedings of the KSPS conference
    • /
    • 1996.02a
    • /
    • pp.157-172
    • /
    • 1996
  • This study aims to suggest some theoretical methodology for analysis of the prosodic patterns in Korean Read Speech. The engineering effort relevant to the phonetic study has focused to the importance of prosodic phrasing which may play a major role in analyzing the phonetic DB. Before establishing the prosodic phrase as the prosodic unit, we should describe the features of the boundary signal in a target sentence. With this in mind, the general characteristics of Read Speech and the ToBI(tones and Break Indices), which has been currently in vogue with respect to the prosodic labelling system were presented as the first step. The concrete analysis was carried out with the fable 'North Wind and the Sun' Korean version, where about 25 prosodic units were discriminated by perceptual approach for 5 subjects. Establishing various informations which can be used for deciding a boundary position systematically, we can proceed to the next, viz. acoustic analysis of prosodic unit. The most important which we primarily study for improving the naturalness of synthetic speech may be, at first, detecting the boundary signals in the speech file and accordingly reestablishment it within the raw text.

  • PDF

A Study on the Analysis of Interior Coordination Trend by Semiology - oriented Process - Focused on the Analysis of determinant Theme of Exhibition - (기호체계에 의한 인테리어코디네이션 트렌드 분석 - 박람회 테마전시를 중심으로 -)

  • Yoo, Yeon-Sook;Lee, Seon-Min
    • Korean Institute of Interior Design Journal
    • /
    • v.20 no.1
    • /
    • pp.51-60
    • /
    • 2011
  • Analysis of trend by various information is systematically approached by strategy differentiating in Interior Design. At the present, trend is approached by intuitive viewpoint without systematic strategy and analysis system about interior coordination activity. And, it is not still established specific systematic architecture of the interior coordination by logical and academical approach. This Study set the goal at overall understanding about the Trend that shifting fast and offering objective data. Therefore, I approached the semiology-oriented process as the most suitable academical system on analysis of interior coordination trend. Object target of analysis was investigated to three domestic and overseas exhibitions announced from 2007 to 2008. These analysis was based on the context and text from the life style and the major determinant theme of the age of each exhibition. Also, it was arranged color, material and texture by the related expression system with topics and theme keywords. And it'll be considered as utilizing the code of specific application in interior coordination which is from the investigating about exhibition. Therefore, this study will be expected to help in meaning transmission and methodology establishment by more beneficial objective system, when designer work the interior coordination practically through the establishment of systematic viewpoint about interior coordination.

Proposal of Research Methodology Using The Measurement of Perception Difference

  • YANG, Hoechang
    • Journal of Wellbeing Management and Applied Psychology
    • /
    • v.2 no.2
    • /
    • pp.39-45
    • /
    • 2019
  • The purpose of this study is to solve the problem of revision or abbreviation of questionnaires based on the previous studies suggested by many existing empirical studies. In addition, this study aims to provide the theoretical basis of the research method which has been variously approached since it presents the methodology that can directly measure the research object. For this purpose, this study proposed a more elaborate analysis method using the differences in perception of individuals who are interested in cognitive research. Specifically, the perception gap(D) can be used as an independent variable, a dependent variable, and a moderating variable. And this study suggested an effective research approach using the measurement of perception difference. The difference of perception suggested that it can be used as a measure to overcome the limitations of existing researches used it as independent variables or mediating variables that measure only one factor of expectation and performance or importance and satisfaction. In addition, it is highly likely that various analyzes on the perception differences, which are the result of measuring target factors for the same person, will be quite effective in the situation where follow-up of respondents is difficult. This study is expected to overcome various limitations reported by empirical studies such as scale utilization problem and follow-up survey difficulty. In future research, it was expected that the limitation of the factor derivation process in the research approach could be complemented by web crawling and text mining of big data analysis.

Identifying Similar Overseas Patent Using Word2Vec-Based Semantic Text Analytics (Word2Vec 학습을 통한 의미 기반 해외 유사 특허 검색 방안)

  • Paek, Minji;Kim, Namgyu
    • Journal of Information Technology Services
    • /
    • v.17 no.2
    • /
    • pp.129-142
    • /
    • 2018
  • Recently, the number of patent applications have been increasing rapidly every year as the importance of protecting intellectual property rights becomes more important. Patents must be inventive and have novelty. Especially, the novelty implies that the corresponding invention is not the same as the previous invention. To confirm the novelty, prior art search must be conducted before and after the application. The target of prior art search should include not only Korean patents but also foreign patents. Search of foreign patents should be supported by multilingual search techniques. However, a dictionary-based naive approach shows a limitation because some technical concepts are represented in different terms according to each nation. For example, a Korean term and a Japanese term may not be synonym even though they represent the same technical concept. In this paper, we propose a new method to map semantic similarity between technical terms in Korean patents and Japanese patents. To investigate different representations in each nation for the same technical concept, we identified and analyzed pairs of patents those are mutually connected with priority claim relationship. By performing an experiment with real-world data, we showed that our approach can reveal semantically similar technical terms in other language successfully.