Search | Korea Science

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
- Asia pacific journal of information systems
- /
- v.21 no.1
- /
- pp.103-122
- /
- 2011
Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.
PDF KSCI

The Equality of Keywords of Journal of KAPD with Medical Subject Headings (대한소아치과학회지의 주요어와 의학주제표목의 일치도)

Kim, Eunhee;Kim, Ahhyeon;Shim, Younsoo;Ahn, Eunsuk;Jeon, Eunyoung;An, Soyoun
- Journal of the korean academy of Pediatric Dentistry
- /
- v.43 no.2
- /
- pp.123-128
- /
- 2016
The purpose of this study is to analyze the equality between keywords used in the Journal of the Korean Academy of Pediatric Dentistry and medical subject headings (MeSH). A total of 4,353 English keywords in 1,165 papers from 1998 to 2014 were eligible for this study. We classified them according to equality to MeSH. We assayed patterns of errors in using MeSH, and reviewed frequently used non-MeSH terms. 24.9% of total keywords were completely coincident with MeSH terms, 75.1% were not MeSH terms. The results show that the accordance rate of keywords with MeSH terms in the Journal of the Korean Pediatric Dentistry is at a low level. Therefore, there is a need for authors to understand MeSH more specifically and accurately. Use of proper keywords aligned with the international standards such as MeSH is important to be properly cited. Authors should pay attention and be educated on the correct use of MeSH as keywords.
https://doi.org/10.5933/JKAPD.2016.43.2.123 인용 PDF KSCI

Open versus closed reduction of mandibular condyle fractures : A systematic review of comparative studies

Kim, Jong-Sik;Seo, Hyun-Soo;Kim, Ki-Young;Song, Yun-Jung;Kim, Seon-Ah;Hong, Soon-Min;Park, Jun-Woo
- Journal of the Korean Association of Oral and Maxillofacial Surgeons
- /
- v.34 no.1
- /
- pp.99-107
- /
- 2008
Objective : The objective of this review was to provide reliable comparative results regarding the effectiveness of any interventions either open or closed that can be used in the management of fractured mandibular condyle Patients and Methods : Research of studies from MEDLINE and Cochrane since 1990 was done. Controlled vocabulary terms were used. MeSH Terms were "Mandibular condyle" AND "Fractures, bone". Only comparative study were considered in this review using the "limit" function. According to the criteria, two review authors independently assessed the abstracts of studies resulting from the searches. The studies were divided according to some criteria, and following were measured: Ramus height, condyle sagittal displacement, condyle Towns's image displacement, Maximum open length, Protrusion & Lateral excursion, TMJ pain, Malocclusion, and TMJ disorder. Results : Many studies were analyzed to review the post-operative result of the two methods of treatment. Ramus height decreased more in when treated by closed reduction as opposed to open reduction. Sagittal condyle displacement was shown to be greater in closed reduction. Condyle Town's image condyle displacement had greater values in closed reduction. Maximum open length showed lower values in closed reduction. In protrusive and lateral movement, closed reduction was less than ORIF. Closed reduction showed greater occurrence of malocclusion than ORIF. However, post-operative pain and discomfort was greater in ORIF. Conclusion : In almost all categories, ORIF showed better results than CRIF. However, the use of the open reduction method should be considered due to the potential surgical morbidity and increased hospitalization time and cost. To these days, Endoscopic surgical techniques for ORIF (EORIF) are now in their infancy with the specific aims of eliminating concern for damage to the facial nerve and of reducing or eliminating facial scars. Before performing any types of treatment, patients must be understood of both of the treatment methods, and the best treatment method should be taken on permission.
PDF KSCI

A Preliminary Study on Extending OAK Metadata for Research Data (연구데이터 관리를 위한 OAK 메타데이터 확장 방안 연구)

Lee, Mihwa;Lee, Eun-Ju;Rho, Jee-Hyun
- Journal of Korean Library and Information Science Society
- /
- v.51 no.3
- /
- pp.27-51
- /
- 2020
This study aims to propose an extended OAK metadata for research data that would be described in OAK, an open access repository of the National Library of Korea. As a research method, literature review, case studies, and interviews with related parties were conducted. The method of extending the existing OAK metadata for research data was derived as follows. First, in modeling for research data, the structure of the collection> item> file is maintained, the collection is placed as a higher group to which the research data can be grouped, and item was combined metadata and files or digital objects of various formats together. Second, by mapping the metadata standard and case organizations with the existing OAK metadata, elements judged to need to be extended to OAK for research data were selected and reflected in the existing OAK. Third, the controlled vocabulary and syntax are also proposed so that it can be used for search or later statistics through structured data. By expanding the OAK metadata to describe research data, research data produced in Korea can be officially stored and used, which is the basis for preventing duplication of research and sharing and recycling research results nationally.
https://doi.org/10.16981/kliss.51.3.202009.27 인용 PDF KSCI

Personalized Recommendation System for IPTV using Ontology and K-medoids (IPTV환경에서 온톨로지와 k-medoids기법을 이용한 개인화 시스템)

Yun, Byeong-Dae;Kim, Jong-Woo;Cho, Yong-Seok;Kang, Sang-Gil
- Journal of Intelligence and Information Systems
- /
- v.16 no.3
- /
- pp.147-161
- /
- 2010
As broadcasting and communication are converged recently, communication is jointed to TV. TV viewing has brought about many changes. The IPTV (Internet Protocol Television) provides information service, movie contents, broadcast, etc. through internet with live programs + VOD (Video on demand) jointed. Using communication network, it becomes an issue of new business. In addition, new technical issues have been created by imaging technology for the service, networking technology without video cuts, security technologies to protect copyright, etc. Through this IPTV network, users can watch their desired programs when they want. However, IPTV has difficulties in search approach, menu approach, or finding programs. Menu approach spends a lot of time in approaching programs desired. Search approach can't be found when title, genre, name of actors, etc. are not known. In addition, inserting letters through remote control have problems. However, the bigger problem is that many times users are not usually ware of the services they use. Thus, to resolve difficulties when selecting VOD service in IPTV, a personalized service is recommended, which enhance users' satisfaction and use your time, efficiently. This paper provides appropriate programs which are fit to individuals not to save time in order to solve IPTV's shortcomings through filtering and recommendation-related system. The proposed recommendation system collects TV program information, the user's preferred program genres and detailed genre, channel, watching program, and information on viewing time based on individual records of watching IPTV. To look for these kinds of similarities, similarities can be compared by using ontology for TV programs. The reason to use these is because the distance of program can be measured by the similarity comparison. TV program ontology we are using is one extracted from TV-Anytime metadata which represents semantic nature. Also, ontology expresses the contents and features in figures. Through world net, vocabulary similarity is determined. All the words described on the programs are expanded into upper and lower classes for word similarity decision. The average of described key words was measured. The criterion of distance calculated ties similar programs through K-medoids dividing method. K-medoids dividing method is a dividing way to divide classified groups into ones with similar characteristics. This K-medoids method sets K-unit representative objects. Here, distance from representative object sets temporary distance and colonize it. Through algorithm, when the initial n-unit objects are tried to be divided into K-units. The optimal object must be found through repeated trials after selecting representative object temporarily. Through this course, similar programs must be colonized. Selecting programs through group analysis, weight should be given to the recommendation. The way to provide weight with recommendation is as the follows. When each group recommends programs, similar programs near representative objects will be recommended to users. The formula to calculate the distance is same as measure similar distance. It will be a basic figure which determines the rankings of recommended programs. Weight is used to calculate the number of watching lists. As the more programs are, the higher weight will be loaded. This is defined as cluster weight. Through this, sub-TV programs which are representative of the groups must be selected. The final TV programs ranks must be determined. However, the group-representative TV programs include errors. Therefore, weights must be added to TV program viewing preference. They must determine the finalranks.Based on this, our customers prefer proposed to recommend contents. So, based on the proposed method this paper suggested, experiment was carried out in controlled environment. Through experiment, the superiority of the proposed method is shown, compared to existing ways.
PDF KSCI

Search Result 55, Processing Time 0.017 seconds

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

The Equality of Keywords of Journal of KAPD with Medical Subject Headings (대한소아치과학회지의 주요어와 의학주제표목의 일치도)

Open versus closed reduction of mandibular condyle fractures : A systematic review of comparative studies

A Preliminary Study on Extending OAK Metadata for Research Data (연구데이터 관리를 위한 OAK 메타데이터 확장 방안 연구)

Personalized Recommendation System for IPTV using Ontology and K-medoids (IPTV환경에서 온톨로지와 k-medoids기법을 이용한 개인화 시스템)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)