통합 검색 | Korea Science

키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법 (A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model)

조원진;노상규;윤지영;박진수
- Asia pacific journal of information systems
- /
- 제21권1호
- /
- pp.103-122
- /
- 2011
Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.
PDF KSCI

학교급식 식재료별 시장가격 조사 실태 분석 (Analysis of Surveys to Determine the Real Prices of Ingredients used in School Foodservice)

이서현;이민아;유재윤;김상효;김수연;이호진
- 대한지역사회영양학회지
- /
- 제26권3호
- /
- pp.188-199
- /
- 2021
본 연구는 현재 영양교사·영양사들의 식재료 가격 조사방법에 대한 실태 및 향후 요구도를 파악하기 위해 진행되었다. 모집단인 전국 11,818개교에서 비례층화 표본방법으로 약 10%의 학교를 추출하였다. 총 1,158개교에 설문지를 배부하였고, 그 중 회수된 439부(회수율 37.9%)는 SPSS(ver 25.0) 통계 프로그램을 이용하여 분석을 진행하였다. 현재 시장가격 조사 방법에 대한 만족도가 낮은 식재료는 공산품(3.26점), 지역산 식재료(3.33점), 수산물(3.34점), 과일류(3.37점), 채소 및 서류(3.38점)로 나타났고, 이 중 공산품을 제외하고는 지역산 식재료를 포함하여 농산물로 대부분 구성되어있다. 식재료 시장가격 조사 시 공산품 7.94시간, 채소 및 서류 6.94시간, 친환경 식재료 6.25시간, 지역산 식재료 6.24시간, 과일류 6.16시간, 수산물 5.75시간 순으로 시간이 소요되며, 이는 식재료 시장가격 조사에 만족도가 낮은 식재료 종류와 중복되는 것을 확인할 수 있다. 또한 현재 학교에서 직접 가격 조사를 시행하는 빈도가 높은 식재료의 경우 공산품 227개교(52.9%), 과일류 168개교(38.5%), 특용작물 165개교(37.8%), 수산물 164개교(38.1%), 채소 및 서류 160개교(36.8%), 난류 144개교(33.3%), 절임가공품 150개교(37.9%)로 나타났다. 더불어 초등학교와 중·고등학교로 구분하여 현재 실시하고 있는 식재료 시장가격 조사 방법을 살펴본 결과, 초등학교는 식량작물, 채소 및 서류, 친환경 식재료, 지역산 식재료의 경우 학교급식지원센터에서 구성한 TFT에서 조사한 자료 사용에 대해 응답한 비율이 가장 높게 나타났으나 중·고등학교의 경우 인근 학교와 공동조사하거나 학교에서 직접 조사하여 활용하는 비율이 높은 것으로 나타났다. 향후 원하는 식재료 시장가격 조사 방법을 살펴본 결과, 초등학교의 경우 식량작물, 채소 및 서류, 특용작물, 과일류, 난류, 수산물, 친환경 식재료, 지역산 식재료의 경우 학교급식지원센터에서 구성한 TFT에서 조사한 자료 사용에 대한 요구도가 높은 것으로 나타났으며 축산물, 공산품, 절임가공품은 교육청에서 외부전문기관에 의뢰하여 제공하는 조사자료 사용에 대한 요구도가 높았다. 중·고등학교의 경우 모든 식재료 품목에서 교육(지원)청에서 구성한 TFT에서 조사한 자료 사용에 대한 요구도가 높은 것으로 나타났다. 그러므로 학교에서는 직접 식재료 시장가격을 조사하기보다는 공신력 있는 외부 기관이나 같은 교육(지원)청 내의 영양교사·영양사와의 협력을 통해 식재료 시장가격 조사에 투입되는 시간이나 비용을 효율적으로 사용하기를 원하는 것으로 판단된다. 이러한 결과를 바탕으로 시·도 교육청이나 교육지원청에서 구성하여 일괄 조사하거나 공인된 자료를 바탕으로 식재료 시장가격을 조사 및 분석, 외부 전문기관에 의뢰 등의 방법 개선이 필요하며, 각 지역의 학교급식지원센터와 같이 일괄적으로 식재료를 공급하는 곳에서 외부와 가격 비교가 어려운 식재료 품목만이라도 시장가격 공개 혹은 공유가 필요할 것으로 판단된다. 즉, 실제 학교급식에 납품되는 규격, 식재료 납품 단가 등에 대한 비교자료가 존재하지 않는 친환경 및 지역산 식재료의 경우 가격에 대한 평가가 불가능한 상황임을 고려하여, 우선적으로 시·도교육청에서 조사한 결과 혹은 학교급식지역센터에서 제공하는 가격정보를 바탕으로 통합적으로 운영할 수 있는 홈페이지를 운영하는 것이 필요할 것으로 판단된다. 더불어 향후에는 홈페이지를 통해 학교급식에서 사용하는 식재료의 품목 수를 증가시키는 것이 필요하며, 더불어 정보의 질 유지를 위하여 공신력 있는 기관에서 주관하여 운영하거나 위탁하여 운영하는 것이 가장 이상적일 것이라 판단된다. 따라서 본 연구 결과는 학교급식 식재료 시장조사수집에 대한 현주소를 보여줄 수 있는 의미 있는 결과물이 될 것이며, 향후 학교급식 식재료 가격 수집 및 공개방법에 대한 정책마련의 기초자료가 될 것으로 사료된다.
https://doi.org/10.5720/kjcn.2021.26.3.188 인용 PDF KSCI

이조시대(李朝時代)의 임지제도(林地制度)에 관(關)한 연구(硏究) (A Study on the Forest Land System in the YI Dynasty)

이만우
- 한국산림과학회지
- /
- 제22권1호
- /
- pp.19-48
- /
- 1974
토지국유원칙(土地國有原則)을 표방(標榜)하고 "공사(公私) 공리(共利)"를 기본원칙(基本原則)으로 하고 있었던 고려조(高麗朝)의 시전과제도(柴田科制度)도 집권력(執權力)의 약화(弱化)로 인(因)하여 조만간(早晩間), 붕괴(崩壞)되고 말았던 것이나 임지제도(林地制度)에 있어서는 분묘설정(墳墓設定)의 자유(自由)와 개간장려(開墾奬勵)를 이용(利用)한 삼림(森林)의 광점(廣占) 및 전시과제도(田柴科制度)로 인(因)한 시지(柴地)의 수조권위양(收租權委讓)으로 유래(由來)된 사적수조권(私的洙組權)이 결부(結付)된 삼림(森林)의 사점현상등(私占現象等)이 점차(漸次) 발전(發展)하여 고려중기(高麗中期)의 국정해지기이후(國政解地期以後)에는 대부분(大部分)의 삼림(森林)이 권력층(權力層)의 사점지(私占地)로 화(化)하여 왔었다. 고려조(高麗朝)의 모든 제도(制度)를 그대로 계승(繼承)한 이조(李朝)는 건국후(建國後) 국가소용(國家所用)의 삼림확보(森林確保)를 위(爲)한 삼림수용(森林收用)의 제도확립(制度確立)이 긴요(緊要)하였음으로 전국(全國)의 삼림(森林)을 국가권력(國家權力)에 의(依)하여 공수(公收)하고 국가(國家)와 궁실소용이외(宮室所用以外)의 모든 삼림(森林)은 사점(私占)을 금(禁)한다는 "시장사점금지(柴場私占禁止)"의 제도(制度)를 법제화(法制化)하였고 도성주변(都城周邊)의 사산(四山)을 금산(禁山)으로 함과 아울러 우량(優良)한 임상(林相)의 천연림(天然林)을 택(擇)하여 전조선용재(戰漕船用材)와 궁실용재(宮室用材)의 확보(確保)를 위(爲)한 외방금산(外方禁山)으로 정(定)하고 그 금양(禁養)을 위(爲)하여 산직(山直)을 배치(配置)하였다. 그리고 연병(練兵)과 국왕(國王)의 수렵(狩獵)을 위(爲)한 강무장(講武場)과 관용시장(官用柴場), 능원부속림(陵園附屬林)의 금벌(禁伐), 금화(禁火)를 제정(制定) 등(等) 필요(必要)에 따라 수시(隨時)로 삼림(森林)을 수용(收用)하였으나 고려조이래(高麗朝以來)로 권력층(權力層)에 의(依)하여 사점(私占)되어온 삼림(森林)을 왕권(王權)으로 모두 공수(公收)하지는 못하였던 것이다. 이조초기(李朝初期)에 있어서의 집권층(執權層)은 그 대부분(大部分)이 고려조(高麗朝)에서의 권력층(權力層)이었던것 임으로 그들은 이미 전조시대(前朝時代)로부터 많은 사점림(私占林)을 보유(保有)하고 있었던 것이고 따라서 그들이 권력(權力)을 장악(掌握)하고 있는 한(限) 사점림(私占林)을 공수(公收)한다는 것은 어려운 일이었으며 그들은 오히려 권력(權力)을 이용(利用)하여 사점림(私占林)을 확대(擴大)하고 있었던 것이다. 또 왕자(王子)들도 묘지(墓地)를 빙자(憑藉)하여 주(主)로 도성주변(都城周邊)의 삼림(森林)을 광점(廣占)하고 있던 터에 성종(成宗)의 대(代) 이후(以後)로는 왕자신(王自身)이 금령(禁令)을 어기면서 왕자(王子)에게 삼림(森林)을 사급(賜給)하였음으로 16세기말(世紀末)에는 원도지방(遠道地方)에 까지 왕자(王子)들의 삼림사점(森林私占)이 확대(擴大)되었고 이에 편승(便乘)한 권신(權臣)들의 삼림사점(森林私占)도 전국(全國)으로 파급(波及)하였다. 임진왜란후(壬辰倭亂後)에 시작(始作)된 왕자(王子)에 대(對)한 시장절급(柴場折給)은 삼림(森林)의 상속(相續)과 매매(賣買)를 합법화(合法化)시켰고 이로 인(因)하여 봉건제하(封建制下)에서의 사유림(私有林)을 발생(發生)시키게 된 것이다. 그리하여 권신(權臣)들도 합법적(合法的)으로 삼림(森林)을 사점(私占)하게 되었고 따라서 이조시대(李朝時代) 임지제도(林地制度)의 기본(基本)이었던 시장사점금지(柴場私占禁止)의 제도(制度)는 건국초(建國初)로부터 실행(實行)된 일이 없었으며 오로지 국가(國家)의 삼림수용(森林收用)을 합법화(合法化)시키는 의제(擬制)에 불과(不過)하였던 것이다. 금산(禁山)은 그 이용(利用)과 관리제도(管理制度)의 불비(不備)로 인(因)하여 산하주민(山下住民)들의 염오(厭惡)의 대상(對象)이 되었음으로 주민(住民)들의 고의적(故意的)인 금산(禁山)의 파괴(破壞)는 처음부터 심(甚)하였고 이로 인(因)하여 국가(國家)에서는 용재림확보(用材林確保)를 위(爲)한 금산(禁山)의 증설(增設)을 거듭하였으나 관리제도(管理制度)의 개선(改善)이 수반(隨伴)되지 않았음으로 금산(禁山)의 황폐(荒廢)는 더욱 증대(增大)되었다. 영조(英祖)는 정국(政局)을 안정(安定)시키기 위(爲)하여 경국대전이후(經國大典以後) 남발(濫發)된 교령(敎令)과 법령(法令)을 정비(整備)하여 속대전(續大典)을 편찬(編纂)하고 삼림법령(森林法令)을 정비(整備)하여 도성주변(都城周邊)의 금산(禁山)과 각도(各道) 금산(禁山)의 명칭대신(名稱代身) 서기(西紀) 1699년(年) 이후(以後) 개칭(改稱)하여온 봉산(封山)의 금양(禁養)을 강화(强化)시키는 한편 사양산(私養山)의 권한(權限)을 인정(認定)하는 등(等) 적극적(積極的)인 육림정책(育林政策)을 퍼려하였으나 계속적(繼續的)인 권력층(權力層)의 삼림사점광대(森林私占廣大)는 농민(農民)들로부터 삼림(森林)을 탈취(奪取)하였고 농민(農民)들 이 삼림(森林)을 상실(喪失)함으로써 국가(國家)의 육림장려등(育林奬勵策)은 효과(効果)를 나타내지 못하였던 것이다. 임진왜란후(壬辰倭亂後)의 국정해이(國定解弛)로 인(因)한 묘지광점(墓地廣占), 왕자(王子)에 대(對)한 삼림(森林)의 절급(折給) 권세층(權勢層)에 대(對)한 산림사점(森林私占)은 인허(認許)하는 입안문서(立案文書)의 발행등(發行等)으로 법전상(法典上)의 삼임사점금지조항(森林私占禁止條項)은 사문화(死文化)되었고 이조말기(李朝末期)에 있어서는 사양산(私養山)의 강탈(强奪)도 빈발(頻發)하고 있음을 볼수 있다. 이와 같이 이조시대(李朝時代)의 시장사점금지조항(柴場私占禁止條項)은 오로지 농민(農民)에게만 적용(適用)되는 규정(規定)에 불과(不過)하였고 이로 인(因)하여 농민(農民)들의 육림의욕(育林意慾)은 상실(喪失)되었으며 약탈적(掠奪的)인 삼림(森林)의 채취이용(採取利用)은은 금산(禁山), 봉산(封山) 및 사양산(私養山)을 막론(莫論)하고 황폐(荒廢)시키는 결과(結果)를 자아냈으며 권력층(權力層)의 삼림점탈(森林占奪)에 대항(對抗)한 송계(松契)의 활동(活動)으로 일부(一部) 공산(公山)이 농민(農民)의 입회지(入會地)로서 보존(保存)되어왔다. 그럼에도 불구(不拘)하고 일제(日帝)는 이조말기(李朝末期)의 삼림(森林) 거의 무주공산(無主公山)이 었던것처럼, 이미 사문화(死文化)된 삼림사점금지조항(森林私占禁止條項)을 활용(活用)함으로써, 국가림(國有林)으로 수탈(收奪)한후(後) 식민정책(植民政策)에 이용(利用)하였던 것이나, 실제(實際)에 있어서 이조시대(李朝時代)의 삼림(森林)은 금산(禁山), 봉산(封山), 능원부속림등(陸園附屬林等)의 관금지(官禁地)와 오지름(奧地林)을 제외(除外)하고는 대부분(大部分)의 임지(林地)가 권세층(權勢層)의 사유(私有) 내지(乃至)는 사점하(私占下)에 있었던 것이다.
PDF

검색결과 523건 처리시간 0.025초

키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법 (A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model)

학교급식 식재료별 시장가격 조사 실태 분석 (Analysis of Surveys to Determine the Real Prices of Ingredients used in School Foodservice)

이조시대(李朝時代)의 임지제도(林地制度)에 관(關)한 연구(硏究) (A Study on the Forest Land System in the YI Dynasty)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)