Search | Korea Science

A Study on Automatic Extraction of Core Sentences from Document using Word Cooccurrence Graph (단어의 공기 관계 그래프를 이용한 문서의 핵심 문장 추출에 관한 연구)

Ryu, Je;Han, Kwang-Rok;Sohn, Seok-Won;Rim, Kee-Wook
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.11
- /
- pp.3427-3437
- /
- 2000
In this paper,we propose an method of core sciences extractionusing word cooccrrence graph in order to summarize a document. For automatic extraction of core sentenees, we construct a mean cluster from word cooccurrence graph, and find insistence which corresponds a porposed of author. And then we extract keywords by using relationship between mean cluster and isistence. Finally, core senrences are sclected based on keywords and insitances. The esults are evaluated by comparing with manual extraction, and show that the extraction performance is improved about 10%.
PDF

Concept-Based Method for Noun Phrase Indexing Using Syntactic Analysis and Co-occurence Information (구문분석과 공기정보를 이용한 개념 기반 명사구 색인 방법)

Lee, Hyun-A;Lee, Jong-Hyeok;Lee, Geun-Bae
- Annual Conference on Human and Language Technology
- /
- 1995.10a
- /
- pp.3-7
- /
- 1995
한국어에서의 명사구 색인을 위한 기존의 방법들은 주로 간단한 규칙을 이용하여 왔고 그 결과 문장에 존재하는 모든 명사구를 추출하지 못했다. 이를 해결하기 위하여 본 논문에서는 개념 기반 명사구 색인 방법을 제안한다. 하나의 문장은 하나 이상의 개념으로 이루어져 있으므로, 명사구 추출은 개념을 고려하여 이루어져야 바람직하다 문장은 구문적으로 하나 이상의 내포문으로 이루어져 있다. 일반적으로 내포문 단위 내의 용어들이 나타내는 각각의 개념들은 서로 높은 연관성을 가진다. 그러므로 문장이 가지는 개념의 상이성을 내포문의 개념 상이성으로 축소할 수 있다. 문장을 내포문 단위로 분할하기 위하여 의존 문법을 기반한 구문분석과 공기정보를 이용한다. 특히 공기정보는 원거리 의존관계(long distance dependency)를 결정하여 한 내포문에 속함을 밝혀내는 데 도움을 준다. 이러한 내포문 내의 의존관계를 이용하여 명사구를 추출한다.
PDF

Calculation of similarity by weighting title and summary in word co-occurrence of research reports (연구 보고서의 공기관계 정보에 제목 및 요약의 가중치를 적용한 유사도 계산)

Kim, Nam-Hun;Joo, Jong-Min;Park, Hyuk-Ro;Yang, Hyung-Jeong
- Proceedings of The KACE
- /
- 2017.08a
- /
- pp.37-40
- /
- 2017
본 논문에서는 국가 연구 보고서의 공기 관계 정보와 제목, 요약 등에 가중치를 적용한 유사도 계산방법을 제안한다. 이를 위해 국가 연구개발 보고서에서 텍스트를 추출하여 한 문장 단위로 문서를 분할하고, 기본 불용어와 보고서에서 특징적으로 나타나는 불용어를 처리하고 형태소 분석을 한 뒤 공기관계를 추출하였다. 또한 문서의 유사도 계산시 정확성을 높이기 위해 제목과 요약 부분에 가중치를 부여하였다. 이를 통해 본 논문에서 제안하는 방법이 문서 검색 라이브러인 루씬(Lucene)을 이용한 방법보다 2.5%의 검색성능 향상을 그리고 Knn-휴리스틱 방법보다는 1.1%의 검색성능 향상을 보였다. 이러한 결과를 통해 문서의 요약과 제목 그리고 공기관계 정보가 연구보고서의 유사도를 계산 하는데 영향을 미친다는 것을 보였다.
PDF

Word Sense Disambiguation Method Using Co-occurrence Information (공기정보를 이용한 단어 의미 중의성 해결 방안)

Park, Yo-Sep;Kim, Gyeong-Im;Park, Hyuk-Ro
- Annual Conference on Human and Language Technology
- /
- 2010.10a
- /
- pp.177-178
- /
- 2010
단어 의미 중의성은 자연언어처리 분야에서의 주요 관심 분야이다. 한국어에서의 단어 의미 중의성 문제는 다른 언어에 비하여 연구가 미흡한 상태이다. 기존 연구에서는 빈도 수에 기반한 공기 정보 벡터를 이용한 방법에서 처리되지 못하는 경우가 발생하였다. 또한 사전에 기반한 상위어 추출 시에 정형화된 형태가 아닌 경우에 어려움이 발생하였다. 본 논문에서는 상호정보량을 추가하여 공기 정보 처리 과정 시에 발생하는 오류를 최소화 하였다. 또한 대상 명사의 상위어 추출 문제를 해결하기 위해 어휘 지식 베이스를 적용하였다.
PDF

Laboratory Tests for Trichloroethylene (TCE) and Toluene Remediation in Soil Using Soil Vapor Extraction (토양증기추출(Soil Vapor Extraction)을 이용한 토양 내 Trichloroethylene (TCE)과 Toluene정화 실험)

이민희;강현민
- Economic and Environmental Geology
- /
- v.35 no.3
- /
- pp.221-227
- /
- 2002
Column experiments were performed to evaluate the removal efficiency of soil vapor extraction (SVE) iota TCE (trichloroethylene) and toluene in soil. Homogeneous Ottawa sands and real soils collected from contaminated area were used to investigate the effect of soil properties and SVE operation conditions on the removal efficiency. In column teats with two different sizes of Ottawa sand, the maximum effluent TCE concentration in a coarse sand column was 442 mg/L and 337 mg/L in a fine sand column. However, after 20 liter gas flushing, the effluent concentrations were very similar and more than 90% of initial TCE mass were removed from the column. For two real contaminated soil columns, the maximum effluent concentration decreased 50% compared with that in the homogeneous Ottawa coarse sand column, but 99% of initial TCE mass were extracted from the column within 40 liter air flushing, suggesting that SVE is very available to remove volatile NAPLs in the contaminated soil. To investigate the effect of contaminant existing time on the removal efficiency, an Ottawa sand column was left stable for one week after TCE was injected and the gas extraction was applied into the column. Its effluent concentration trend was very similar to those for other Ottawa sand columns except that the residual TCE after the air flushing showed relatively high. Column tests with different water contents were performed and results showed high removal efficiency even in a high water content sand column. Toluene as one of BTEX compounds was used in an Ottawa sand column and a real soil column. Removal trends were similar to those in TCE contaminated columns and more than 98% of initial toluene mass were removed with SVE in both column.
PDF KSCI

Text Extraction and Summarization from Web News (웹 뉴스의 기사 추출과 요약)

Han, Kwang-Rok;Sun, Bok-Keun;Yoo, Hyoung-Sun
- Journal of the Korea Society of Computer and Information
- /
- v.12 no.5
- /
- pp.1-10
- /
- 2007
Many types of information provided through the web including news contents contain unnecessary clutters. These clutters make it difficult to build automated information processing systems such as the summarization, extraction and retrieval of documents. We propose a system that extracts and summarizes news contents from the web. The extraction system receives news contents in HTML as input and builds an element tree similar to DOM tree, and extracts texts while removing clutters with the hyperlink attribute in the HTML tag from the element tree. Texts extracted through the extraction system are transferred to the summarization system, which extracts key sentences from the texts. We implement the summarization system using co-occurrence relation graph. The summarized sentences of this paper are expected to be transmissible to PDA or cellular phone by message services such as SMS.
PDF

원전 제이실 바닥 화재감지기 선정에 관한 연구

구철수;임장현
- Proceedings of the Korean Nuclear Society Conference
- /
- 1996.05d
- /
- pp.543-548
- /
- 1996
국내 가압 경수형 원자력 발전소 주제이실의 바닥에 가장 적합한 화재 감지기 선정을 위하여 원전 심사, 검사시 적용한 국내·외 화재감지기 설계 및 설치에 관한 법규와 규제요건의 종합적인 검토를 수행하고 원전의 주제이실 바닥 환경을 최대로 모의한 시험장치를 이용하여 선형열감지기와 공기 표본 추출형 연기감지기의 동작 성능을 비교 시험하였다. 시험 결과 케이블 화재시 공기 표본 추출형 연기감지기가 감도 및 응답특성이 정온식 선형 감지기 보다 우수한 것으로 확인되었다.
PDF

Desorption of Organic Compounds from the Simulated Soils by Soil Vapor Extraction (인공토양으로부터 토양증기추출법에 의한 유기화합물의 탈착 현상에 관한 연구)

이병환;이현주;이종협
- Proceedings of the Korean Society of Soil and Groundwater Environment Conference
- /
- 1998.06a
- /
- pp.22-26
- /
- 1998
토양오염 정화방법의 하나인 토양증기추출법(soil vapor extraction, SVE)은 오염된 토양에 진공 또는 가압의 공기를 공급하여 연속적인 공기 흐름을 유도함으로써 토양의 기공에 잔류하는 유해화합물의 증발을 촉진하여 오염물질을 제거하는 공정이다. 본 연구에서는 토양증기추출법의 효율에 영향을 주는 인자들 가운데에서 토양의 수분함량과 오염물질의 종류가 오염물질의 제거효율에 미치는 영향에 대한 실험과 계산을 수행하였다. 인공토양으로 glass bead, sand, molecular sieve가 사용되었으며. 오염물질로는 톨루엔. 메틸에틸케톤, 트리클로로에틸렌이 사용되었다. 각 실험에 대하여 프로인들리히 등온식과 기공확산모델 등을 고려하여 계산을 수행한 결과, 수분이 없는 경우에는 탈착식에 의해, 수분이 있는 경우에는 interparticle에서의 확산 현상에 의해 오염물질의 제거속도가 지배됨을 알 수 있었다. 이러한 연구결과는 정화대상지역에 SVE를 이용한 적절한 정화방법을 설계하는데 기초자료로 이용할 수 있을 것이다.
PDF

The Study on the Model of Extracting Collocations from Corpus in Korean Using the Statistical Tools (통계 기법을 이용한 연어 추출 모형 연구)

Ahn, Sung-Min
- Annual Conference on Human and Language Technology
- /
- 2010.10a
- /
- pp.162-165
- /
- 2010
공기하여 나타나는 구 정보 중에서 언어에 대한 연구는 응용 언어학에 발전에 기여할 수 있는 부분이 크다. 연어란 어휘들 간의 제한된 결합 관계를 갖는 공기 확률이 높은 구 구성이다. 이러한 연어 구성에 대한 연구는 특히 기계 번역이나 사전 편찬 등의 분야에서 관심이 높아지고 있다. 본 연구에서는 언어를 추출하기 위해 T-test와 상호 정보, 조건 확률 등의 여러 통계 기법의 사용을 제시한다. 각 기법을 적용하였을 때 연어 추출에 어떠한 변화를 보이는지 조사하였고, 가장 적절한 기법의 적용도 모색함으로써 향후 언어 추출의 방향을 제시하고자 한다.
PDF

An Experimental Study on an Effective Word Sense Disambiguation Model Based on Automatic Sense Tagging Using Dictionary Information (사전 정보를 이용한 단어 중의성 해소 모형에 관한 실험적 연구)

Lee, Yong-Gu;Chung, Young-Mee
- Journal of the Korean Society for information Management
- /
- v.24 no.1 s.63
- /
- pp.321-342
- /
- 2007
This study presents an effective word sense disambiguation model that does not require manual sense tagging Process by automatically tagging the right sense using a machine-readable and the collocation co-occurrence-based methods. The dictionary information-based method that applied multiple feature selection showed the tagging accuracy of 70.06%, and the collocation co-occurrence-based method 56.33%. The sense classifier using the dictionary information-based tagging method showed the classification accuracy of 68.11%, and that using the collocation co-occurrence-based tagging method 62.09% The combined 1a99ing method applying data fusion technique achieved a greater performance of 76.09% resulting in the classification accuracy of 76.16%.
https://doi.org/10.3743/KOSIM.2007.24.1.321 인용 PDF

Search Result 282, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)