Search | Korea Science

Adjusting Weights of Single-word and Multi-word Terms for Keyphrase Extraction from Article Text

Kang, In-Su
- Journal of the Korea Society of Computer and Information
- /
- v.26 no.8
- /
- pp.47-54
- /
- 2021
Given a document, keyphrase extraction is to automatically extract words or phrases which topically represent the content of the document. In unsupervised keyphrase extraction approaches, candidate words or phrases are first extracted from the input document, and scores are calculated for keyphrase candidates, and final keyphrases are selected based on the scores. Regarding the computation of the scores of candidates in unsupervised keyphrase extraction, this study proposes a method of adjusting the scores of keyphrase candidates according to the types of keyphrase candidates: word-type or phrase-type. For this, type-token ratios of word-type and phrase-type candidates as well as information content of high-frequency word-type and phrase-type candidates are collected from the input document, and those values are employed in adjusting the scores of keyphrase candidates. In experiments using four keyphrase extraction evaluation datasets which were constructed for full-text articles in English, the proposed method performed better than a baseline method and comparison methods in three datasets.
https://doi.org/10.9708/jksci.2021.26.08.047 인용 PDF KSCI HTML

A Study on Automatic Extraction of Core Sentences from Document using Word Cooccurrence Graph (단어의 공기 관계 그래프를 이용한 문서의 핵심 문장 추출에 관한 연구)

Ryu, Je;Han, Kwang-Rok;Sohn, Seok-Won;Rim, Kee-Wook
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.11
- /
- pp.3427-3437
- /
- 2000
In this paper,we propose an method of core sciences extractionusing word cooccrrence graph in order to summarize a document. For automatic extraction of core sentenees, we construct a mean cluster from word cooccurrence graph, and find insistence which corresponds a porposed of author. And then we extract keywords by using relationship between mean cluster and isistence. Finally, core senrences are sclected based on keywords and insitances. The esults are evaluated by comparing with manual extraction, and show that the extraction performance is improved about 10%.
PDF

Content Analysis of Food and Nutrition unit in Middle School Textbooks of Home Economics - Focus on the National Curriculums from 1st to 2009 revised (중학교 가정(기술·가정)교과 식생활 영역의 핵심 교육내용 분석 - 제1차 교육과정부터 2009개정 교육과정의 교과서 내용을 중심으로 -)

Jang, Yoon-Mi;Kim, Yoo Kyeong
- Journal of Korean Home Economics Education Association
- /
- v.30 no.4
- /
- pp.93-112
- /
- 2018
We analysed the textbooks of Home Economics in middle school from 1^st to 2009 curriculums to investigate the contents and the portion of Food and Nutrition section. The key words were generated by word cloud technique using text-mining, and the portion of Food and Nutrition section was presented as a ratio of the pages. The core key words of Food and Nutrition section through the curriculums were 'raw food'·'food'·'diet'. In 1^st and 2^nd curriculums, the main key words were related to food materials, condiments and nutrients such as 'vitamin'·'protein'. The words such as 'nutrition'·'eating'·'requirement' were newly appeared in 3^rd, 'portion' in 6^th, and 'diet'·'adolescence' in 7^th curriculum. The mean ratio of Food and Nutrition section in Home Economics was 24.3%. While the portion was as high as 31.8% in 7th it was strikingly reduced to 15.2% in 2009^th. curriculum. Besides, Food and Nutrition section was composed of 10 units of middle level category during the 2^nd and 3^rd curriculums, and was reduced to 2 small units with none of middle level category in 2009^th curriculum. Although the contents of Food and Nutrition section has been developed and adapted to the needs of the society through the curriculums, the portion of Food and Nutrition section in Home Economics has been reduced especially in 2009^th curriculum, which could raise concerns on the health of individuals and communities.
https://doi.org/10.19031/jkheea.2018.12.30.4.93 인용 PDF

음성에 의한 Man-Machine Communication 기술의 현황

은종관
- The Magazine of the IEIE
- /
- v.15 no.2
- /
- pp.75-87
- /
- 1988
본 논문에서는 음성에 의한 man-machine communication의 핵심기술인 음성인식 및 합성의 전반적인 기술에 관하여 그 현황을 알아본다. 먼저 음성인식에서 해결되어야 할 문제점들을 고찰하고 격리단어 인식, 연결단어 인식, 그리고 연속언어 인식의 기술현황을 기술한다. 격리단어 인식에서는 pattern matching 방법에서 사용되는 입력어휘의 특징 추출, reference와의 유사도 측정, 유사도 측정 결과에 의한 인식결정에 관해서 논한다. 연결단어 및 연속언어 인식에서는 현재 연구가 되고 있는 "bottom-up approach"와 "top-down approach"에 관해서 설명하고 이들 방법의 어려운 점들을 고찰한다. 다음 음성 합성에서는 기존의 여러 가지 합성 방식을 검토하고 이들의 장단점을 기술한다. 마지막으로 한 예로서 한국어 text-to-speech 변환 시스템에 관하여 기술한다.
PDF

Development of chatting program using social issue keyword information (사회적 핵심 이슈 키워드 정보를 활용한 채팅 프로그램 개발)

Yoon, Kyung-Suob;Jeong, Won-Hyeok
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2020.07a
- /
- pp.307-310
- /
- 2020
본 논문에서 이슈 키워드 추출을 위해 텍스트 마이닝(Text Mining) 기술을 요구한다. 사회적 이슈 키워드를 추출하기 위해 키워드 수집 모델이 되는 사이트에서 크롤링(crawling)을 수행한 뒤, 형태소 단위 의미있는 단어를 수집하기 위해 형태소 분석(morphological analysis)을 수행한다. 한국어 형태소 분석을 위해 파이썬의 코엔엘파이(KoNLPy) 패키지를 활용한다. 형태소 분석을 통해 나뉘어진 단어에서 통계를 내어 이슈 키워드 추출한다. 이슈 키워드를 뒷받침할 연관 단어를 분석하기 위해 단어 임베딩(Word Embedding)을 수행한다. 단어 임베딩 수행을 위해 Word2Vec 모델 중 Skip-Gram 방법론을 적용하여 연관 단어를 분석하도록 개발하였다. 웹 소켓(Web Socket) 통신을 통한 채팅 프로그램의 상단에 분석한 이슈 키워드와 연관 단어를 출력하도록 개발하였다.
PDF

Research and Development Strategic Plan of Honam Sea Grant Program to Secure the Base Technology of Jeollanam-do's Policy Projects in the Area of Maritime and Fisheries (전라남도 해양수산 정책사업의 기반기술 확보를 위한 호남지역 Sea Grant 사업단 연구개발 전략수립)

Yim, Jeong-Bin;Nam, Taek-Kun
- Journal of Navigation and Port Research
- /
- v.32 no.8
- /
- pp.685-692
- /
- 2008
The goal of this paper is to set the research and development (R&D) strategic plan of Honam Sea Grant (HSG) program which is to secure the base technologies for the success of Jeollanamdo's policy projects in the area of maritime and fisheries. HSG's mission is to support science-based sustainable management, conservation and enhancement of Honam coastal and aquatic resources through research, extension and education. Firstly, 80 cases of Jeollanam-do's policy project and 48 cases of HSG's R&D project are compiled and classified into the five areas of maritime and fisheries. Secondly, typical key words are extracted from each five areas and assessed the inherent meanings of each key words using quarterly segmented meaning allocation techniques with 'intended for practical use', 'intended for theoretical use', 'intended for future', and 'intended for current'. Then, we proposed R&D strategic plan based on the evaluation results and, it's practical use is also discussed.
https://doi.org/10.5394/KINPR.2008.32.8.685 인용 PDF KSCI

Analysis of Research Trends in the Hydrogen Energy Field Using Co-Occurrence Keyword Analysis (동시출현 핵심단어 분석을 활용한 수소 에너지 관련 연구동향 분석)

Kim, Minju;Kwon, Sangki
- Explosives and Blasting
- /
- v.40 no.3
- /
- pp.1-18
- /
- 2022
Due to the advent of the hydrogen economy era, various studies are being conducted to transport and store hydrogen, and the risk of hydrogen explosion is emerging. In order to figure out the new technology related to hydrogen energy, it is necessary to figure out the overall research trends related to various hydrogen energy at home and abroad. In this study, a bibliometric analysis using VOSViewer for the papers published in the international journal was conducted. From the analysis in different time period using the keywords including hydrogen explosion, hydrogen pipeline, and hydrogen storage, it was found that there were frequent paper publications using numerical analysis simulation. It is also found that more and more researches on safety and hydrogen explosion in hydrogen storage and hydrogen pipeline transportation have been conducted in 2011-2022 compared to those in 2000-2010.
https://doi.org/10.22704/ksee.2022.40.3.001 인용 PDF KSCI

Korea National College of Agriculture and Fisheries in Naver News by Web Crolling : Based on Keyword Analysis and Semantic Network Analysis (웹 크롤링에 의한 네이버 뉴스에서의 한국농수산대학 - 키워드 분석과 의미연결망분석 -)

Joo, J.S.;Lee, S.Y.;Kim, S.H.;Park, N.B.
- Journal of Practical Agriculture & Fisheries Research
- /
- v.23 no.2
- /
- pp.71-86
- /
- 2021
This study was conducted to find information on the university's image from words related to 'Korea National College of Agriculture and Fisheries (KNCAF)' in Naver News. For this purpose, word frequency analysis, TF-IDF evaluation and semantic network analysis were performed using web crawling technology. In word frequency analysis, 'agriculture', 'education', 'support', 'farmer', 'youth', 'university', 'business', 'rural', 'CEO' were important words. In the TF-IDF evaluation, the key words were 'farmer', 'dron', 'agricultural and livestock food department', 'Jeonbuk', 'young farmer', 'agriculture', 'Chonju', 'university', 'device', 'spreading'. In the semantic network analysis, the Bigrams showed high correlations in the order of 'youth' - 'farmer', 'digital' - 'agriculture', 'farming' - 'settlement', 'agriculture' - 'rural', 'digital' - 'turnover'. As a result of evaluating the importance of keywords as five central index, 'agriculture' ranked first. And the keywords in the second place of the centrality index were 'farmers' (Cc, Cb), 'education' (Cd, Cp) and 'future' (Ce). The sperman's rank correlation coefficient by centrality index showed the most similar rank between Degree centrality and Pagerank centrality. The KNCAF articles of Naver News were used as important words such as 'agriculture', 'education', 'support', 'farmer', 'youth' in terms of word frequency. However, in the evaluation including document frequency, the words such as 'farmer', 'dron', 'Ministry of Agriculture, Food and Rural Affairs', 'Jeonbuk', and 'young farmers' were found to be key words. The centrality analysis considering the network connectivity between words was suitable for evaluation by Cd and Cp. And the words with strong centrality were 'agriculture', 'education', 'future', 'farmer', 'digital', 'support', 'utilization'.
https://doi.org/10.23097/JPAF.2021.23(2):71 인용 PDF KSCI

Analyzing Self-Introduction Letter of Freshmen at Korea National College of Agricultural and Fisheries by Using Semantic Network Analysis : Based on TF-IDF Analysis (언어네트워크분석을 활용한 한국농수산대학 신입생 자기소개서 분석 - TF-IDF 분석을 기초로 -)

Joo, J.S.;Lee, S.Y.;Kim, J.S.;Kim, S.H.;Park, N.B.
- Journal of Practical Agriculture & Fisheries Research
- /
- v.23 no.1
- /
- pp.89-104
- /
- 2021
Based on the TF-IDF weighted value that evaluates the importance of words that play a key role, the semantic network analysis(SNA) was conducted on the self-introduction letter of freshman at Korea National College of Agriculture and Fisheries(KNCAF) in 2020. The top three words calculated by TF-IDF weights were agriculture, mathematics, study (Q. 1), clubs, plants, friends (Q. 2), friends, clubs, opinions, (Q. 3), mushrooms, insects, and fathers (Q. 4). In the relationship between words, the words with high betweenness centrality are reason, high school, attending (Q. 1), garbage, high school, school (Q. 2), importance, misunderstanding, completion (Q.3), processing, feed, and farmhouse (Q. 4). The words with high degree centrality are high school, inquiry, grades (Q. 1), garbage, cleanup, class time (Q. 2), opinion, meetings, volunteer activities (Q.3), processing, space, and practice (Q. 4). The combination of words with high frequency of simultaneous appearances, that is, high correlation, appeared as 'certification - acquisition', 'problem - solution', 'science - life', and 'misunderstanding - concession'. In cluster analysis, the number of clusters obtained by the height of cluster dendrogram was 2(Q.1), 4(Q.2, 4) and 5(Q. 3). At this time, the cohesion in Cluster was high and the heterogeneity between Clusters was clearly shown.
https://doi.org/10.23097/JPAF.2021.23(1):89 인용 PDF KSCI

Contextual Advertisement System based on Document Clustering (문서 클러스터링을 이용한 문맥 광고 시스템)

Lee, Dong-Kwang;Kang, In-Ho;An, Dong-Un
- The KIPS Transactions:PartB
- /
- v.15B no.1
- /
- pp.73-80
- /
- 2008
In this paper, an advertisement-keyword finding method using document clustering is proposed to solve problems by ambiguous words and incorrect identification of main keywords. News articles that have similar contents and the same advertisement-keywords are clustered to construct the contextual information of advertisement-keywords. In addition to news articles, the web page and summary of a product are also used to construct the contextual information. The given document is classified as one of the news article clusters, and then cluster-relevant advertisement-keywords are used to identify keywords in the document. We could achieve 21% precision improvement by our proposed method.
https://doi.org/10.3745/KIPSTB.2008.15-B.1.73 인용 PDF KSCI

Search Result 249, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)