Search | Korea Science

A Text Summarization Model Based on Sentence Clustering (문장 클러스터링에 기반한 자동요약 모형)

정영미;최상희
- Journal of the Korean Society for information Management
- /
- v.18 no.3
- /
- pp.159-178
- /
- 2001
This paper presents an automatic text summarization model which selects representative sentences from sentence clusters to create a summary. Summary generation experiments were performed on two sets of test documents after learning the optimum environment from a training set. Centroid clustering method turned out to be the most effective in clustering sentences, and sentence weight was found more effective than the similarity value between sentence and cluster centroid vectors in selecting a representative sentence from each cluster. The result of experiments also proves that inverse sentence weight as well as title word weight for terms and location weight for sentences are effective in improving the performance of summarization.
PDF

Table based Matching Algorithm for Soft Categorization of News Articles in Reuter 21578

Jo, Tae-Ho
- Journal of Korea Multimedia Society
- /
- v.11 no.6
- /
- pp.875-882
- /
- 2008
This research proposes an alternative approach to machine learning based ones for text categorization. For using machine learning based approaches for any task of text mining, documents should be encoded into numerical vectors; it causes two problems: huge dimensionality and sparse distribution. Although there are various tasks of text mining such as text categorization, text clustering, and text summarization, the scope of this research is restricted to text categorization. The idea of this research is to avoid the two problems by encoding a document or documents into a table, instead of numerical vectors. Therefore, the goal of this research is to improve the performance of text categorization by proposing approaches, which are free from the two problems.
PDF

An Optimized e-Lecture Video Search and Indexing framework

Medida, Lakshmi Haritha;Ramani, Kasarapu
- International Journal of Computer Science & Network Security
- /
- v.21 no.8
- /
- pp.87-96
- /
- 2021
The demand for e-learning through video lectures is rapidly increasing due to its diverse advantages over the traditional learning methods. This led to massive volumes of web-based lecture videos. Indexing and retrieval of a lecture video or a lecture video topic has thus proved to be an exceptionally challenging problem. Many techniques listed by literature were either visual or audio based, but not both. Since the effects of both the visual and audio components are equally important for the content-based indexing and retrieval, the current work is focused on both these components. A framework for automatic topic-based indexing and search depending on the innate content of the lecture videos is presented. The text from the slides is extracted using the proposed Merged Bounding Box (MBB) text detector. The audio component text extraction is done using Google Speech Recognition (GSR) technology. This hybrid approach generates the indexing keywords from the merged transcripts of both the video and audio component extractors. The search within the indexed documents is optimized based on the Naïve Bayes (NB) Classification and K-Means Clustering models. This optimized search retrieves results by searching only the relevant document cluster in the predefined categories and not the whole lecture video corpus. The work is carried out on the dataset generated by assigning categories to the lecture video transcripts gathered from e-learning portals. The performance of search is assessed based on the accuracy and time taken. Further the improved accuracy of the proposed indexing technique is compared with the accepted chain indexing technique.
https://doi.org/10.22937/IJCSNS.2021.21.8.12 인용 PDF KSCI

Study on CEO New Year's Address: Using Text Mining Method (텍스트마이닝을 활용한 주요 대기업 신년사 분석)

YuKyoung Kim;Daegon Cho
- Journal of Information Technology Services
- /
- v.22 no.2
- /
- pp.93-127
- /
- 2023
This study analyzed the CEO New Year's addresses of major Korean companies, extracting key topics for employees via text mining techniques. An intended contribution of this study is to assist reporters, analysts, and researchers in gaining a better understanding of the New Year's addresses by elucidating the implicit and implicative features of messages within. To this end, this study collected and analyzed 545 New Year's addresses published between 2012 and 2021 by the top 66 Korean companies in terms of market capitalization. Research methodologies applied include text clustering, word embedding of keywords, frequency analysis, and topic modeling. Our main findings suggest that the messages in the New Year's addresses were categorized into nine topics-organizational culture, global advancement, substantial management, business reorganization, capacity building, market leadership, management innovation, sustainable management, and technology development. Next, this study further analyzed the managerial significance of each topic and discussed their characteristics from the perspectives of time, industry, and corporate groups. Companies were typically found to emphasize sound management, market leadership, and business reorganization during economic downturns while stressing capacity building and organizational culture during market transition periods. Also, companies belonging to corporate groups tended to emphasize founding philosophy and corporate culture.
https://doi.org/10.9716/KITS.2023.22.2.093 인용 PDF

A Semi-Singular Value Decomposition and Its Application to Large Text Documents Clustering (준-특이치 분해와 대규모 문서 군집화에의 응용)

Sin, Yang-Gyu
- 한국데이터정보과학회:학술대회논문집
- /
- 2003.05a
- /
- pp.133-133
- /
- 2003
PDF

A Novel Video Image Text Detection Method

Zhou, Lin;Ping, Xijian;Gao, Haolin;Xu, Sen
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.6 no.3
- /
- pp.941-953
- /
- 2012
A novel and universal method of video image text detection is proposed. A coarse-to-fine text detection method is implemented. Firstly, the spectral clustering (SC) method is adopted to coarsely detect text regions based on the stationary wavelet transform (SWT). In order to make full use of the information, multi-parameters kernel function which combining the features similarity information and spatial adjacency information is employed in the SC method. Secondly, 28 dimension classifying features are proposed and support vector machine (SVM) is implemented to classify text regions with non-text regions. Experimental results on video images show the encouraging performance of the proposed algorithm and classifying features.
https://doi.org/10.3837/tiis.2012.03.010 인용 PDF KSCI

A Novel Video Image Text Detection Method

Zhou, Lin;Ping, Xijian;Gao, Haolin;Xu, Sen
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.6 no.4
- /
- pp.1140-1152
- /
- 2012
A novel and universal method of video image text detection is proposed. A coarse-to-fine text detection method is implemented. Firstly, the spectral clustering (SC) method is adopted to coarsely detect text regions based on the stationary wavelet transform (SWT). In order to make full use of the information, multi-parameters kernel function which combining the features similarity information and spatial adjacency information is employed in the SC method. Secondly, 28 dimension classifying features are proposed and support vector machine (SVM) is implemented to classify text regions with non-text regions. Experimental results on video images show the encouraging performance of the proposed algorithm and classifying features.
https://doi.org/10.3837/tiis.2012.04.011 인용 PDF KSCI

Text Extraction in HIS Color Space by Weighting Scheme

Le, Thi Khue Van;Lee, Gueesang
- Smart Media Journal
- /
- v.2 no.1
- /
- pp.31-36
- /
- 2013
A robust and efficient text extraction is very important for an accuracy of Optical Character Recognition (OCR) systems. Natural scene images with degradations such as uneven illumination, perspective distortion, complex background and multi color text give many challenges to computer vision task, especially in text extraction. In this paper, we propose a method for extraction of the text in signboard images based on a combination of mean shift algorithm and weighting scheme of hue and saturation in HSI color space for clustering algorithm. The number of clusters is determined automatically by mean shift-based density estimation, in which local clusters are estimated by repeatedly searching for higher density points in feature vector space. Weighting scheme of hue and saturation is used for formulation a new distance measure in cylindrical coordinate for text extraction. The obtained experimental results through various natural scene images are presented to demonstrate the effectiveness of our approach.
PDF

Investigation of Trend in Virtual Reality-based Workplace Convergence Research: Using Pathfinder Network and Parallel Neighbor Clustering Methodology (가상현실 기반 업무공간 융복합 분야 연구 동향 분석 : 패스파인더 네트워크와 병렬 최근접 이웃 클러스터링 방법론 활용)

Ha, Jae Been;Kang, Ju Young
- The Journal of Information Systems
- /
- v.31 no.2
- /
- pp.19-43
- /
- 2022
Purpose Due to the COVID-19 pandemic, many companies are building virtual workplaces based on virtual reality technology. Through this study, we intend to identify the trend of convergence and convergence research between virtual reality technology and work space, and suggest future promising fields based on this. Design/methodology/approach For this purpose, 12,250 bibliographic data of research papers related to Virtual Reality (VR) and Workplace were collected from Scopus from 1982 to 2021. The bibliographic data of the collected papers were analyzed using Text Mining and Pathfinder Network, Parallel Neighbor Clustering, Nearest Neighbor Centrality, and Triangle Betweenness Centrality. Through this, the relationship between keywords by period was identified, and network analysis and visualization work were performed for virtual reality-based workplace research. Findings Through this study, it is expected that the main keyword knowledge structure flow of virtual reality-based workplace convergence research can be identified, and the relationship between keywords can be identified to provide a major measure for designing directions in subsequent studies.
https://doi.org/10.5859/KAIS.2022.31.2.19 인용 PDF KSCI

Automatic Construction of Reduced Dimensional Cluster-based Keyword Association Networks using LSI (LSI를 이용한 차원 축소 클러스터 기반 키워드 연관망 자동 구축 기법)

Yoo, Han-mook;Kim, Han-joon;Chang, Jae-young
- Journal of KIISE
- /
- v.44 no.11
- /
- pp.1236-1243
- /
- 2017
In this paper, we propose a novel way of producing keyword networks, named LSI-based ClusterTextRank, which extracts significant key words from a set of clusters with a mutual information metric, and constructs an association network using latent semantic indexing (LSI). The proposed method reduces the dimension of documents through LSI, decomposes documents into multiple clusters through k-means clustering, and expresses the words within each cluster as a maximal spanning tree graph. The significant key words are identified by evaluating their mutual information within clusters. Then, the method calculates the similarities between the extracted key words using the term-concept matrix, and the results are represented as a keyword association network. To evaluate the performance of the proposed method, we used travel-related blog data and showed that the proposed method outperforms the existing TextRank algorithm by about 14% in terms of accuracy.
https://doi.org/10.5626/JOK.2017.44.11.1236 인용 KSCI

Search Result 205, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)