Search | Korea Science

An Efficient Machine Learning-based Text Summarization in the Malayalam Language

P Haroon, Rosna;Gafur M, Abdul;Nisha U, Barakkath
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.16 no.6
- /
- pp.1778-1799
- /
- 2022
Automatic text summarization is a procedure that packs enormous content into a more limited book that incorporates significant data. Malayalam is one of the toughest languages utilized in certain areas of India, most normally in Kerala and in Lakshadweep. Natural language processing in the Malayalam language is relatively low due to the complexity of the language as well as the scarcity of available resources. In this paper, a way is proposed to deal with the text summarization process in Malayalam documents by training a model based on the Support Vector Machine classification algorithm. Different features of the text are taken into account for training the machine so that the system can output the most important data from the input text. The classifier can classify the most important, important, average, and least significant sentences into separate classes and based on this, the machine will be able to create a summary of the input document. The user can select a compression ratio so that the system will output that much fraction of the summary. The model performance is measured by using different genres of Malayalam documents as well as documents from the same domain. The model is evaluated by considering content evaluation measures precision, recall, F score, and relative utility. Obtained precision and recall value shows that the model is trustable and found to be more relevant compared to the other summarizers.
https://doi.org/10.3837/tiis.2022.06.001 인용 PDF KSCI HTML

LATENT SEMANTIC INDEXING AND LINEAR RELEVANCE FEEDBACK IN TEXT INFORMATION RETRIEVAL THEORY

Yang, Ki-Choon
- Journal of the Korean Mathematical Society
- /
- v.36 no.3
- /
- pp.609-619
- /
- 1999
We give a mathematically rigorous description of the recently popular latent semantic indexing (LSI) method in text information retrieval theory. Also, a related problem of finding a document ranking function in linear relevance feedback is discussed.
PDF

On The Full-Text Database Retrieval and Indexing Language

Chang, Hye-Rhan
- Journal of the Korean Society for information Management
- /
- v.4 no.1
- /
- pp.24-46
- /
- 1987
The recent growth of full-text database operations has brought new opportunities for subject access. The fundamental problem of subject access in the online environment is the indexing language and technology. The purpose of this paper is to identify the characteristics and capabilities of full-text retrieval as compared to traditional bibliographic retrieval. Retrieval performance of indexing languages, full-text systems features achieved so far, and the new role of a controlled vocabulary, are examined. This paper also includes a review of the research on full-text retrieval performance.
PDF

Development Status and Prospects of Graphical Password Authentication System in Korea

Yang, Gi-Chul
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.11
- /
- pp.5755-5772
- /
- 2019
Security is becoming more important as society changes rapidly. In addition, today's ICT environment demands changes in existing security technologies. As a result, password authentication methods are also changing. The authentication method most often used for security is password authentication. The most-commonly used passwords are text-based. Security enhancement requires longer and more complex passwords, but long, complex, text-based passwords are hard to remember and inconvenient to use. Therefore, authentication techniques that can replace text-based passwords are required today. Graphical passwords are more difficult to steal than text-based passwords and are easier for users to remember. In recent years, researches into graphical passwords that can replace existing text-based passwords are being actively conducting in various places throughout the world. This article surveys recent research and development directions of graphical password authentication systems in Korea. For this purpose, security authentication methods using graphical passwords are categorized into technical groups and the research associated with graphical passwords performed in Korea is explored. In addition, the advantages and disadvantages of all investigated graphical password authentication methods were analyzed along with their characteristics.
https://doi.org/10.3837/tiis.2019.11.026 인용 PDF KSCI HTML

Rectification of Perspective Text Images on Rectangular Planes

Le, Huy Phat;Madhubalan, Kavitha;Lee, Guee-Sang
- International Journal of Contents
- /
- v.6 no.4
- /
- pp.1-7
- /
- 2010
Natural images often contain useful information about the scene such as text or company logos placed on a rectangular shaped plane. The 2D images captured from such objects by a camera are often distorted, because of the effects of the perspective projection camera model. This distortion makes the acquisition of the text information difficult. In this study, we detect the rectangular object on which the text is written, then the image is restored by removing the perspective distortion. The Hough transform is used to detect the boundary lines of the rectangular object and a bilinear transformation is applied to restore the original image.
https://doi.org/10.5392/IJoC.2010.6.4.001 인용 PDF KSCI

Automatic Name Line Detection for Person Indexing Based on Overlay Text

Lee, Sanghee;Ahn, Jungil;Jo, Kanghyun
- Journal of Multimedia Information System
- /
- v.2 no.1
- /
- pp.163-170
- /
- 2015
Many overlay texts are artificially superimposed on the broadcasting videos by humans. These texts provide additional information to the audiovisual content. Especially, the overlay text in news videos contains concise and direct description of the content. Therefore, it is most reliable clue for constructing a news video indexing system. To make the automatic person indexing of interview video in the TV news program, this paper proposes the method to only detect the name text line among the whole overlay texts in one frame. The experimental results on Korean television news videos show that the proposed framework efficiently detects the overlaid name text line.
https://doi.org/10.9717/jmis.2015.2.1.163 인용 PDF

Title Extraction from Book Cover Images Using Histogram of Oriented Gradients and Color Information

Do, Yen;Kim, Soo Hyung;Na, In Seop
- International Journal of Contents
- /
- v.8 no.4
- /
- pp.95-102
- /
- 2012
In this paper, we present a technique to extract the title areas from book cover images. A typical book cover image may contain text, pictures, diagrams as well as complex and irregular background. In addition, the high variability of character features such as thickness, font, position, background and tilt of the text also makes the text extraction task more complicated. Therefore, we propose a two steps efficient method that uses Histogram of Oriented Gradients and color information to find the title areas. Firstly, text localization is carried out to find the title candidates. Finally, refinement process is performed to find the sufficient components of title areas. To obtain the best result, we also use other constraints about the size, ratio between the length and width of the title. We achieve encouraging results of extracted title regions from book cover images which prove the advantages and efficiency of the proposed method.
https://doi.org/10.5392/IJoC.2012.8.4.095 인용 PDF KSCI

Text Detection in Scene Images Based on Interest Points

Nguyen, Minh Hieu;Lee, Gueesang
- Journal of Information Processing Systems
- /
- v.11 no.4
- /
- pp.528-537
- /
- 2015
Text in images is one of the most important cues for understanding a scene. In this paper, we propose a novel approach based on interest points to localize text in natural scene images. The main ideas of this approach are as follows: first we used interest point detection techniques, which extract the corner points of characters and center points of edge connected components, to select candidate regions. Second, these candidate regions were verified by using tensor voting, which is capable of extracting perceptual structures from noisy data. Finally, area, orientation, and aspect ratio were used to filter out non-text regions. The proposed method was tested on the ICDAR 2003 dataset and images of wine labels. The experiment results show the validity of this approach.
https://doi.org/10.3745/JIPS.02.0026 인용 PDF KSCI

Citation-based Article Summarization using a Combination of Lexical Text Similarities: Evaluation with Computational Linguistics Literature Summarization Datasets

Kang, In-Su
- Journal of the Korea Society of Computer and Information
- /
- v.24 no.7
- /
- pp.31-37
- /
- 2019
Citation-based article summarization is to create a shortened text for an academic article, reflecting the content of citing sentences which contain other's thoughts about the target article to be summarized. To deal with the problem, this study introduces an extractive summarization method based on calculating a linear combination of various sentence salience scores, which represent the degrees to which a candidate sentence reflects the content of author's abstract text, reader's citing text, and the target article to be summarized. In the current study, salience scores are obtained by computing surface-level textual similarities. Experiments using CL-SciSumm datasets show that the proposed method parallels or outperforms the previous approaches in ROUGE evaluations against SciSumm-2017 human summaries and SciSumm-2016/2017 community summaries.
https://doi.org/10.9708/jksci.2019.24.07.031 인용 PDF KSCI HTML

A new approach technique on Speech-to-Speech Translation (신호의 복원된 위상 공간을 이용한 오디오 상황 인지)

Le, Thanh Hien;Lee, Sung-young;Lee, Young-Koo
- Proceedings of the Korea Information Processing Society Conference
- /
- 2009.11a
- /
- pp.239-240
- /
- 2009
We live in a flat world in which globalization fosters communication, travel, and trade among more than 150 countries and thousands of languages. To surmount the barriers among these languages, translation is required; Speech-to-Speech translation will automate the process. Thanks to recent advances in Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS), one can now utilize a system to translate a speech of source language to a speech of target language and vice versa in affordable manner. The three phase process establishes that the source speech be transcribed into a (set of) text of the source language (ASR) before the source text is translated into the target text (MT). Finally, the target speech is synthesized from the target text (TTS).
https://doi.org/10.3745/PKIPS.y2009m11a.239 인용 PDF

Search Result 4,418, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)