• Title/Summary/Keyword: Text Title

Search Result 150, Processing Time 0.029 seconds

Academic Conference Categorization According to Subjects Using Topical Information Extraction from Conference Websites (학회 웹사이트의 토픽 정보추출을 이용한 주제에 따른 학회 자동분류 기법)

  • Lee, Sue Kyoung;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.22 no.2
    • /
    • pp.61-77
    • /
    • 2017
  • Recently, the number of academic conference information on the Internet has rapidly increased, the automatic classification of academic conference information according to research subjects enables researchers to find the related academic conference efficiently. Information provided by most conference listing services is limited to title, date, location, and website URL. However, among these features, the only feature containing topical words is title, which causes information insufficiency problem. Therefore, we propose methods that aim to resolve information insufficiency problem by utilizing web contents. Specifically, the proposed methods the extract main contents from a HTML document collected by using a website URL. Based on the similarity between the title of a conference and its main contents, the topical keywords are selected to enforce the important keywords among the main contents. The experiment results conducted by using a real-world dataset showed that the use of additional information extracted from the conference websites is successful in improving the conference classification performances. We plan to further improve the accuracy of conference classification by considering the structure of websites.

Similarity Measurement Between Titles and Abstracts Using Bijection Mapping and Phi-Correlation Coefficient

  • John N. Mlyahilu;Jong-Nam Kim
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.23 no.3
    • /
    • pp.143-149
    • /
    • 2022
  • This excerpt delineates a quantitative measure of relationship between a research title and its respective abstract extracted from different journal articles documented through a Korean Citation Index (KCI) database published through various journals. In this paper, we propose a machine learning-based similarity metric that does not assume normality on dataset, realizes the imbalanced dataset problem, and zero-variance problem that affects most of the rule-based algorithms. The advantage of using this algorithm is that, it eliminates the limitations experienced by Pearson correlation coefficient (r) and additionally, it solves imbalanced dataset problem. A total of 107 journal articles collected from the database were used to develop a corpus with authors, year of publication, title, and an abstract per each. Based on the experimental results, the proposed algorithm achieved high correlation coefficient values compared to others which are cosine similarity, euclidean, and pearson correlation coefficients by scoring a maximum correlation of 1, whereas others had obtained non-a-number value to some experiments. With these results, we found that an effective title must have high correlation coefficient with the respective abstract.

Design and Implementation of Web Crawler with Real-Time Keyword Extraction based on the RAKE Algorithm

  • Zhang, Fei;Jang, Sunggyun;Joe, Inwhee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.395-398
    • /
    • 2017
  • We propose a web crawler system with keyword extraction function in this paper. Researches on the keyword extraction in existing text mining are mostly based on databases which have already been grabbed by documents or corpora, but the purpose of this paper is to establish a real-time keyword extraction system which can extract the keywords of the corresponding text and store them into the database together while grasping the text of the web page. In this paper, we design and implement a crawler combining RAKE keyword extraction algorithm. It can extract keywords from the corresponding content while grasping the content of web page. As a result, the performance of the RAKE algorithm is improved by increasing the weight of the important features (such as the noun appearing in the title). The experimental results show that this method is superior to the existing method and it can extract keywords satisfactorily.

A Study on Variable Text Effect applying for Digital Contents (디지털 콘텐츠에 다양한 텍스트 효과 적용에 관한 연구)

  • Joo, Heon-Sik
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2015.07a
    • /
    • pp.228-229
    • /
    • 2015
  • 본 연구에서는 디지털 콘텐츠의 다양한 텍스트 효과 적용에 대해서 나타낸다. 디지털 콘텐츠에 텍스트 효과를 적용함으로서 영상의 의미를 보다 더 구체적으로 이해할 수 있고, 디지털 콘텐츠의 정체성이 드러나며, 콘텐츠의 성격과 그 진실성을 보다 명확히 이해 할 수 있다. 따라서 영상에 어떤 텍스트 효과를 사용하느냐에 따라 디지털 콘텐츠의 성격이 달라지고, 콘텐츠의 의미가 부각되고, 콘텐츠의 격과 질이 높이고, 관심과 가치를 나타낼 수 있다. 따라서 본 연구에서는 다양한 텍스트 효과 유형을 디지털 콘텐츠에 적용함으로써 다양한 영상 효과를 나타내고, 콘텐츠의 성격을 보다 구체화시킬 수 있고, 디지털 콘텐츠의 명확성과 관심과 흥미를 통하여 콘텐츠의 가치를 높일 수 있다고 사료한다.

  • PDF

Application of the 2-Poisson Model to Full-Text Information Retrieval System (2-포아송 모형의 전문검색시스템 응용에 관한 연구)

  • 문성빈
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.3
    • /
    • pp.49-63
    • /
    • 1999
  • The purpose of this study is to investigate whether the terms in queries are distributed according to the 2-Poisson model in the documents represented by abstract/title or full-text. In this study, retrieval experiments using Binary independence and 2-Poisson independence model, which are based on the probabilistic theory, were conducted to see if the 2-Poisson distribution of the query terms has an influence on the retrieval effectiveness, particularly of full-text information retrieval system.

  • PDF

Re-creation method of literature tale to fairy tale (문헌설화의 동화로의 재창작 방법 -삼국유사를 중심으로-)

  • Jeong, Hee-jeong
    • Journal of Korean Classical Literature and Education
    • /
    • no.16
    • /
    • pp.181-206
    • /
    • 2008
  • This paper focuses on the re-creation method of historically valuable literatured tale, "Samguk Yusa" to fairy tale as well as the problems and way of improvement shown in its recreation process using five publication samples. In case of the fairy tale based on the tale having original text, the understanding and judgement about that text as well as the focusing and causality of the story are needed to the fairy tale author. Moreover, it needs clear title for the comparison and relationship with original text and also it requires the fitting arrangement of history and fiction to evoke the imagination of child. In addition, to rise more interest of child, the various literary expression showing beauty of language, selection and understanding of original tale, acquirement of formal beauty as a literature and binding of books should be considered effectively. Through the effort of problem solving and new writing approaches for tale re-creation, we will get more interesting and instructive fairy tale, "Samguk Yusa".

Research Trends on Literature Reviews in Scopus Journals by Authors from Indonesia, Japan, South Korea, Vietnam, Singapore, and Malaysia: A Bibliometric Analysis from 2003 to 2022

  • Prakoso Bhairawa Putera;Amelya Gustina
    • Asian Journal of Innovation and Policy
    • /
    • v.12 no.3
    • /
    • pp.304-322
    • /
    • 2023
  • Text data mining ('big data methods') is one of the most widely used approaches during the COVID-19 pandemic. In particular, text data mining on Scopus databases or Web of Science (WoS). Text data mining is widely used to collect literature for later bibliometric analysis, and in the end, it becomes a literature review article. Therefore, in this article, we reveal the trend of publication of literature reviews in Scopus journals from Indonesia, Japan, South Korea, Vietnam, Singapore, and Malaysia. This article describes two essential parts, namely 1) a comparison of international publication trends and subject area of literature review publications, and 2) a comparison of Top 5 for Authors, Affiliation, Source Title, and Collaboration Country.

RECENT RESEARCH AND DEVELOPING TREND OF ENGINEERING MANAGEMENT IN CHINA BASED ON TEXT MINING

  • Shaohua Jiang;Wenling Zhang;Zhaohong Qiu;Shaojun Wang
    • International conference on construction engineering and project management
    • /
    • 2009.05a
    • /
    • pp.814-820
    • /
    • 2009
  • With the rapid development of China economy, many engineering projects with large scale and investment were constructed in China and some were the biggest ones in the world. With the development of engineering practice, great progress in the research of engineering management of China was made and a large number of research findings were embodied in content of research papers and were represented by technical words. To know the state of arts in the research field of engineering management in China, three major parts, namely title, abstract and keywords of research papers in last five years from three representative Chinese journals about engineering management were chose as research materials. Unlike western languages, there are no delimiters between the words of Chinese, so the maximum matching and frequency statistics (MMFS) method, a text segmentation technique of text mining Chinese, was presented to extract the features consisting of technical words, phrases and words from the research materials. Recent research and developing trend of engineering management in China were found by comparing and analyzing the difference of technical words in the research materials of last five years.

  • PDF

A Study on the Development of E-book Contents for Fashion Online Entrepreneurship Education (패션온라인창업 교육을 위한 전자책 콘텐츠 개발에 대한 연구)

  • Hwa-Yeon Jeong;Eun-Hee Hong
    • Journal of the Korea Fashion and Costume Design Association
    • /
    • v.26 no.1
    • /
    • pp.33-44
    • /
    • 2024
  • This study developed e-book content in order to use e-books as a tool to provide more efficient classes to learners who are familiar with smart devices and online spaces. E-book contents were produced using Sigil-0.9.10. The development process is as follows. Before e-book development, it is necessary to prepare manuscript files, image files to be inserted, fonts to be used, and e-book covers. After inserting the book cover images, it is necessary to register the table of contents using the title tag and register the free fonts. Also, a style must be created for text or images used in the main text connected to a file containing the entire text. Then, after separating the entire text file into separate files according to each chapter, the text is completed in turn. E-books were produced focusing on hyperlink functions so that educational content and various example images could be accessed. Currently, there is a lack of research on e-books as textbooks in universities within the fashion design major. In the future, if e-book contents are developed according to the characteristics of courses and the level of learners, they can be used as effective teaching tools.

An Embedded Text Index System for Mass Flash Memory (대용량 플래시 메모리를 위한 임베디드 텍스트 인덱스 시스템)

  • Yun, Sang-Hun;Cho, Haeng-Rae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.6
    • /
    • pp.1-10
    • /
    • 2009
  • Flash memory has the advantages of nonvolatile, low power consumption, light weight, and high endurance. This enables the flash memory to be utilized as a storage of mobile computing device such as PMP(Portable Multimedia Player). Potable device with a mass flash memory can store various multimedia data such as video, audio, or image. Typical index systems for mobile computer are inefficient to search a form of text like lyric or title. In this paper, we propose a new text index system, named EMTEX(Embedded Text Index). EMTEX has the following salient features. First, it uses a compression algorithm for embedded system. Second, if a new insert or delete operation is executed on the base table. EMTEX updates the text index immediately. Third, EMTEX considers the characteristics of flash memory to design insert, delete, and rebuild operations on the text index. Finally, EMTEX is executed as an upper layer of DBMS. Therefore, it is independent of the underlying DBMS. We evaluate the performance of EMTEX. The Experiment results show that EMTEX can outperform th conventional index systems such as Oracle Text and FT3.