• Title/Summary/Keyword: 텍스트 접근법

Search Result 49, Processing Time 0.028 seconds

The Analysis of Research Trends in Technology to the Fourth Industrial Revolution using SNA (소셜 네트워크 분석을 이용한 4차 산업혁명 기술 분야의 연구 동향 분석)

  • Kim, Hong-Gwang;Ahn, Jong-Wook
    • Journal of Cadastre & Land InformatiX
    • /
    • v.49 no.1
    • /
    • pp.113-121
    • /
    • 2019
  • The fourth industrial revolution technology focused on the fusion of infrastructure and various advanced technologies related city. Therefore, technical cooperation in various fields of research is essential. In order to activating the fourth industrial revolution technologies, it is necessary to research the state of technology in various fields. Consequently, this paper aims to analysis of domestic and foreign research trends on technology to the fourth industrial revolution using SNA and text mining for web site. We collected text, date data of research paper and report in web site for five years, that is, from January 1st in 2014 to December 31st in 2018. Next, we have deduced the major keywords in public data through analyzing the morphemes. Then we have analyzed the core and related keyword lists through an SNA. In Korea, the focus is on R&D and legal/institutional solution in relation to the fourth industrial revolution technology. On the other hand, in the case of foreign, there was focus on practical technologies for urban services in detail aspects.

Topic Modeling based Interdisciplinarity Measurement in the Informatics Related Journals (토픽 모델링 기반 정보학 분야 학술지의 학제성 측정 연구)

  • Jin, Seol A;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.1
    • /
    • pp.7-32
    • /
    • 2016
  • This study has measured interdisciplinarity using a topic modeling, which automatically extracts sub-topics based on term information appeared in documents group unlike the traditional top-down approach employing the references and classification system as a basis. We used titles and abstracts of the articles published in top 20 journals for the past five years by the 5-year impact factor under the category of 'Information & Library Science' in JCR 2013. We applied 'Discipline Diversity' and 'Network Coherence' as factors in measuring interdisciplinarity; 'Shannon Entropy Index' and 'Stirling Diversity Index' were used as indices to gauge diversity of fields while topic network's average path length was employed as an index representing network cohesion. After classifying the types of interdisciplinarity with the diversity and cohesion indices produced, we compared the topic networks of journals that represent each type. As a result, we found that the text-based diversity index showed different ranking when compared to the reference-based diversity index. This signifies that those two indices can be utilized complimentarily. It was also confirmed that the characteristics and interconnectedness of the sub-topics dealt with in each journal can be intuitively understood through the topic networks classified by considering both the diversity and cohesion. In conclusion, the topic modeling-based measurement of interdisciplinarity that this study proposed was confirmed to be applicable serving multiple roles in showing the interdisciplinarity of the journals.

Online Document Mining Approach to Predicting Crowdfunding Success (온라인 문서 마이닝 접근법을 활용한 크라우드펀딩의 성공여부 예측 방법)

  • Nam, Suhyeon;Jin, Yoonsun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.45-66
    • /
    • 2018
  • Crowdfunding has become more popular than angel funding for fundraising by venture companies. Identification of success factors may be useful for fundraisers and investors to make decisions related to crowdfunding projects and predict a priori whether they will be successful or not. Recent studies have suggested several numeric factors, such as project goals and the number of associated SNS, studying how these affect the success of crowdfunding campaigns. However, prediction of the success of crowdfunding campaigns via non-numeric and unstructured data is not yet possible, especially through analysis of structural characteristics of documents introducing projects in need of funding. Analysis of these documents is promising because they are open and inexpensive to obtain. We propose a novel method to predict the success of a crowdfunding project based on the introductory text. To test the performance of the proposed method, in our study, texts related to 1,980 actual crowdfunding projects were collected and empirically analyzed. From the text data set, the following details about the projects were collected: category, number of replies, funding goal, fundraising method, reward, number of SNS followers, number of images and videos, and miscellaneous numeric data. These factors were identified as significant input features to be used in classification algorithms. The results suggest that the proposed method outperforms other recently proposed, non-text-based methods in terms of accuracy, F-score, and elapsed time.

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

  • Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.125-148
    • /
    • 2018
  • Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.

An SAO-based Text Mining Approach for Technology Roadmapping Using Patent Information (기술로드맵핑을 위한 특허정보의 SAO기반 텍스트 마이닝 접근 방법)

  • Choi, Sung-Chul;Kim, Hong-Bin;Yoon, Jang-Hyeok
    • Journal of Technology Innovation
    • /
    • v.20 no.1
    • /
    • pp.199-234
    • /
    • 2012
  • Technology roadmaps (TRMs) are considered to be the essential tool for strategic technology planning and management. Recently, rapidly evolving technological trends and severe technological competition are making TRM more important than ever before. That is because TRM plays a role of "map" that align organizational objectives with their relevant technologies. However, constructing and managing TRMs are costly and time-consuming because they rely on the qualitative and intuitive knowledge of human experts. Therefore, enhancing the productivity of developing TRMs is one of the major concerns in technology planning. In this regard, this paper proposes a technology roadmapping approach based on function of which concept includes objectives, structures and effects of a technology and which are represented as Subject-Action-Object structures extractable by exploiting natural language processing of patent text. We expect that the proposed method will broaden experts' technological horizons in the technology planning process and will help to construct TRMs efficiently with the reduced time and costs.

  • PDF

A Study on Current State of Web Content Accessibility on General Hospital Websites in Korea (국내 종합병원의 웹 접근성 실태에 관한 연구)

  • Kim, Yong-Seob;Oh, Kun-Seok
    • Journal of Internet Computing and Services
    • /
    • v.11 no.3
    • /
    • pp.87-103
    • /
    • 2010
  • In the study, we introduce the trend in domestic and foreign web accessibility, as well as the legal system that ensures web accessibility. Based on Korean Web Content Accessibility Guidelines (KWCAG)1.0, we investigated the web content accessibility of 80 tertiary health-care hospitals and general hospitals in Korea. We evaluated accessibility by combining accessibility-based criteria (ABC) with usability-based criteria (UBC). ABC was limited to an alternative text for Guideline 1, using a small number of frames and keyboard accessibility for Guideline 2. UBC checked the voice service (TTS), resizing text, providing multi-lingual websites, and disclosing web accessibility policy. KADO-WAH2.0 was used for representing the compliance rate. The evaluation result was a considerable improvement from previous results, even though the rate of compliance with web accessibility was generally insufficient. There was a significant difference between those medical centers which did and did not comply with web accessibility. Incidentally, many hospitals were found to have attempted to confront and come to terms with web accessibility. In future, the following factors are advisable for medical centers with publicity or public interest: they must employ active and aggressive promotion of establishment of independent accessibility guidelines to secure web accessibility, they should effect an improvement of the realization of web accessibility, there can be constant education and promotion, and there can be an institutional supplementation, as well as others.

Literary Texts in the English Classroom: An Integrated Approach to English Instruction (영어 교실의 문학 텍스트 -영어교육의 통합적 접근)

  • Kang, Gyu Han
    • Journal of English Language & Literature
    • /
    • v.55 no.1
    • /
    • pp.107-128
    • /
    • 2009
  • Literature had been at center-stage in the traditional grammar-translation-focused English classrooms up to the mid-twentieth century. As the Audiolingual Method and the Communicative Language Teaching have gained popularity in the English classrooms, however, literature has receded into the background of English education. The main reasons for using literary texts in the English classrooms for communication-focused English instruction need to be examined. First of all, students can come in touch with the subtle and varied uses of language through literature-based teaching. They also feel close to certain characters in the literary work and share the emotional reponses with them. They get personally involved in the plot of the story. Universal human experience and cultural enrichment are two other merits which can be conferred on students by literary texts. Such linguistic and literary experiences can be significantly integrated into the literature-based instruction. More significantly, the four language skills (reading, writing, listening and speaking) can be combined with one another and integrated into a literature-focused curriculum for English education. The value of literary texts in the English classrooms can be clearly demonstrated by effective ways of using such texts as Charlotte's Web for integrated instruction. The full array of benefits that literature can bring to English instruction, however, has yet to be fully realized. These potentials need to be materialized into classroom practice.

A Study on automatic assignment of descriptors using machine learning (기계학습을 통한 디스크립터 자동부여에 관한 연구)

  • Kim, Pan-Jun
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.1 s.59
    • /
    • pp.279-299
    • /
    • 2006
  • This study utilizes various approaches of machine learning in the process of automatically assigning descriptors to journal articles. The effectiveness of feature selection and the size of training set were examined, after selecting core journals in the field of information science and organizing test collection from the articles of the past 11 years. Regarding feature selection, after reducing the feature set using $x^2$ statistics(CHI) and criteria that prefer high-frequency features(COS, GSS, JAC), the trained Support Vector Machines(SVM) performed the best. With respect to the size of the training set, it significantly influenced the performance of Support Vector Machines(SVM) and Voted Perceptron(VTP). However, it had little effect on Naive Bayes(NB).

An Implementation of Mobile Messenger Application for Kindergartens and Nurseries (영유아교육기관용 모바일 메신저 어플리케이션 구현)

  • Han, Dong-Gyoon
    • Journal of Digital Contents Society
    • /
    • v.13 no.3
    • /
    • pp.401-412
    • /
    • 2012
  • Communication through smartphones is creating a new paradigm of mobile communication beyond the restrictions of mobility and space. The mobile instant messengers of the smartphone, which began on the desktop environment, are providing various communication features, including multimedia contents, texts, and voice. Kindergartens and nurseries are using Websites, phones, SMS, printed materials, handwritten notifications, etc. to communicate with the parents. IsPlus was developed to facilitate the communication between the parents and the teachers, including exchanges of pictures and videos, chatting, and attendance management using smartphones. This is meaningful in that it is an early smartphone mobile messenger for educational institutions for infants and preschoolers. This study presented a new approach to the development of a mobile messenger for a specific group of users by analyzing the user environment, communication types, application planning process and interface design development using storytelling.

A Corpus Analysis to the Engineering Academic English (공학학술영어에 대한 코퍼스 분석)

  • Ha, Myung-Jeong;Rhee, Eugene
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2017.05a
    • /
    • pp.139-140
    • /
    • 2017
  • 본 연구는 공과대학 학생들이 배우는 전공영어로서의 특수목적영어(ESP)에 대해 코퍼스 기반 접근법의 유용성을 논하고자 한다. 이에 본 연구에서는 공과대학에서 사용하는 전공텍스트를 코퍼스로 구축하여 컴퓨터에 기반한 분석에서 나온 결과들을 제시하면서 공학영어 코퍼스의 특성을 살펴보고 궁극적으로 영어매개수업을 듣는 공대학생들의 데이터 기반 학습에 일조하고자 한다. 본 연구에서 사용된 목표 코퍼스는 세부전공과 상관없이 공통적으로 적용되는 공학과목을 선정하여 구축되었고 비교대상인 참조 코퍼스는 British National Corpus를 사용하였다. 공학영어 코퍼스는 총 단어 180만개, 단어 유형 만 6천여개로 이루어졌고 코퍼스 분석도구인 AntConc 3.4.4를 이용하여 빈도 분석과 키워드 분석이 수행되었다. 고빈도수 어휘의 분석결과 목표 코퍼스와 참조 코퍼스에서 가장 빈번하게 나타나는 어휘군은 내용어(content words)보다는 기능어(function words) 형태가 많다는 점이 나타났고 내용어군만 분석결과 참조코퍼스에 비해 공학영어 코퍼스에 과학영역의 변이어가 많이 분포하고 있음이 드러났다. 또한 키워드 분석에서는 공학영어 코퍼스의 키워드 동사군이 전문적인 어휘(technical vocabulary)보다는 비전문적인 학술적 어휘(non-technical academic vocabulary)가 상대적으로 많이 분포되어 있음이 드러나 ESP교육을 실시함에 있어서 전공관련 전문영어와 함께 일반적인 학술 영어에 대한 인식을 고양해야 할 필요성이 대두된다.

  • PDF