• Title/Summary/Keyword: Academic Text

Search Result 356, Processing Time 0.03 seconds

Multi-Vector Document Embedding Using Semantic Decomposition of Complex Documents (복합 문서의 의미적 분해를 통한 다중 벡터 문서 임베딩 방법론)

  • Park, Jongin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.19-41
    • /
    • 2019
  • According to the rapidly increasing demand for text data analysis, research and investment in text mining are being actively conducted not only in academia but also in various industries. Text mining is generally conducted in two steps. In the first step, the text of the collected document is tokenized and structured to convert the original document into a computer-readable form. In the second step, tasks such as document classification, clustering, and topic modeling are conducted according to the purpose of analysis. Until recently, text mining-related studies have been focused on the application of the second steps, such as document classification, clustering, and topic modeling. However, with the discovery that the text structuring process substantially influences the quality of the analysis results, various embedding methods have actively been studied to improve the quality of analysis results by preserving the meaning of words and documents in the process of representing text data as vectors. Unlike structured data, which can be directly applied to a variety of operations and traditional analysis techniques, Unstructured text should be preceded by a structuring task that transforms the original document into a form that the computer can understand before analysis. It is called "Embedding" that arbitrary objects are mapped to a specific dimension space while maintaining algebraic properties for structuring the text data. Recently, attempts have been made to embed not only words but also sentences, paragraphs, and entire documents in various aspects. Particularly, with the demand for analysis of document embedding increases rapidly, many algorithms have been developed to support it. Among them, doc2Vec which extends word2Vec and embeds each document into one vector is most widely used. However, the traditional document embedding method represented by doc2Vec generates a vector for each document using the whole corpus included in the document. This causes a limit that the document vector is affected by not only core words but also miscellaneous words. Additionally, the traditional document embedding schemes usually map each document into a single corresponding vector. Therefore, it is difficult to represent a complex document with multiple subjects into a single vector accurately using the traditional approach. In this paper, we propose a new multi-vector document embedding method to overcome these limitations of the traditional document embedding methods. This study targets documents that explicitly separate body content and keywords. In the case of a document without keywords, this method can be applied after extract keywords through various analysis methods. However, since this is not the core subject of the proposed method, we introduce the process of applying the proposed method to documents that predefine keywords in the text. The proposed method consists of (1) Parsing, (2) Word Embedding, (3) Keyword Vector Extraction, (4) Keyword Clustering, and (5) Multiple-Vector Generation. The specific process is as follows. all text in a document is tokenized and each token is represented as a vector having N-dimensional real value through word embedding. After that, to overcome the limitations of the traditional document embedding method that is affected by not only the core word but also the miscellaneous words, vectors corresponding to the keywords of each document are extracted and make up sets of keyword vector for each document. Next, clustering is conducted on a set of keywords for each document to identify multiple subjects included in the document. Finally, a Multi-vector is generated from vectors of keywords constituting each cluster. The experiments for 3.147 academic papers revealed that the single vector-based traditional approach cannot properly map complex documents because of interference among subjects in each vector. With the proposed multi-vector based method, we ascertained that complex documents can be vectorized more accurately by eliminating the interference among subjects.

A study on the improvements of Foreign Research Information Center from the perspective of librarians in charge (외국학술지지원센터 개선방안에 관한 연구 - 운영 담당자의 관점을 중심으로 -)

  • Lee, Jongwook
    • Journal of Korean Library and Information Science Society
    • /
    • v.49 no.3
    • /
    • pp.283-305
    • /
    • 2018
  • Although academic library budgets have been decreasing, the rates of print and electronic journal subscription price have consistently increased. In response to this, as part of efforts to ensure access to foreign academic materials, the Ministry of Education and Korea Education & Research Information Service (KERIS) have initiated and operated Foreign Research Information Center (FRIC) since 2006, pursuing shared acquisition and sharing of foreign print journals. This study investigates the roles/values, issues raised by stakeholders, improvements in services, and new service elements of FRIC through the in-depth interviews with librarians in charge of FRIC in addition to examining its current state. The findings show that FRIC has contributed to sharing of academic materials and to promoting research. However, it was also found that the five types of stakeholders (i.e., the Ministry of Education/KERIS, universities/libraries, users, FRICs, and publishers/agencies) have diverse issues and problems with FRIC. Therefore, this study makes some suggestions to address the issues in terms of policy, system, management, and service.

A Qualitative Study on the Experience of Visually Impaired Researchers in the Acquisition and Use of Scholarly Contents (시각장애 연구자의 학술정보 획득 및 활용 경험에 관한 질적 연구)

  • Bak, Seongeui;Shim, Wonsik
    • Journal of Korean Library and Information Science Society
    • /
    • v.48 no.1
    • /
    • pp.161-189
    • /
    • 2017
  • The purpose of this study is to describe the experience of visually impaired academic researchers' use of scholarly contents and to explore intrinsic nature of the experience. In-depth interview was conducted with a total number of twelve visually impaired academic researchers and the data were analyzed using Colaizzi's phenomenological research method. A total of 107 significant statements were extracted, divided into 44 themes and 12 theme clusters. The statements were then classified into four categories. The 'scholarly contents acquisition and use' category has to do with difficulties that these experience when negotiating with internet sites with poor web accessibility and full-text availability. The 'changes in perception and emotions' category deals with transitions in perception and mood experienced by visually impaired academic researchers' over time. The 'relationships with support personnel' category includes issues related with the difficulty of securing support person, support person's inadequate level of competence, and establishing/sustaining personal relationships. Finally, the 'improvement requirements' category includes issues that these researchers want resolved with regard to contents acquisition and use.

Using Text-mining Method to Identify Research Trends of Freshwater Exotic Species in Korea (텍스트마이닝 (text-mining) 기법을 이용한 국내 담수외래종 연구동향 파악)

  • Do, Yuno;Ko, Eui-Jeong;Kim, Young-Min;Kim, Hyo-Gyeom;Joo, Gea-Jae;Kim, Ji Yoon;Kim, Hyun-Woo
    • Korean Journal of Ecology and Environment
    • /
    • v.48 no.3
    • /
    • pp.195-202
    • /
    • 2015
  • We identified research trends for freshwater exotic species in South Korea using text mining methods in conjunction with bibliometric analysis. We searched scientific and common names of freshwater exotic species as searching keywords including 1 mammal species, 3 amphibian-reptile species, 11 fish species, 2 aquatic plant species. A total of 245 articles including research articles and abstracts of conference proceedings published by 56 academic societies and institutes were collected from scientific article databases. The search keywords used were the common names for the exotic species. The $20^{th}$ century (1900's) saw the number of articles increase; however, during the early $21^{st}$ century (2000's) the number of published articles decreased slowly. The number of articles focusing on physiological and embryological research was significantly greater than taxonomic and ecological studies. Rainbow trout and Nile tilapia were the main research topic, specifically physiological and embryological research associated with the aquaculture of these species. Ecological studies were only conducted on the distribution and effect of large-mouth bass and nutria. The ecological risk associated with freshwater exotic species has been expressed yet the scientific information might be insufficient to remove doubt about ecological issues as expressed by interested by individuals and policy makers due to bias in research topics with respect to freshwater exotic species. The research topics of freshwater exotic species would have to diversify to effectively manage freshwater exotic species.

Research Trends on Emotional Labor in Korea using text mining (텍스트마이닝을 활용한 감정노동 연구 동향 분석)

  • Cho, Kyoung-Won;Han, Na-Young
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.26 no.6
    • /
    • pp.119-133
    • /
    • 2021
  • Research has been conducted in many fields to identify research trends using text mining, but in the field of emotional labor, no research has been conducted using text mining to identify research trends. This study uses text mining to deeply analyze 1,465 papers at the Korea Citation Index (KCI) from 2004 to 2019 containing the subject word 'emotional labor' to understand the trend of emotional labor researches. Topics were extracted by LDA analysis, and IDM analysis was performed to confirm the proportion and similarity of the topics. Through these methods, an integrated analysis of topics was conducted considering the usefulness of topics with high similarity. The research topics are divided into 11 categories in descending order: stress of emotional labor (12.2%), emotional labor and social support (12.0%), customer service workers' emotional labor (10.9%), emotional labor and resilience (10.2%), emotional labor strategy (9.2%), call center counselor's emotional labor (9.1%), results of emotional labor (9.0%), emotional labor and job exhaustion (7.9%), emotional intelligence (7.1%), preliminary care service workers' emotional labor (6.6%), emotional labor and organizational culture (5.9%). Through topic modeling and trend analysis, the research trend of emotional labor and the academic progress are analyzed to present the direction of emotional labor research, and it is expected that a practical strategy for emotional labor can be established.

The Effects of Academic Achievement and Learning Satisfaction According to the Presentation Method of the Multimedia Materials for 'Transportation Technology' Unit of Technology.Home Economics Subject (기술.가정 교과 '수송기술' 단원에서 수업 자료의 제시 방법에 따른 학업 성취도와 학습 만족도에 미치는 영향)

  • Kim, Seong-Il
    • 대한공업교육학회지
    • /
    • v.37 no.2
    • /
    • pp.147-160
    • /
    • 2012
  • The purpose of this study was to examine the effects on the academic achievement and learning satisfaction according to the presentation method of the multimedia materials for 'transportation technology' units of technology home economics subject. The subjects were assigned in third conditions; Text type explanation class, multimedia class and multimedia video class with narration. The data of six evaluation questions obtained from the survey of 93 high school girl were analyzed using SPSS program. The results of the study were as follows : First, in the learning satisfaction average level(M) of the students' overall responses to the questions, multimedia teaching learning class(experimental group 1) is the first(M=4.14), multimedia video class with narration(experimental group 2) is the second(M=3.16), and instructor-led class(control group) is the third (M=2.63). Therefore, the teaching learning multimedia class(experimental group 1) was most effective. Second, looking at the correlations between the students' responses to the questions, in an interesting class, the students have a retentive memory and comprehension, but a lower concentration can not a retentive memory. Third, multimedia teaching learning class(experimental group 1) has the best degree at the level of academic achievement, but instructor-led class(control group) and multimedia video class with narration(experimental group 2) have similar degree in the second place. To increase academic achievement, an instructor-led class is important to arouse interest and a multimedia video class with narration is required ways to improve level of concentration.

Trends and Issues of Tibetan History in Taiwan (대만의 티베트사(史) 연구 동향과 쟁점)

  • Sim, HyukJoo
    • 동북아역사논총
    • /
    • no.60
    • /
    • pp.196-227
    • /
    • 2018
  • The issues of this study are as follows. First, I will examine the overall situation and transition trends of Tibetan research in Taiwan since the modern period, and examine the development and trends of Tibetan history research in Taiwan. Secondly, in order to satisfy the above, we will analyze trends of Taiwan's major Tibetan research institutes and scholars, and trace their trends and their trajectories. Third, the trend of Tibetan research in Taiwan may be a useful indicator for us to analyze research methods and trends of Taiwanese scholars. If there is a flow of features and transitions, the text will explore the reason. Fourth, one of the implications of this study is that it can trigger an understanding of locality in the structure of the central region, the Han Chinese minority, and the possession and distribution of academic reasoning. In other words, it should be noted that even though the same Tibetan research is conducted, China is in the position of the vested right to distribute 226 | 동북아역사논총 60호the central or ownership, while Taiwan has historical and territorial characteristics that deviate from such a gaze and attitude. Taiwan may be sensitive to the vertical concept understood as a change in the relationship between the state and the center, or whether it is applicable to Tibetan research. If there is such an academic climate, I would like to consider suggestions for us. This may provide a direction to view the academic issues of a few scholars, or even the domestic academic world as an independent object of more specific academic research.

A Study on the Evaluation and Improvement of Accessibility in Korean Online e-Journal (국내 온라인 학술지의 접근성 평가 및 개선에 관한 연구)

  • Boseong, Jang
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.56 no.4
    • /
    • pp.161-180
    • /
    • 2022
  • This study aims to improve the accessibility of websites that can search for online e-journals and check the original text, and the accessibility of web contents in the form of article papers. In order to publish online e-journals in Korea, article contribution management system is used, and services are provided through public or private academic DB companies. There was no content related to accessibility in the publishing and editing stage of online e-journals. In the case of foreign countries, objective to comply with Level AA of WCAG 2.1 to improve accessibility of websites and web content. In addition, the level of accessibility of academic journals is guided through VPAT. In order to improve access to web content in online journals, Accessibility matters are added to the academic society's editorial and publication regulations. Accessibility education should be provided to journal editors. Accessibility checklists should be developed and researchers should verify themselves. To improve the accessibility of online e-journals to websites, For equal use, various convenience functions should be provided when using the website. It guides the accessibility function to the article contribution management system. Each academic and academic DB company should be required to submit a Korean VPAT.

A Study on the design of XHTML DTD for Academic Journal Articles (학술지 논문을 위한 XHTML DTD의 설계)

  • 윤영준;이응봉
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2000.08a
    • /
    • pp.83-88
    • /
    • 2000
  • 1990년대 인터넷을 기반으로 하는 웹은 현대 사회의 인터넷 환경에 많은 발전을 가져 왔다. 정보를 표현하는 방법으로는 1987년에 결정된 SGML이 사용되어 왔으며, 웹의 개발 이후에는 웹의 표준 문서인 HTML(HyperText Markup Language)이 보편화되었다. 특히 HTML은 사용자의 편의와 기능 발전을 중심으로 HTML 2.0, HTML 32, HTML 4.0, 최근에는 HTML 4.01까지 개발되었으나 다양한 형태를 가진 정보를 표현하기에는 부족하였다. 이에 W3C에서는 XML 형태를 가진 새로운 HTML을 제안하게 되었으며, 이 새로운 HTML이 XHTML이다. XHTML은 HTML 4.0의 기능을 수용하며 기존의 브라우저에서도 사용할 수 있으며 XML의 한 응용이다. 본 고에서는 이 XHTML을 이용하여 학술지의 논문을 웹에 표현할 수 있는 DTD를 개발하여 구현하고자 한다.

  • PDF

Academic Research Inspired Design of an Expository Organic Chemistry Lab Course

  • Kim, Thomas Taehyung;Kim, Hyunwoo;Han, Sunkyu
    • Journal of the Korean Chemical Society
    • /
    • v.62 no.2
    • /
    • pp.99-105
    • /
    • 2018
  • In this paper, we present fortified instructional methods that contributed in improving students' interest toward the expository organic chemistry laboratory course. Reformed TA (Teaching assistant) training and allocation method, a thorough course orientation session, text-light/graphics-heavy results PPT reports, and journal article templated-term papers have improved students' satisfaction in the organic chemistry laboratory course. These methods could be implemented while maintaining the traditional organic chemistry laboratory instruction styles and hence could be broadly applicable.