• Title/Summary/Keyword: 텍스트 연구

Search Result 3,494, Processing Time 0.035 seconds

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

Research Trends in Korean Healing Facilities and Healing Programs Using LDA Topic Modeling (LDA 토픽모델링을 활용한 국내 치유시설과 치유프로그램 연구 동향)

  • Lee, Ju-Hong;Lee, Kyung-Jin;Sung, Jung-Han
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.51 no.3
    • /
    • pp.95-106
    • /
    • 2023
  • Korean healing research has developed over the past 20 years along with the growing social interest in healing. The field of healing research is diverse and includes legislated natural-based healing. In this study, abstracts of 2,202 academic journals, master's, and doctoral dissertations published in KCI and RISS were collected and analyzed. As for the research method, LDA topic modeling used to classify research topics, and time-series publication trends were examined. As a result of the study, it identified that the topic of Korean healing research was connected with 5 types and 4 mediators. The five were "Healing Tourism," "Mind and Art Healing," "Forest Therapy," "Healing Space," and "Youth Restoration and Healing," and the four mediators were "Forest," "Nature," "Culture", and "Education". In addition, only legalized healing studies extracted from Korean healing research and the topics were analyzed. As a result, legalized healing research classified into four. The four types were "Healing Spatial Environment Plan", "Healing Therapy Experiment", "Agricultural Education Experiential Healing", and "Healing Tourism Factor". Forest Therapy, which has the largest amount of research in legalized healing, Agro Healing and Garden Healing which operate similar programs through plants, and Marine Healing using marine resources also analyzed. As a result, topics that show the unique characteristics of individual healing studies and topics that are considered universal in all healing studies derived. This study is significant in that it identified the overall trend of research on Korean healing facilities and programs by utilizing LDA topic modeling.

Weighted Subject - Method Network Analysis of Library and Information Science Studies (문헌정보학 분야 핵심 학술지들의 가중 주제-방법 네트워크 분석)

  • Lee, Keehoen;Jung, Hyojung;Song, Min
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.49 no.3
    • /
    • pp.457-488
    • /
    • 2015
  • In this study, we analyzed the current research state of Library and Information science in top 20 journals from 1990 to 2015, in subject and method perspectives. We developed weighted subject-method network to investigate on centralities of a subject and a method as well as their relations. This network is composed of subject nodes and method nodes and gives a weight on each node by topic occurrence. As a result, for 25 years, management information system, information need analysis, bibliometrics, information policy were top topics. Modeling, literature review, scientific research impact analysis, web data analysis were top methods. A recent rise of text mining is highlighted. We also analyzed communities made from the past 25 years and the recent 5 years. Bibliometrics is extending its field by applying various network analyzing algorithms. Text mining is specialized in medical information system and user interface. This result identifies the interests of excellent studies in Library and Information Science. It also can be fundamental resource for the development of Library and Information Science.

A Study on the International Research Trend in Education Development focused on Text Network Analysis(2002~2017) (교육개발협력에 관한 국제 학술지 연구 동향 고찰 : 텍스트 네트워크 분석을 중심으로(2002~2017))

  • Kim, Sang-Mi;Kim, Young-Hwan;Cho, Won-Gyeum
    • Korean Journal of Comparative Education
    • /
    • v.28 no.1
    • /
    • pp.1-24
    • /
    • 2018
  • The objective of the article is to find the research trends and the main traits presented in the keywords on abstracts of research articles of "International Journal of Education Development" from 2002 to 2017. To do this, Text Network Analysis(TNA) was applied targeting 966 papers on the journal and the major research outcomes are as follows. First, the frequency analysis on the keywords showed that the keywords like Administration of education program, Schools and instruction, Regional public administration, Educational support service, Elementary education, and Elementary and secondary school were analyzed more than 100 times and also high in centrality degree. Second, the analysis results of the keywords presented in those research articles by development goal periods showed that several new keywords like Elementary education, Elementary and secondary school, Education quality, Secondary education, Educational planning have emerged frequently after SDGs and these keywords showed high in their centrality analysis. Third, the analysis on education level showed that the keywords like Elementary education, Administration of education program, School children were high in frequency and centrality degree in Elementary level. In secondary level, Schools and instruction, Administration of education program, Academic achievement were high, and in high level, college and university was high, respectively.

A comparative study of domestic and international research trends of mathematics education through topic modeling (토픽모델링을 활용한 국내외 수학교육 연구 동향 비교 연구)

  • Shin, Dongjo
    • The Mathematical Education
    • /
    • v.59 no.1
    • /
    • pp.63-80
    • /
    • 2020
  • This study analyzed 3,114 articles published in KCI journals and 1,636 articles published in SSCI journals from 2000 to 2019 in order to compare domestic and international research trends of mathematics education using a topic modeling method. Results indicated that there were 16 similar research topics in domestic and international mathematics education journals: algebra/algebraic thinking, fraction, function/representation, statistics, geometry, problem-solving, model/modeling, proof, achievement effect/difference, affective factor, preservice teacher, teaching practice, textbook/curriculum, task analysis, assessment, and theory. Also, there were 7 distinct research topics in domestic and international mathematics education journals. Topics such as affective/cognitive domain and research trends, mathematics concept, class activity, number/operation, creativity/STEAM, proportional reasoning, and college/technology were identified from the domestic journals, whereas discourse/interaction, professional development, identity/equity, child thinking, semiotics/embodied cognition, intervention effect, and design/technology were the topics identified from the international journals. The topic related to preservice teacher was the most frequently addressed topic in both domestic and international research. The topic related to in-service teachers' professional development was the second most popular topic in international research, whereas it was not identified in domestic research. Domestic research in mathematics education tended to pay attention to the topics concerned with the mathematical competency, but it focused more on problem-solving and creativity/STEAM than other mathematical competencies. Rather, international research highlighted the topic related to equity and social justice.

Analysis of Research Trends in The Journal of Engineering Geology (1991-2024): Latent Dirichlet Allocation and Network Analysis ("지질공학"(1991-2024)의 연구동향 분석: 잠재 디리클레 할당 및 네트워크 분석)

  • Taeyong Kim;Hyerim Lee;Minjune Yang
    • The Journal of Engineering Geology
    • /
    • v.34 no.3
    • /
    • pp.429-445
    • /
    • 2024
  • The Journal of Engineering Geology (JEG), a leading academic journal in the field of engineering geology in South Korea, was first published in 1991 and has since been publishing academic papers and research findings. While several literature reviews have been undertaken on specific research areas in recent decades, comprehensive reviews focusing on JEG have been relatively limited. To address this gap, this study applied the latent Dirichlet allocation (LDA) model to analyze the main research topics and trends, and undertook network analysis to identify relationships between topics over different periods. Results for the LDA indicate seven key research topics categorized into three trends: Classic, Emerging and Stable topics. Classic topics include 'Geophysics' and 'Structural geology', which were major subjects in the early years, with their focus shifting to other areas over time. Emerging topics such as 'Hydrogeology' and 'Geohazards' have gained prominence in recent years. Stable topics including 'Geotechnical structures', 'Geomechanics', and 'Environmental geology' have maintained consistent research interest. Network analysis revealed that Structural geology was the central topic prior to 2008, while Geotechnical structures became the focal point of research after 2008, with a shift in research focus. The results of this study contribute to our understanding of research trends and the development of JEG, providing insights for the setting of future research directions.

Trend of Research and Industry-Related Analysis in Data Quality Using Time Series Network Analysis (시계열 네트워크분석을 통한 데이터품질 연구경향 및 산업연관 분석)

  • Jang, Kyoung-Ae;Lee, Kwang-Suk;Kim, Woo-Je
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.6
    • /
    • pp.295-306
    • /
    • 2016
  • The purpose of this paper is both to analyze research trends and to predict industrial flows using the meta-data from the previous studies on data quality. There have been many attempts to analyze the research trends in various fields till lately. However, analysis of previous studies on data quality has produced poor results because of its vast scope and data. Therefore, in this paper, we used a text mining, social network analysis for time series network analysis to analyze the vast scope and data of data quality collected from a Web of Science index database of papers published in the international data quality-field journals for 10 years. The analysis results are as follows: Decreases in Mathematical & Computational Biology, Chemistry, Health Care Sciences & Services, Biochemistry & Molecular Biology, Biochemistry & Molecular Biology, and Medical Information Science. Increases, on the contrary, in Environmental Sciences, Water Resources, Geology, and Instruments & Instrumentation. In addition, the social network analysis results show that the subjects which have the high centrality are analysis, algorithm, and network, and also, image, model, sensor, and optimization are increasing subjects in the data quality field. Furthermore, the industrial connection analysis result on data quality shows that there is high correlation between technique, industry, health, infrastructure, and customer service. And it predicted that the Environmental Sciences, Biotechnology, and Health Industry will be continuously developed. This paper will be useful for people, not only who are in the data quality industry field, but also the researchers who analyze research patterns and find out the industry connection on data quality.

Counseling Outcomes Research Trend Analysis Using Topic Modeling - Focus on 「Korean Journal of Counseling」 (토픽 모델링을 활용한 상담 성과 연구동향 분석 - 「상담학연구」 학술지를 중심으로)

  • Park, Kwi Hwa;Lee, Eun Young;Yune, So Jung
    • Journal of Digital Convergence
    • /
    • v.19 no.11
    • /
    • pp.517-523
    • /
    • 2021
  • The outcome of the consultation is important to both the counselor and the researcher. Analyzing the trends of research on the results of counseling that have been carried out so far will help to comprehensively structure the results of consultations. The purpose of this research is to analyze research trends in Korea, focusing on research related to the outcomes of counseling published in 「Korean Journal of Counseling」 from 2011 to 2021, which is one of the well-known academic journals in the field of counseling in Korea. This is to explore the direction of future research by navigating the knowledge structure of research. There were 197 studies used for analysis, and the final 339 keyword were extracted during the node extraction process and used for analysis. As a result of extracting potential topics using the LDA algorithm, "Measurement and evaluation of counseling outcomes", "emotions and mediate factors affecting interpersonal relationships", and "career stress and coping strategies" are the main topics. Identifying major topics through trend analysis of counseling performance research contributed to structuring counseling performance. In-depth research on these topics needs to continue thereafter.

Bibliometric Analysis on Studies of Korean Intangible Cultural Property Dance : Focusing on Events in the Seoul Area (한국무형문화재 춤 연구의 계량서지학적 분석 : 서울지역 종목을 중심으로)

  • Yoo, Ji-Young;Kim, Jee-Young;Baek, Hyun-Soon
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.4
    • /
    • pp.139-147
    • /
    • 2019
  • This study conducted bibliometric analysis on studies of Korean intangible cultural heritage dance in the Seoul area and it aimed to figure out the tendencies of that research. For this, a list of Korean intangible cultural heritage dance studies of 24 events was collected and analysis was conducted through the big data analysis solution of TEXTOM. Text mining was used as the method for analysis. Research results showed that first, most of the studies were conducted on the Bongsan Talchum and studies on teaching and learning methods were especially actively conducted. On the other hand, there were not many studies on Gut and the need for research vitalization in that area was confirmed. Second, in studies on Cheoyongmu events, the term'contemporary Cheoyongmu' was used frequently. This can be considered the use of meaningful terms with regard to intangible cultural heritage dance that has changed throughout history. At this, the vitalization of research that can reveal the typicality of dance is demanded from research of other events as well. Third, there was a notable amount of research that compared and analyzed dance styles with regard to the Munmyoilmu. This was seen as the result of discussions in the Korean dancing world regarding archetypal dance styles expanding into academic discussions. Therefore, it was revealed that academic discussions can connect to academic outcomes apart from whether the matter is right or wrong.

Science and Technology Policy Studies, Society, and the State : An Analysis of a Co-evolution Among Social Issue, Governmental Policy, and Academic Research in Science and Technology (과학기술정책 연구와 사회, 정부 : 과학기술의 사회이슈, 정부정책, 학술연구의 공진화 분석)

  • Kwon, Ki-Seok;Jeong, Seohwa;Yi, Chan-Goo
    • Journal of Korea Technology Innovation Society
    • /
    • v.21 no.1
    • /
    • pp.64-91
    • /
    • 2018
  • This study explores the interactive pattern among social issue, academic research, and governmental policy on science and technology during the last 20 years. In particular, we try understand wether the science and technology policy research and governmental policy meets social needs appropriately. In order to do this, we have collected text data from news articles, papers, and governmental documents. Based on these data, social network analysis and cluster analysis has been carried out. According to the results, we have found that science and technology policy researches tend to focus on fragmented technological innovation meeting urgent practical needs at the initial stage. However, recently, the main characteristics of science and technology policy research shows co-evolutionary patterns responding to society. Furthermore, time lag also has been observed in the process of interaction among the three bodies. Based on these results, we put forward some suggestions for upcoming researches in science and technology policy. Firstly, analysis levels are needed to be shifted from micro level to mezo or macro level. Secondly, more research efforts are required to be focused on policy process in science technology and its public management. Finally, we have to enhance the sensitiveness to social issues through studies on agenda setting in science and technology policy.