• Title/Summary/Keyword: 비정형 텍스트 자료

Search Result 40, Processing Time 0.023 seconds

Comparative Analysis of Low Fertility Response Policies (Focusing on Unstructured Data on Parental Leave and Child Allowance) (저출산 대응 정책 비교분석 (육아휴직과 아동수당의 비정형 데이터 중심으로))

  • Eun-Young Keum;Do-Hee Kim
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.5
    • /
    • pp.769-778
    • /
    • 2023
  • This study compared and analyzed parental leave and child allowance, two major policies among solutions to the current serious low fertility rate problem, using unstructured data, and sought future directions and implications for related response policies based on this. The collection keywords were "low fertility + parental leave" and "low fertility + child allowance", and data analysis was conducted in the following order: text frequency analysis, centrality analysis, network visualization, and CONCOR analysis. As a result of the analysis, first, parental leave was found to be a realistic and practical policy in response to low fertility rates, as data analysis showed more diverse and systematic discussions than child allowance. Second, in terms of child allowance, data analysis showed that there was a high level of information and interest in the cash grant benefit system, including child allowance, but there were no other unique features or active discussions. As a future improvement plan, both policies need to utilize the existing system. First, parental leave requires improvement in the working environment and blind spots in order to expand the system, and second, child allowance requires a change in the form of payment that deviates from the uniform and biased system. should be sought, and it was proposed to expand the target age.

Comparison of responses to issues in SNS and Traditional Media using Text Mining -Focusing on the Termination of Korea-Japan General Security of Military Information Agreement(GSOMIA)- (텍스트 마이닝을 이용한 SNS와 언론의 이슈에 대한 반응 비교 -"한일군사정보보호협정(GSOMIA) 종료"를 중심으로-)

  • Lee, Su Ryeon;Choi, Eun Jung
    • Journal of Digital Convergence
    • /
    • v.18 no.2
    • /
    • pp.277-284
    • /
    • 2020
  • Text mining is a representative method of big data analysis that extracts meaningful information from unstructured and large amounts of text data. Social media such as Twitter generates hundreds of thousands of data per second and acts as a one-person media that instantly and directly expresses public opinions and ideas. The traditional media are delivering informations, criticizing society, and forming public opinions. For this, we compare the responses of SNS with the responses of media on the issue of the termination of the Korea-Japan GSOMIA (General Security of Military Information Agreement), one of the domestic issues in the second half of 2019. Data collected from 201,728 tweets and 20,698 newspaper articles were analyzed by sentiment analysis, association keyword analysis, and cluster analysis. As a result, SNS tends to respond positively to this issue, and the media tends to react negatively. In association keyword analysis, SNS shows positive views on domestic issues such as "destruction, decision, we," while the media shows negative views on external issues such as "disappointment, regret, concern". SNS is faster and more powerful than media when studying or creating social trends and opinions, rather than the function of information delivery. This can complement the role of the media that reflects public perception.

Domain-Specific Terminology Mapping Methodology Using Supervised Autoencoders (지도학습 오토인코더를 이용한 전문어의 범용어 공간 매핑 방법론)

  • Byung Ho Yoon;Junwoo Kim;Namgyu Kim
    • Information Systems Review
    • /
    • v.25 no.1
    • /
    • pp.93-110
    • /
    • 2023
  • Recently, attempts have been made to convert unstructured text into vectors and to analyze vast amounts of natural language for various purposes. In particular, the demand for analyzing texts in specialized domains is rapidly increasing. Therefore, studies are being conducted to analyze specialized and general-purpose documents simultaneously. To analyze specific terms with general terms, it is necessary to align the embedding space of the specific terms with the embedding space of the general terms. So far, attempts have been made to align the embedding of specific terms into the embedding space of general terms through a transformation matrix or mapping function. However, the linear transformation based on the transformation matrix showed a limitation in that it only works well in a local range. To overcome this limitation, various types of nonlinear vector alignment methods have been recently proposed. We propose a vector alignment model that matches the embedding space of specific terms to the embedding space of general terms through end-to-end learning that simultaneously learns the autoencoder and regression model. As a result of experiments with R&D documents in the "Healthcare" field, we confirmed the proposed methodology showed superior performance in terms of accuracy compared to the traditional model.

Analysis of Domestic Research on Depression and Stress : Focused on the Treatment and Subjects (우울과 스트레스에 관한 국내 연구 분석 : 치료와 대상자를 중심으로)

  • Jo, Nam-Hee;Na, Eun-Young
    • Journal of Convergence for Information Technology
    • /
    • v.7 no.6
    • /
    • pp.53-59
    • /
    • 2017
  • This study was attempted to identify the domestic research related to depression and stress. The subjects of the analysis were 1,875 college degree theses thrown in the National Assembly Library searched by the depression and stress keyword as of November 30, 2016. The analysis method visualizes atypical data with Word Cloud, which is one of the text mining techniques. We also used the R'LDA package and LDA to classify treatment and subjects. As a result of the analysis, 233(12.4%) of the total papers with therapeutic keywords were found. Application of treatment methods was art therapy, music therapy, horticultural therapy, cognitive behavior therapy, clinical art therapy, cognitive therapy, psychological therapy, depression treatment, group therapy, laughter treatment sequence. The study subjects were adolescents, elderly, patient, mother, child, female, parents, and college students in order. The results of LDA topic analysis for adolescents were classified into four topics: self-support, treatment program, relationship effect, and variable study.

A Study on Questionnaire Improvement using Text Mining (텍스트 마이닝 기법을 활용한 설문 문항 개선에 관한 연구)

  • Paek, Yun-Ji;Jung, Chang-Hyun
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.26 no.2
    • /
    • pp.121-128
    • /
    • 2020
  • The Marine Safety Culture Index (MSCI) was developed in the year 2018 for objectively assessing the public safety culture levels and for incorporating it as data to spread knowledge regarding the marine safety culture. The method for calculating the safety culture index should include issues that may affect the safety culture and should consist of appropriate attributes for estimating the current status. In addition, continuous verification and supplementation are required for addressing social and economic changes. In this study, to determine whether the questionnaire designed by marine experts reflects the people's interests and needs, we analyzed 915 marine safety proposals. Text mining was employed for analyzing the unstructured data of the marine safety proposals, and network analysis and topic modeling were subsequently performed. Analysis of the marine safety proposals was centered on attributes such as education, public relations, safety rules, awareness, skilled workers, and systems. Eighteen questions were modified and supplemented for reflecting the marine safety proposals, and reliability of the revised questions was analyzed. Furthermore, compared to the previous year, the questionnaire's internal consistency was improved upon and was rated at a high value of 0.895. It is expected that by employing the derived marine safety culture index and incorporating the improved questionnaire that reflects the requirements of marine experts and the people, the improved questionnaire will contribute to the establishment of policies for spreading knowledge regarding the marine safety culture.

A Study on the Research Trends on Domestic Platform Government using Topic Modeling (토픽 모델링을 활용한 한국의 플랫폼정부 연구동향 분석)

  • Suh, Byung-Jo;Shin, Sun-Young
    • Informatization Policy
    • /
    • v.24 no.3
    • /
    • pp.3-26
    • /
    • 2017
  • The amount of unstructured data generated online is increasing exponentially and the analysis of text data is being done in various fields. In order to identify the research trends on the platform government, the title, year, academic society, and abstract information of the academic papers on the subject of platform government were collected from the database of the domestic papers, DBPIA(www.dbpia.co.kr). The results of the existing research on the platform government and related fields were analyzed based on each stage of the national informatization promotion. The technology, service, and governance topics were extracted from papers on platform government and the trends of core topics were analyzed by year. Entering the era of the intelligent information society, this study has significance for providing the basis for defining a new role of government - the platform government that sets the stage for the private sector to lead the innovation, and plays the role of an 'enabler' and 'facilitator' instead. The purpose of this study is to understand the platform government research through objective analysis of its trends. Looking for future directions, this study will contribute to future research by providing reference materials.

Improving Performance of Recommendation Systems Using Topic Modeling (사용자 관심 이슈 분석을 통한 추천시스템 성능 향상 방안)

  • Choi, Seongi;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.101-116
    • /
    • 2015
  • Recently, due to the development of smart devices and social media, vast amounts of information with the various forms were accumulated. Particularly, considerable research efforts are being directed towards analyzing unstructured big data to resolve various social problems. Accordingly, focus of data-driven decision-making is being moved from structured data analysis to unstructured one. Also, in the field of recommendation system, which is the typical area of data-driven decision-making, the need of using unstructured data has been steadily increased to improve system performance. Approaches to improve the performance of recommendation systems can be found in two aspects- improving algorithms and acquiring useful data with high quality. Traditionally, most efforts to improve the performance of recommendation system were made by the former approach, while the latter approach has not attracted much attention relatively. In this sense, efforts to utilize unstructured data from variable sources are very timely and necessary. Particularly, as the interests of users are directly connected with their needs, identifying the interests of the user through unstructured big data analysis can be a crew for improving performance of recommendation systems. In this sense, this study proposes the methodology of improving recommendation system by measuring interests of the user. Specially, this study proposes the method to quantify interests of the user by analyzing user's internet usage patterns, and to predict user's repurchase based upon the discovered preferences. There are two important modules in this study. The first module predicts repurchase probability of each category through analyzing users' purchase history. We include the first module to our research scope for comparing the accuracy of traditional purchase-based prediction model to our new model presented in the second module. This procedure extracts purchase history of users. The core part of our methodology is in the second module. This module extracts users' interests by analyzing news articles the users have read. The second module constructs a correspondence matrix between topics and news articles by performing topic modeling on real world news articles. And then, the module analyzes users' news access patterns and then constructs a correspondence matrix between articles and users. After that, by merging the results of the previous processes in the second module, we can obtain a correspondence matrix between users and topics. This matrix describes users' interests in a structured manner. Finally, by using the matrix, the second module builds a model for predicting repurchase probability of each category. In this paper, we also provide experimental results of our performance evaluation. The outline of data used our experiments is as follows. We acquired web transaction data of 5,000 panels from a company that is specialized to analyzing ranks of internet sites. At first we extracted 15,000 URLs of news articles published from July 2012 to June 2013 from the original data and we crawled main contents of the news articles. After that we selected 2,615 users who have read at least one of the extracted news articles. Among the 2,615 users, we discovered that the number of target users who purchase at least one items from our target shopping mall 'G' is 359. In the experiments, we analyzed purchase history and news access records of the 359 internet users. From the performance evaluation, we found that our prediction model using both users' interests and purchase history outperforms a prediction model using only users' purchase history from a view point of misclassification ratio. In detail, our model outperformed the traditional one in appliance, beauty, computer, culture, digital, fashion, and sports categories when artificial neural network based models were used. Similarly, our model outperformed the traditional one in beauty, computer, digital, fashion, food, and furniture categories when decision tree based models were used although the improvement is very small.

Text Mining-Based Emerging Trend Analysis for e-Learning Contents Targeting for CEO (텍스트마이닝을 통한 최고경영자 대상 이러닝 콘텐츠 트렌드 분석)

  • Kyung-Hoon Kim;Myungsin Chae;Byungtae Lee
    • Information Systems Review
    • /
    • v.19 no.2
    • /
    • pp.1-19
    • /
    • 2017
  • Original scripts of e-learning lectures for the CEOs of corporation S were analyzed using topic analysis, which is a text mining method. Twenty-two topics were extracted based on the keywords chosen from five-year records that ranged from 2011 to 2015. Research analysis was then conducted on various issues. Promising topics were selected through evaluation and element analysis of the members of each topic. In management and economics, members demonstrated high satisfaction and interest toward topics in marketing strategy, human resource management, and communication. Philosophy, history of war, and history demonstrated high interest and satisfaction in the field of humanities, whereas mind health showed high interest and satisfaction in the field of in lifestyle. Studies were also conducted to identify topics on the proportion of content, but these studies failed to increase member satisfaction. In the field of IT, educational content responds sensitively to change of the times, but it may not increase the interest and satisfaction of members. The present study found that content production for CEOs should draw out deep implications for value innovation through technology application instead of simply ending the technical aspect of information delivery. Previous studies classified contents superficially based on the name of content program when analyzing the status of content operation. However, text mining can derive deep content and subject classification based on the contents of unstructured data script. This approach can examine current shortages and necessary fields if the service contents of the themes are displayed by year. This study was based on data obtained from influential e-learning companies in Korea. Obtaining practical results was difficult because data were not acquired from portal sites or social networking service. The content of e-learning trends of CEOs were analyzed. Data analysis was also conducted on the intellectual interests of CEOs in each field.

A Tombstone Filtered LSM-Tree for Stable Performance of KVS (키밸류 저장소 성능 제어를 위한 삭제 키 분리 LSM-Tree)

  • Lee, Eunji
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.4
    • /
    • pp.17-22
    • /
    • 2022
  • With the spread of web services, data types are becoming more diversified. In addition to the form of storing data such as images, videos, and texts, the number and form of properties and metadata expressing the data are different for each data. In order to efficiently process such unstructured data, a key-value store is widely used for state-of-the-art applications. LSM-Tree (Log Structured Merge Tree) is the core data structure of various commercial key-value stores. LSM-Tree is optimized to provide high performance for small writes by recording all write and delete operations in a log manner. However, there is a problem in that the delay time and processing speed of user requests are lowered as batches of deletion operations for expired data are inserted into the LSM-Tree as special key-value data. This paper presents a Filtered LSM-Tree (FLSM-Tree) that solves the above problem by separating the deleted key from the main tree structure while maintaining all the advantages of the existing LSM-Tree. The proposed method is implemented in LevelDB, a commercial key-value store and it shows that the read performance is improved by up to 47% in performance evaluation.

A Development Plan for Co-creation-based Smart City through the Trend Analysis of Internet of Things (사물인터넷 동향분석을 통한 Co-creation기반 스마트시티 구축 방안)

  • Park, Ju Seop;Hong, Soon-Goo;Kim, Na Rang
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.4
    • /
    • pp.67-78
    • /
    • 2016
  • Recently many countries around the world are actively promoting smart city projects to address various urban problems such as traffic congestion, housing shortage, and energy scarcity. Due to development of the Internet of Things (IoT), the development of a smart city with sustainability, convenience, and environment-friendliness was enabled through the effective control and reuse of urban resources. The purpose of this study is to analyze the technical trends of IoT and present a development plan for smart city which is one of the applications of the IoT. To this end, the news articles of the Electronic Times between 2013 and 2015were analyzed using the text mining technique and smart city development cases of other countries were investigated. The analysis results revealed the close relationships of big data, cloud, platforms, and sensors with smart city. For the successful development of a smart city, first, all the interested parties in the city must work together to create new values throughout the entire process of value chain. Second, they must utilize big data and disclose public data more actively than they are doing now. This study has made academic contribution in that it has presented a big data analysis method and stimulated follow-up studies. For the practical contribution, the results of this study provided useful data for the policy making of local governments and administrative agencies for smart city development. This study may have limitations in the incorporation of the total trends because only the news articles of the Electronic Times were selected to analyze the technical trends of the IoT.