• Title/Summary/Keyword: Keywords

Search Result 2,413, Processing Time 0.029 seconds

Ensemble Learning-Based Prediction of Good Sellers in Overseas Sales of Domestic Books and Keyword Analysis of Reviews of the Good Sellers (앙상블 학습 기반 국내 도서의 해외 판매 굿셀러 예측 및 굿셀러 리뷰 키워드 분석)

  • Do Young Kim;Na Yeon Kim;Hyon Hee Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.4
    • /
    • pp.173-178
    • /
    • 2023
  • As Korean literature spreads around the world, its position in the overseas publishing market has become important. As demand in the overseas publishing market continues to grow, it is essential to predict future book sales and analyze the characteristics of books that have been highly favored by overseas readers in the past. In this study, we proposed ensemble learning based prediction model and analyzed characteristics of the cumulative sales of more than 5,000 copies classified as good sellers published overseas over the past 5 years. We applied the five ensemble learning models, i.e., XGBoost, Gradient Boosting, Adaboost, LightGBM, and Random Forest, and compared them with other machine learning algorithms, i.e., Support Vector Machine, Logistic Regression, and Deep Learning. Our experimental results showed that the ensemble algorithm outperforms other approaches in troubleshooting imbalanced data. In particular, the LightGBM model obtained an AUC value of 99.86% which is the best prediction performance. Among the features used for prediction, the most important feature is the author's number of overseas publications, and the second important feature is publication in countries with the largest publication market size. The number of evaluation participants is also an important feature. In addition, text mining was performed on the four book reviews that sold the most among good-selling books. Many reviews were interested in stories, characters, and writers and it seems that support for translation is needed as many of the keywords of "translation" appear in low-rated reviews.

Analysis of Municipal Ordinances for Smart Cities of Municipal Governments: Using Topic Modeling (지방자치단체의 스마트시티 조례 분석: 토픽모델링을 활용하여)

  • Hyungjun Seo
    • Informatization Policy
    • /
    • v.30 no.1
    • /
    • pp.41-66
    • /
    • 2023
  • This study aims to reveal the direction of municipal ordinances for smart cities, while focusing on 74 municipal ordinances from 72 municipal governments through topic modeling. As a result, the main keywords that show a high frequency belong to establishment and operations of the Smart City Committee. From the result of topic modeling Latent Dirichlet Allocation(LDA), it classifies municipal ordinances for smart cities into eight topics as follows: Topic 1(security for process of smart cities), Topic 2(promotion of smart city industry), Topic 3(composition of a smart city consultative body for local residents), Topic 4(support system for smart cities), Topic 5(management for personal information), Topic 6(use of smart city data), Topic 7(implementation for intelligent public administration), and Topic 8(smart city promotion). As for topic categorization by region, Topics 5, 6, and 8 which are mostly related to the practical operation of smart cities have a significant portion of municipal ordinances for smart cities in the Seoul metropolitan area. Then, Topics 2, 3, and 4 which are mostly related to the initial implementation of smart cities have a significant portion of municipal ordinances for smart cities in provincial areas.

Abbreviation Disambiguation using Topic Modeling (토픽모델링을 이용한 약어 중의성 해소)

  • Woon-Kyo Lee;Ja-Hee Kim;Junki Yang
    • Journal of the Korea Society for Simulation
    • /
    • v.32 no.1
    • /
    • pp.35-44
    • /
    • 2023
  • In recent, there are many research cases that analyze trends or research trends with text analysis. When collecting documents by searching for keywords in abbreviations for data analysis, it is necessary to disambiguate abbreviations. In many studies, documents are classified by hand-work reading the data one by one to find the data necessary for the study. Most of the studies to disambiguate abbreviations are studies that clarify the meaning of words and use supervised learning. The previous method to disambiguate abbreviation is not suitable for classification studies of documents looking for research data from abbreviation search documents, and related studies are also insufficient. This paper proposes a method of semi-automatically classifying documents collected by abbreviations by going topic modeling with Non-Negative Matrix Factorization, an unsupervised learning method, in the data pre-processing step. To verify the proposed method, papers were collected from academic DB with the abbreviation 'MSA'. The proposed method found 316 papers related to Micro Services Architecture in 1,401 papers. The document classification accuracy of the proposed method was measured at 92.36%. It is expected that the proposed method can reduce the researcher's time and cost due to hand work.

Comparing the 2015 with the 2022 Revised Primary Science Curriculum Based on Network Analysis (2015 및 2022 개정 초등학교 과학과 교육과정에 대한 비교 - 네트워크 분석을 중심으로 -)

  • Jho, Hunkoog
    • Journal of Korean Elementary Science Education
    • /
    • v.42 no.1
    • /
    • pp.178-193
    • /
    • 2023
  • The aim of this study was to investigate differences in the achievement standards from the 2015 to the 2022 revised national science curriculum and to present the implications for science teaching under the revised curriculum. Achievement standards relevant to primary science education were therefore extracted from the national curriculum documents; conceptual domains in the two curricula were analyzed for differences; various kinds of centrality were computed; and the Louvain algorithm was used to identify clusters. These methods revealed that, in the revised compared with the preceding curriculum, the total number of nodes and links had increased, while the number of achievement standards had decreased by 10 percent. In the revised curriculum, keywords relevant to procedural skills and behavior received more emphasis and were connected to collaborative learning and digital literacy. Observation, survey, and explanation remained important, but varied in application across the fields of science. Clustering revealed that the number of categories in each field of science remained mostly unchanged in the revised compared with the previous curriculum, but that each category highlighted different skills or behaviors. Based on those findings, some implications for science instruction in the classroom are discussed.

Review for Assessment Methodology of Disaster Prevention Performance using Scientometric Analysis (계량정보 분석을 활용한 방재성능평가 방법에 대한 고찰)

  • Dong Hyun Kim;Hyung Ju Yoo;Seung Oh Lee
    • Journal of Korean Society of Disaster and Security
    • /
    • v.15 no.4
    • /
    • pp.39-46
    • /
    • 2022
  • The rainfall characteristics such as heavy rains are changing differently from the past, and uncertainties are also greatly increasing due to climate change. In addition, urban development and population concentration are aggravating flood damage. Since the causes of urban inundation are generally complex, it is very important to establish an appropriate flood prevention plan. Thus, the government in Korea is establishing standards for disaster prevention performance for each local government. Since the concept of the disaster prevention performance target was first presented in 2010, the setting standards have changed several times, but the overall technology, methodology, and procedures have been maintained. Therefore, in this study, studies and technologies related to urban disaster prevention performance were reviewed using the scientometric analysis method to review them. This analysis is a method of identifying trends in the field and deriving new knowledge and information based on data such as papers and literature. In this study, papers related to the disaster prevention performance of the Web of Science for the last 30 years from 1990 to 2021 were collected. Citespace, scientometric software, was used to identify authors, research institutes, countries, and research trends, including citation analysis. As a result of the analysis, consideration factors such as the the concept of asset evaluation were identified when making decisions related to urban disaster prevention performance. In the future, it is expected that prevention performance standards and procedures can be upgraded if the keywords are specified and the review of each technology is conducted.

A Systematic Review of the Effects of Visual Perception Interventions for Children With Cerebral Palsy (뇌성마비 아동에게 시지각 중재가 미치는 효과에 대한 체계적 고찰)

  • Ha, Yae-Na;Chae, Song-Eun;Jeong, Mi-Yeon;Yoo, Eun-Young
    • Therapeutic Science for Rehabilitation
    • /
    • v.12 no.2
    • /
    • pp.55-68
    • /
    • 2023
  • Objective : This study aims to analyze the effects of visual perception intervention by systematically reviewing the studies that applied visual perception intervention to children with cerebral palsy. Methods : The databases used were PubMed, EMbase, Science Direct, ProQuest, Koreanstudies Information Service System (KISS), Research Information Sharing Service (RISS), and the National Assembly Library. The keywords used were cerebral palsy, CP, and visual perception. According to the PRISMA flowchart, 10 studies were selected from among studies published from January 1, 2012 to March 30, 2022. The quality level of the selected studies, the demographic characteristics of study participants, the effectiveness of interventions, area and strategies of intervention, assessment tools to measure the effectiveness of interventions, and risk of bias were analyzed. Results : All selected studies confirmed that visual perception intervention was effective in improving visual perception function. In addition, positive results were shown in upper extremity function, activities of daily living, posture control, goal achievement, and psychosocial areas as well as visual perception function. The eye-hand coordination area was intervened in all studies. Conclusion : In visual perception intervention, It is necessary to evaluate the visual perception function by area, and apply systematically graded customized interventions for each individual.

The Exploration of Intersectoral Convergence of Spatial Information Industry and Forecast of its Market Size (공간정보산업 융·복합부문 탐색 및 시장규모 전망 연구)

  • Kwon, Young-Hyun
    • Journal of Cadastre & Land InformatiX
    • /
    • v.52 no.2
    • /
    • pp.121-135
    • /
    • 2022
  • The purpose of this study is to explore the convergence sector of the spatial information industry based on the business transaction data of spatial information companies and to predict the market size of the industry using the Seemingly Unrelated Regression(SUR) model. The convergence part of spatial information industry, which cannot be identified in the Spatial Data Industry Survey, was analyzed by exploring keywords related to spatial information using the business DB of Korea Enterprise Data (2010-2019). The convergence of spatial information businesses mainly appeared in the business relationship between the value chain between Seoul and Gyeonggi Province. The convergence business has the largest sales in the value chain 2 (utilization, service) & 3 (convergence), and also the convergence in the value chain 1 (production, construction) & 2, 2 & 3 stages has doubled in 2019 compared to 2010. In 2019, the total sales of the spatial information industry based on the Statistical Korea were announced at about 8 trillion won, but in this study, the total sales of the spatial information industry were estimated at 28 trillion won considering convergence activities. Finally, when scenario 1 (0.38% population growth, 2020-2024) and 0.07% (2026-2030) were applied using the SUR model to predict the expected market size of the industry, sales decreased by -0.37% to 0.069% in 2025 and 2030 by respectively. When scenario 2 (average wage growth 1.2%) was applied during the same period, sales in the industry increased by 2.326% to 12.185%. In other words, the sales in the spatial information industry depends on Labor, Total Factor Productivity, and Capital Productivity so it is necessary to additional research on policy development and alternatives of enhancing each productivity.

Analysis of Resident's Satisfaction and Its Determining Factors on Residential Environment: Using Zigbang's Apartment Review Bigdata and Deeplearning-based BERT Model (주거환경에 대한 거주민의 만족도와 영향요인 분석 - 직방 아파트 리뷰 빅데이터와 딥러닝 기반 BERT 모형을 활용하여 - )

  • Kweon, Junhyeon;Lee, Sugie
    • Journal of the Korean Regional Science Association
    • /
    • v.39 no.2
    • /
    • pp.47-61
    • /
    • 2023
  • Satisfaction on the residential environment is a major factor influencing the choice of residence and migration, and is directly related to the quality of life in the city. As online services of real estate increases, people's evaluation on the residential environment can be easily checked and it is possible to analyze their satisfaction and its determining factors based on their evaluation. This means that a larger amount of evaluation can be used more efficiently than previously used methods such as surveys. This study analyzed the residential environment reviews of about 30,000 apartment residents collected from 'Zigbang', an online real estate service in Seoul. The apartment review of Zigbang consists of an evaluation grade on a 5-point scale and the evaluation content directly described by the dweller. At first, this study labeled apartment reviews as positive and negative based on the scores of recommended reviews that include comprehensive evaluation about apartment. Next, to classify them automatically, developed a model by using Bidirectional Encoder Representations from Transformers(BERT), a deep learning-based natural language processing model. After that, by using SHapley Additive exPlanation(SHAP), extract word tokens that play an important role in the classification of reviews, to derive determining factors of the evaluation of the residential environment. Furthermore, by analyzing related keywords using Word2Vec, priority considerations for improving satisfaction on the residential environment were suggested. This study is meaningful that suggested a model that automatically classifies satisfaction on the residential environment into positive and negative by using apartment review big data and deep learning, which are qualitative evaluation data of residents, so that it's determining factors were derived. The result of analysis can be used as elementary data for improving the satisfaction on the residential environment, and can be used in the future evaluation of the residential environment near the apartment complex, and the design and evaluation of new complexes and infrastructure.

A Study on China's SNS Opinion Leader through Social Data (소셜 데이터를 통한 중국의 여론 주도층에 관한 연구)

  • Zheng, Xuan;Lee, Jooyoup
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.6 no.9
    • /
    • pp.59-70
    • /
    • 2016
  • The rapid development of the Chinese version of Twitter, the groom Weibo has become an important communication means for Chinese SNS users to obtain and share information. As a result, in China, the phenomenon of the power shift has emerged from the traditional opinion leaders to SNS opinion leasers. The relationship analysis of demographic variables of the Chinese SNS users and their Information on the relationship between keywords was made by utilizing the centrality analysis using Social Network Program NetMiner. China's SNS opinion leaders have general interest in daily activities with their families or friends rather than in social issues. And in case of SNS opinion leaders of high betweenness centrality, it was analyzed that general users was a key mediator role that organically out lead to the adjacent information. These properties are not independent of demographic variables, such as professional, therefore, the demographic characteristics of SNS opinion leaders showed a significant effect on the parameters of betweenness centrality. This study analyzed the characteristics of SNS users, especially opinion leaders in China by looking at the aspects of Chinese social phenomenon in terms of information. Through this study, we expect to provide basic information about the social characteristics of China through collective communication.

The Effect of Taeksa-tang for Dyslipidemia: A Systematic Review and Meta-Analysis (이상지질혈증에 대한 택사탕(澤瀉湯)의 효과 : 체계적 문헌 고찰 및 메타 분석)

  • Yeong-seo Lee;Tae-young Huh;Kyoung-min Kim
    • The Journal of Internal Korean Medicine
    • /
    • v.44 no.3
    • /
    • pp.485-505
    • /
    • 2023
  • Objective: The purpose of this study is to assess the effectiveness and safety of using Taeksa-tang for dyslipidemia through a systematic review and meta-analysis of randomized controlled trials (RCTs). Methods: The search was conducted using keywords such as "dyslipidemia", "hyperlipidemia", "taeksa tang", "zexie tang", and "takusha to" in 12 databases (Pubmed, Cochrane, Embase, ScienceDirect, CNKI, Wanfang, CiNii, RISS, KISS, ScienceON, OASIS, and DBpia) on April 13, 2023. There were no limits on the publication period and language. Cochrane's risk of bias (RoB) was used to evaluate the quality of the studies. A meta-analysis was conducted according to the outcome measurements such as total effective rate (TER), total cholesterol (TC), triglyceride (TG), HDL-cholesterol (HDL-C), LDL-cholesterol (LDL-C), and adverse effects, using the Review Manager web. Results: A total of 9 RCTs were selected. In evaluating the RoB, 2 studies mentioning the random sequence generation, 1 study conducting double blindness, and 8 studies without missing values were evaluated as low risk, while 1 study without mentioning the random sequence generation was evaluated as high risk. All other parts were evaluated as unclear risk. The treatment group (Taeksa-tang or Taeksa-tang-gagam) showed more statistically significant effects compared to the control group (Western medicine or Chinese patent medicine) in TER (RR : 1.24, 95% CI 1.15 to 1.34, P<0.00001), TC (MD : -1.12, 95% CI -1.68 to -0.56, P<0.0001), TG (MD : -1.08, 95% CI -1.65 to -0.51, P=0.0002), HDL-C (MD : 0.63, 95% CI 0.34 to 0.93, P<0.0001), LDL-C (MD : -0.81, 95% CI -1.10 to -0.53, P<0.00001). In addition, the treatment group showed lower adverse effects compared to the control group (RR : 0.30, 95% CI 0.12 to 0.74, P=0.008). Conclusion: This study suggests that Taeksa-tang is effective and safe to use for treating dyslipidemia. However, due to the low quality of the included studies, more clinical studies need to be conducted in the future to increase the possibility of clinical use.