• Title/Summary/Keyword: Large Dataset

Search Result 561, Processing Time 0.028 seconds

Machine-learning-based out-of-hospital cardiac arrest (OHCA) detection in emergency calls using speech recognition (119 응급신고에서 수보요원과 신고자의 통화분석을 활용한 머신 러닝 기반의 심정지 탐지 모델)

  • Jong In Kim;Joo Young Lee;Jio Chung;Dae Jin Shin;Dong Hyun Choi;Ki Hong Kim;Ki Jeong Hong;Sunhee Kim;Minhwa Chung
    • Phonetics and Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.109-118
    • /
    • 2023
  • Cardiac arrest is a critical medical emergency where immediate response is essential for patient survival. This is especially true for Out-of-Hospital Cardiac Arrest (OHCA), for which the actions of emergency medical services in the early stages significantly impact outcomes. However, in Korea, a challenge arises due to a shortage of dispatcher who handle a large volume of emergency calls. In such situations, the implementation of a machine learning-based OHCA detection program can assist responders and improve patient survival rates. In this study, we address this challenge by developing a machine learning-based OHCA detection program. This program analyzes transcripts of conversations between responders and callers to identify instances of cardiac arrest. The proposed model includes an automatic transcription module for these conversations, a text-based cardiac arrest detection model, and the necessary server and client components for program deployment. Importantly, The experimental results demonstrate the model's effectiveness, achieving a performance score of 79.49% based on the F1 metric and reducing the time needed for cardiac arrest detection by 15 seconds compared to dispatcher. Despite working with a limited dataset, this research highlights the potential of a cardiac arrest detection program as a valuable tool for responders, ultimately enhancing cardiac arrest survival rates.

Empirical Analysis of the Influence of ICT SMEs' R&D Resources on Corporate Performance (ICT 중소기업의 연구개발 자원이 기업성과에 미치는 영향에 관한 실증연구)

  • Jong Yoon Won;Kun Chang Lee
    • Information Systems Review
    • /
    • v.23 no.3
    • /
    • pp.1-23
    • /
    • 2021
  • The national economic policy paradigm is constantly changing according to the global business environment. Among them, fostering SMEs is a core policy of many developed countries. The growth of SMEs contributes to the creation of jobs and the development of local communities in the era of employment-free growth. In particular, the growth of SMEs is the foundation for growth into mid-sized and large enterprises. Therefore, the growth of SMEs plays an important role in the national economy. Information and communication technology (ICT) became important much more with the emergence of the 4th industrial revolution. Among them, the growth of ICT SMEs is the nation's future asset. Therefore, this study examines and verifies the main factors affecting the performance of ICT SMEs from the view of their R&D resources. On the basis of 1,999 SMEs dataset, empirical analysis was performed to investigate the influence of R&D resources on their corporate performance. Its results are as follows. First, based on theresource-based theory, ICT SMEs' R&D investment, R&D manpower, and government support policies were found to have a positive effect on securing a company's competitive advantage. Second, it was found that the level of product has a positive effect on the company's performance. Finally, it was found that M&A and technology acquisition method strategies differ according to the growth stage of the company. Therefore, in order to achieve technological innovation and corporate performance of ICT SMEs, the government support policy and investment into internal R&D personnel play as main factors. In addition, it was found that technology acquisition strategies differ depending on the growth stage of the company.

Research on Characterizing Urban Color Analysis based on Tourists-Shared Photos and Machine Learning - Focused on Dali City, China - (관광객 공유한 사진 및 머신 러닝을 활용한 도시 색채 특성 분석 연구 - 중국 대리시를 대상으로 -)

  • Yin, Xiaoyan;Jung, Taeyeol
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.52 no.2
    • /
    • pp.39-50
    • /
    • 2024
  • Color is an essential visual element that has a significant impact on the formation of a city's image and people's perceptions. Quantitative analysis of color in urban environments is a complex process that has been difficult to implement in the past. However, with recent rapid advances in Machine Learning, it has become possible to analyze city colors using photos shared by tourists. This study selected Dali City, a popular tourist destination in China, as a case study. Photos of Dali City shared by tourists were collected, and a method to measure large-scale city colors was explored by combining machine learning techniques. Specifically, the DeepLabv3+ model was first applied to perform a semantic segmentation of tourist sharing photos based on the ADE20k dataset, thereby separating artificial elements in the photos. Next, the K-means clustering algorithm was used to extract colors from the artificial elements in Dali City, and an adjacency matrix was constructed to analyze the correlations between the dominant colors. The research results indicate that the main color of the artificial elements in Dali City has the highest percentage of orange-grey. Furthermore, gray tones are often used in combination with other colors. The results indicated that local ethnic and Buddhist cultures influence the color characteristics of artificial elements in Dali City. This research provides a new method of color analysis, and the results not only help Dali City to shape an urban color image that meets the expectations of tourists but also provide reference materials for future urban color planning in Dali City.

Using noise filtering and sufficient dimension reduction method on unstructured economic data (노이즈 필터링과 충분차원축소를 이용한 비정형 경제 데이터 활용에 대한 연구)

  • Jae Keun Yoo;Yujin Park;Beomseok Seo
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.2
    • /
    • pp.119-138
    • /
    • 2024
  • Text indicators are increasingly valuable in economic forecasting, but are often hindered by noise and high dimensionality. This study aims to explore post-processing techniques, specifically noise filtering and dimensionality reduction, to normalize text indicators and enhance their utility through empirical analysis. Predictive target variables for the empirical analysis include monthly leading index cyclical variations, BSI (business survey index) All industry sales performance, BSI All industry sales outlook, as well as quarterly real GDP SA (seasonally adjusted) growth rate and real GDP YoY (year-on-year) growth rate. This study explores the Hodrick and Prescott filter, which is widely used in econometrics for noise filtering, and employs sufficient dimension reduction, a nonparametric dimensionality reduction methodology, in conjunction with unstructured text data. The analysis results reveal that noise filtering of text indicators significantly improves predictive accuracy for both monthly and quarterly variables, particularly when the dataset is large. Moreover, this study demonstrated that applying dimensionality reduction further enhances predictive performance. These findings imply that post-processing techniques, such as noise filtering and dimensionality reduction, are crucial for enhancing the utility of text indicators and can contribute to improving the accuracy of economic forecasts.

A Basic Study on User Experience Evaluation Based on User Experience Hierarchy Using ChatGPT 4.0 (챗지피티 4.0을 활용한 사용자 경험 계층 기반 사용자 경험 평가에 관한 기초적 연구)

  • Soomin Han;Jae Wan Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.493-498
    • /
    • 2024
  • With the rapid advancement of generative artificial intelligence technology, there is growing interest in how to utilize it in practical applications. Additionally, the importance of prompt engineering to generate results that meet user demands is being newly highlighted. Exploring the new possibilities of generative AI can hold significant value. This study aims to utilize ChatGPT 4.0, a leading generative AI, to propose an effective method for evaluating user experience through the analysis of online customer review data. The user experience evaluation method was based on the six-layer elements of user experience: 'functionality', 'reliability', 'usability', 'convenience', 'emotion', and 'significance'. For this study, a literature review was conducted to enhance the understanding of prompt engineering and to grasp the clear concept of the user experience hierarchy. Based on this, prompts were crafted, and experiments for the user experience evaluation method were carried out using the analysis of collected online customer review data. In this study, we reveal that when provided with accurate definitions and descriptions of the classification processes for user experience factors, ChatGPT demonstrated excellent performance in evaluating user experience. However, it was also found that due to time constraints, there were limitations in analyzing large volumes of data. By introducing and proposing a method to utilize ChatGPT 4.0 for user experience evaluation, we expect to contribute to the advancement of the UX field.

Textbook Outcome of Delta-Shaped Anastomosis in Minimally Invasive Distal Gastrectomy for Gastric Cancer in 4,505 Consecutive Patients

  • Seul-Gi Oh;Suin Lee;Ba Ool Seong;Chang Seok Ko;Sa-Hong Min;Chung Sik Gong;Beom Su Kim;Moon-Won Yoo;Jeong Hwan Yook;In-Seob Lee
    • Journal of Gastric Cancer
    • /
    • v.24 no.3
    • /
    • pp.341-352
    • /
    • 2024
  • Purpose: Textbook outcome is a comprehensive measure used to assess surgical quality and is increasingly being recognized as a valuable evaluation tool. Delta-shaped anastomosis (DA), an intracorporeal gastroduodenostomy, is a viable option for minimally invasive distal gastrectomy in patients with gastric cancer. This study aims to evaluate the surgical outcomes and calculate the textbook outcome of DA. Materials and Methods: In this retrospective study, the records of 4,902 patients who underwent minimally invasive distal gastrectomy for DA between 2009 and 2020 were reviewed. The data were categorized into three phases to analyze the trends over time. Surgical outcomes, including the operation time, length of post-operative hospital stay, and complication rates, were assessed, and the textbook outcome was calculated. Results: Among 4,505 patients, the textbook outcome is achieved in 3,736 (82.9%). Post-operative complications affect the textbook outcome the most significantly (91.9%). The highest textbook outcome is achieved in phase 2 (85.0%), which surpasses the rates of in phase 1 (81.7%) and phase 3 (82.3%). The post-operative complication rate within 30 d after surgery is 8.7%, and the rate of major complications exceeding the Clavien-Dindo classification grade 3 is 2.4%. Conclusions: Based on the outcomes of a large dataset, DA can be considered safe and feasible for gastric cancer.

Prevalence of Inflammatory Bowel Disease Unclassified, as Estimated Using the Revised Porto Criteria, among Korean Pediatric Patients with Inflammatory Bowel Disease

  • Sung Hee Lee;Minsoo Shin;Seo Hee Kim;Seong Pyo Kim;Hyung-Jin Yoon;Yangsoon Park;Jaemoon Koh;Seak Hee Oh;Jae Sung Ko;Jin Soo Moon;Kyung Mo Kim
    • Pediatric Gastroenterology, Hepatology & Nutrition
    • /
    • v.27 no.4
    • /
    • pp.206-214
    • /
    • 2024
  • Purpose: Few studies have reported the prevalence of inflammatory bowel disease unclassified (IBDU) among Korean pediatric IBD (PIBD) population. To address this gap, we used two tertiary centers and nationwide population-based healthcare administrative data to estimate the prevalence of Korean pediatric IBDU at the time of diagnosis. Methods: We identified 136 patients aged 2-17 years with newly diagnosed IBD (94 Crohn's disease [CD] and 42 ulcerative colitis [UC]) from two tertiary centers in Korea between 2005 and 2017. We reclassified these 136 patients using the revised Porto criteria. To estimate the population-based prevalence, we analyzed Korean administrative healthcare data between 2005 and 2016, which revealed 3,650 IBD patients, including 2,538 CD and 1,112 UC. By extrapolating the reclassified results to a population-based dataset, we estimated the prevalence of PIBD subtypes. Results: Among the 94 CD, the original diagnosis remained unchanged in 93 (98.9%), while the diagnosis of one (1.1%) patient was changed to IBDU. Among the 42 UC, the original diagnosis remained unchanged in 13 (31.0%), while the diagnoses in 11 (26.2%), 17 (40.5%), and one (2.4%) patient changed to atypical UC, IBDU, and CD, respectively. The estimated prevalences of CD, UC, atypical UC, and IBDU in the Korean population were 69.5%, 9.4%, 8.0%, and 13.1%, respectively. Conclusion: This study is the first in Korea to estimate the prevalence of pediatric IBDU. This prevalence (13.1%) aligns with findings from Western studies. Large-scale prospective multicenter studies on PIBDU are required to examine the clinical features and outcomes of this condition.

Development of an AI Model to Determine the Relationship between Cerebrovascular Disease and the Work Environment as well as Analysis of Consistency with Expert Judgment (뇌심혈관 질환과 업무 환경의 연관성 판단을 위한 AI 모델의 개발 및 전문가 판단과의 일치도 분석)

  • Juyeon Oh;Ki-bong Yoo;Ick Hoon Jin;Byungyoon Yun;Juho Sim;Heejoo Park;Jongmin Lee;Jian Lee;Jin-Ha Yoon
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.34 no.3
    • /
    • pp.202-213
    • /
    • 2024
  • Introduction: Acknowledging the global issue of diseases potentially caused by overwork, this study aims to develop an AI model to help workers understand the connection between cerebrocardiovascular diseases and their work environment. Materials and methods: The model was trained using medical and legal expertise along with data from the 2021 occupational disease adjudication certificate by the Industrial Accident Compensation Insurance and Prevention Service. The Polyglot-ko-5.8B model, which is effective for processing Korean, was utilized. Model performance was evaluated through accuracy, precision, sensitivity, and F1-score metrics. Results: The model trained on a comprehensive dataset, including expert knowledge and actual case data, outperformed the others with respective accuracy, precision, sensitivity, and F1-scores of 0.91, 0.89, 0.84, and 0.87. However, it still had limitations in responding to certain scenarios. Discussion: The comprehensive model proved most effective in diagnosing work-related cerebrocardiovascular diseases, highlighting the significance of integrating actual case data in AI model development. Despite its efficacy, the model showed limitations in handling diverse cases and offering health management solutions. Conclusion: The study succeeded in creating an AI model to discern the link between work factors and cerebrocardiovascular diseases, showcasing the highest efficacy with the comprehensively trained model. Future enhancements towards a template-based approach and the development of a user-friendly chatbot webUI for workers are recommended to address the model's current limitations.

Comparison of Error Rate and Prediction of Compression Index of Clay to Machine Learning Models using Orange Mining (오렌지마이닝을 활용한 기계학습 모델별 점토 압축지수의 오차율 및 예측 비교)

  • Yoo-Jae Woong;Woo-Young Kim;Tae-Hyung Kim
    • Journal of the Korean Geosynthetics Society
    • /
    • v.23 no.3
    • /
    • pp.15-22
    • /
    • 2024
  • Predicting ground settlement during the improvement of soft ground and the construction of a structure is an crucial factor. Numerous studies have been conducted, and many prediction equations have been proposed to estimate settlement. Settlement can be calculated using the compression index of clay. In this study, data on water content, void ratio, liquid limit, plastic limit, and compression index from the Busan New Port area were collected to construct a dataset. Correlation analysis was conducted among the collected data. Machine learning algorithms, including Random Forest, Neural Network, Linear Regression, Ada Boost, and Gradient Boosting, were applied using the Orange mining program to propose compression index prediction models. The models' results were evaluated by comparing RMSE and MAPE values, which indicate error rates, and R2 values, which signify the models' significance. As a result, water content showed the highest correlation, while the plastic limit showed a somewhat lower correlation than other characteristics. Among the compared models, the AdaBoost model demonstrated the best performance. As a result of comparing each model, the AdaBoost model had the lowest error rate and a large coefficient of determination.

Recommender Systems using Structural Hole and Collaborative Filtering (구조적 공백과 협업필터링을 이용한 추천시스템)

  • Kim, Mingun;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.107-120
    • /
    • 2014
  • This study proposes a novel recommender system using the structural hole analysis to reflect qualitative and emotional information in recommendation process. Although collaborative filtering (CF) is known as the most popular recommendation algorithm, it has some limitations including scalability and sparsity problems. The scalability problem arises when the volume of users and items become quite large. It means that CF cannot scale up due to large computation time for finding neighbors from the user-item matrix as the number of users and items increases in real-world e-commerce sites. Sparsity is a common problem of most recommender systems due to the fact that users generally evaluate only a small portion of the whole items. In addition, the cold-start problem is the special case of the sparsity problem when users or items newly added to the system with no ratings at all. When the user's preference evaluation data is sparse, two users or items are unlikely to have common ratings, and finally, CF will predict ratings using a very limited number of similar users. Moreover, it may produces biased recommendations because similarity weights may be estimated using only a small portion of rating data. In this study, we suggest a novel limitation of the conventional CF. The limitation is that CF does not consider qualitative and emotional information about users in the recommendation process because it only utilizes user's preference scores of the user-item matrix. To address this novel limitation, this study proposes cluster-indexing CF model with the structural hole analysis for recommendations. In general, the structural hole means a location which connects two separate actors without any redundant connections in the network. The actor who occupies the structural hole can easily access to non-redundant, various and fresh information. Therefore, the actor who occupies the structural hole may be a important person in the focal network and he or she may be the representative person in the focal subgroup in the network. Thus, his or her characteristics may represent the general characteristics of the users in the focal subgroup. In this sense, we can distinguish friends and strangers of the focal user utilizing the structural hole analysis. This study uses the structural hole analysis to select structural holes in subgroups as an initial seeds for a cluster analysis. First, we gather data about users' preference ratings for items and their social network information. For gathering research data, we develop a data collection system. Then, we perform structural hole analysis and find structural holes of social network. Next, we use these structural holes as cluster centroids for the clustering algorithm. Finally, this study makes recommendations using CF within user's cluster, and compare the recommendation performances of comparative models. For implementing experiments of the proposed model, we composite the experimental results from two experiments. The first experiment is the structural hole analysis. For the first one, this study employs a software package for the analysis of social network data - UCINET version 6. The second one is for performing modified clustering, and CF using the result of the cluster analysis. We develop an experimental system using VBA (Visual Basic for Application) of Microsoft Excel 2007 for the second one. This study designs to analyzing clustering based on a novel similarity measure - Pearson correlation between user preference rating vectors for the modified clustering experiment. In addition, this study uses 'all-but-one' approach for the CF experiment. In order to validate the effectiveness of our proposed model, we apply three comparative types of CF models to the same dataset. The experimental results show that the proposed model outperforms the other comparative models. In especial, the proposed model significantly performs better than two comparative modes with the cluster analysis from the statistical significance test. However, the difference between the proposed model and the naive model does not have statistical significance.