Search | Korea Science

Spontaneous Speech Emotion Recognition Based On Spectrogram With Convolutional Neural Network (CNN 기반 스펙트로그램을 이용한 자유발화 음성감정인식)

Guiyoung Son;Soonil Kwon
- The Transactions of the Korea Information Processing Society
- /
- v.13 no.6
- /
- pp.284-290
- /
- 2024
Speech emotion recognition (SER) is a technique that is used to analyze the speaker's voice patterns, including vibration, intensity, and tone, to determine their emotional state. There has been an increase in interest in artificial intelligence (AI) techniques, which are now widely used in medicine, education, industry, and the military. Nevertheless, existing researchers have attained impressive results by utilizing acted-out speech from skilled actors in a controlled environment for various scenarios. In particular, there is a mismatch between acted and spontaneous speech since acted speech includes more explicit emotional expressions than spontaneous speech. For this reason, spontaneous speech-emotion recognition remains a challenging task. This paper aims to conduct emotion recognition and improve performance using spontaneous speech data. To this end, we implement deep learning-based speech emotion recognition using the VGG (Visual Geometry Group) after converting 1-dimensional audio signals into a 2-dimensional spectrogram image. The experimental evaluations are performed on the Korean spontaneous emotional speech database from AI-Hub, consisting of 7 emotions, i.e., joy, love, anger, fear, sadness, surprise, and neutral. As a result, we achieved an average accuracy of 83.5% and 73.0% for adults and young people using a time-frequency 2-dimension spectrogram, respectively. In conclusion, our findings demonstrated that the suggested framework outperformed current state-of-the-art techniques for spontaneous speech and showed a promising performance despite the difficulty in quantifying spontaneous speech emotional expression.
https://doi.org/10.3745/TKIPS.2024.13.6.284 인용 PDF

Prediction of Patient Management in COVID-19 Using Deep Learning-Based Fully Automated Extraction of Cardiothoracic CT Metrics and Laboratory Findings

Thomas Weikert;Saikiran Rapaka;Sasa Grbic;Thomas Re;Shikha Chaganti;David J. Winkel;Constantin Anastasopoulos;Tilo Niemann;Benedikt J. Wiggli;Jens Bremerich;Raphael Twerenbold;Gregor Sommer;Dorin Comaniciu;Alexander W. Sauter
- Korean Journal of Radiology
- /
- v.22 no.6
- /
- pp.994-1004
- /
- 2021
Objective: To extract pulmonary and cardiovascular metrics from chest CTs of patients with coronavirus disease 2019 (COVID-19) using a fully automated deep learning-based approach and assess their potential to predict patient management. Materials and Methods: All initial chest CTs of patients who tested positive for severe acute respiratory syndrome coronavirus 2 at our emergency department between March 25 and April 25, 2020, were identified (n = 120). Three patient management groups were defined: group 1 (outpatient), group 2 (general ward), and group 3 (intensive care unit [ICU]). Multiple pulmonary and cardiovascular metrics were extracted from the chest CT images using deep learning. Additionally, six laboratory findings indicating inflammation and cellular damage were considered. Differences in CT metrics, laboratory findings, and demographics between the patient management groups were assessed. The potential of these parameters to predict patients' needs for intensive care (yes/no) was analyzed using logistic regression and receiver operating characteristic curves. Internal and external validity were assessed using 109 independent chest CT scans. Results: While demographic parameters alone (sex and age) were not sufficient to predict ICU management status, both CT metrics alone (including both pulmonary and cardiovascular metrics; area under the curve [AUC] = 0.88; 95% confidence interval [CI] = 0.79-0.97) and laboratory findings alone (C-reactive protein, lactate dehydrogenase, white blood cell count, and albumin; AUC = 0.86; 95% CI = 0.77-0.94) were good classifiers. Excellent performance was achieved by a combination of demographic parameters, CT metrics, and laboratory findings (AUC = 0.91; 95% CI = 0.85-0.98). Application of a model that combined both pulmonary CT metrics and demographic parameters on a dataset from another hospital indicated its external validity (AUC = 0.77; 95% CI = 0.66-0.88). Conclusion: Chest CT of patients with COVID-19 contains valuable information that can be accessed using automated image analysis. These metrics are useful for the prediction of patient management.
https://doi.org/10.3348/kjr.2020.0994 인용 PDF

Automated Lung Segmentation on Chest Computed Tomography Images with Extensive Lung Parenchymal Abnormalities Using a Deep Neural Network

Seung-Jin Yoo;Soon Ho Yoon;Jong Hyuk Lee;Ki Hwan Kim;Hyoung In Choi;Sang Joon Park;Jin Mo Goo
- Korean Journal of Radiology
- /
- v.22 no.3
- /
- pp.476-488
- /
- 2021
Objective: We aimed to develop a deep neural network for segmenting lung parenchyma with extensive pathological conditions on non-contrast chest computed tomography (CT) images. Materials and Methods: Thin-section non-contrast chest CT images from 203 patients (115 males, 88 females; age range, 31-89 years) between January 2017 and May 2017 were included in the study, of which 150 cases had extensive lung parenchymal disease involving more than 40% of the parenchymal area. Parenchymal diseases included interstitial lung disease (ILD), emphysema, nontuberculous mycobacterial lung disease, tuberculous destroyed lung, pneumonia, lung cancer, and other diseases. Five experienced radiologists manually drew the margin of the lungs, slice by slice, on CT images. The dataset used to develop the network consisted of 157 cases for training, 20 cases for development, and 26 cases for internal validation. Two-dimensional (2D) U-Net and three-dimensional (3D) U-Net models were used for the task. The network was trained to segment the lung parenchyma as a whole and segment the right and left lung separately. The University Hospitals of Geneva ILD dataset, which contained high-resolution CT images of ILD, was used for external validation. Results: The Dice similarity coefficients for internal validation were 99.6 ± 0.3% (2D U-Net whole lung model), 99.5 ± 0.3% (2D U-Net separate lung model), 99.4 ± 0.5% (3D U-Net whole lung model), and 99.4 ± 0.5% (3D U-Net separate lung model). The Dice similarity coefficients for the external validation dataset were 98.4 ± 1.0% (2D U-Net whole lung model) and 98.4 ± 1.0% (2D U-Net separate lung model). In 31 cases, where the extent of ILD was larger than 75% of the lung parenchymal area, the Dice similarity coefficients were 97.9 ± 1.3% (2D U-Net whole lung model) and 98.0 ± 1.2% (2D U-Net separate lung model). Conclusion: The deep neural network achieved excellent performance in automatically delineating the boundaries of lung parenchyma with extensive pathological conditions on non-contrast chest CT images.
https://doi.org/10.3348/kjr.2020.0318 인용 PDF

Cycle-Consistent Generative Adversarial Network: Effect on Radiation Dose Reduction and Image Quality Improvement in Ultralow-Dose CT for Evaluation of Pulmonary Tuberculosis

Chenggong Yan;Jie Lin;Haixia Li;Jun Xu;Tianjing Zhang;Hao Chen;Henry C. Woodruff;Guangyao Wu;Siqi Zhang;Yikai Xu;Philippe Lambin
- Korean Journal of Radiology
- /
- v.22 no.6
- /
- pp.983-993
- /
- 2021
Objective: To investigate the image quality of ultralow-dose CT (ULDCT) of the chest reconstructed using a cycle-consistent generative adversarial network (CycleGAN)-based deep learning method in the evaluation of pulmonary tuberculosis. Materials and Methods: Between June 2019 and November 2019, 103 patients (mean age, 40.8 ± 13.6 years; 61 men and 42 women) with pulmonary tuberculosis were prospectively enrolled to undergo standard-dose CT (120 kVp with automated exposure control), followed immediately by ULDCT (80 kVp and 10 mAs). The images of the two successive scans were used to train the CycleGAN framework for image-to-image translation. The denoising efficacy of the CycleGAN algorithm was compared with that of hybrid and model-based iterative reconstruction. Repeated-measures analysis of variance and Wilcoxon signed-rank test were performed to compare the objective measurements and the subjective image quality scores, respectively. Results: With the optimized CycleGAN denoising model, using the ULDCT images as input, the peak signal-to-noise ratio and structural similarity index improved by 2.0 dB and 0.21, respectively. The CycleGAN-generated denoised ULDCT images typically provided satisfactory image quality for optimal visibility of anatomic structures and pathological findings, with a lower level of image noise (mean ± standard deviation [SD], 19.5 ± 3.0 Hounsfield unit [HU]) than that of the hybrid (66.3 ± 10.5 HU, p < 0.001) and a similar noise level to model-based iterative reconstruction (19.6 ± 2.6 HU, p > 0.908). The CycleGAN-generated images showed the highest contrast-to-noise ratios for the pulmonary lesions, followed by the model-based and hybrid iterative reconstruction. The mean effective radiation dose of ULDCT was 0.12 mSv with a mean 93.9% reduction compared to standard-dose CT. Conclusion: The optimized CycleGAN technique may allow the synthesis of diagnostically acceptable images from ULDCT of the chest for the evaluation of pulmonary tuberculosis.
https://doi.org/10.3348/kjr.2020.0988 인용 PDF

Development and Validation of a Deep Learning System for Segmentation of Abdominal Muscle and Fat on Computed Tomography

Hyo Jung Park;Yongbin Shin;Jisuk Park;Hyosang Kim;In Seob Lee;Dong-Woo Seo;Jimi Huh;Tae Young Lee;TaeYong Park;Jeongjin Lee;Kyung Won Kim
- Korean Journal of Radiology
- /
- v.21 no.1
- /
- pp.88-100
- /
- 2020
Objective: We aimed to develop and validate a deep learning system for fully automated segmentation of abdominal muscle and fat areas on computed tomography (CT) images. Materials and Methods: A fully convolutional network-based segmentation system was developed using a training dataset of 883 CT scans from 467 subjects. Axial CT images obtained at the inferior endplate level of the 3rd lumbar vertebra were used for the analysis. Manually drawn segmentation maps of the skeletal muscle, visceral fat, and subcutaneous fat were created to serve as ground truth data. The performance of the fully convolutional network-based segmentation system was evaluated using the Dice similarity coefficient and cross-sectional area error, for both a separate internal validation dataset (426 CT scans from 308 subjects) and an external validation dataset (171 CT scans from 171 subjects from two outside hospitals). Results: The mean Dice similarity coefficients for muscle, subcutaneous fat, and visceral fat were high for both the internal (0.96, 0.97, and 0.97, respectively) and external (0.97, 0.97, and 0.97, respectively) validation datasets, while the mean cross-sectional area errors for muscle, subcutaneous fat, and visceral fat were low for both internal (2.1%, 3.8%, and 1.8%, respectively) and external (2.7%, 4.6%, and 2.3%, respectively) validation datasets. Conclusion: The fully convolutional network-based segmentation system exhibited high performance and accuracy in the automatic segmentation of abdominal muscle and fat on CT images.
https://doi.org/10.3348/kjr.2019.0470 인용 PDF

An Investigation Into the Effects of AI-Based Chemistry I Class Using Classification Models (분류 모델을 활용한 AI 기반 화학 I 수업의 효과에 대한 연구)

Heesun Yang;Seonghyeok Ahn;Seung-Hyun Kim;Seong-Joo Kang
- Journal of the Korean Chemical Society
- /
- v.68 no.3
- /
- pp.160-175
- /
- 2024
The purpose of this study is to examine the effects of a Chemistry I class based on an artificial intelligence (AI) classification model. To achieve this, the research investigated the development and application of a class utilizing an AI classification model in Chemistry I classes conducted at D High School in Gyeongbuk during the first semester of 2023. After selecting the curriculum content and AI tools, and determining the curriculum-AI integration education model as well as AI hardware and software, we developed detailed activities for the program and applied them in actual classes. Following the implementation of the classes, it was confirmed that students' self-efficacy improved in three aspects: chemistry concept formation, AI value perception, and AI-based maker competency. Specifically, the chemistry classes based on text and image classification models had a positive impact on students' self-efficacy for chemistry concept formation, enhanced students' perception of AI value and interest, and contributed to improving students' AI and physical computing abilities. These results demonstrate the positive impact of the Chemistry I class based on an AI classification model on students, providing evidence of its utility in educational settings.
https://doi.org/10.5012/jkcs.2024.68.3.160 인용 PDF HTML

A Study on Strategic Development Approaches for Cyber Seniors in the Information Security Industry

Seung Han Yoon;Ah Reum Kang
- Journal of the Korea Society of Computer and Information
- /
- v.29 no.4
- /
- pp.73-82
- /
- 2024
In 2017, the United Nations reported that the population aged 60 and above was increasing more rapidly than all younger age groups worldwide, projecting that by 2050, the population aged 60 and above would constitute at least 25% of the global population, excluding Africa. The world is experiencing a decline in the rate of increase in the working-age population due to global aging, and the younger generation tends to avoid difficult and challenging occupations. Although theoretically, AI equipped with artificial intelligence can replace humans in all fields, in the realm of practical information security, human judgment and expertise are absolutely essential, especially in ethical considerations. Therefore, this paper proposes a method to retrain and reintegrate IT professionals aged 50 and above who are retiring or seeking career transitions, aiming to bring them back into the industry. For this research, surveys were conducted with 21 government/public agencies representing demand and 9 security monitoring companies representing supply. Survey results indicated that both demand (90%) and supply (78%) unanimously agreed on the absolute necessity of such measures. If the results of this research are applied in the field, it could lead to the strategic development of senior information security professionals, laying the foundation for a new market in the Korean information security industry amid the era of low birth rates and longevity.
https://doi.org/10.9708/jksci.2024.29.04.073 인용 PDF HTML

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

Min, Sung-Hwan
- Journal of Intelligence and Information Systems
- /
- v.22 no.1
- /
- pp.139-157
- /
- 2016
Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.
https://doi.org/10.13088/jiis.2016.22.1.139 인용 PDF KSCI

Comparison of Deep Learning Frameworks: About Theano, Tensorflow, and Cognitive Toolkit (딥러닝 프레임워크의 비교: 티아노, 텐서플로, CNTK를 중심으로)

Chung, Yeojin;Ahn, SungMahn;Yang, Jiheon;Lee, Jaejoon
- Journal of Intelligence and Information Systems
- /
- v.23 no.2
- /
- pp.1-17
- /
- 2017
The deep learning framework is software designed to help develop deep learning models. Some of its important functions include "automatic differentiation" and "utilization of GPU". The list of popular deep learning framework includes Caffe (BVLC) and Theano (University of Montreal). And recently, Microsoft's deep learning framework, Microsoft Cognitive Toolkit, was released as open-source license, following Google's Tensorflow a year earlier. The early deep learning frameworks have been developed mainly for research at universities. Beginning with the inception of Tensorflow, however, it seems that companies such as Microsoft and Facebook have started to join the competition of framework development. Given the trend, Google and other companies are expected to continue investing in the deep learning framework to bring forward the initiative in the artificial intelligence business. From this point of view, we think it is a good time to compare some of deep learning frameworks. So we compare three deep learning frameworks which can be used as a Python library. Those are Google's Tensorflow, Microsoft's CNTK, and Theano which is sort of a predecessor of the preceding two. The most common and important function of deep learning frameworks is the ability to perform automatic differentiation. Basically all the mathematical expressions of deep learning models can be represented as computational graphs, which consist of nodes and edges. Partial derivatives on each edge of a computational graph can then be obtained. With the partial derivatives, we can let software compute differentiation of any node with respect to any variable by utilizing chain rule of Calculus. First of all, the convenience of coding is in the order of CNTK, Tensorflow, and Theano. The criterion is simply based on the lengths of the codes and the learning curve and the ease of coding are not the main concern. According to the criteria, Theano was the most difficult to implement with, and CNTK and Tensorflow were somewhat easier. With Tensorflow, we need to define weight variables and biases explicitly. The reason that CNTK and Tensorflow are easier to implement with is that those frameworks provide us with more abstraction than Theano. We, however, need to mention that low-level coding is not always bad. It gives us flexibility of coding. With the low-level coding such as in Theano, we can implement and test any new deep learning models or any new search methods that we can think of. The assessment of the execution speed of each framework is that there is not meaningful difference. According to the experiment, execution speeds of Theano and Tensorflow are very similar, although the experiment was limited to a CNN model. In the case of CNTK, the experimental environment was not maintained as the same. The code written in CNTK has to be run in PC environment without GPU where codes execute as much as 50 times slower than with GPU. But we concluded that the difference of execution speed was within the range of variation caused by the different hardware setup. In this study, we compared three types of deep learning framework: Theano, Tensorflow, and CNTK. According to Wikipedia, there are 12 available deep learning frameworks. And 15 different attributes differentiate each framework. Some of the important attributes would include interface language (Python, C ++, Java, etc.) and the availability of libraries on various deep learning models such as CNN, RNN, DBN, and etc. And if a user implements a large scale deep learning model, it will also be important to support multiple GPU or multiple servers. Also, if you are learning the deep learning model, it would also be important if there are enough examples and references.
https://doi.org/10.13088/jiis.2017.23.2.001 인용 PDF KSCI

Development on Early Warning System about Technology Leakage of Small and Medium Enterprises (중소기업 기술 유출에 대한 조기경보시스템 개발에 대한 연구)

Seo, Bong-Goon;Park, Do-Hyung
- Journal of Intelligence and Information Systems
- /
- v.23 no.1
- /
- pp.143-159
- /
- 2017
Due to the rapid development of IT in recent years, not only personal information but also the key technologies and information leakage that companies have are becoming important issues. For the enterprise, the core technology that the company possesses is a very important part for the survival of the enterprise and for the continuous competitive advantage. Recently, there have been many cases of technical infringement. Technology leaks not only cause tremendous financial losses such as falling stock prices for companies, but they also have a negative impact on corporate reputation and delays in corporate development. In the case of SMEs, where core technology is an important part of the enterprise, compared to large corporations, the preparation for technological leakage can be seen as an indispensable factor in the existence of the enterprise. As the necessity and importance of Information Security Management (ISM) is emerging, it is necessary to check and prepare for the threat of technology infringement early in the enterprise. Nevertheless, previous studies have shown that the majority of policy alternatives are represented by about 90%. As a research method, literature analysis accounted for 76% and empirical and statistical analysis accounted for a relatively low rate of 16%. For this reason, it is necessary to study the management model and prediction model to prevent leakage of technology to meet the characteristics of SMEs. In this study, before analyzing the empirical analysis, we divided the technical characteristics from the technology value perspective and the organizational factor from the technology control point based on many previous researches related to the factors affecting the technology leakage. A total of 12 related variables were selected for the two factors, and the analysis was performed with these variables. In this study, we use three - year data of "Small and Medium Enterprise Technical Statistics Survey" conducted by the Small and Medium Business Administration. Analysis data includes 30 industries based on KSIC-based 2-digit classification, and the number of companies affected by technology leakage is 415 over 3 years. Through this data, we conducted a randomized sampling in the same industry based on the KSIC in the same year, and compared with the companies (n = 415) and the unaffected firms (n = 415) 1:1 Corresponding samples were prepared and analyzed. In this research, we will conduct an empirical analysis to search for factors influencing technology leakage, and propose an early warning system through data mining. Specifically, in this study, based on the questionnaire survey of SMEs conducted by the Small and Medium Business Administration (SME), we classified the factors that affect the technology leakage of SMEs into two factors(Technology Characteristics, Organization Characteristics). And we propose a model that informs the possibility of technical infringement by using Support Vector Machine(SVM) which is one of the various techniques of data mining based on the proven factors through statistical analysis. Unlike previous studies, this study focused on the cases of various industries in many years, and it can be pointed out that the artificial intelligence model was developed through this study. In addition, since the factors are derived empirically according to the actual leakage of SME technology leakage, it will be possible to suggest to policy makers which companies should be managed from the viewpoint of technology protection. Finally, it is expected that the early warning model on the possibility of technology leakage proposed in this study will provide an opportunity to prevent technology Leakage from the viewpoint of enterprise and government in advance.
https://doi.org/10.13088/jiis.2017.23.1.143 인용 PDF KSCI

Search Result 5,113, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)