Search | Korea Science

Enhancing Multimodal Emotion Recognition in Speech and Text with Integrated CNN, LSTM, and BERT Models (통합 CNN, LSTM, 및 BERT 모델 기반의 음성 및 텍스트 다중 모달 감정 인식 연구)

Edward Dwijayanto Cahyadi;Hans Nathaniel Hadi Soesilo;Mi-Hwa Song
- The Journal of the Convergence on Culture Technology
- /
- v.10 no.1
- /
- pp.617-623
- /
- 2024
Identifying emotions through speech poses a significant challenge due to the complex relationship between language and emotions. Our paper aims to take on this challenge by employing feature engineering to identify emotions in speech through a multimodal classification task involving both speech and text data. We evaluated two classifiers-Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM)-both integrated with a BERT-based pre-trained model. Our assessment covers various performance metrics (accuracy, F-score, precision, and recall) across different experimental setups). The findings highlight the impressive proficiency of two models in accurately discerning emotions from both text and speech data.
https://doi.org/10.17703/JCCT.2024.10.1.617 인용 PDF

Localization of ripe tomato bunch using deep neural networks and class activation mapping

Seung-Woo Kang;Soo-Hyun Cho;Dae-Hyun Lee;Kyung-Chul Kim
- Korean Journal of Agricultural Science
- /
- v.50 no.3
- /
- pp.357-364
- /
- 2023
In this study, we propose a ripe tomato bunch localization method based on convolutional neural networks, to be applied in robotic harvesting systems. Tomato images were obtained from a smart greenhouse at the Rural Development Administration (RDA). The sample images for training were extracted based on tomato maturity and resized to 128 × 128 pixels for use in the classification model. The model was constructed based on four-layer convolutional neural networks, and the classes were determined based on stage of maturity, using a Softmax classifier. The localization of the ripe tomato bunch region was indicated on a class activation map. The class activation map could show the approximate location of the tomato bunch but tends to present a local part or a large part of the ripe tomato bunch region, which could lead to poor performance. Therefore, we suggest a recursive method to improve the performance of the model. The classification results indicated that the accuracy, precision, recall, and F1-score were 0.98, 0.87, 0.98, and 0.92, respectively. The localization performance was 0.52, estimated by the Intersection over Union (IoU), and through input recursion, the IoU was improved by 13%. Based on the results, the proposed localization of the ripe tomato bunch area can be incorporated in robotic harvesting systems to establish the optimal harvesting paths.
https://doi.org/10.7744/kjoas.500305 인용 PDF

Application of Text-Classification Based Machine Learning in Predicting Psychiatric Diagnosis (텍스트 분류 기반 기계학습의 정신과 진단 예측 적용)

Pak, Doohyun;Hwang, Mingyu;Lee, Minji;Woo, Sung-Il;Hahn, Sang-Woo;Lee, Yeon Jung;Hwang, Jaeuk
- Korean Journal of Biological Psychiatry
- /
- v.27 no.1
- /
- pp.18-26
- /
- 2020
Objectives The aim was to find effective vectorization and classification models to predict a psychiatric diagnosis from text-based medical records. Methods Electronic medical records (n = 494) of present illness were collected retrospectively in inpatient admission notes with three diagnoses of major depressive disorder, type 1 bipolar disorder, and schizophrenia. Data were split into 400 training data and 94 independent validation data. Data were vectorized by two different models such as term frequency-inverse document frequency (TF-IDF) and Doc2vec. Machine learning models for classification including stochastic gradient descent, logistic regression, support vector classification, and deep learning (DL) were applied to predict three psychiatric diagnoses. Five-fold cross-validation was used to find an effective model. Metrics such as accuracy, precision, recall, and F1-score were measured for comparison between the models. Results Five-fold cross-validation in training data showed DL model with Doc2vec was the most effective model to predict the diagnosis (accuracy = 0.87, F1-score = 0.87). However, these metrics have been reduced in independent test data set with final working DL models (accuracy = 0.79, F1-score = 0.79), while the model of logistic regression and support vector machine with Doc2vec showed slightly better performance (accuracy = 0.80, F1-score = 0.80) than the DL models with Doc2vec and others with TF-IDF. Conclusions The current results suggest that the vectorization may have more impact on the performance of classification than the machine learning model. However, data set had a number of limitations including small sample size, imbalance among the category, and its generalizability. With this regard, the need for research with multi-sites and large samples is suggested to improve the machine learning models.
https://doi.org/10.22857/kjbp.2020.27.1.003 인용 PDF KSCI

Construction of Faster R-CNN Deep Learning Model for Surface Damage Detection of Blade Systems (블레이드의 표면 결함 검출을 위한 Faster R-CNN 딥러닝 모델 구축)

Jang, Jiwon;An, Hyojoon;Lee, Jong-Han;Shin, Soobong
- Journal of the Korea institute for structural maintenance and inspection
- /
- v.23 no.7
- /
- pp.80-86
- /
- 2019
As computer performance improves, research using deep learning are being actively carried out in various fields. Recently, deep learning technology has been applying to the safety evaluation for structures. In particular, the internal blades of a turbine structure requires experienced experts and considerable time to detect surface damages because of the difficulty of separation of the blades from the structure and the dark environmental condition. This study proposes a Faster R-CNN deep learning model that can detect surface damages on the internal blades, which is one of the primary elements of the turbine structure. The deep learning model was trained using image data with dent and punch damages. The image data was also expanded using image filtering and image data generator techniques. As a result, the deep learning model showed 96.1% accuracy, 95.3% recall, and 96% precision. The value of the recall means that the proposed deep learning model could not detect the blade damages for 4.7%. The performance of the proposed damage detection system can be further improved by collecting and extending damage images in various environments, and finally it can be applicable for turbine engine maintenance.
https://doi.org/10.11112/jksmi.2019.23.7.80 인용 PDF KSCI

Estimate Saliency map based on Multi Feature Assistance of Learning Algorithm (다중 특징을 지원하는 학습 기반의 saliency map에 관한 연구)

Han, Hyun-Ho;Lee, Gang-Seong;Park, Young-Soo;Lee, Sang-Hun
- Journal of the Korea Convergence Society
- /
- v.8 no.6
- /
- pp.29-36
- /
- 2017
In this paper, we propose a method for generating improved saliency map by learning multiple features to improve the accuracy and reliability of saliency map which has similar result to human visual perception type. In order to overcome the inaccurate result of reverse selection or partial loss in color based salient area estimation in existing salience map generation, the proposed method generates multi feature data based on learning. The features to be considered in the image are analyzed through the process of distinguishing the color pattern and the region having the specificity in the original image, and the learning data is composed by the combination of the similar protrusion area definition and the specificity area using the LAB color space based color analysis. After combining the training data with the extrinsic information obtained from low level features such as frequency, color, and focus information, we reconstructed the final saliency map to minimize the inaccurate saliency area. For the experiment, we compared the ground truth image with the experimental results and obtained the precision-recall value.
https://doi.org/10.15207/JKCS.2017.8.6.029 인용 PDF KSCI

Automated Areal Feature Matching in Different Spatial Data-sets (이종의 공간 데이터 셋의 면 객체 자동 매칭 방법)

Kim, Ji Young;Lee, Jae Bin
- Journal of Korean Society for Geospatial Information Science
- /
- v.24 no.1
- /
- pp.89-98
- /
- 2016
In this paper, we proposed an automated areal feature matching method based on geometric similarity without user intervention and is applied into areal features of many-to-many relation, for confusion of spatial data-sets of different scale and updating cycle. Firstly, areal feature(node) that a value of inclusion function is more than 0.4 was connected as an edge in adjacency matrix and candidate corresponding areal features included many-to-many relation was identified by multiplication of adjacency matrix. For geometrical matching, these multiple candidates corresponding areal features were transformed into an aggregated polygon as a convex hull generated by a curve-fitting algorithm. Secondly, we defined matching criteria to measure geometrical quality, and these criteria were changed into normalized values, similarity, by similarity function. Next, shape similarity is defined as a weighted linear combination of these similarities and weights which are calculated by Criteria Importance Through Intercriteria Correlation(CRITIC) method. Finally, in training data, we identified Equal Error Rate(EER) which is trade-off value in a plot of precision versus recall for all threshold values(PR curve) as a threshold and decided if these candidate pairs are corresponding pairs or not. To the result of applying the proposed method in a digital topographic map and a base map of address system(KAIS), we confirmed that some many-to-many areal features were mis-detected in visual evaluation and precision, recall and F-Measure was highly 0.951, 0.906, 0.928, respectively in statistical evaluation. These means that accuracy of the automated matching between different spatial data-sets by the proposed method is highly. However, we should do a research on an inclusion function and a detail matching criterion to exactly quantify many-to-many areal features in future.
https://doi.org/10.7319/kogsis.2016.24.1.089 인용 PDF KSCI

Testing the Reliability of a Smartphone-Based Travel Survey: An Experiment in Seoul (스마트폰 기반 통행 행태 조사 자료 신뢰성 검증: 서울에서 수집된 자료를 바탕으로)

Lee, Jae Seung;Zegras, P. Christopher;Zhao, Fang;Kim, Daehee;Kang, Junhee
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.15 no.2
- /
- pp.50-62
- /
- 2016
With programmable applications that utilize sensors, such as global positioning systems and accelerometers, smartphones provide an unprecedented opportunity to collect behavioral data in an unobtrusive and cost-effective manner. This paper assesses the relative accuracy and reliability of the Future Mobility Sensing (FMS), a smartphone-based prompted-recall travel survey. We compared the data extracted from FMS with the data collected from the Korea Passenger Trip Survey (PTS), a traditional self-reported, paper-based travel survey. In total, 46 undergraduate students completed the PTS for seven consecutive days, while also carrying their smartphones with the activated FMS applications for the same time span. After completing the PTS, the participants validated their FMS data on the web-based prompted recall surveys. We then matched the validated FMS data with the PTS-based records. The FMS turns out to be superior in detecting short trips, which are usually under-reported in self-reported travel surveys. The reported PTS travel times are longer than for the FMS, suggesting that participants tend to overestimate their travel time in the PTS. This study contributes to the ongoing development of smartphone-based travel behavior data collecting methods.
https://doi.org/10.12815/kits.2016.15.2.050 인용 PDF KSCI

Development of an Intelligent Illegal Gambling Site Detection Model Based on Tag2Vec (Tag2vec 기반의 지능형 불법 도박 사이트 탐지 모형 개발)

Song, ChanWoo;Ahn, Hyunchul
- Journal of Intelligence and Information Systems
- /
- v.28 no.4
- /
- pp.211-227
- /
- 2022
Illegal gambling through online gambling sites has become a significant social problem. The development of Internet technology and the spread of smartphones have led to the proliferation of illegal gambling sites, so now illegal online gambling has become accessible to anyone. In order to mitigate its negative effect, the Korean government is trying to detect illegal gambling sites by using self-monitoring agents or reporting systems such as 'Nuricops.' However, it is difficult to detect all illegal sites due to limitations such as a lack of staffing. Accordingly, several scholars have proposed intelligent illegal gambling site detection techniques. Xu et al. (2019) found that fake or illegal websites generally have unique features in the HTML tag structure. It implies that the HTML tag structure can be important for detecting illegal sites. However, prior studies to improve the model's performance by utilizing the HTML tag structure in the illegal site detection model are rare. Against this background, our study aimed to improve the model's performance by utilizing the HTML tag structure and proposes Tag2Vec, a modified version of Doc2Vec, as a methodology to vectorize the HTML tag structure properly. To validate the proposed model, we perform the empirical analysis using a data set consisting of the list of harmful sites from 'The Cheat' and normal sites through Google search. As a result, it was confirmed that the Tag2Vec-based detection model proposed in this study showed better classification accuracy, recall, and F1_Score than the URL-based detection model-a comparative model. The proposed model of this study is expected to be effectively utilized to improve the health of our society through intelligent technology.
https://doi.org/10.13088/jiis.2022.28.4.211 인용 PDF KSCI

The Prediction of Survival of Breast Cancer Patients Based on Machine Learning Using Health Insurance Claim Data (건강보험 청구 데이터를 활용한 머신러닝 기반유방암 환자의 생존 여부 예측)

Doeggyu Lee;Kyungkeun Byun;Hyungdong Lee;Sunhee Shin
- Journal of Korea Society of Industrial Information Systems
- /
- v.28 no.2
- /
- pp.1-9
- /
- 2023
Research using AI and big data is also being actively conducted in the health and medical fields such as disease diagnosis and treatment. Most of the existing research data used cohort data from research institutes or some patient data. In this paper, the difference in the prediction rate of survival and the factors affecting survival between breast cancer patients in their 40~50s and other age groups was revealed using health insurance review claim data held by the HIRA. As a result, the accuracy of predicting patients' survival was 0.93 on average in their 40~50s, higher than 0.86 in their 60~80s. In terms of that factor, the number of treatments was high for those in their 40~50s, and age was high for those in their 60~80s. Performance comparison with previous studies, the average precision was 0.90, which was higher than 0.81 of the existing paper. As a result of performance comparison by applied algorithm, the overall average precision of Decision Tree, Random Forest, and Gradient Boosting was 0.90, and the recall was 1.0, and the precision of multi-layer perceptrons was 0.89, and the recall was 1.0. I hope that more research will be conducted using machine learning automation(Auto ML) tools for non-professionals to enhance the use of the value for health insurance review claim data held by the HIRA.
https://doi.org/10.9723/jksiis.2023.28.2.001 인용 PDF

Prediction of Safety Grade of Bridges Using the Classification Models of Decision Tree and Random Forest (의사결정나무 및 랜덤포레스트 분류 모델을 이용한 교량 안전등급 예측)

Hong, Jisu;Jeon, Se-Jin
- KSCE Journal of Civil and Environmental Engineering Research
- /
- v.43 no.3
- /
- pp.397-411
- /
- 2023
The number of deteriorated bridges with a service period of more than 30 years has been rapidly increasing in Korea. Accordingly, the importance of advanced maintenance technologies through the predictions of age-induced deterioration degree, condition, and performance of bridges is more and more noticed. The prediction method of the safety grade of bridges was proposed in this study using the classification models of the Decision Tree and the Random Forest based on machine learning. As a result of analyzing these models for the 8,850 bridges located in national roads with various evaluation indexes such as confusion matrix, balanced accuracy, recall, ROC curve, and AUC, the Random Forest largely showed better predictive performance than that of the Decision Tree. In particular, random under-sampling in the Random Forest showed higher predictive performance than that of other sampling techniques for the C and D grade bridges, with the recall of 83.4%, which need more attention to maintenance because of the significant deterioration degree. The proposed model can be usefully applied to rapidly identify the safety grade and to establish an efficient and economical maintenance plan of bridges that have not recently been inspected.
https://doi.org/10.12652/Ksce.2023.43.3.0397 인용 PDF

Search Result 310, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)