• Title/Summary/Keyword: Research dataset

Search Result 1,350, Processing Time 0.034 seconds

A Study on Strategic Factors for the Application of Digitalized Korean Human Dataset (한국인의 인체정보 활용을 위한 전략적 요인에 관한 연구)

  • Park, Dong-Jin;Lee, Sang-Tae;Lee, Sang-Ho;Lee, Seung-Bok;Shin, Dong-Sun
    • Journal of Digital Convergence
    • /
    • v.8 no.2
    • /
    • pp.203-216
    • /
    • 2010
  • This study corresponds to an exploratory survey that identifies and organizes important decision factors for establishing R&D strategic portfolio in the application of digitalized Korean human-dataset. In the case of countries that have performed the above, the digitalized human-dataset and its visualization application development research are regarded as strategic R&D projects selected and supervised in national level. To achieve the goal of this study, we organize a professional group that reviews articles, suggests research topics, considers alternatives and answers questionnaires. With this study, we draw and refine the detailed factors; these are reflected during a strategic planning phase that includes R&D vision setting, SWOT analysis and strategy development, research area and project selection. In addition to this contribution for supporting the strategic planning, the study also shows the detailed research area's definition/scope and their priorities in terms of importance and urgency. This addition will act as a guideline for investigating further research and as a framework for assessing the current status of research investment.

  • PDF

A Hybrid Mod K-Means Clustering with Mod SVM Algorithm to Enhance the Cancer Prediction

  • Kumar, Rethina;Ganapathy, Gopinath;Kang, Jeong-Jin
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.2
    • /
    • pp.231-243
    • /
    • 2021
  • In Recent years the way we analyze the breast cancer has changed dramatically. Breast cancer is the most common and complex disease diagnosed among women. There are several subtypes of breast cancer and many options are there for the treatment. The most important is to educate the patients. As the research continues to expand, the understanding of the disease and its current treatments types, the researchers are constantly being updated with new researching techniques. Breast cancer survival rates have been increased with the use of new advanced treatments, largely due to the factors such as earlier detection, a new personalized approach to treatment and a better understanding of the disease. Many machine learning classification models have been adopted and modified to diagnose the breast cancer disease. In order to enhance the performance of classification model, our research proposes a model using A Hybrid Modified K-Means Clustering with Modified SVM (Support Vector Machine) Machine learning algorithm to create a new method which can highly improve the performance and prediction. The proposed Machine Learning model is to improve the performance of machine learning classifier. The Proposed Model rectifies the irregularity in the dataset and they can create a new high quality dataset with high accuracy performance and prediction. The recognized datasets Wisconsin Diagnostic Breast Cancer (WDBC) Dataset have been used to perform our research. Using the Wisconsin Diagnostic Breast Cancer (WDBC) Dataset, We have created our Model that can help to diagnose the patients and predict the probability of the breast cancer. A few machine learning classifiers will be explored in this research and compared with our Proposed Model "A Hybrid Modified K-Means with Modified SVM Machine Learning Algorithm to Enhance the Cancer Prediction" to implement and evaluated. Our research results show that our Proposed Model has a significant performance compared to other previous research and with high accuracy level of 99% which will enhance the Cancer Prediction.

Emotion Recognition of Low Resource (Sindhi) Language Using Machine Learning

  • Ahmed, Tanveer;Memon, Sajjad Ali;Hussain, Saqib;Tanwani, Amer;Sadat, Ahmed
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.369-376
    • /
    • 2021
  • One of the most active areas of research in the field of affective computing and signal processing is emotion recognition. This paper proposes emotion recognition of low-resource (Sindhi) language. This work's uniqueness is that it examines the emotions of languages for which there is currently no publicly accessible dataset. The proposed effort has provided a dataset named MAVDESS (Mehran Audio-Visual Dataset Mehran Audio-Visual Database of Emotional Speech in Sindhi) for the academic community of a significant Sindhi language that is mainly spoken in Pakistan; however, no generic data for such languages is accessible in machine learning except few. Furthermore, the analysis of various emotions of Sindhi language in MAVDESS has been carried out to annotate the emotions using line features such as pitch, volume, and base, as well as toolkits such as OpenSmile, Scikit-Learn, and some important classification schemes such as LR, SVC, DT, and KNN, which will be further classified and computed to the machine via Python language for training a machine. Meanwhile, the dataset can be accessed in future via https://doi.org/10.5281/zenodo.5213073.

Development of a Korean chatbot system that enables emotional communication with users in real time (사용자와 실시간으로 감성적 소통이 가능한 한국어 챗봇 시스템 개발)

  • Baek, Sungdae;Lee, Minho
    • Journal of Sensor Science and Technology
    • /
    • v.30 no.6
    • /
    • pp.429-435
    • /
    • 2021
  • In this study, the creation of emotional dialogue was investigated within the process of developing a robot's natural language understanding and emotional dialogue processing. Unlike an English-based dataset, which is the mainstay of natural language processing, the Korean-based dataset has several shortcomings. Therefore, in a situation where the Korean language base is insufficient, the Korean dataset should be dealt with in detail, and in particular, the unique characteristics of the language should be considered. Hence, the first step is to base this study on a specific Korean dataset consisting of conversations on emotional topics. Subsequently, a model was built that learns to extract the continuous dialogue features from a pre-trained language model to generate sentences while maintaining the context of the dialogue. To validate the model, a chatbot system was implemented and meaningful results were obtained by collecting the external subjects and conducting experiments. As a result, the proposed model was influenced by the dataset in which the conversation topic was consultation, to facilitate free and emotional communication with users as if they were consulting with a chatbot. The results were analyzed to identify and explain the advantages and disadvantages of the current model. Finally, as a necessary element to reach the aforementioned ultimate research goal, a discussion is presented on the areas for future studies.

The Study for Type of Mask Wearing Dataset for Deep learning and Detection Model (딥러닝을 위한 마스크 착용 유형별 데이터셋 구축 및 검출 모델에 관한 연구)

  • Hwang, Ho Seong;Kim, Dong heon;Kim, Ho Chul
    • Journal of Biomedical Engineering Research
    • /
    • v.43 no.3
    • /
    • pp.131-135
    • /
    • 2022
  • Due to COVID-19, Correct method of wearing mask is important to prevent COVID-19 and the other respiratory tract infections. And the deep learning technology in the image processing has been developed. The purpose of this study is to create the type of mask wearing dataset for deep learning models and select the deep learning model to detect the wearing mask correctly. The Image dataset is the 2,296 images acquired using a web crawler. Deep learning classification models provided by tensorflow are used to validate the dataset. And Object detection deep learning model YOLOs are used to select the detection deep learning model to detect the wearing mask correctly. In this process, this paper proposes to validate the type of mask wearing datasets and YOLOv5 is the effective model to detect the type of mask wearing. The experimental results show that reliable dataset is acquired and the YOLOv5 model effectively recognize type of mask wearing.

Encoding Dictionary Feature for Deep Learning-based Named Entity Recognition

  • Ronran, Chirawan;Unankard, Sayan;Lee, Seungwoo
    • International Journal of Contents
    • /
    • v.17 no.4
    • /
    • pp.1-15
    • /
    • 2021
  • Named entity recognition (NER) is a crucial task for NLP, which aims to extract information from texts. To build NER systems, deep learning (DL) models are learned with dictionary features by mapping each word in the dataset to dictionary features and generating a unique index. However, this technique might generate noisy labels, which pose significant challenges for the NER task. In this paper, we proposed DL-dictionary features, and evaluated them on two datasets, including the OntoNotes 5.0 dataset and our new infectious disease outbreak dataset named GFID. We used (1) a Bidirectional Long Short-Term Memory (BiLSTM) character and (2) pre-trained embedding to concatenate with (3) our proposed features, named the Convolutional Neural Network (CNN), BiLSTM, and self-attention dictionaries, respectively. The combined features (1-3) were fed through BiLSTM - Conditional Random Field (CRF) to predict named entity classes as outputs. We compared these outputs with other predictions of the BiLSTM character, pre-trained embedding, and dictionary features from previous research, which used the exact matching and partial matching dictionary technique. The findings showed that the model employing our dictionary features outperformed other models that used existing dictionary features. We also computed the F1 score with the GFID dataset to apply this technique to extract medical or healthcare information.

Two-Branch Classifier for Retinal Imaging Analysis (망막 영상 분석을 위한 두 갈래 분류기)

  • Oh, Young-tack;Park, Hyunjin
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.614-616
    • /
    • 2021
  • The world faces difficulties in terms of eye care, including treatment, quality of prevention, vision rehabilitation services, and scarcity of trained eye care experts. However, it is difficult to develop a method for classifying various ocular diseases because the existing dataset for retinal image disclosure does not consist of various diseases found in clinical practice. We propose a method for classifying ocular diseases using the Retinal Fundus Multi-disease Image Dataset (RFMiD), a dataset published in the ISBI-2021 challenge. Our goal is to develop a robust and generalizable model for screening retinal images into normal and abnormal categories. The performance of the proposed model shows a value of 0.9782 for the test dataset as an area under the curve (AUC) score.

  • PDF

Activity recognition of stroke-affected people using wearable sensor

  • Anusha David;Rajavel Ramadoss;Amutha Ramachandran;Shoba Sivapatham
    • ETRI Journal
    • /
    • v.45 no.6
    • /
    • pp.1079-1089
    • /
    • 2023
  • Stroke is one of the leading causes of long-term disability worldwide, placing huge burdens on individuals and society. Further, automatic human activity recognition is a challenging task that is vital to the future of healthcare and physical therapy. Using a baseline long short-term memory recurrent neural network, this study provides a novel dataset of stretching, upward stretching, flinging motions, hand-to-mouth movements, swiping gestures, and pouring motions for improved model training and testing of stroke-affected patients. A MATLAB application is used to output textual and audible prediction results. A wearable sensor with a triaxial accelerometer is used to collect preprocessed real-time data. The model is trained with features extracted from the actual patient to recognize new actions, and the recognition accuracy provided by multiple datasets is compared based on the same baseline model. When training and testing using the new dataset, the baseline model shows recognition accuracy that is 11% higher than the Activity Daily Living dataset, 22% higher than the Activity Recognition Single Chest-Mounted Accelerometer dataset, and 10% higher than another real-world dataset.

A Case Study of Dataset Records in Information Management System (행정정보 데이터세트 사례 조사 연구)

  • Oh, Seh-La;Park, Seunghoon;Yim, Jin-Hee
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.18 no.2
    • /
    • pp.109-133
    • /
    • 2018
  • The need for the records management of administrative information dataset has led to a broad consensus among archivists and has been continuously studied. In the meantime, information technology has greatly advanced, and the development and redevelopment of information management systems have been increasing. Nevertheless, dataset management in information management system has not been practiced in public organizations. This is because it is supposed that no practical management plan exists. From the point of view that practical dataset management methods should be based on the reality of dataset creation and management environment, this study investigates various active datasets in working administrative information systems. The examples and the information drawn from the examination are expected to contribute to dataset management planning. Moreover, the research methods can be utilized in further studies.

A Study on Visual Emotion Classification using Balanced Data Augmentation (균형 잡힌 데이터 증강 기반 영상 감정 분류에 관한 연구)

  • Jeong, Chi Yoon;Kim, Mooseop
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.7
    • /
    • pp.880-889
    • /
    • 2021
  • In everyday life, recognizing people's emotions from their frames is essential and is a popular research domain in the area of computer vision. Visual emotion has a severe class imbalance in which most of the data are distributed in specific categories. The existing methods do not consider class imbalance and used accuracy as the performance metric, which is not suitable for evaluating the performance of the imbalanced dataset. Therefore, we proposed a method for recognizing visual emotion using balanced data augmentation to address the class imbalance. The proposed method generates a balanced dataset by adopting the random over-sampling and image transformation methods. Also, the proposed method uses the Focal loss as a loss function, which can mitigate the class imbalance by down weighting the well-classified samples. EfficientNet, which is the state-of-the-art method for image classification is used to recognize visual emotion. We compare the performance of the proposed method with that of conventional methods by using a public dataset. The experimental results show that the proposed method increases the F1 score by 40% compared with the method without data augmentation, mitigating class imbalance without loss of classification accuracy.