• Title/Summary/Keyword: Research dataset

Search Result 1,350, Processing Time 0.03 seconds

COVID-19 International Collaborative Research by the Health Insurance Review and Assessment Service Using Its Nationwide Real-world Data: Database, Outcomes, and Implications

  • Rho, Yeunsook;Cho, Do Yeon;Son, Yejin;Lee, Yu Jin;Kim, Ji Woo;Lee, Hye Jin;You, Seng Chan;Park, Rae Woong;Lee, Jin Yong
    • Journal of Preventive Medicine and Public Health
    • /
    • v.54 no.1
    • /
    • pp.8-16
    • /
    • 2021
  • This article aims to introduce the inception and operation of the COVID-19 International Collaborative Research Project, the world's first coronavirus disease 2019 (COVID-19) open data project for research, along with its dataset and research method, and to discuss relevant considerations for collaborative research using nationwide real-world data (RWD). COVID-19 has spread across the world since early 2020, becoming a serious global health threat to life, safety, and social and economic activities. However, insufficient RWD from patients was available to help clinicians efficiently diagnose and treat patients with COVID-19, or to provide necessary information to the government for policy-making. Countries that saw a rapid surge of infections had to focus on leveraging medical professionals to treat patients, and the circumstances made it even more difficult to promptly use COVID-19 RWD. Against this backdrop, the Health Insurance Review and Assessment Service (HIRA) of Korea decided to open its COVID-19 RWD collected through Korea's universal health insurance program, under the title of the COVID-19 International Collaborative Research Project. The dataset, consisting of 476 508 claim statements from 234 427 patients (7590 confirmed cases) and 18 691 318 claim statements of the same patients for the previous 3 years, was established and hosted on HIRA's in-house server. Researchers who applied to participate in the project uploaded analysis code on the platform prepared by HIRA, and HIRA conducted the analysis and provided outcome values. As of November 2020, analyses have been completed for 129 research projects, which have been published or are in the process of being published in prestigious journals.

Construction of a Standard Dataset for Liver Tumors for Testing the Performance and Safety of Artificial Intelligence-Based Clinical Decision Support Systems (인공지능 기반 임상의학 결정 지원 시스템 의료기기의 성능 및 안전성 검증을 위한 간 종양 표준 데이터셋 구축)

  • Seung-seob Kim;Dong Ho Lee;Min Woo Lee;So Yeon Kim;Jaeseung Shin;Jin‑Young Choi;Byoung Wook Choi
    • Journal of the Korean Society of Radiology
    • /
    • v.82 no.5
    • /
    • pp.1196-1206
    • /
    • 2021
  • Purpose To construct a standard dataset of contrast-enhanced CT images of liver tumors to test the performance and safety of artificial intelligence (AI)-based algorithms for clinical decision support systems (CDSSs). Materials and Methods A consensus group of medical experts in gastrointestinal radiology from four national tertiary institutions discussed the conditions to be included in a standard dataset. Seventy-five cases of hepatocellular carcinoma, 75 cases of metastasis, and 30-50 cases of benign lesions were retrieved from each institution, and the final dataset consisted of 300 cases of hepatocellular carcinoma, 300 cases of metastasis, and 183 cases of benign lesions. Only pathologically confirmed cases of hepatocellular carcinomas and metastases were enrolled. The medical experts retrieved the medical records of the patients and manually labeled the CT images. The CT images were saved as Digital Imaging and Communications in Medicine (DICOM) files. Results The medical experts in gastrointestinal radiology constructed the standard dataset of contrast-enhanced CT images for 783 cases of liver tumors. The performance and safety of the AI algorithm can be evaluated by calculating the sensitivity and specificity for detecting and characterizing the lesions. Conclusion The constructed standard dataset can be utilized for evaluating the machine-learning-based AI algorithm for CDSS.

HSFE Network and Fusion Model based Dynamic Hand Gesture Recognition

  • Tai, Do Nhu;Na, In Seop;Kim, Soo Hyung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.9
    • /
    • pp.3924-3940
    • /
    • 2020
  • Dynamic hand gesture recognition(d-HGR) plays an important role in human-computer interaction(HCI) system. With the growth of hand-pose estimation as well as 3D depth sensors, depth, and the hand-skeleton dataset is proposed to bring much research in depth and 3D hand skeleton approaches. However, it is still a challenging problem due to the low resolution, higher complexity, and self-occlusion. In this paper, we propose a hand-shape feature extraction(HSFE) network to produce robust hand-shapes. We build a hand-shape model, and hand-skeleton based on LSTM to exploit the temporal information from hand-shape and motion changes. Fusion between two models brings the best accuracy in dynamic hand gesture (DHG) dataset.

Feature Extraction on a Periocular Region and Person Authentication Using a ResNet Model (ResNet 모델을 이용한 눈 주변 영역의 특징 추출 및 개인 인증)

  • Kim, Min-Ki
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.12
    • /
    • pp.1347-1355
    • /
    • 2019
  • Deep learning approach based on convolution neural network (CNN) has extensively studied in the field of computer vision. However, periocular feature extraction using CNN was not well studied because it is practically impossible to collect large volume of biometric data. This study uses the ResNet model which was trained with the ImageNet dataset. To overcome the problem of insufficient training data, we focused on the training of multi-layer perception (MLP) having simple structure rather than training the CNN having complex structure. It first extracts features using the pretrained ResNet model and reduces the feature dimension by principle component analysis (PCA), then trains a MLP classifier. Experimental results with the public periocular dataset UBIPr show that the proposed method is effective in person authentication using periocular region. Especially it has the advantage which can be directly applied for other biometric traits.

A Simple Tandem Method for Clustering of Multimodal Dataset

  • Cho C.;Lee J.W.;Lee J.W.
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2003.05a
    • /
    • pp.729-733
    • /
    • 2003
  • The presence of local features within clusters incurred by multi-modal nature of data prohibits many conventional clustering techniques from working properly. Especially, the clustering of datasets with non-Gaussian distributions within a cluster can be problematic when the technique with implicit assumption of Gaussian distribution is used. Current study proposes a simple tandem clustering method composed of k-means type algorithm and hierarchical method to solve such problems. The multi-modal dataset is first divided into many small pre-clusters by k-means or fuzzy k-means algorithm. The pre-clusters found from the first step are to be clustered again using agglomerative hierarchical clustering method with Kullback- Leibler divergence as the measure of dissimilarity. This method is not only effective at extracting the multi-modal clusters but also fast and easy in terms of computation complexity and relatively robust at the presence of outliers. The performance of the proposed method was evaluated on three generated datasets and six sets of publicly known real world data.

  • PDF

Fuzzy Classification Method for Processing Incomplete Dataset

  • Woo, Young-Woon;Lee, Kwang-Eui;Han, Soo-Whan
    • Journal of information and communication convergence engineering
    • /
    • v.8 no.4
    • /
    • pp.383-386
    • /
    • 2010
  • Pattern classification is one of the most important topics for machine learning research fields. However incomplete data appear frequently in real world problems and also show low learning rate in classification models. There have been many researches for handling such incomplete data, but most of the researches are focusing on training stages. In this paper, we proposed two classification methods for incomplete data using triangular shaped fuzzy membership functions. In the proposed methods, missing data in incomplete feature vectors are inferred, learned and applied to the proposed classifier using triangular shaped fuzzy membership functions. In the experiment, we verified that the proposed methods show higher classification rate than a conventional method.

Extraction of Non-Point Pollution Using Satellite Imagery Data

  • Lee, Sang-Ik;Lee, Chong-Soo;Choi, Yun-Soo;Koh, June-Hwan
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.96-99
    • /
    • 2003
  • Land cover map is a typical GIS database which shows the Earth's physical surface differentiated by standardized homogeneous land cover types. Satellite images acquired by Landsat TM were primarily used to produce a land cover map of 7 land cover classes; however, it now becomes to produce a more accurate land cover classification dataset of 23 classes thanks to higher resolution satellite images, such as SPOT-5 and IKONOS. The use of the newly produced high resolution land cover map of 23 classes for such activities to estimate non-point sources of pollution like water pollution modeling and atmospheric dispersion modeling is expected to result a higher level of accuracy and validity in various environmental monitoring results. The estimation of pollution from non-point sources using GIS-based modeling with land cover dataset shows fairly accurate and consistent results.

  • PDF

Decomposed "Spatial and Temporal" Convolution for Human Action Recognition in Videos

  • Sediqi, Khwaja Monib;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.455-457
    • /
    • 2019
  • In this paper we study the effect of decomposed spatiotemporal convolutions for action recognition in videos. Our motivation emerges from the empirical observation that spatial convolution applied on solo frames of the video provide good performance in action recognition. In this research we empirically show the accuracy of factorized convolution on individual frames of video for action classification. We take 3D ResNet-18 as base line model for our experiment, factorize its 3D convolution to 2D (Spatial) and 1D (Temporal) convolution. We train the model from scratch using Kinetics video dataset. We then fine-tune the model on UCF-101 dataset and evaluate the performance. Our results show good accuracy similar to that of the state of the art algorithms on Kinetics and UCF-101 datasets.

Forecasting COVID-19 confirmed cases in South Korea using Spatio-Temporal Graph Neural Networks

  • Ngoc, Kien Mai;Lee, Minho
    • International Journal of Contents
    • /
    • v.17 no.3
    • /
    • pp.1-14
    • /
    • 2021
  • Since the outbreak of the coronavirus disease 2019 (COVID-19) pandemic, a lot of efforts have been made in the field of data science to help combat against this disease. Among them, forecasting the number of cases of infection is a crucial problem to predict the development of the pandemic. Many deep learning-based models can be applied to solve this type of time series problem. In this research, we would like to take a step forward to incorporate spatial data (geography) with time series data to forecast the cases of region-level infection simultaneously. Specifically, we model a single spatio-temporal graph, in which nodes represent the geographic regions, spatial edges represent the distance between each pair of regions, and temporal edges indicate the node features through time. We evaluate this approach in COVID-19 in a Korean dataset, and we show a decrease of approximately 10% in both RMSE and MAE, and a significant boost to the training speed compared to the baseline models. Moreover, the training efficiency allows this approach to be extended for a large-scale spatio-temporal dataset.

Recommendation system using Deep Autoencoder for Tensor data

  • Park, Jina;Yong, Hwan-Seung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.8
    • /
    • pp.87-93
    • /
    • 2019
  • These days, as interest in the recommendation system with deep learning is increasing, a number of related studies to develop a performance for collaborative filtering through autoencoder, a state-of-the-art deep learning neural network architecture has advanced considerably. The purpose of this study is to propose autoencoder which is used by the recommendation system to predict ratings, and we added more hidden layers to the original architecture of autoencoder so that we implemented deep autoencoder with 3 to 5 hidden layers for much deeper architecture. In this paper, therefore we make a comparison between the performance of them. In this research, we use 2-dimensional arrays and 3-dimensional tensor as the input dataset. As a result, we found a correlation between matrix entry of the 3-dimensional dataset such as item-time and user-time and also figured out that deep autoencoder with extra hidden layers generalized even better performance than autoencoder.