• Title/Summary/Keyword: large dataset

Search Result 559, Processing Time 0.024 seconds

Handwriting Thai Digit Recognition Using Convolution Neural Networks (다양한 컨볼루션 신경망을 이용한 태국어 숫자 인식)

  • Onuean, Athita;Jung, Hanmin;Kim, Taehong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.15-17
    • /
    • 2021
  • Handwriting recognition research is mainly focused on deep learning techniques and has achieved a great performance in the last few years. Especially, handwritten Thai digit recognition has been an important research area including generic digital numerical information, such as Thai official government documents and receipts. However, it becomes also a challenging task for a long time. For resolving the unavailability of a large Thai digit dataset, this paper constructs our dataset and learns them with some variants of the CNN model; Decision tree, K-nearest neighbors, Alexnet, LaNet-5, and VGG (11,13,16,19). The experimental results using the accuracy metric show the maximum accuracy of 98.29% when using VGG 13 with batch normalization.

  • PDF

Korean Text to Gloss: Self-Supervised Learning approach

  • Thanh-Vu Dang;Gwang-hyun Yu;Ji-yong Kim;Young-hwan Park;Chil-woo Lee;Jin-Young Kim
    • Smart Media Journal
    • /
    • v.12 no.1
    • /
    • pp.32-46
    • /
    • 2023
  • Natural Language Processing (NLP) has grown tremendously in recent years. Typically, bilingual, and multilingual translation models have been deployed widely in machine translation and gained vast attention from the research community. On the contrary, few studies have focused on translating between spoken and sign languages, especially non-English languages. Prior works on Sign Language Translation (SLT) have shown that a mid-level sign gloss representation enhances translation performance. Therefore, this study presents a new large-scale Korean sign language dataset, the Museum-Commentary Korean Sign Gloss (MCKSG) dataset, including 3828 pairs of Korean sentences and their corresponding sign glosses used in Museum-Commentary contexts. In addition, we propose a translation framework based on self-supervised learning, where the pretext task is a text-to-text from a Korean sentence to its back-translation versions, then the pre-trained network will be fine-tuned on the MCKSG dataset. Using self-supervised learning help to overcome the drawback of a shortage of sign language data. Through experimental results, our proposed model outperforms a baseline BERT model by 6.22%.

Learning Deep Representation by Increasing ConvNets Depth for Few Shot Learning

  • Fabian, H.S. Tan;Kang, Dae-Ki
    • International journal of advanced smart convergence
    • /
    • v.8 no.4
    • /
    • pp.75-81
    • /
    • 2019
  • Though recent advancement of deep learning methods have provided satisfactory results from large data domain, somehow yield poor performance on few-shot classification tasks. In order to train a model with strong performance, i.e. deep convolutional neural network, it depends heavily on huge dataset and the labeled classes of the dataset can be extremely humongous. The cost of human annotation and scarcity of the data among the classes have drastically limited the capability of current image classification model. On the contrary, humans are excellent in terms of learning or recognizing new unseen classes with merely small set of labeled examples. Few-shot learning aims to train a classification model with limited labeled samples to recognize new classes that have neverseen during training process. In this paper, we increase the backbone depth of the embedding network in orderto learn the variation between the intra-class. By increasing the network depth of the embedding module, we are able to achieve competitive performance due to the minimized intra-class variation.

Big Data Analysis and Prediction of Traffic in Los Angeles

  • Dauletbak, Dalyapraz;Woo, Jongwook
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.2
    • /
    • pp.841-854
    • /
    • 2020
  • The paper explains the method to process, analyze and predict traffic patterns in Los Angeles county using Big Data and Machine Learning. The dataset is used from a popular navigating platform in the USA, which tracks information on the road using connected users' devices and also collects reports shared by the users through the app. The dataset mainly consists of information about traffic jams and traffic incidents reported by users, such as road closure, hazards, accidents. The major contribution of this paper is to give a clear view of how the large-scale road traffic data can be stored and processed using the Big Data system - Hadoop and its ecosystem (Hive). In addition, analysis is explained with the help of visuals using Business Intelligence and prediction with classification machine learning model on the sampled traffic data is presented using Azure ML. The process of modeling, as well as results, are interpreted using metrics: accuracy, precision and recall.

Feature Extraction on a Periocular Region and Person Authentication Using a ResNet Model (ResNet 모델을 이용한 눈 주변 영역의 특징 추출 및 개인 인증)

  • Kim, Min-Ki
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.12
    • /
    • pp.1347-1355
    • /
    • 2019
  • Deep learning approach based on convolution neural network (CNN) has extensively studied in the field of computer vision. However, periocular feature extraction using CNN was not well studied because it is practically impossible to collect large volume of biometric data. This study uses the ResNet model which was trained with the ImageNet dataset. To overcome the problem of insufficient training data, we focused on the training of multi-layer perception (MLP) having simple structure rather than training the CNN having complex structure. It first extracts features using the pretrained ResNet model and reduces the feature dimension by principle component analysis (PCA), then trains a MLP classifier. Experimental results with the public periocular dataset UBIPr show that the proposed method is effective in person authentication using periocular region. Especially it has the advantage which can be directly applied for other biometric traits.

A Deep Learning Approach for Classification of Cloud Image Patches on Small Datasets

  • Phung, Van Hiep;Rhee, Eun Joo
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.3
    • /
    • pp.173-178
    • /
    • 2018
  • Accurate classification of cloud images is a challenging task. Almost all the existing methods rely on hand-crafted feature extraction. Their limitation is low discriminative power. In the recent years, deep learning with convolution neural networks (CNNs), which can auto extract features, has achieved promising results in many computer vision and image understanding fields. However, deep learning approaches usually need large datasets. This paper proposes a deep learning approach for classification of cloud image patches on small datasets. First, we design a suitable deep learning model for small datasets using a CNN, and then we apply data augmentation and dropout regularization techniques to increase the generalization of the model. The experiments for the proposed approach were performed on SWIMCAT small dataset with k-fold cross-validation. The experimental results demonstrated perfect classification accuracy for most classes on every fold, and confirmed both the high accuracy and the robustness of the proposed model.

Forecasting COVID-19 confirmed cases in South Korea using Spatio-Temporal Graph Neural Networks

  • Ngoc, Kien Mai;Lee, Minho
    • International Journal of Contents
    • /
    • v.17 no.3
    • /
    • pp.1-14
    • /
    • 2021
  • Since the outbreak of the coronavirus disease 2019 (COVID-19) pandemic, a lot of efforts have been made in the field of data science to help combat against this disease. Among them, forecasting the number of cases of infection is a crucial problem to predict the development of the pandemic. Many deep learning-based models can be applied to solve this type of time series problem. In this research, we would like to take a step forward to incorporate spatial data (geography) with time series data to forecast the cases of region-level infection simultaneously. Specifically, we model a single spatio-temporal graph, in which nodes represent the geographic regions, spatial edges represent the distance between each pair of regions, and temporal edges indicate the node features through time. We evaluate this approach in COVID-19 in a Korean dataset, and we show a decrease of approximately 10% in both RMSE and MAE, and a significant boost to the training speed compared to the baseline models. Moreover, the training efficiency allows this approach to be extended for a large-scale spatio-temporal dataset.

Supervised learning framework using Web-Videos (Web-Videos를 사용한 Supervised Learning Framework)

  • Na, Seong-Won;Lee, Ye-Gi;Yoon, Kyoung-ro
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2019.06a
    • /
    • pp.95-97
    • /
    • 2019
  • 본 논문에서는 비디오 데이터를 이용한 감독 학습 프레임 워크를 제안한다. 최근 Deep Convolutional Neural Networks의 성공으로 많은 분야에서 사용되고 있다. DCNNs 모델 성능의 중요한 요소 중 하나는 Large-cale Dataset을 구축하는 것으로 Small-scale Dataset으로 모델을 학습한다면 과적합 및 일반화 오류를 해결하기 어렵다. 이러한 문제점을 해결하는 방법으로 이미지 왜곡을 통한 데이터 셋을 증가 또는 Dropout 기법 등을 사용하였지만 원본 데이터가 적은 경우에는 모델이 일반화 능력을 갖기 어렵다. 따라서 본 논문에서는 이러한 문제점을 보완하고자 Web으로부터 얻은 비디오에서 해당 Class와 관련된 프레임들을 추출하여 보다 쉽게 데이터 셋을 확장하고, 모델의 성능을 향상 시키는 방법을 제안한다.

  • PDF

Analysis and 3D Reconstruction of a Cerebral Vascular Network Using Image Threshold Techniques in High-resolution Images of the Mouse Brain (쥐 뇌의 고해상도 이미지에서 임계화 기법을 활용한 뇌혈관 네트워크 분석 및 3D 재현)

  • Lee, Junseok
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.9
    • /
    • pp.992-999
    • /
    • 2019
  • In this paper, I lay the foundation for creating a multiscale atlas that characterizes cerebrovasculature structural changes across the entire brain of a mouse in the Knife-Edge Scanning Microscopy dataset. The geometric reconstruction of the vascular filaments embedded in the volume imaging dataset provides the ability to distinguish cerebral vessels by diameter and other morphological properties across the whole mouse brain. This paper presents a means for studying local variations in the small vascular morphology that have a significant impact on the peripheral nervous system in other cerebral areas, as well as the robust and vulnerable side of the cerebrovasculature system across the large blood vessels. I expect that this foundation will prove invaluable towards data-driven, quantitative investigations into the system-level architectural layout of the cerebrovasculature and surrounding cerebral microstructures.

An Efficient Indexing Structure for Multidimensional Categorical Range Aggregation Query

  • Yang, Jian;Zhao, Chongchong;Li, Chao;Xing, Chunxiao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.597-618
    • /
    • 2019
  • Categorical range aggregation, which is conceptually equivalent to running a range aggregation query separately on multiple datasets, returns the query result on each dataset. The challenge is when the number of dataset is as large as hundreds or thousands, it takes a lot of computation time and I/O. In previous work, only a single dimension of the range restriction has been solved, and in practice, more applications are being used to calculate multiple range restriction statistics. We proposed MCRI-Tree, an index structure designed to solve multi-dimensional categorical range aggregation queries, which can utilize main memory to maximize the efficiency of CRA queries. Specifically, the MCRI-Tree answers any query in $O(nk^{n-1})$ I/Os (where n is the number of dimensions, and k denotes the maximum number of pages covered in one dimension among all the n dimensions during a query). The practical efficiency of our technique is demonstrated with extensive experiments.