• 제목/요약/키워드: small data set

검색결과 661건 처리시간 0.028초

애완동물 분류를 위한 딥러닝 (Deep Learning for Pet Image Classification)

  • 신광성;신성윤
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2019년도 춘계학술대회
    • /
    • pp.151-152
    • /
    • 2019
  • 본 논문에서는 동물 이미지 분류를위한 작은 데이터 세트를 기반으로 개선 된 심층 학습 방법을 제안한다. 첫째, CNN은 소규모 데이터 세트에 대한 교육 모델을 작성하고 데이터 세트를 사용하여 교육 세트의 데이터 세트를 확장하는 데 사용된다. 둘째, VGG16과 같은 대규모 데이터 세트에 사전 훈련 된 네트워크를 사용하여 작은 데이터 세트의 병목을 추출하여 새로운 교육 데이터 세트 및 테스트 데이터 세트로 두 개의 NumPy 파일에 저장하고, 마지막으로 완전히 연결된 네트워크를 새로운 데이터 세트로 학습한다.

  • PDF

데이터 증강을 통한 기계학습 능력 개선 방법 연구 (Study on the Improvement of Machine Learning Ability through Data Augmentation)

  • 김태우;신광성
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2021년도 춘계학술대회
    • /
    • pp.346-347
    • /
    • 2021
  • 기계학습을 위한 패턴인식을 위해서는 학습데이터의 양이 많을수록 그 성능이 향상된다. 하지만 일상에서 검출해내야하는 패턴의 종류 및 정보가 항상 많은 양의 학습데이터를 확보할 수는 없다. 따라서 일반적인 기계학습을 위해 적은데이터셋을 의미있게 부풀릴 필요가 있다. 본 연구에서는 기계학습을 수행할 수 있도록 데이터를 증강시키는 기법에 관해 연구한다. 적은데이터셋을 이용하여 기계학습을 수행하는 대표적인 방법이 전이학습(transfer learning) 기법이다. 전이학습은 범용데이터셋으로 기본적인 학습을 수행한 후 목표데이터셋을 최종 단계에 대입함으로써 결과를 얻어내는 방법이다. 본 연구에서는 ImageNet과 같은 범용데이터셋으로 학습시킨 학습모델을 증강된 데이터를 이용하여 특징추출셋으로 사용하여 원하는 패턴에 대한 검출을 수행한다.

  • PDF

Training for Huge Data set with On Line Pruning Regression by LS-SVM

  • Kim, Dae-Hak;Shim, Joo-Yong;Oh, Kwang-Sik
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2003년도 추계 학술발표회 논문집
    • /
    • pp.137-141
    • /
    • 2003
  • LS-SVM(least squares support vector machine) is a widely applicable and useful machine learning technique for classification and regression analysis. LS-SVM can be a good substitute for statistical method but computational difficulties are still remained to operate the inversion of matrix of huge data set. In modern information society, we can easily get huge data sets by on line or batch mode. For these kind of huge data sets, we suggest an on line pruning regression method by LS-SVM. With relatively small number of pruned support vectors, we can have almost same performance as regression with full data set.

  • PDF

환경영향평가시 도로소음 평가범위 설정에 대한 연구 (A Study for Assessment Scope Set-up of Road Noise in EIA)

  • 최준규;선효성;정태량
    • 환경영향평가
    • /
    • 제21권4호
    • /
    • pp.567-572
    • /
    • 2012
  • This paper suggests the set-up plan of the assessment scope in road noise considering road characteristics with the prediction model of road noise. The RLS90 prediction model with some assumptions is used to establish the assessment scope of road noise. The main contents of the applied assumptions are smooth drive of cars, flat region, location of all noise sources in one lane, drive in design speed, and set-up of assessment scope according to traffic volume and car speed. The information of traffic volume to predict road noise is obtained by the distribution of small cars and full-sized cars in road. In this study, the total traffic volume in road is computed by adding the number of small cars to the conversion number of small cars, which means the number of small cars making the same noise as one full-sized car. The prediction result of road noise with the influence factor of traffic volume, car speed, distance between road and receiver is presented. The resultant assessment scope of road noise is obtained by combining road noise prediction data with the set-up standard of road noise assessment scope.

Deep Learning Approach Based on Transcriptome Profile for Data Driven Drug Discovery

  • Eun-Ji Kwon;Hyuk-Jin Cha
    • Molecules and Cells
    • /
    • 제46권1호
    • /
    • pp.65-67
    • /
    • 2023
  • SMILES (simplified molecular-input line-entry system) information of small molecules parsed by one-hot array is passed to a convolutional neural network called black box. Outputs data representing a gene signature is then matched to the genetic signature of a disease to predict the appropriate small molecule. Efficacy of the predicted small molecules is examined by in vivo animal models. GSEA, gene set enrichment analysis.

Cascade Network Based Bolt Inspection In High-Speed Train

  • Gu, Xiaodong;Ding, Ji
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권10호
    • /
    • pp.3608-3626
    • /
    • 2021
  • The detection of bolts is an important task in high-speed train inspection systems, and it is frequently performed to ensure the safety of trains. The difficulty of the vision-based bolt inspection system lies in small sample defect detection, which makes the end-to-end network ineffective. In this paper, the problem is resolved in two stages, which includes the detection network and cascaded classification networks. For small bolt detection, all bolts including defective bolts and normal bolts are put together for conducting annotation training, a new loss function and a new boundingbox selection based on the smallest axis-aligned convex set are proposed. These allow YOLOv3 network to obtain the accurate position and bounding box of the various bolts. The average precision has been greatly improved on PASCAL VOC, MS COCO and actual data set. After that, the Siamese network is employed for estimating the status of the bolts. Using the convolutional Siamese network, we are able to get strong results on few-shot classification. Extensive experiments and comparisons on actual data set show that the system outperforms state-of-the-art algorithms in bolt inspection.

고려상표군 크기에 따른 구텐베르그의 가격독점영역에 관한 연구 (Evaluating the effect of the size of brand consideration set upon the Gutenberg′s monopolistic price interval)

  • 백지원;황선진;이수진
    • 한국의류학회지
    • /
    • 제27권8호
    • /
    • pp.1004-1013
    • /
    • 2003
  • This study addressed an ill-understood issue of a price response model and a monopolistic price interval of fashion goods. The concept of monopolistic price interval introduced by Gutenberg has been rarely applied to the fashion goods, which is known as price sensitive goods. Thus, this study examined the price insensitive zone of the blue jean. The data of 268 respondents were analyzed using Choice-based Conjoint (CBC) analysis and t-test. Considering brand consideration set as a price determinant, we found the presence of monopolistic price interval of the jean. The results obtained from the CBC analysis showed that the bigger the size of brand consideration set, the shorter the monopolistic interval. This implied that the consumer who had a small brand consideration set was more likely to have a longer monopolistic price interval than the one who had a large brand consideration set, since the consumer with a small consideration set tended to value brand itself more than price. Although significant monopolistic price intervals were shown only for the three jean brands out of the seven, to reduce the size of brand consideration set and to increase brand loyalty were found important in maximizing firms'financial profits.

접촉 압력 분포를 이용한 로봇 의료 촉진 (A Robotic Medical Palpation using Contact Pressure Distribution)

  • 김형균;최승문;정완균
    • 로봇학회논문지
    • /
    • 제12권3호
    • /
    • pp.322-331
    • /
    • 2017
  • In this paper we present a novel robotic palpation method for the lump shape estimation using contact pressure distribution. Many previous researches about the robotic palpation have used a stiffness map, which is not suitable to obtain geometrical information of a lump. As a result, they require a large data set and long palpation time to estimate the lump shape. Instead of using the stiffness map, the proposed palpation method uses the difference between the normal force direction and the surface normal to detect the lump boundary and estimate its normal. The palpation trajectory is generated by the normal of the lump boundary to track the lump boundary in real-time. The proposed approach requires small data set and short palpation time for the lump shape estimation since the shape can be directly estimated from the optimally generated palpation trajectory. An experiment result shows that our method can find the lump shape accurately in real-time with small data and short time.

Predictive Analysis of Financial Fraud Detection using Azure and Spark ML

  • Priyanka Purushu;Niklas Melcher;Bhagyashree Bhagwat;Jongwook Woo
    • Asia pacific journal of information systems
    • /
    • 제28권4호
    • /
    • pp.308-319
    • /
    • 2018
  • This paper aims at providing valuable insights on Financial Fraud Detection on a mobile money transactional activity. We have predicted and classified the transaction as normal or fraud with a small sample and massive data set using Azure and Spark ML, which are traditional systems and Big Data respectively. Experimenting with sample dataset in Azure, we found that the Decision Forest model is the most accurate to proceed in terms of the recall value. For the massive data set using Spark ML, it is found that the Random Forest classifier algorithm of the classification model proves to be the best algorithm. It is presented that the Spark cluster gets much faster to build and evaluate models as adding more servers to the cluster with the same accuracy, which proves that the large scale data set can be predictable using Big Data platform. Finally, we reached a recall score with 0.73, which implies a satisfying prediction quality in predicting fraudulent transactions.

Philips LINAC 6 MV와 8 MV X선 소조사연에 대한 선량분포 측정 (Measurement of Dose Distribution in Small Beams of Philips 6 and 8 MVX Linear Accelerator)

  • 서태석;윤세철;신경섭;박용휘
    • Radiation Oncology Journal
    • /
    • 제9권1호
    • /
    • pp.143-152
    • /
    • 1991
  • 본 논문에서는 소조사면에 대한 X-선의 선량분포를 일반실험식으로 계산될 수 있도록 beam 측정 데이타를 종합 처리하는 방법에 대하여 기술하고 있다. Beam 데이타는 philips LINAC 6 MV, 8 MV X-ray에 대하여 측정 되었으며, 측정된 요소는 tissue maximum ratio (TMR), off-axis-ratio (OAH), 그리고 relative output factor (ROF)를 포함한다. 소조사면에 의한 방사선 치료를 위하여 isocenter에서 지름이 1 내지 3cm되도록 실린더 형태의 특수 collimator가 2 mm 간격으로 제작되었다. 본 측정을 위하여 다이오드 detector가 이용되었으며 Film 및 TLO 측정기로 측정된 값과 비교검토 되었다. 제한된 조사면으로 측정된 TMR, OAR data로부터 beam 데이타를 나타내는 실험식을 유도하였으며 이 실험식은 임의의 Set-UP조건에 따른 측정값을 예상할 수 있는 일반 실험식으로 확장되었고 측정된 TMR과 OAR 값들은 잘 일치되었다.

  • PDF