• Title/Summary/Keyword: Data Labeling

Search Result 469, Processing Time 0.023 seconds

Evaluating Unsupervised Deep Learning Models for Network Intrusion Detection Using Real Security Event Data

  • Jang, Jiho;Lim, Dongjun;Seong, Changmin;Lee, JongHun;Park, Jong-Geun;Cheong, Yun-Gyung
    • International journal of advanced smart convergence
    • /
    • v.11 no.4
    • /
    • pp.10-19
    • /
    • 2022
  • AI-based Network Intrusion Detection Systems (AI-NIDS) detect network attacks using machine learning and deep learning models. Recently, unsupervised AI-NIDS methods are getting more attention since there is no need for labeling, which is crucial for building practical NIDS systems. This paper aims to test the impact of designing autoencoder models that can be applied to unsupervised an AI-NIDS in real network systems. We collected security events of legacy network security system and carried out an experiment. We report the results and discuss the findings.

Generalized wheat head Detection Model Based on CutMix Algorithm (CutMix 알고리즘 기반의 일반화된 밀 머리 검출 모델)

  • Juwon Yeo;Wonjun Park
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.73-75
    • /
    • 2024
  • 본 논문에서는 밀 수확량을 증가시키기 위한 일반화된 검출 모델을 제안한다. 일반화 성능을 높이기 위해 CutMix 알고리즘으로 데이터를 증식시켰고, 라벨링 되지 않은 데이터를 최대한 활용하기 위해 Fast R-CNN 기반 Pseudo labeling을 사용하였다. 학습의 정확성과 효율성을 높이기 위해 사전에 훈련된 EfficientDet 모델로 학습하였으며, OOF를 이용하여 검증하였다. 최신 객체 검출 모델과 IoU(Intersection over Union)를 이용한 성능 평가 결과, 제안된 모델이 가장 높은 성능을 보이는 것을 확인하였다.

  • PDF

Manchu Script Letters Dataset Creation and Labeling

  • Aaron Daniel Snowberger;Choong Ho Lee
    • Journal of information and communication convergence engineering
    • /
    • v.22 no.1
    • /
    • pp.80-87
    • /
    • 2024
  • The Manchu language holds historical significance, but a complete dataset of Manchu script letters for training optical character recognition machine-learning models is currently unavailable. Therefore, this paper describes the process of creating a robust dataset of extracted Manchu script letters. Rather than performing automatic letter segmentation based on whitespace or the thickness of the central word stem, an image of the Manchu script was manually inspected, and one copy of the desired letter was selected as a region of interest. This selected region of interest was used as a template to match all other occurrences of the same letter within the Manchu script image. Although the dataset in this study contained only 4,000 images of five Manchu script letters, these letters were collected from twenty-eight writing styles. A full dataset of Manchu letters is expected to be obtained through this process. The collected dataset was normalized and trained using a simple convolutional neural network to verify its effectiveness.

Response Modeling with Semi-Supervised Support Vector Regression (준지도 지지 벡터 회귀 모델을 이용한 반응 모델링)

  • Kim, Dong-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.9
    • /
    • pp.125-139
    • /
    • 2014
  • In this paper, I propose a response modeling with a Semi-Supervised Support Vector Regression (SS-SVR) algorithm. In order to increase the accuracy and profit of response modeling, unlabeled data in the customer dataset are used with the labeled data during training. The proposed SS-SVR algorithm is designed to be a batch learning to reduce the training complexity. The label distributions of unlabeled data are estimated in order to consider the uncertainty of labeling. Then, multiple training data are generated from the unlabeled data and their estimated label distributions with oversampling to construct the training dataset with the labeled data. Finally, a data selection algorithm, Expected Margin based Pattern Selection (EMPS), is employed to reduce the training complexity. The experimental results conducted on a real-world marketing dataset showed that the proposed response modeling method trained efficiently, and improved the accuracy and the expected profit.

Land cover classification based on the phonology of Korea using NOAA-AVHRR

  • Kim, Won-Joo;Nam, Ki-Deock;Park, Chong-Hwa
    • Proceedings of the KSRS Conference
    • /
    • 1999.11a
    • /
    • pp.439-442
    • /
    • 1999
  • It is important to analyze the seasonal change profiles of land cover type in large scale for establishing preservation strategy and environmental monitoring. Because the NOAA-AVHRR data sets provide global data with high temporal resolution, it is suitable for the land cover classification of the large area. The objectives of this study were to classify land cover of Korea, to investigate the phenological profiles of land cover. The NOAA-AVHRR data from Jan. 1998 to Dec. 1998 were received by Korea Ocean Research & Development Institute(KORDI) and were used for this study. The NDVI data were produced from this data. And monthly maximum value composite data were made for reducing cloud effect and temporal classification. And the data were classified using the method of supervised classification. To label the land cover classes, they were classified again using generalized vegetation map and Landsat-TM classified image. And the profiles of each class was analyzed according to each month. Results of this study can be summarized as follows. First, it was verified that the use of vegetation map and TM classified map was available to obtain the temporal class labeling with NOAA-AVHRR. Second, phenological characteristics of plant communities of Korea using NOAA-AVHRR was identified. Third, NDVI of North Korea is lower on Summer than that of South Korea. And finally, Forest cover is higher than another cover types. Broadleaf forest is highest on may. Outline of covertype profiles was investigated.

  • PDF

Enhancement of Tongue Segmentation by Using Data Augmentation (데이터 증강을 이용한 혀 영역 분할 성능 개선)

  • Chen, Hong;Jung, Sung-Tae
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.5
    • /
    • pp.313-322
    • /
    • 2020
  • A large volume of data will improve the robustness of deep learning models and avoid overfitting problems. In automatic tongue segmentation, the availability of annotated tongue images is often limited because of the difficulty of collecting and labeling the tongue image datasets in reality. Data augmentation can expand the training dataset and increase the diversity of training data by using label-preserving transformations without collecting new data. In this paper, augmented tongue image datasets were developed using seven augmentation techniques such as image cropping, rotation, flipping, color transformations. Performance of the data augmentation techniques were studied using state-of-the-art transfer learning models, for instance, InceptionV3, EfficientNet, ResNet, DenseNet and etc. Our results show that geometric transformations can lead to more performance gains than color transformations and the segmentation accuracy can be increased by 5% to 20% compared with no augmentation. Furthermore, a random linear combination of geometric and color transformations augmentation dataset gives the superior segmentation performance than all other datasets and results in a better accuracy of 94.98% with InceptionV3 models.

Efficient hardware implementation and analysis of true random-number generator based on beta source

  • Park, Seongmo;Choi, Byoung Gun;Kang, Taewook;Park, Kyunghwan;Kwon, Youngsu;Kim, Jongbum
    • ETRI Journal
    • /
    • v.42 no.4
    • /
    • pp.518-526
    • /
    • 2020
  • This paper presents an efficient hardware random-number generator based on a beta source. The proposed generator counts the values of "0" and "1" and provides a method to distinguish between pseudo-random and true random numbers by comparing them using simple cumulative operations. The random-number generator produces labeled data indicating whether the count value is a pseudo- or true random number according to its bit value based on the generated labeling data. The proposed method is verified using a system based on Verilog RTL coding and LabVIEW for hardware implementation. The generated random numbers were tested according to the NIST SP 800-22 and SP 800-90B standards, and they satisfied the test items specified in the standard. Furthermore, the hardware is efficient and can be used for security, artificial intelligence, and Internet of Things applications in real time.

Concept Analysis of Stigma (낙인(stigma) 개념분석)

  • Lee, In-Ok;Lee, Eun-Ok
    • Journal of muscle and joint health
    • /
    • v.13 no.1
    • /
    • pp.53-66
    • /
    • 2006
  • Aims. In order to analyze the concept of stigma, so to develop a valid instrument to measure stigma. Methods. First, a concept analysis was conducted with the aim of clarifying the state of the science of discipline-specific conceptualizations of stigma. The criterion-based method of concept analysis as described by Morse and colleagues was used (Morse et al., 1996; Morse, 2000). This analytic process enabled the assessment of the scientific maturity of the concept of stigma. The interdisciplinary concept of stigma was found to be immature. Based on this level of maturity it was determined that in order to advance the concept of stigma toward gloater maturity. techniques of concept development using the literature as data were applied. In this process, questions were 'asked of the data' (in this case, the selected disciplinary literatures) to identify the conceptual components of stigma. Results. The inquiry into the concept of stigma led to the development of an expanded interdisciplinary conceptual definition by merging the most coherent commonalties from each discipline. And the conceptual components of stigma were identified. The antecedent factors of stigma were "apart from social identity". The attributes of stigma were "devaluing, labeling, negative stereotypes, discrimination". The consequences of stigma were "social rejection, social isolation, deficiency of social support, low social status".

  • PDF

Noise-Robust Capturing and Animating Facial Expression by Using an Optical Motion Capture System (광학식 동작 포착 장비를 이용한 노이즈에 강건한 얼굴 애니메이션 제작)

  • Park, Sang-Il
    • Journal of Korea Game Society
    • /
    • v.10 no.5
    • /
    • pp.103-113
    • /
    • 2010
  • In this paper, we present a practical method for generating facial animation by using an optical motion capture system. In our setup, we assumed a situation of capturing the body motion and the facial expression simultaneously, which degrades the quality of the captured marker data. To overcome this problem, we provide an integrated framework based on the local coordinate system of each marker for labeling the marker data, hole-filling and removing noises. We justify the method by applying it to generate a short animated film.

Comparative Analysis of Nutrients between HMR Products and TV Recipes: Focusing on Soup, Stew, and Broth (HMR 제품과 방송 속 레시피의 영양성분 분석: 국, 찌개, 탕류를 중심으로)

  • Kang, Hyeyun;Chung, Lana
    • Journal of the Korean Society of Food Culture
    • /
    • v.35 no.3
    • /
    • pp.233-240
    • /
    • 2020
  • This study examined the nutrient content of HMR products and recipes by television chefs. Twelve menu items from the soup, stew, and broth category were chosen from HMR products and TV chef's recipes. The data on the nutrition labeling from the HMR products and TV chef's recipes were calculated using Can-Pro 5.0. The results of the analysis were the differences between the HMR products and TV recipes per serving size. The energy content of TV recipes 236.1 kcal was significantly higher than the HMR products. On the other hand, HMR products contained significantly higher sodium (926.9 mg) levels than the TV recipes (565.8 mg). In general, HMR products contained more sodium and less energy and protein than TV recipes. The highest sodium content containing products among the 12 menu items was the Spicy soft tofu stew (1,421.4 mg) from HMR products. The results revealed the significant differences in the macronutrient and sodium content between HMR products and the TV chef's recipe. This study provides supportive data for the need to reduce the sodium content in HMR products. TV cooking programs should focus on the importance of balanced nutrition, how to reduce sodium intake, and how to achieve this without disrupting well-balanced nutrition.