• Title/Summary/Keyword: Data Labeling

Search Result 465, Processing Time 0.029 seconds

Research on supplementing unlabeled data through pseudo-labeling. (의사 레이블링을 통한 레이블이 없는 데이터 보완 연구)

  • Min-Hee Yoo;Heon-Chang Yu
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.410-413
    • /
    • 2023
  • 레이블링 작업은 데이터 분석 시 필요한 사전 작업중 하나이다. 모든 데이터들에 대해 레이블링 작업은 시간/인적 자원을 필요로 하기에, 해당 작업을 보완할 방법이 존재한다면 요구되는 리소스를 줄여 효율성을 크게 향상시킬 수 있다. 본 논문에서는 통신회사에서 적재된 데이터 셋에 대하여 레이블이 없는 데이터(Unlabeled-data)에 대해 의사 레이블링(Pseudo-labeling), SMOTE 를 통한 데이터 증강을 활용하여 기존에 활용되지 못한 데이터를 추가하여 모델에 학습시킨다. 실험을 통해 의사 레이블을 통한 모델 학습 방법이 기존 도메인 지식의 레이블 방법보다 효율적이고 성능이 우수함을 확인하였다.

Moderating Effect of Education-Hours on the Relationship between Knowledge of Country-of-Origin Labeling and Performance in Hotel Culinary Staff (호텔조리직원들의 음식점 원산지표시에 대한 지식과 수행도 관계와 교육시간 조절효과)

  • Kwon, Ki-Wan;Chong, Yu-Kyeong
    • Culinary science and hospitality research
    • /
    • v.22 no.4
    • /
    • pp.37-50
    • /
    • 2016
  • This study aims to examine the effect that the degree of knowledge about country-of-origin labeling on country-of-origin labeling work performance, which is a culinary staff task. This study is also intended to analyze differences in knowledge depending on hours of origin labeling education, and the moderating effect of education hours in the relationship between knowledge and performance. This study targeted culinary staff members working in ten five-star hotels in Seoul. A total of 205 self-administered questionnaires were distributed from November 14th to 27th, 2014, and 240 questionnaires(98.4%) were used for analysis after the exclusion of 4 with unreliable responses. Based on the data collected, frequency analysis, reliability test, exploratory factor analysis, simple regression analysis, t-test and moderating regression analysis were conducted using SPSS 18.0 program. The study findings are as follows. Culinary staff knowledge of origin labeling had a significantly positive effect on job performance and the degree of knowledge was higher in the group that attended one to two-hour periods of education. This suggests a differences in knowledge depending on the hours of education, which then had a moderating effect on the relationship between knowledge and performance. In conclusion, in order to improve knowledge of country-of-origin labeling and the level of performance, there is a need to increase education hours and enable culinary staff memebers to learn more knowledge and apply it to actual tasks. Based on these results, the limitation of the study and the direction of future research were also discussed.

A Study on Classification System using Generative Adversarial Networks (GAN을 활용한 분류 시스템에 관한 연구)

  • Bae, Sangjung;Lim, Byeongyeon;Jung, Jihak;Na, Chulhun;Jung, Hoekyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.338-340
    • /
    • 2019
  • Recently, the speed and size of data accumulation are increasing due to the development of networks. There are many difficulties in classifying these data. One of the difficulties is the difficulty of labeling. Labeling is usually done by people, but it is very difficult for everyone to understand the data in the same way and it is very difficult to label them on the same basis. In order to solve this problem, we implemented GAN to generate new image based on input image and to learn input data indirectly by using it for learning. This suggests that the accuracy of classification can be increased by increasing the number of learning data.

  • PDF

The Importance of Manpower in Major Education as an Example of Artificial Intelligence Development in Construction (건설 인공지능 개발사례로 보는 전공교육 인력의 중요성)

  • Heo, Seokjae;Lee, Sanghyun;Lee, Seungwon;Kim, Myunghun;Chung, Lan
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2021.11a
    • /
    • pp.223-224
    • /
    • 2021
  • The process before the model learning stage in AI R&D can be subdivided into data collection/cleansing-data purification-data labeling. After that, according to the purpose of development, it goes through a stage of verifying the model by performing learning by using the algorithm of the artificial intelligence model. Several studies describe an important part of AI research as the learning stage, and try to increase the accuracy by changing the structure and layer of the AI model. However, if the refinement and labeling process of the learning data is tailored only to the model format and is not made for the purpose of development, the desired AI model cannot be obtained. The latest research reveals that most AI research failures are the failure of the learning data rather than the structure of the AI model. analyzed.

  • PDF

Towards Improved Performance on Plant Disease Recognition with Symptoms Specific Annotation

  • Dong, Jiuqing;Fuentes, Alvaro;Yoon, Sook;Kim, Taehyun;Park, Dong Sun
    • Smart Media Journal
    • /
    • v.11 no.4
    • /
    • pp.38-45
    • /
    • 2022
  • Object detection models have become the current tool of choice for plant disease detection in precision agriculture. Most existing research improves the performance by ameliorating networks and optimizing the loss function. However, the data-centric part of a whole project also needs more investigation. In this paper, we proposed a systematic strategy with three different annotation methods for plant disease detection: local, semi-global, and global label. Experimental results on our paprika disease dataset show that a single class annotation with semi-global boxes may improve accuracy. In addition, we also studied the noise factor during the labeling process. An ablation study shows that annotation noise within 10% is acceptable for keeping good performance. Overall, this data-centric numerical analysis helps us to understand the significance of annotation methods, which provides practitioners a way to obtain higher performance and reduce annotation costs on plant disease detection tasks. Our work encourages researchers to pay more attention to label quality and the essential issues of labeling methods.

A Study on the Current Nutritin Labeling Practices for the Processed Foods Retailed in the Supermarket in Korea (시판 포장가공 식품의 영양표시 현황에 관한 조사연구)

  • 장순옥
    • Journal of Nutrition and Health
    • /
    • v.30 no.1
    • /
    • pp.100-108
    • /
    • 1997
  • Our current food hygiene law mandates nutrition label (NL) for the special nutrition foods, health support foods, instant foods, and foods with certain nutrient emphasized note, only. Currently more processed foods are bearing nutrition labels though the format is quite inconsistant. This study examined the status on current nutrition labeling practices for the processed foods that are retailed in the supermarket. The obtained information was assessed in the aspects of numerical data presentation on nutrients content, descriptive terms, health claim, and the format. The results are summarized as follows. 1) Foods with NL are limited to the food category specified by current hygiene law while voluntary nutition labeling is few. 2) Descriptive terms such as free, low, and sufficient are not substantiated with quantitative data. The efficacy of microelements which has not been clalified yet are overemphasized but major nutrients are ignored. 3) The regulations for the descriptive terms are set on the base of the nutrient content per 100g or 100ml under current nutrition labeling act. It would mislead consumers thus the definition for these descriptor be better set on the unit of the amount of food customary eaten at one time. For this the standard serving size should be set officially. 4) Quantitative nutrition information given on food products is difficult to compare because of the lack in formality. The title of NL, load and kinds of nutritents, order of nutrients listed, the unit of expression, RDA comparision, and reference RDA are inconsistant among the foods similar in dietary property. Uniform format is needed to give NL the credibility and usefulness. Proividing nutrition information to the consumers with NL is a worldwide practice though its efficacy has been controversial. Under newly legistered health promotion law in Korean nutrition education is esxpected to take part in to improve national nutrition condition and NL would education is expected to take part in to improve national nutrition condition and NL would be a potent tool for public nutritions education. It appears to be the time to mandate NL to all the processed foods in the market. The result of present study would initiate further consumer experiments related to NL. Various interest groups such as food and nutrition professions, public health organizations, government regulatory agencies, food producers and marketers, and consumer groups need to particepate and communicate for the legislation of NL and the development of NL format.

  • PDF

A Study on the Voxel Coloring using Multi-variable Thresholding (다중 가변 문턱값을 이용한 복셀 칼라링 기법에 관한 연구)

  • Kim Hyo-Sung;Lee Sang-Wook;Nam Ki-Gon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.5
    • /
    • pp.1102-1110
    • /
    • 2005
  • In this paper, we proposed a advanced approach to resolve the trade-off problem for the threshold value determining the photo-consistency in the previous algorithms. The threshold value for the surface voxel is substituted the photo-consistency value of the inside voxel. As iterating the voxel coloring process, the threshold is approached to the optimal value for the individual surface voxel. we present an energy minimization formulation of the binary labeling problem that surface voxels classify into opacity or transparency. The energy formula consists of the data term and the smoothness term. As considering neighboring voxels in the labeling problem, the unevenness of reconstructed surface is reduced. The labeling whose energy is the global minimum can be computed using a graph cut.

Sequence Labeling-based Multiple Causal Relations Extraction using Pre-trained Language Model for Maritime Accident Prevention (해양사고 예방을 위한 사전학습 언어모델의 순차적 레이블링 기반 복수 인과관계 추출)

  • Ki-Yeong Moon;Do-Hyun Kim;Tae-Hoon Yang;Sang-Duck Lee
    • Journal of the Korean Society of Safety
    • /
    • v.38 no.5
    • /
    • pp.51-57
    • /
    • 2023
  • Numerous studies have been conducted to analyze the causal relationships of maritime accidents using natural language processing techniques. However, when multiple causes and effects are associated with a single accident, the effectiveness of extracting these causal relations diminishes. To address this challenge, we compiled a dataset using verdicts from maritime accident cases in this study, analyzed their causal relations, and applied labeling considering the association information of various causes and effects. In addition, to validate the efficacy of our proposed methodology, we fine-tuned the KoELECTRA Korean language model. The results of our validation process demonstrated the ability of our approach to successfully extract multiple causal relationships from maritime accident cases.

Method for improving video/image data quality for AI learning of unstructured data (비정형데이터의 AI학습을 위한 영상/이미지 데이터 품질 향상 방법)

  • Kim Seung Hee;Dongju Ryu
    • Convergence Security Journal
    • /
    • v.23 no.2
    • /
    • pp.55-66
    • /
    • 2023
  • Recently, there is an increasing movement to increase the value of AI learning data and to secure high-quality data based on previous research on AI learning data in all areas of society. Therefore, quality management is very important in construction projects to secure high-quality data. In this paper, quality management to secure high-quality data when building AI learning data and improvement plans for each construction process are presented. In particular, more than 80% of the data quality of unstructured data built for AI learning is determined during the construction process. In this paper, we performed quality inspection of image/video data. In addition, we identified inspection procedures and problem elements that occurred in the construction phases of acquisition, data cleaning, labeling, and models, and suggested ways to secure high-quality data by solving them. Through this, it is expected that it will be an alternative to overcome the quality deviation of data for research groups and operators participating in the construction of AI learning data.

Efficient Authorization Conflict Detection Using Prime Number Graph Labeling in RDF Access Control (RDF 접근 제어에서 소수 그래프 레이블링을 사용한 효율적 권한 충돌 발견)

  • Kim, Jae-Hoon;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.35 no.2
    • /
    • pp.112-124
    • /
    • 2008
  • RDF and OWL are the primary base technologies for implementing Semantic Web. Recently, many researches related with them, or applying them into the other application domains, have been introduced. However, relatively little work has been done for securing the RDF and OWL data. In this article, we briefly introduce an RDF triple based model for specifying RDF access authorization related with RDF security. Next, to efficiently find the authorization conflict by RDF inference, we introduce a method using prime number graph labeling in detail. The problem of authorization conflict by RDF inference is that although the lower concept is permitted to be accessed, it can be inaccessible due to the disapproval for the upper concept. Because by the RDF inference, the lower concept can be interpreted into the upper concept. Some experimental results show that the proposed method using the prime number graph labeling has better performance than the existing simple method for the detection of the authorization conflict.