• Title/Summary/Keyword: Data Preprocessing

Search Result 997, Processing Time 0.025 seconds

THE NONDESTRUCTIVE MEASUREMENT OF THE SOLUBLE SOLID AND ACID CONTENTS OF INTACT PEACH USING VIS/NIR TRANSMITTANCE SPECTRA

  • Hwang, I.G.;Noh, S.H.;Lee, H.Y.;Yang, S.B.
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2000.11b
    • /
    • pp.210-218
    • /
    • 2000
  • Since the SSC(soluble solid contents) and titratable acidity of fruit are highly concerned to the taste, the need for measuring them by non-destructive technology such as NIR(Visual and Near-infrared) spectroscopy is increasing. Specially, in order to grade the quality of each fruit with a sorter at sorting and packing facilities, technologies for online measurement satisfying the tolerance in terms of accuracy and speed should be developed. Many researches have been done to develop devices to measure the internal qualities of fruit such as SSC, titratable acidity, firmness, etc. with the VIS(Visual)/NIR(Near Infrared) reflectance spectra. The distributions of the SSC, titratable acidity, firmness, etc. are different with respect to the position and depth of fruit, and generally the VIS/NIR light can interact with fruit in a few millimeters of pathlength, and it is very difficult to measure the qualities of inner flesh of fruit. Therefore, to measure the average concentrations of each quality factor such as SSC and titratable acidity with the reflectance-type NIR devices, the spectra of fruit at several positions should be measured. Recently, the interest about the transmittance-type VIS/NIR devices is increasing. NIR light can penetrate through the fruit about 1/10-1/1,000,000 %. Therefore, very intensive light source and very sensitive sensor should be adopted to measure the transmitted light spectra of intact fruit. The ultimate purpose of this study was to develop a device to measure the transmitted light spectra of intact fruit such as apple, pear, peach, etc. With the transmittance-type VIS/NIR device, the feasibility of measurement of the SSC and titratable acidity in intact fruit cultivated in Korea was tested. The results are summarized as follows; A simple measurement device which can measure the transmitted light spectra of intact fruit was constructed with sample holder, two 500W-tungsten halogen lamps, a real-time spectrometer having a very sensitive CCD array sensor and optical fiber probe. With the device, it was possible to measure the transmitted light spectra of intact fruit such as apple, pear and peach. Main factors affecting the intensity of transmitted light spectra were the size of sample, the radiation intensity of light source and the integration time of the detector. Sample holder should be designed so that direct light leakage to the probe could be protected. Preprocessing method to the raw spectrum data significantly influenced the performance of the nondestructive measurement of SSC and titratable acidity of intact fruit. Representative results of PLS models in predicting the SSC of peach were SEP of 0.558 Brix% and R2 of 0.819, and those in predicting titratable acidity were SEP of 0.056% and R2 of 0.655.

  • PDF

Highly Reliable Fault Detection and Classification Algorithm for Induction Motors (유도전동기를 위한 고 신뢰성 고장 검출 및 분류 알고리즘 연구)

  • Hwang, Chul-Hee;Kang, Myeong-Su;Jung, Yong-Bum;Kim, Jong-Myon
    • The KIPS Transactions:PartB
    • /
    • v.18B no.3
    • /
    • pp.147-156
    • /
    • 2011
  • This paper proposes a 3-stage (preprocessing, feature extraction, and classification) fault detection and classification algorithm for induction motors. In the first stage, a low-pass filter is used to remove noise components in the fault signal. In the second stage, a discrete cosine transform (DCT) and a statistical method are used to extract features of the fault signal. Finally, a back propagation neural network (BPNN) method is applied to classify the fault signal. To evaluate the performance of the proposed algorithm, we used one second long normal/abnormal vibration signals of an induction motor sampled at 8kHz. Experimental results showed that the proposed algorithm achieves about 100% accuracy in fault classification, and it provides 50% improved accuracy when compared to the existing fault detection algorithm using a cross-covariance method. In a real-world data acquisition environment, unnecessary noise components are usually included to the real signal. Thus, we conducted an additional simulation to evaluate how well the proposed algorithm classifies the fault signals in a circumstance where a white Gaussian noise is inserted into the fault signals. The simulation results showed that the proposed algorithm achieves over 98% accuracy in fault classification. Moreover, we developed a testbed system including a TI's DSP (digital signal processor) to implement and verify the functionality of the proposed algorithm.

Page Logging System for Web Mining Systems (웹마이닝 시스템을 위한 페이지 로깅 시스템)

  • Yun, Seon-Hui;O, Hae-Seok
    • The KIPS Transactions:PartC
    • /
    • v.8C no.6
    • /
    • pp.847-854
    • /
    • 2001
  • The Web continues to grow fast rate in both a large aclae volume of traffic and the size and complexity of Web sites. Along with growth, the complexity of tasks such as Web site design Web server design and of navigating simply through a Web site have increased. An important input to these design tasks is the analysis of how a web site is being used. The is paper proposes a Page logging System(PLS) identifying reliably user sessions required in Web mining system PLS consists of Page Logger acquiring all the page accesses of the user Log processor producing user session from these data, and statements to incorporate a call to page logger applet. Proposed PLS abbreviates several preprocessing tasks which spends a log of time and efforts that must be performed in Web mining systems. In particular, it simplifies the complexity of transaction identification phase through acquiring directly the amount of time a user stays on a page. Also PLS solves local cache hits and proxy IPs that create problems with identifying user sessions from Web sever log.

  • PDF

A Evaluation Parameter Development of Anesthesia Depth in Each Anesthesia Steps by the Wavelet Transform of the Heart Rate Variability Signal (HRV 신호의 웨이브렛 변환에 의한 마취단계별 마취심도 평가 파라미터 개발)

  • Jeon, Gye-Rok;Kim, Myung-Chul;Han, Bong-Hyo;Ye, Soo-Yung;Ro, Jung-Hoon;Baik, Seong-Wan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.9
    • /
    • pp.2460-2470
    • /
    • 2009
  • In this study, the parameter extraction for evaluation of the anesthesia depth in each anesthesia stages was conducted. An object of the this experiment study has studied 5 adult patients (mean $\pm$ SD age:$42{\pm}9.13$), ASA classification I and II, undergoing surgery of obstetrics and gynecology. Anaesthesia was maintained with Enflurane. HRV signal was created by R-peak detection algorithm form ECG signal. The HRV data were preprocessing algorithm. It has tried find out the anesthesia parameter which responds the anesthesia events and shows objective anesthesia depth according to anesthesia stage including pre-anesthesia, induction, maintenance, awake and post-anesthesia. In this study, proposed algorithm to analysis the HRV(heart rate variability) signal using wavelet transform in anesthesia stage. Three sorts of wavelet functions applied to PSD. In the result, all of the results were showed similarly. But experiment results of Daubeches 10 is better. Therefore, this parameter is the best parameter in the evaluation of anesthesia stage.

Design of Pattern Classifier for Electrical and Electronic Waste Plastic Devices Using LIBS Spectrometer (LIBS 분광기를 이용한 폐소형가전 플라스틱 패턴 분류기의 설계)

  • Park, Sang-Beom;Bae, Jong-Soo;Oh, Sung-Kwun;Kim, Hyun-Ki
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.6
    • /
    • pp.477-484
    • /
    • 2016
  • Small industrial appliances such as fan, audio, electric rice cooker mostly consist of ABS, PP, PS materials. In colored plastics, it is possible to classify by near infrared(NIR) spectroscopy, while in black plastics, it is very difficult to classify black plastic because of the characteristic of black material that absorbs the light. So the RBFNNs pattern classifier is introduced for sorting electrical and electronic waste plastics through LIBS(Laser Induced Breakdown Spectroscopy) spectrometer. At the preprocessing part, PCA(Principle Component Analysis), as a kind of dimension reduction algorithms, is used to improve processing speed as well as to extract the effective data characteristics. In the condition part, FCM(Fuzzy C-Means) clustering is exploited. In the conclusion part, the coefficients of linear function of being polynomial type are used as connection weights. PSO and 5-fold cross validation are used to improve the reliability of performance as well as to enhance classification rate. The performance of the proposed classifier is described based on both optimization and no optimization.

A Study on the Product Planning Model based on Word2Vec using On-offline Comment Analysis: Focused on the Noiseless Vertical Mouse User (온·오프라인 댓글 분석이 활용된 Word2Vec 기반 상품기획 모델연구: 버티컬 무소음마우스 사용자를 중심으로)

  • Ahn, Yeong-Hwi
    • Journal of Digital Convergence
    • /
    • v.19 no.10
    • /
    • pp.221-227
    • /
    • 2021
  • In this paper, we conducted word-to-word similarity analysis of standardized datasets collected through web crawling for 10,000 Vertical Noise Mouses using Word2Vec, and made 92 students of computer engineering use the products presented for 5 days, and conducted self-report questionnaire analysis. The questionnaire analysis was conducted by collecting the words in the form of a narrative form and presenting and selecting the top 50 words extracted from the word frequency analysis and the word similarity analysis. As a result of analyzing the similarity of e-commerce user's product review, pain (.985) and design (.963) were analyzed as the advantages of click keywords, and the disadvantages were vertical (.985) and adaptation (.948). In the descriptive frequency analysis, the most frequently selected items were Vertical (123) and Pain (118). Vertical (83) and Pain (75) were selected for the advantages of selecting the long/demerit similar words, and adaptation (89) and buttons (72) were selected for the disadvantages. Therefore, it is expected that decision makers and product planners of medium and small enterprises can be used as important data for decision making when the method applied in this study is reflected as a new product development process and a review strategy of existing products.

Design and Implementation of OpenCV-based Inventory Management System to build Small and Medium Enterprise Smart Factory (중소기업 스마트공장 구축을 위한 OpenCV 기반 재고관리 시스템의 설계 및 구현)

  • Jang, Su-Hwan;Jeong, Jopil
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.1
    • /
    • pp.161-170
    • /
    • 2019
  • Multi-product mass production small and medium enterprise factories have a wide variety of products and a large number of products, wasting manpower and expenses for inventory management. In addition, there is no way to check the status of inventory in real time, and it is suffering economic damage due to excess inventory and shortage of stock. There are many ways to build a real-time data collection environment, but most of them are difficult to afford for small and medium-sized companies. Therefore, smart factories of small and medium enterprises are faced with difficult reality and it is hard to find appropriate countermeasures. In this paper, we implemented the contents of extension of existing inventory management method through character extraction on label with barcode and QR code, which are widely adopted as current product management technology, and evaluated the effect. Technically, through preprocessing using OpenCV for automatic recognition and classification of stock labels and barcodes, which is a method for managing input and output of existing products through computer image processing, and OCR (Optical Character Recognition) function of Google vision API. And it is designed to recognize the barcode through Zbar. We propose a method to manage inventory by real-time image recognition through Raspberry Pi without using expensive equipment.

Change Attention-based Vehicle Scratch Detection System (변화 주목 기반 차량 흠집 탐지 시스템)

  • Lee, EunSeong;Lee, DongJun;Park, GunHee;Lee, Woo-Ju;Sim, Donggyu;Oh, Seoung-Jun
    • Journal of Broadcast Engineering
    • /
    • v.27 no.2
    • /
    • pp.228-239
    • /
    • 2022
  • In this paper, we propose an unmanned vehicle scratch detection deep learning model for car sharing services. Conventional scratch detection models consist of two steps: 1) a deep learning module for scratch detection of images before and after rental, 2) a manual matching process for finding newly generated scratches. In order to build a fully automatic scratch detection model, we propose a one-step unmanned scratch detection deep learning model. The proposed model is implemented by applying transfer learning and fine-tuning to the deep learning model that detects changes in satellite images. In the proposed car sharing service, specular reflection greatly affects the scratch detection performance since the brightness of the gloss-treated automobile surface is anisotropic and a non-expert user takes a picture with a general camera. In order to reduce detection errors caused by specular reflected light, we propose a preprocessing process for removing specular reflection components. For data taken by mobile phone cameras, the proposed system can provide high matching performance subjectively and objectively. The scores for change detection metrics such as precision, recall, F1, and kappa are 67.90%, 74.56%, 71.08%, and 70.18%, respectively.

Analysis of Research Trends in Tax Compliance using Topic Modeling (토픽모델링을 활용한 조세순응 연구 동향 분석)

  • Kang, Min-Jo;Baek, Pyoung-Gu
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.1
    • /
    • pp.99-115
    • /
    • 2022
  • In this study, domestic academic journal papers on tax compliance, tax consciousness, and faithful tax payment (hereinafter referred to as "tax compliance") were comprehensively analyzed from an interdisciplinary perspective as a representative research topic in the field of tax science. To achieve the research purpose, topic modeling technique was applied as part of text mining. In the flow of data collection-keyword preprocessing-topic model analysis, potential research topics were presented from tax compliance related keywords registered by the researcher in a total of 347 papers. The results of this study can be summarized as follows. First, in the keyword analysis, keywords such as tax investigation, tax avoidance, and honest tax reporting system were included in the top 5 keywords based on simple term-frequency, and in the TF-IDF value considering the relative importance of keywords, they were also included in the top 5 keywords. On the other hand, the keyword, tax evasion, was included in the top keyword based on the TF-IDF value, whereas it was not highlighted in the simple term-frequency. Second, eight potential research topics were derived through topic modeling. The topics covered are (1) tax fairness and suppression of tax offenses, (2) the ideology of the tax law and the validity of tax policies, (3) the principle of substance over form and guarantee of tax receivables (4) tax compliance costs and tax administration services, (5) the tax returns self- assessment system and tax experts, (6) tax climate and strategic tax behavior, (7) multifaceted tax behavior and differential compliance intentions, (8) tax information system and tax resource management. The research comprehensively looked at the various perspectives on the tax compliance from an interdisciplinary perspective, thereby comprehensively grasping past research trends on tax compliance and suggesting the direction of future research.

A Code Clustering Technique for Unifying Method Full Path of Reusable Cloned Code Sets of a Product Family (제품군의 재사용 가능한 클론 코드의 메소드 경로 통일을 위한 코드 클러스터링 방법)

  • Kim, Taeyoung;Lee, Jihyun;Kim, Eunmi
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.1
    • /
    • pp.1-18
    • /
    • 2023
  • Similar software is often developed with the Clone-And-Own (CAO) approach that copies and modifies existing artifacts. The CAO approach is considered as a bad practice because it makes maintenance difficult as the number of cloned products increases. Software product line engineering is a methodology that can solve the issue of the CAO approach by developing a product family through systematic reuse. Migrating product families that have been developed with the CAO approach to the product line engineering begins with finding, integrating, and building them as reusable assets. However, cloning occurs at various levels from directories to code lines, and their structures can be changed. This makes it difficult to build product line code base simply by finding clones. Successful migration thus requires unifying the source code's file path, class name, and method signature. This paper proposes a clustering method that identifies a set of similar codes scattered across product variants and some of their method full paths are different, so path unification is necessary. In order to show the effectiveness of the proposed method, we conducted an experiment using the Apo Games product line, which has evolved with the CAO approach. As a result, the average precision of clustering performed without preprocessing was 0.91 and the number of identified common clusters was 0, whereas our method showed 0.98 and 15 respectively.