• Title/Summary/Keyword: Pre-Processing

Search Result 2,013, Processing Time 0.034 seconds

Principal component analysis in C[11]-PIB imaging (주성분분석을 이용한 C[11]-PIB imaging 영상분석)

  • Kim, Nambeom;Shin, Gwi Soon;Ahn, Sung Min
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.19 no.1
    • /
    • pp.12-16
    • /
    • 2015
  • Purpose Principal component analysis (PCA) is a method often used in the neuroimagre analysis as a multivariate analysis technique for describing the structure of high dimensional correlation as the structure of lower dimensional space. PCA is a statistical procedure that uses an orthogonal transformation to convert a set of observations of correlated variables into a set of values of linearly independent variables called principal components. In this study, in order to investigate the usefulness of PCA in the brain PET image analysis, we tried to analyze C[11]-PIB PET image as a representative case. Materials and Methods Nineteen subjects were included in this study (normal = 9, AD/MCI = 10). For C[11]-PIB, PET scan were acquired for 20 min starting 40 min after intravenous injection of 9.6 MBq/kg C[11]-PIB. All emission recordings were acquired with the Biograph 6 Hi-Rez (Siemens-CTI, Knoxville, TN) in three-dimensional acquisition mode. Transmission map for attenuation-correction was acquired using the CT emission scans (130 kVp, 240 mA). Standardized uptake values (SUVs) of C[11]-PIB calculated from PET/CT. In normal subjects, 3T MRI T1-weighted images were obtained to create a C[11]-PIB template. Spatial normalization and smoothing were conducted as a pre-processing for PCA using SPM8 and PCA was conducted using Matlab2012b. Results Through the PCA, we obtained linearly uncorrelated independent principal component images. Principal component images obtained through the PCA can simplify the variation of whole C[11]-PIB images into several principal components including the variation of neocortex and white matter and the variation of deep brain structure such as pons. Conclusion PCA is useful to analyze and extract the main pattern of C[11]-PIB image. PCA, as a method of multivariate analysis, might be useful for pattern recognition of neuroimages such as FDG-PET or fMRI as well as C[11]-PIB image.

  • PDF

Conditional Generative Adversarial Network based Collaborative Filtering Recommendation System (Conditional Generative Adversarial Network(CGAN) 기반 협업 필터링 추천 시스템)

  • Kang, Soyi;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.157-173
    • /
    • 2021
  • With the development of information technology, the amount of available information increases daily. However, having access to so much information makes it difficult for users to easily find the information they seek. Users want a visualized system that reduces information retrieval and learning time, saving them from personally reading and judging all available information. As a result, recommendation systems are an increasingly important technologies that are essential to the business. Collaborative filtering is used in various fields with excellent performance because recommendations are made based on similar user interests and preferences. However, limitations do exist. Sparsity occurs when user-item preference information is insufficient, and is the main limitation of collaborative filtering. The evaluation value of the user item matrix may be distorted by the data depending on the popularity of the product, or there may be new users who have not yet evaluated the value. The lack of historical data to identify consumer preferences is referred to as data sparsity, and various methods have been studied to address these problems. However, most attempts to solve the sparsity problem are not optimal because they can only be applied when additional data such as users' personal information, social networks, or characteristics of items are included. Another problem is that real-world score data are mostly biased to high scores, resulting in severe imbalances. One cause of this imbalance distribution is the purchasing bias, in which only users with high product ratings purchase products, so those with low ratings are less likely to purchase products and thus do not leave negative product reviews. Due to these characteristics, unlike most users' actual preferences, reviews by users who purchase products are more likely to be positive. Therefore, the actual rating data is over-learned in many classes with high incidence due to its biased characteristics, distorting the market. Applying collaborative filtering to these imbalanced data leads to poor recommendation performance due to excessive learning of biased classes. Traditional oversampling techniques to address this problem are likely to cause overfitting because they repeat the same data, which acts as noise in learning, reducing recommendation performance. In addition, pre-processing methods for most existing data imbalance problems are designed and used for binary classes. Binary class imbalance techniques are difficult to apply to multi-class problems because they cannot model multi-class problems, such as objects at cross-class boundaries or objects overlapping multiple classes. To solve this problem, research has been conducted to convert and apply multi-class problems to binary class problems. However, simplification of multi-class problems can cause potential classification errors when combined with the results of classifiers learned from other sub-problems, resulting in loss of important information about relationships beyond the selected items. Therefore, it is necessary to develop more effective methods to address multi-class imbalance problems. We propose a collaborative filtering model using CGAN to generate realistic virtual data to populate the empty user-item matrix. Conditional vector y identify distributions for minority classes and generate data reflecting their characteristics. Collaborative filtering then maximizes the performance of the recommendation system via hyperparameter tuning. This process should improve the accuracy of the model by addressing the sparsity problem of collaborative filtering implementations while mitigating data imbalances arising from real data. Our model has superior recommendation performance over existing oversampling techniques and existing real-world data with data sparsity. SMOTE, Borderline SMOTE, SVM-SMOTE, ADASYN, and GAN were used as comparative models and we demonstrate the highest prediction accuracy on the RMSE and MAE evaluation scales. Through this study, oversampling based on deep learning will be able to further refine the performance of recommendation systems using actual data and be used to build business recommendation systems.

An Outlier Detection Using Autoencoder for Ocean Observation Data (해양 이상 자료 탐지를 위한 오토인코더 활용 기법 최적화 연구)

  • Kim, Hyeon-Jae;Kim, Dong-Hoon;Lim, Chaewook;Shin, Yongtak;Lee, Sang-Chul;Choi, Youngjin;Woo, Seung-Buhm
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.33 no.6
    • /
    • pp.265-274
    • /
    • 2021
  • Outlier detection research in ocean data has traditionally been performed using statistical and distance-based machine learning algorithms. Recently, AI-based methods have received a lot of attention and so-called supervised learning methods that require classification information for data are mainly used. This supervised learning method requires a lot of time and costs because classification information (label) must be manually designated for all data required for learning. In this study, an autoencoder based on unsupervised learning was applied as an outlier detection to overcome this problem. For the experiment, two experiments were designed: one is univariate learning, in which only SST data was used among the observation data of Deokjeok Island and the other is multivariate learning, in which SST, air temperature, wind direction, wind speed, air pressure, and humidity were used. Period of data is 25 years from 1996 to 2020, and a pre-processing considering the characteristics of ocean data was applied to the data. An outlier detection of actual SST data was tried with a learned univariate and multivariate autoencoder. We tried to detect outliers in real SST data using trained univariate and multivariate autoencoders. To compare model performance, various outlier detection methods were applied to synthetic data with artificially inserted errors. As a result of quantitatively evaluating the performance of these methods, the multivariate/univariate accuracy was about 96%/91%, respectively, indicating that the multivariate autoencoder had better outlier detection performance. Outlier detection using an unsupervised learning-based autoencoder is expected to be used in various ways in that it can reduce subjective classification errors and cost and time required for data labeling.

Current status and future of insect smart factory farm using ICT technology (ICT기술을 활용한 곤충스마트팩토리팜의 현황과 미래)

  • Seok, Young-Seek
    • Food Science and Industry
    • /
    • v.55 no.2
    • /
    • pp.188-202
    • /
    • 2022
  • In the insect industry, as the scope of application of insects is expanded from pet insects and natural enemies to feed, edible and medicinal insects, the demand for quality control of insect raw materials is increasing, and interest in securing the safety of insect products is increasing. In the process of expanding the industrial scale, controlling the temperature and humidity and air quality in the insect breeding room and preventing the spread of pathogens and other pollutants are important success factors. It requires a controlled environment under the operating system. European commercial insect breeding facilities have attracted considerable investor interest, and insect companies are building large-scale production facilities, which became possible after the EU approved the use of insect protein as feedstock for fish farming in July 2017. Other fields, such as food and medicine, have also accelerated the application of cutting-edge technology. In the future, the global insect industry will purchase eggs or small larvae from suppliers and a system that focuses on the larval fattening, i.e., production raw material, until the insects mature, and a system that handles the entire production process from egg laying, harvesting, and initial pre-treatment of larvae., increasingly subdivided into large-scale production systems that cover all stages of insect larvae production and further processing steps such as milling, fat removal and protein or fat fractionation. In Korea, research and development of insect smart factory farms using artificial intelligence and ICT is accelerating, so insects can be used as carbon-free materials in secondary industries such as natural plastics or natural molding materials as well as existing feed and food. A Korean-style customized breeding system for shortening the breeding period or enhancing functionality is expected to be developed soon.

Comparative Analysis of the Keywords in Taekwondo News Articles by Year: Applying Topic Modeling Method (태권도 뉴스기사의 연도별 주제어 비교분석: 토픽모델링 적용)

  • Jeon, Minsoo;Lim, Hyosung
    • Journal of Digital Convergence
    • /
    • v.19 no.11
    • /
    • pp.575-583
    • /
    • 2021
  • This study aims to analyze Taekwondo trends according to news articles by year by applying topic modeling. In order to examine the Taekwondo trend through media reports, articles including news articles and Taekwondo specialized media articles were collected through Big Kinds of the Korea Press Foundation. The search period was divided into three sections: before 2000, 2001~2010, and 2011~2020. A total of 12,124 items were selected as research data. For topic analysis, pre-processing was performed, and topic analysis was performed using the LDA algorithm. In this case, python 3 was applied for all analysis. First, as a result of analyzing the topics of media articles by year, 'World' was the most common keyword before 2000. 'South and North Korea' was next common and 'Olympic' was the third commonest topic. From 2001 to 2010, 'World' was the most common topic, followed by 'Association' and 'World Taekwondo'. From 2011 to 2020, 'World', 'Demonstration', and 'Kukkiwon' was the most common topic in that order. Second, as a result of analyzing news articles before 2000 by topic modeling, topics were divided into two categories. Specifically, Topic 1 was selected as 'South-North Korea sports exchange' and Topic 2 was selected as 'Adoption of Olympic demonstration events'. Third, as a result of analyzing news articles from 2001 to 2010 by topic modeling, three topics were selected. Topic 1 was selected as 'Taekwondo Demonstration Performance and Corruption', Topic 2 was selected as 'Muju Taekwondo Park Creation', and Topic 3 was selected as 'World Taekwondo Festival'. Fourth, as a result of analyzing news articles from 2011 to 2020 by topic modeling, three topics were selected. Topic 1 was selected as 'Successful Hosting of the 2018 Pyeongchang Winter Olympics', Topic 2 was selected as 'North-South Korea Taekwondo Joint Demonstration Performance', and Topic 3 was selected as '2017 Muju World Taekwondo Championships'.

An Implementation of OTB Extension to Produce TOA and TOC Reflectance of LANDSAT-8 OLI Images and Its Product Verification Using RadCalNet RVUS Data (Landsat-8 OLI 영상정보의 대기 및 지표반사도 산출을 위한 OTB Extension 구현과 RadCalNet RVUS 자료를 이용한 성과검증)

  • Kim, Kwangseob;Lee, Kiwon
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.449-461
    • /
    • 2021
  • Analysis Ready Data (ARD) for optical satellite images represents a pre-processed product by applying spectral characteristics and viewing parameters for each sensor. The atmospheric correction is one of the fundamental and complicated topics, which helps to produce Top-of-Atmosphere (TOA) and Top-of-Canopy (TOC) reflectance from multi-spectral image sets. Most remote sensing software provides algorithms or processing schemes dedicated to those corrections of the Landsat-8 OLI sensors. Furthermore, Google Earth Engine (GEE), provides direct access to Landsat reflectance products, USGS-based ARD (USGS-ARD), on the cloud environment. We implemented the Orfeo ToolBox (OTB) atmospheric correction extension, an open-source remote sensing software for manipulating and analyzing high-resolution satellite images. This is the first tool because OTB has not provided calibration modules for any Landsat sensors. Using this extension software, we conducted the absolute atmospheric correction on the Landsat-8 OLI images of Railroad Valley, United States (RVUS) to validate their reflectance products using reflectance data sets of RVUS in the RadCalNet portal. The results showed that the reflectance products using the OTB extension for Landsat revealed a difference by less than 5% compared to RadCalNet RVUS data. In addition, we performed a comparative analysis with reflectance products obtained from other open-source tools such as a QGIS semi-automatic classification plugin and SAGA, besides USGS-ARD products. The reflectance products by the OTB extension showed a high consistency to those of USGS-ARD within the acceptable level in the measurement data range of the RadCalNet RVUS, compared to those of the other two open-source tools. In this study, the verification of the atmospheric calibration processor in OTB extension was carried out, and it proved the application possibility for other satellite sensors in the Compact Advanced Satellite (CAS)-500 or new optical satellites.

Preliminary Inspection Prediction Model to select the on-Site Inspected Foreign Food Facility using Multiple Correspondence Analysis (차원축소를 활용한 해외제조업체 대상 사전점검 예측 모형에 관한 연구)

  • Hae Jin Park;Jae Suk Choi;Sang Goo Cho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.121-142
    • /
    • 2023
  • As the number and weight of imported food are steadily increasing, safety management of imported food to prevent food safety accidents is becoming more important. The Ministry of Food and Drug Safety conducts on-site inspections of foreign food facilities before customs clearance as well as import inspection at the customs clearance stage. However, a data-based safety management plan for imported food is needed due to time, cost, and limited resources. In this study, we tried to increase the efficiency of the on-site inspection by preparing a machine learning prediction model that pre-selects the companies that are expected to fail before the on-site inspection. Basic information of 303,272 foreign food facilities and processing businesses collected in the Integrated Food Safety Information Network and 1,689 cases of on-site inspection information data collected from 2019 to April 2022 were collected. After preprocessing the data of foreign food facilities, only the data subject to on-site inspection were extracted using the foreign food facility_code. As a result, it consisted of a total of 1,689 data and 103 variables. For 103 variables, variables that were '0' were removed based on the Theil-U index, and after reducing by applying Multiple Correspondence Analysis, 49 characteristic variables were finally derived. We build eight different models and perform hyperparameter tuning through 5-fold cross validation. Then, the performance of the generated models are evaluated. The research purpose of selecting companies subject to on-site inspection is to maximize the recall, which is the probability of judging nonconforming companies as nonconforming. As a result of applying various algorithms of machine learning, the Random Forest model with the highest Recall_macro, AUROC, Average PR, F1-score, and Balanced Accuracy was evaluated as the best model. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the selection reason for nonconforming facilities of individual instances, and discuss applicability to the on-site inspection facility selection system. Based on the results of this study, it is expected that it will contribute to the efficient operation of limited resources such as manpower and budget by establishing an imported food management system through a data-based scientific risk management model.

Development of deep learning structure for complex microbial incubator applying deep learning prediction result information (딥러닝 예측 결과 정보를 적용하는 복합 미생물 배양기를 위한 딥러닝 구조 개발)

  • Hong-Jik Kim;Won-Bog Lee;Seung-Ho Lee
    • Journal of IKEEE
    • /
    • v.27 no.1
    • /
    • pp.116-121
    • /
    • 2023
  • In this paper, we develop a deep learning structure for a complex microbial incubator that applies deep learning prediction result information. The proposed complex microbial incubator consists of pre-processing of complex microbial data, conversion of complex microbial data structure, design of deep learning network, learning of the designed deep learning network, and GUI development applied to the prototype. In the complex microbial data preprocessing, one-hot encoding is performed on the amount of molasses, nutrients, plant extract, salt, etc. required for microbial culture, and the maximum-minimum normalization method for the pH concentration measured as a result of the culture and the number of microbial cells to preprocess the data. In the complex microbial data structure conversion, the preprocessed data is converted into a graph structure by connecting the water temperature and the number of microbial cells, and then expressed as an adjacency matrix and attribute information to be used as input data for a deep learning network. In deep learning network design, complex microbial data is learned by designing a graph convolutional network specialized for graph structures. The designed deep learning network uses a cosine loss function to proceed with learning in the direction of minimizing the error that occurs during learning. GUI development applied to the prototype shows the target pH concentration (3.8 or less) and the number of cells (108 or more) of complex microorganisms in an order suitable for culturing according to the water temperature selected by the user. In order to evaluate the performance of the proposed microbial incubator, the results of experiments conducted by authorized testing institutes showed that the average pH was 3.7 and the number of cells of complex microorganisms was 1.7 × 108. Therefore, the effectiveness of the deep learning structure for the complex microbial incubator applying the deep learning prediction result information proposed in this paper was proven.

Effect of Accelerated Storage on the Microstructure and Water Absorption Characteristics of Korean Adzuki Bean (Vigna angularis L.) Cultivar (팥의 가속화 저장에 따른 미세구조 및 수분흡수 특성)

  • Jieun Kwak;Seon-Min Oh;You-Geun Oh;Yu-Chan Choi;Hyun-Jin Park;Suk-Bo Song;Jeong-Heui Lee;Jeom-Sig Lee
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.68 no.3
    • /
    • pp.167-174
    • /
    • 2023
  • This study investigated the microstructure and water absorption characteristics of the Korean adzuki bean (Vigna angularis L.) cultivar under accelerated storage. The germination rate, acid value, redness (a*), and yellowness (b*) values showed no significant differences after three months of storage compared to pre-storage under low temperatures (4℃). However, a statistically significant difference was observed under accelerated high temperatures (45℃). In particular, after storage for three months, the germination rate and acid value were 0% and 33.63 mg KOH/100g, respectively, under accelerated high temperatures. After storage for three months, the holes, hilum damage, and spaces between the seed coat and cotyledon shortened the time and speed of water absorption under accelerated high temperatures compared to that under low temperatures. Conversely, further research is required to investigate the reason for the low rate of parallel water absorption.

The Impacts of Need for Cognitive Closure, Psychological Wellbeing, and Social Factors on Impulse Purchasing (인지폐합수요(认知闭合需要), 심리건강화사회인소대충동구매적영향(心理健康和社会因素对冲动购买的影响))

  • Lee, Myong-Han;Schellhase, Ralf;Koo, Dong-Mo;Lee, Mi-Jeong
    • Journal of Global Scholars of Marketing Science
    • /
    • v.19 no.4
    • /
    • pp.44-56
    • /
    • 2009
  • Impulse purchasing is defined as an immediate purchase with no pre-shopping intentions. Previous studies of impulse buying have focused primarily on factors linked to marketing mix variables, situational factors, and consumer demographics and traits. In previous studies, marketing mix variables such as product category, product type, and atmospheric factors including advertising, coupons, sales events, promotional stimuli at the point of sale, and media format have been used to evaluate product information. Some authors have also focused on situational factors surrounding the consumer. Factors such as the availability of credit card usage, time available, transportability of the products, and the presence and number of shopping companions were found to have a positive impact on impulse buying and/or impulse tendency. Research has also been conducted to evaluate the effects of individual characteristics such as the age, gender, and educational level of the consumer, as well as perceived crowding, stimulation, and the need for touch, on impulse purchasing. In summary, previous studies have found that all products can be purchased impulsively (Vohs and Faber, 2007), that situational factors affect and/or at least facilitate impulse purchasing behavior, and that various individual traits are closely linked to impulse buying. The recent introduction of new distribution channels such as home shopping channels, discount stores, and Internet stores that are open 24 hours a day increases the probability of impulse purchasing. However, previous literature has focused predominantly on situational and marketing variables and thus studies that consider critical consumer characteristics are still lacking. To fill this gap in the literature, the present study builds on this third tradition of research and focuses on individual trait variables, which have rarely been studied. More specifically, the current study investigates whether impulse buying tendency has a positive impact on impulse buying behavior, and evaluates how consumer characteristics such as the need for cognitive closure (NFCC), psychological wellbeing, and susceptibility to interpersonal influences affect the tendency of consumers towards impulse buying. The survey results reveal that while consumer affective impulsivity has a strong positive impact on impulse buying behavior, cognitive impulsivity has no impact on impulse buying behavior. Furthermore, affective impulse buying tendency is driven by sub-components of NFCC such as decisiveness and discomfort with ambiguity, psychological wellbeing constructs such as environmental control and purpose in life, and by normative and informational influences. In addition, cognitive impulse tendency is driven by sub-components of NFCC such as decisiveness, discomfort with ambiguity, and close-mindedness, and the psychological wellbeing constructs of environmental control, as well as normative and informational influences. The present study has significant theoretical implications. First, affective impulsivity has a strong impact on impulse purchase behavior. Previous studies based on affectivity and flow theories proposed that low to moderate levels of impulsivity are driven by reduced self-control or a failure of self-regulatory mechanisms. The present study confirms the above proposition. Second, the present study also contributes to the literature by confirming that impulse buying tendency can be viewed as a two-dimensional concept with both affective and cognitive dimensions, and illustrates that impulse purchase behavior is explained mainly by affective impulsivity, not by cognitive impulsivity. Third, the current study accommodates new constructs such as psychological wellbeing and NFCC as potential influencing factors in the research model, thereby contributing to the existing literature. Fourth, by incorporating multi-dimensional concepts such as psychological wellbeing and NFCC, more diverse aspects of consumer information processing can be evaluated. Fifth, the current study also extends the existing literature by confirming the two competing routes of normative and informational influences. Normative influence occurs when individuals conform to the expectations of others or to enhance his/her self-image. Whereas informational influence occurs when individuals search for information from knowledgeable others or making inferences based upon observations of the behavior of others. The present study shows that these two competing routes of social influence can be attributed to different sources of influence power. The current study also has many practical implications. First, it suggests that people with affective impulsivity may be primary targets to whom companies should pay closer attention. Cultivating a more amenable and mood-elevating shopping environment will appeal to this segment. Second, the present results demonstrate that NFCC is closely related to the cognitive dimension of impulsivity. These people are driven by careless thoughts, not by feelings or excitement. Rational advertising at the point of purchase will attract these customers. Third, people susceptible to normative influences are another potential target market. Retailers and manufacturers could appeal to this segment by advertising their products and/or services as products that can be used to identify with or conform to the expectations of others in the aspiration group. However, retailers should avoid targeting people susceptible to informational influences as a segment market. These people are engaged in an extensive information search relevant to their purchase, and therefore more elaborate, long-term rational advertising messages, which can be internalized into these consumers' thought processes, will appeal to this segment. The current findings should be interpreted with caution for several reasons. The study used a small convenience sample, and only investigated behavior in two dimensions. Accordingly, future studies should incorporate a sample with more diverse characteristics and measure different aspects of behavior. Future studies should also investigate personality traits closely related to affectivity theories. Trait variables such as sensory curiosity, interpersonal curiosity, and atmospheric responsiveness are interesting areas for future investigation.

  • PDF