• Title/Summary/Keyword: data science department

Search Result 26,952, Processing Time 0.073 seconds

A study on the behavior of cosmetic customers (화장품구매 자료를 통한 고객 구매행태 분석)

  • Cho, Dae-Hyeon;Kim, Byung-Soo;Seok, Kyung-Ha;Lee, Jong-Un;Kim, Jong-Sung;Kim, Sun-Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.4
    • /
    • pp.615-627
    • /
    • 2009
  • In micro marketing promotion, it is important to know the behavior of customers. In this study we are interested in the forecasting of repurchase of customers from customers' behavior. By analyzing the cosmetic transaction data we derive some variables which play an important role in the knowledge of the customers' behavior and in the modeling of repurchase. As modeling tools we use the decision tree, logistic regression and neural network model. Finally we decide to use the decision tree as a final model since it yields the smallest RASE (root average squared error) and the greatest correct classification rate.

  • PDF

Reversible Data Hiding in Block Truncation Coding Compressed Images Using Quantization Level Swapping and Shifting

  • Hong, Wien;Zheng, Shuozhen;Chen, Tung-Shou;Huang, Chien-Che
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.6
    • /
    • pp.2817-2834
    • /
    • 2016
  • The existing reversible data hiding methods for block truncation coding (BTC) compressed images often utilize difference expansion or histogram shifting technique for data embedment. Although these methods effectively embed data into the compressed codes, the embedding operations may swap the numerical order of the higher and lower quantization levels. Since the numerical order of these two quantization levels can be exploited to carry additional data without destroying the quality of decoded image, the existing methods cannot take the advantages of this property to embed data more efficiently. In this paper, we embed data by shifting the higher and lower quantization levels in opposite direction. Because the embedment does not change numerical order of quantization levels, we exploit this property to carry additional data without further reducing the image quality. The proposed method performs no-distortion embedding if the payload is small, and performs reversible data embedding for large payload. The experimental results show that the proposed method offers better embedding performance over prior works in terms of payload and image quality.

Reconstruction of Terrestrial Water Storage of GRACE/GFO Using Convolutional Neural Network and Climate Data

  • Jeon, Woohyu;Kim, Jae-Seung;Seo, Ki-Weon
    • Journal of the Korean earth science society
    • /
    • v.42 no.4
    • /
    • pp.445-458
    • /
    • 2021
  • Gravity Recovery and Climate Experiment (GRACE) gravimeter satellites observed the Earth gravity field with unprecedented accuracy since 2002. After the termination of GRACE mission, GRACE Follow-on (GFO) satellites successively observe global gravity field, but there is missing period between GRACE and GFO about one year. Many previous studies estimated terrestrial water storage (TWS) changes using hydrological models, vertical displacements from global navigation satellite system observations, altimetry, and satellite laser ranging for a continuity of GRACE and GFO data. Recently, in order to predict TWS changes, various machine learning methods are developed such as artificial neural network and multi-linear regression. Previous studies used hydrological and climate data simultaneously as input data of the learning process. Further, they excluded linear trends in input data and GRACE/GFO data because the trend components obtained from GRACE/GFO data were assumed to be the same for other periods. However, hydrological models include high uncertainties, and observational period of GRACE/GFO is not long enough to estimate reliable TWS trends. In this study, we used convolutional neural networks (CNN) method incorporating only climate data set (temperature, evaporation, and precipitation) to predict TWS variations in the missing period of GRACE/GFO. We also make CNN model learn the linear trend of GRACE/GFO data. In most river basins considered in this study, our CNN model successfully predicts seasonal and long-term variations of TWS change.

Collection and Analysis of Electricity Consumption Data in POSTECH Campus (포스텍 캠퍼스의 전력 사용 데이터 수집 및 분석)

  • Ryu, Do-Hyeon;Kim, Kwang-Jae;Ko, YoungMyoung;Kim, Young-Jin;Song, Minseok
    • Journal of Korean Society for Quality Management
    • /
    • v.50 no.3
    • /
    • pp.617-634
    • /
    • 2022
  • Purpose: This paper introduces Pohang University of Science Technology (POSTECH) advanced metering infrastructure (AMI) and Open Innovation Big Data Center (OIBC) platform and analysis results of electricity consumption data collected via the AMI in POSTECH campus. Methods: We installed 248 sensors in seven buildings at POSTECH for the AMI and collected electricity consumption data from the buildings. To identify the amounts and trends of electricity consumption of the seven buildings, electricity consumption data collected from March to June 2019 were analyzed. In addition, this study compared the differences between the amounts and trends of electricity consumption of the seven buildings before and after the COVID-19 outbreak by using electricity consumption data collected from March to June 2019 and 2020. Results: Users can monitor, visualize, and download electricity consumption data collected via the AMI on the OIBC platform. The analysis results show that the seven buildings consume different amounts of electricity and have different consumption trends. In addition, the amounts of most buildings were significantly reduced after the COVID-19 outbreak. Conclusion: POSTECH AMI and OIBC platform can be a good reference for other universities that prepare their own microgrid. The analysis results provides a proof that POSTECH needs to establish customized strategies on reducing electricity for each building. Such results would be useful for energy-efficient operation and preparation of unusual energy consumptions due to unexpected situations like the COVID-19 pandemic.

Data-driven Value-enhancing Strategies: How to Increase Firm Value Using Data Science

  • Hyoung-Goo Kang;Ga-Young Jang;Moonkyung Choi
    • Asia pacific journal of information systems
    • /
    • v.32 no.3
    • /
    • pp.477-495
    • /
    • 2022
  • This paper proposes how to design and implement data-driven strategies by investigating how a firm can increase its value using data science. Drawing on prior studies on architectural innovation, a behavioral theory of the firm, and the knowledge-based view of the firm as well as the analysis of field observations, the paper shows how data science is abused in dealing with meso-level data while it is underused in using macro-level and alternative data to accomplish machine-human teaming and risk management. The implications help us understand why some firms are better at drawing value from intangibles such as data, data-science capabilities, and routines and how to evaluate such capabilities.

Hierarchical Flow-Based Anomaly Detection Model for Motor Gearbox Defect Detection

  • Younghwa Lee;Il-Sik Chang;Suseong Oh;Youngjin Nam;Youngteuk Chae;Geonyoung Choi;Gooman Park
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.6
    • /
    • pp.1516-1529
    • /
    • 2023
  • In this paper, a motor gearbox fault-detection system based on a hierarchical flow-based model is proposed. The proposed system is used for the anomaly detection of a motion sound-based actuator module. The proposed flow-based model, which is a generative model, learns by directly modeling a data distribution function. As the objective function is the maximum likelihood value of the input data, the training is stable and simple to use for anomaly detection. The operation sound of a car's side-view mirror motor is converted into a Mel-spectrogram image, consisting of a folding signal and an unfolding signal, and used as training data in this experiment. The proposed system is composed of an encoder and a decoder. The data extracted from the layer of the pretrained feature extractor are used as the decoder input data in the encoder. This information is used in the decoder by performing an interlayer cross-scale convolution operation. The experimental results indicate that the context information of various dimensions extracted from the interlayer hierarchical data improves the defect detection accuracy. This paper is notable because it uses acoustic data and a normalizing flow model to detect outliers based on the features of experimental data.

Forward Error Control Coding in Multicarrier DS/CDMA Systems

  • Lee, Ju-Mi;Iickho Song;Lee, Jooshik;Park, So-Ryoung
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.140-143
    • /
    • 2000
  • In this paper, forward error control coding in multicarrier direct sequence code division multiple access (DS/CDMA) systems is considered. In order to accommodate a number of coding rates easily and make the encoder and do-coder structure simple, we use the rate compatible punctured convolutional (RCPC) code. We obtain data throughputs at several coding rates and choose the coding rate which has the highest data throughput in the SINR sense. To achieve maximum data throughput, a rate adaptive system using channel state information (the SINR estimate) is proposed. The SINR estimate is obtain by the soft decision Viterbi decoding metric. We show that the proposed rate adaptive convolutionally coded multicarrier DS/CDMA system can enhance spectral efficiency and provide frequency diversity.

  • PDF

T1 Map-Based Radiomics for Prediction of Left Ventricular Reverse Remodeling in Patients With Nonischemic Dilated Cardiomyopathy

  • Suyon Chang;Kyunghwa Han;Yonghan Kwon;Lina Kim;Seunghyun Hwang;Hwiyoung Kim;Byoung Wook Choi
    • Korean Journal of Radiology
    • /
    • v.24 no.5
    • /
    • pp.395-405
    • /
    • 2023
  • Objective: This study aimed to develop and validate models using radiomics features on a native T1 map from cardiac magnetic resonance (CMR) to predict left ventricular reverse remodeling (LVRR) in patients with nonischemic dilated cardiomyopathy (NIDCM). Materials and Methods: Data from 274 patients with NIDCM who underwent CMR imaging with T1 mapping at Severance Hospital between April 2012 and December 2018 were retrospectively reviewed. Radiomic features were extracted from the native T1 maps. LVRR was determined using echocardiography performed ≥ 180 days after the CMR. The radiomics score was generated using the least absolute shrinkage and selection operator logistic regression models. Clinical, clinical + late gadolinium enhancement (LGE), clinical + radiomics, and clinical + LGE + radiomics models were built using a logistic regression method to predict LVRR. For internal validation of the result, bootstrap validation with 1000 resampling iterations was performed, and the optimism-corrected area under the receiver operating characteristic curve (AUC) with 95% confidence interval (CI) was computed. Model performance was compared using AUC with the DeLong test and bootstrap. Results: Among 274 patients, 123 (44.9%) were classified as LVRR-positive and 151 (55.1%) as LVRR-negative. The optimism-corrected AUC of the radiomics model in internal validation with bootstrapping was 0.753 (95% CI, 0.698-0.813). The clinical + radiomics model revealed a higher optimism-corrected AUC than that of the clinical + LGE model (0.794 vs. 0.716; difference, 0.078 [99% CI, 0.003-0.151]). The clinical + LGE + radiomics model significantly improved the prediction of LVRR compared with the clinical + LGE model (optimism-corrected AUC of 0.811 vs. 0.716; difference, 0.095 [99% CI, 0.022-0.139]). Conclusion: The radiomic characteristics extracted from a non-enhanced T1 map may improve the prediction of LVRR and offer added value over traditional LGE in patients with NIDCM. Additional external validation research is required.

Privacy Level Indicating Data Leakage Prevention System

  • Kim, Jinhyung;Park, Choonsik;Hwang, Jun;Kim, Hyung-Jong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.3
    • /
    • pp.558-575
    • /
    • 2013
  • The purpose of a data leakage prevention system is to protect corporate information assets. The system monitors the packet exchanges between internal systems and the Internet, filters packets according to the data security policy defined by each company, or discretionarily deletes important data included in packets in order to prevent leakage of corporate information. However, the problem arises that the system may monitor employees' personal information, thus allowing their privacy to be violated. Therefore, it is necessary to find not only a solution for detecting leakage of significant information, but also a way to minimize the leakage of internal users' personal information. In this paper, we propose two models for representing the level of personal information disclosure during data leakage detection. One model measures only the disclosure frequencies of keywords that are defined as personal data. These frequencies are used to indicate the privacy violation level. The other model represents the context of privacy violation using a private data matrix. Each row of the matrix represents the disclosure counts for personal data keywords in a given time period, and each column represents the disclosure count of a certain keyword during the entire observation interval. Using the suggested matrix model, we can represent an abstracted context of the privacy violation situation. Experiments on the privacy violation situation to demonstrate the usability of the suggested models are also presented.