• 제목/요약/키워드: data science department

검색결과 26,439건 처리시간 0.055초

화장품 고객 정보를 이용한 마이크로 마케팅 (Micro marketing using a cosmetic transaction data)

  • 석경하;조대현;김병수;이종언;백승훈;전유중;이영배;김재길
    • Journal of the Korean Data and Information Science Society
    • /
    • 제21권3호
    • /
    • pp.535-546
    • /
    • 2010
  • 고객 정보를 활용하는 방법에는 고객의 구매액을 활용한 마일리지 방법과 구매 횟수에 따라 등급을 나누어 활용하는 방법 등이 있다. 본 연구에서는 회사 매출에 직결되는 고객의 재구매 여부에 초점을 맞추어 고객정보와 구매정보를 이용하여 로지스틱 회귀분석을 통한 재구매 예측 모형을 만들었다. 예측 모형 평가 측도로는 하이드게 점수를 사용하였으며 하이드게 점수를 최대로 하는 점수를 기준으로 분계점을 선택하였다. 재구매 예측모형을 이용하여 재구매 지수를 만들어 고객을 등급화하여 보다 효율적인 고객 관리가 가능하게 하였다.

A Scalable Data Integrity Mechanism Based on Provable Data Possession and JARs

  • Zafar, Faheem;Khan, Abid;Ahmed, Mansoor;Khan, Majid Iqbal;Jabeen, Farhana;Hamid, Zara;Ahmed, Naveed;Bashir, Faisal
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권6호
    • /
    • pp.2851-2873
    • /
    • 2016
  • Cloud storage as a service provides high scalability and availability as per need of user, without large investment on infrastructure. However, data security risks, such as confidentiality, privacy, and integrity of the outsourced data are associated with the cloud-computing model. Over the year's techniques such as, remote data checking (RDC), data integrity protection (DIP), provable data possession (PDP), proof of storage (POS), and proof of retrievability (POR) have been devised to frequently and securely check the integrity of outsourced data. In this paper, we improve the efficiency of PDP scheme, in terms of computation, storage, and communication cost for large data archives. By utilizing the capabilities of JAR and ZIP technology, the cost of searching the metadata in proof generation process is reduced from O(n) to O(1). Moreover, due to direct access to metadata, disk I/O cost is reduced and resulting in 50 to 60 time faster proof generation for large datasets. Furthermore, our proposed scheme achieved 50% reduction in storage size of data and respective metadata that result in providing storage and communication efficiency.

Spatial Cluster Analysis for Earthquake on the Korean Peninsula

  • Kang, Chang-Wan;Moon, Sung-Ho;Cho, Jang-Sik;Lee, Jeong-Hyeong;Choi, Seung-Bae;Beum, Soo-Gyun
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권4호
    • /
    • pp.1141-1150
    • /
    • 2006
  • In this study, we performed spatial cluster analysis which considered spatial information using earthquake data for Korean peninsula occurred on 1978 year to 2005 year. Also, we look into how to be clustered for regions using earthquake magnitude and frequency based on spatial scan statistic. And, on the basis of the results, we constructed earthquake map by earthquake outbreak risk and gave a possible explanation for the results of spatial cluster analysis.

  • PDF

데이터과학 시대에 적합한 컴퓨팅 인프라 구축 (Building a computing infrastructure in the era of data science)

  • 최숙희;한경수;왕철
    • 응용통계연구
    • /
    • 제37권1호
    • /
    • pp.49-59
    • /
    • 2024
  • 2010년을 전후로 미국에서 시작된 데이터과학의 인기는 국내 대학의 여러 통계학과 교육에 큰 영향을 주고 있다. 그러나 국내 학술지에서는 데이터과학을 효율적으로 교육하기 위한 컴퓨팅 환경 구축과 활용을 다루는 연구 결과는 많지 않다. 본 논문은 국내의 통계학과 및 데이터과학 관련 학과의 교육과 연구에 적합한 컴퓨팅 인프라 구축과 활용에 관한 문제를 논의하고 해결책을 제시한다.

On the Aggregation of Multi-dimensional Data using Data Cube and MDX

  • Ahn, Jeong-Yong;Kim, Seok-Ki
    • Journal of the Korean Data and Information Science Society
    • /
    • 제14권1호
    • /
    • pp.37-44
    • /
    • 2003
  • One of the characteristics of both on-line analytical processing(OLAP) applications and decision support systems is to provide aggregated source data. The purpose of this study is to discuss on the aggregation of multi-dimensional data. In this paper, we (1) examine the SQL aggregate functions and the GROUP BY operator, (2) introduce the Data Cube and MDX, (3) present an example for the practical usage of the Data Cube and MDX using sample data.

  • PDF

Obesity Level Prediction Based on Data Mining Techniques

  • Alqahtani, Asma;Albuainin, Fatima;Alrayes, Rana;Al muhanna, Noura;Alyahyan, Eyman;Aldahasi, Ezaz
    • International Journal of Computer Science & Network Security
    • /
    • 제21권3호
    • /
    • pp.103-111
    • /
    • 2021
  • Obesity affects individuals of all gender and ages worldwide; consequently, several studies have performed great works to define factors causing it. This study develops an effective method to trace obesity levels based on supervised data mining techniques such as Random Forest and Multi-Layer Perception (MLP), so as to tackle this universal epidemic. Notably, the dataset was from countries like Mexico, Peru, and Colombia in the 14- 61year age group, with varying eating habits and physical conditions. The data includes 2111 instances and 17 attributes labelled using NObesity, which facilitates categorization of data using Overweight Levels l I and II, Insufficient Weight, Normal Weight, as well as Obesity Type I to III. This study found that the highest accuracy was achieved by Random Forest algorithm in comparison to the MLP algorithm, with an overall classification rate of 96.7%.

Reliability Estimation in Bivariate Pareto Model with Bivariate Type I Censored Data

  • Cho, Jang-Sik;Cho, Kil-Ho;Kang, Sang-Gil
    • Journal of the Korean Data and Information Science Society
    • /
    • 제14권4호
    • /
    • pp.837-844
    • /
    • 2003
  • In this paper, we obtain the estimator of system reliability for the bivariate Pareto model with bivariate type 1 censored data. We obtain the estimators and approximated confidence intervals of the reliability for the parallel system based on likelihood function and the relative frequency, respectively. Also we present a numerical example by giving a data set which is generated by computer.

  • PDF

Large Sample Test for Independence in the Bivariate Pareto Model with Censored Data

  • Cho, Jang-Sik;Lee, Jea-Man;Lee, Woo-Dong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제14권2호
    • /
    • pp.377-383
    • /
    • 2003
  • In this paper, we consider two components system in which the lifetimes follow the bivariate Pareto model with random censored data. We assume that the censoring time is independent of the lifetimes of the two components. We develop large sample tests for testing independence between two components. Also we present simulated study which is the test based on asymptotic normal distribution in testing independence.

  • PDF

Test for Independence in Bivariate Pareto Model with Bivariate Random Censored Data

  • Cho, Jang-Sik;Kwon, Yong-Man;Choi, Seung-Bae
    • Journal of the Korean Data and Information Science Society
    • /
    • 제15권1호
    • /
    • pp.31-39
    • /
    • 2004
  • In this paper, we consider two components system which the lifetimes follow bivariate pareto model with bivariate random censored data. We assume that the censoring times are independent of the lifetimes of the two components. We develop large sample test for testing independence between two components. Also we present a simulation study which is the test based on asymptotic normal distribution in testing independence.

  • PDF