• Title/Summary/Keyword: Data Profiling

Search Result 420, Processing Time 0.025 seconds

Customer Behavior Based Customer Profiling Technique for Personalized Products Recommendation (개인화된 제품 추천을 위한 고객 행동 기반 고객 프로파일링 기법)

  • Park, You-Jin;Jung, Eau-Jin;Chang, Kun-Nyeong
    • Korean Management Science Review
    • /
    • v.23 no.3
    • /
    • pp.183-194
    • /
    • 2006
  • In this paper, we propose a customer profiling technique based on customer behavior for personalized products recommendation in Internet shopping mall. The proposed technique defines customer profile model based on customer behavior Information such as click data, buying data, market basket data, and interest categories. We also implement CBCPT(customer behavior based customer profiling technique) and perform extensive experiments. The experimental results show that CBCPT has higher MAE, precision, recall, and F1 than the existing other customer profiling technique.

Analysis of Structured and Unstructured Data and Construction of Criminal Profiling System using LSA (LSA를 이용한 정형·비정형데이터 분석과 범죄 프로파일링 시스템 구현)

  • Kim, Yonghoon;Chung, Mokdong
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.1
    • /
    • pp.66-73
    • /
    • 2017
  • Due to the recent rapid changes in society and wide spread of information devices, diverse digital information is utilized in a variety of economic and social analysis. Information related to the crime statistics by type of crime has been used as a major factor in crime. However, statistical analysis using only the structured data has the difficulty in the investigation by providing limited information to investigators and users. In this paper, structured data and unstructured data are analyzed by applying Korean Natural Language Processing (Ko-NLP) and the Latent Semantic Analysis (LSA) technique. It will provide a crime profile optimum system that can be applied to the crime profiling system or statistical analysis.

Development of Data Profiling Software Supporting a Microservice Architecture (마이크로 서비스 아키텍처를 지원하는 데이터 프로파일링 소프트웨어의 개발)

  • Chang, Jae-Young;Kim, Jihoon;Jee, Seowoo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.5
    • /
    • pp.127-134
    • /
    • 2021
  • Recently, acquisition of high quality data has become an important issue as the expansion of the big data industry. In order to acquiring high quality data, accurate evaluation of data quality should be preceded first. The quality of data can be evaluated through meta-information such as statistics on data, and the task to extract such meta-information is called data profiling. Until now, data profiling software has typically been provided as a component or an additional service of traditional data quality or visualization tools. Hence, it was not suitable for utilizing directly in various environments. To address this problem, this paper presents the development result of data profiling software based on a microservice architecture that can be serviced in various environments. The presented data profiler provides an easy-to-use interface that requests of meta-information can be serviced through the restful API. Also, a proposed data profiler is independent of a specific environment, thus can be integrated efficiently with the various big data platforms or data analysis tools.

A Design and Implementation A Software Profiling Tool based on XML for Embedded System (내장형 시스템 소프트웨어를 위한 XML 기반의 프로파일링 도구의 설계 및 구현)

  • Kwak, Dong-Gyu;Yoo, Chae-Woo
    • Journal of Internet Computing and Services
    • /
    • v.11 no.1
    • /
    • pp.143-151
    • /
    • 2010
  • According to increasing requirements in embedded systems. embedded software has been more complicated then before. a optimum software is difficult in embedded system. software developer make a difficult optimum software. this paper suggests a software profiling tool with which a software developer can easily profiling the embedded system software in cross-development environments. the suggested tool is designed based on host/target architecture. this tool insert program source for make profiling log to target program. a target program executed in target system. a target system communicate profiling log to host system. this tool in host system analyze profiling log data. and make a XML of profiling log and a profiling report. a profiling report is graphic viewer based GUI. a target system in this tool needs a few computing power. and XSLT can conversion of profile log XML to other format data. and suggested tool based on eclipse plug-in, therefore developer can use operates in eclipse.

Application of metabolic profiling for biomarker discovery

  • Hwang, Geum-Sook
    • Proceedings of the Korean Society of Applied Pharmacology
    • /
    • 2007.11a
    • /
    • pp.19-27
    • /
    • 2007
  • An important potential of metabolomics-based approach is the possibility to develop fingerprints of diseases or cellular responses to classes of compounds with known common biological effect. Such fingerprints have the potential to allow classification of disease states or compounds, to provide mechanistic information on cellular perturbations and pathways and to identify biomarkers specific for disease severity and drug efficacy. Metabolic profiles of biological fluids contain a vast array of endogenous metabolites. Changes in those profiles resulting from perturbations of the system can be observed using analytical techniques, such as NMR and MS. $^1H$ NMR was used to generate a molecular fingerprint of serum or urinary sample, and then pattern recognition technique was applied to identity molecular signatures associated with the specific diseases or drug efficiency. Several metabolites that differentiate disease samples from the control were thoroughly characterized by NMR spectroscopy. We investigated the metabolic changes in human normal and clinical samples using $^1H$ NMR. Spectral data were applied to targeted profiling and spectral binning method, and then multivariate statistical data analysis (MVDA) was used to examine in detail the modulation of small molecule candidate biomarkers. We show that targeted profiling produces robust models, generates accurate metabolite concentration data, and provides data that can be used to help understand metabolic differences between healthy and disease population. Such metabolic signatures could provide diagnostic markers for a disease state or biomarkers for drug response phenotypes.

  • PDF

Virtual Machine Code Optimization using Profiling Data (프로파일링 데이터를 이용한 가상기계 코드 최적화)

  • Shin, Yang-Hoon;Yi, Chang-Hwan;Oh, Se-Man
    • The KIPS Transactions:PartA
    • /
    • v.14A no.3 s.107
    • /
    • pp.167-172
    • /
    • 2007
  • VM(Virtual Machine) can be considered as a software processor which interprets the machine code. Also, it is considered as a conceptional computer that consists of logical system configuration. But, the execution speed of VM system is much slower than that of a real processor system. So, it is very important to optimize the code for virtual machine to enhance the execution time. Especially the optimizer for a virtual machine code on embedded devices requires the highly efficient performance to the ordinary optimizer in the respect to the optimized ratio about cost. Fundamentally, functions and basic blocks which influence the execution time of virtual machine is found, and then an optimization for them nay get the high efficiency. In this paper, we designed and implemented the optimizer for the virtual(or abstract) machine code(VMC) using profiling. Firstly, we defined the profiling information which is necessary to the optimization of VMC. The information can be obtained from dynamically executing the machine code. And we implemented VMC optimizer using the profiling information. In our implementation, the VMC is SIL(Standard Intermediate Language) that is an intermediate code of EVM(Embedded Virtual Machine). Also, we tried a benchmark test for the VMC optimizer and obtained reasonable results.

Enhancing GPU Performance by Efficient Hardware-Based and Hybrid L1 Data Cache Bypassing

  • Huangfu, Yijie;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.11 no.2
    • /
    • pp.69-77
    • /
    • 2017
  • Recent GPUs have adopted cache memory to benefit general-purpose GPU (GPGPU) programs. However, unlike CPU programs, GPGPU programs typically have considerably less temporal/spatial locality. Moreover, the L1 data cache is used by many threads that access a data size typically considerably larger than the L1 cache, making it critical to bypass L1 data cache intelligently to enhance GPU cache performance. In this paper, we examine GPU cache access behavior and propose a simple hardware-based GPU cache bypassing method that can be applied to GPU applications without recompiling programs. Moreover, we introduce a hybrid method that integrates static profiling information and hardware-based bypassing to further enhance performance. Our experimental results reveal that hardware-based cache bypassing can boost performance for most benchmarks, and the hybrid method can achieve performance comparable to state-of-the-art compiler-based bypassing with considerably less profiling cost.

A Wide-Window Superscalar Microprocessor Profiling Performance Model Using Multiple Branch Prediction (대형 윈도우에서 다중 분기 예측법을 이용하는 수퍼스칼라 프로세서의 프로화일링 성능 모델)

  • Lee, Jong-Bok
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.7
    • /
    • pp.1443-1449
    • /
    • 2009
  • This paper presents a profiling model of a wide-window superscalar microprocessor using multiple branch prediction. The key idea is to apply statistical profiling technique to the superscalar microprocessor with a wide instruction window and a multiple branch predictor. The statistical profiling data are used to obtain a synthetical instruction trace, and the consecutive multiple branch prediction rates are utilized for running trace-driven simulation on the synthesized instruction trace. We describe our design and evaluate it with the SPEC 2000 integer benchmarks. Our performance model can achieve accuracy of 8.5 % on the average.

Deep Learning Based User Safety Profiling Using User Feature Information Modeling (딥러닝 기반 사용자 특징 정보 모델링을 통한 사용자 안전 프로파일링)

  • Kim, Kye-Kyung
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.2
    • /
    • pp.143-150
    • /
    • 2021
  • There is a need for an artificial intelligent technology that can reduce various types of safety accidents by analyzing the risk factors that cause safety accidents in industrial site. In this paper, user safety profiling methods are proposed that can prevent safety accidents in advance by specifying and modeling user information data related to safety accidents. User information data is classified into normal and abnormal conditions through deep learning based artificial intelligence analysis. As a result of verifying user safety profiling technology using more than 10 types of industrial field data, 93.6% of user safety profiling accuracy was obtained.

A Study on Empirical Model for the Prevention and Protection of Technology Leakage through SME Profiling Analysis (중소기업 프로파일링 분석을 통한 기술유출 방지 및 보호 모형 연구)

  • Yoo, In-Jin;Park, Do-Hyung
    • The Journal of Information Systems
    • /
    • v.27 no.1
    • /
    • pp.171-191
    • /
    • 2018
  • Purpose Corporate technology leakage is not only monetary loss, but also has a negative impact on the corporate image and further deteriorates sustainable growth. In particular, since SMEs are highly dependent on core technologies compared to large corporations, loss of technology leakage threatens corporate survival. Therefore, it is important for SMEs to "prevent and protect technology leakage". With the recent development of data analysis technology and the opening of public data, it has become possible to discover and proactively detect companies with a high probability of technology leakage based on actual company data. In this study, we try to construct profiles of enterprises with and without technology leakage experience through profiling analysis using data mining techniques. Furthermore, based on this, we propose a classification model that distinguishes companies that are likely to leak technology. Design/methodology/approach This study tries to develop the empirical model for prevention and protection of technology leakage through profiling method which analyzes each SME from the viewpoint of individual. Based on the previous research, we tried to classify many characteristics of SMEs into six categories and to identify the factors influencing the technology leakage of SMEs from the enterprise point of view. Specifically, we divided the 29 SME characteristics into the following six categories: 'firm characteristics', 'organizational characteristics', 'technical characteristics', 'relational characteristics', 'financial characteristics', and 'enterprise core competencies'. Each characteristic was extracted from the questionnaire data of 'Survey of Small and Medium Enterprises Technology' carried out annually by the Government of the Republic of Korea. Since the number of SMEs with experience of technology leakage in questionnaire data was significantly smaller than the other, we made a 1: 1 correspondence with each sample through mixed sampling. We conducted profiling of companies with and without technology leakage experience using decision-tree technique for research data, and derived meaningful variables that can distinguish the two. Then, empirical model for prevention and protection of technology leakage was developed through discriminant analysis and logistic regression analysis. Findings Profiling analysis shows that technology novelty, enterprise technology group, number of intellectual property registrations, product life cycle, technology development infrastructure level(absence of dedicated organization), enterprise core competency(design) and enterprise core competency(process design) help us find SME's technology leakage. We developed the two empirical model for prevention and protection of technology leakage in SMEs using discriminant analysis and logistic regression analysis, and each hit ratio is 65%(discriminant analysis) and 67%(logistic regression analysis).