• Title/Summary/Keyword: Research Data Utility

Search Result 512, Processing Time 0.024 seconds

A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases

  • Ahmed, Chowdhury Farhan;Tanbeer, Syed Khairuzzaman;Jeong, Byeong-Soo
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.676-686
    • /
    • 2010
  • Mining sequential patterns is an important research issue in data mining and knowledge discovery with broad applications. However, the existing sequential pattern mining approaches consider only binary frequency values of items in sequences and equal importance/significance values of distinct items. Therefore, they are not applicable to actually represent many real-world scenarios. In this paper, we propose a novel framework for mining high-utility sequential patterns for more real-life applicable information extraction from sequence databases with non-binary frequency values of items in sequences and different importance/significance values for distinct items. Moreover, for mining high-utility sequential patterns, we propose two new algorithms: UtilityLevel is a high-utility sequential pattern mining with a level-wise candidate generation approach, and UtilitySpan is a high-utility sequential pattern mining with a pattern growth approach. Extensive performance analyses show that our algorithms are very efficient and scalable for mining high-utility sequential patterns.

A Distributed Privacy-Utility Tradeoff Method Using Distributed Lossy Source Coding with Side Information

  • Gu, Yonghao;Wang, Yongfei;Yang, Zhen;Gao, Yimu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.5
    • /
    • pp.2778-2791
    • /
    • 2017
  • In the age of big data, distributed data providers need to ensure the privacy, while data analysts need to mine the value of data. Therefore, how to find the privacy-utility tradeoff has become a research hotspot. Besides, the adversary may have the background knowledge of the data source. Therefore, it is significant to solve the privacy-utility tradeoff problem in the distributed environment with side information. This paper proposes a distributed privacy-utility tradeoff method using distributed lossy source coding with side information, and quantitatively gives the privacy-utility tradeoff region and Rate-Distortion-Leakage region. Four results are shown in the simulation analysis. The first result is that both the source rate and the privacy leakage decrease with the increase of source distortion. The second result is that the finer relevance between the public data and private data of source, the finer perturbation of source needed to get the same privacy protection. The third result is that the greater the variance of the data source, the slighter distortion is chosen to ensure more data utility. The fourth result is that under the same privacy restriction, the slighter the variance of the side information, the less distortion of data source is chosen to ensure more data utility. Finally, the provided method is compared with current ones from five aspects to show the advantage of our method.

A Differential Privacy Approach to Preserve GWAS Data Sharing based on A Game Theoretic Perspective

  • Yan, Jun;Han, Ziwei;Zhou, Yihui;Lu, Laifeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.1028-1046
    • /
    • 2022
  • Genome-wide association studies (GWAS) aim to find the significant genetic variants for common complex disease. However, genotype data has privacy information such as disease status and identity, which make data sharing and research difficult. Differential privacy is widely used in the privacy protection of data sharing. The current differential privacy approach in GWAS pays no attention to raw data but to statistical data, and doesn't achieve equilibrium between utility and privacy, so that data sharing is hindered and it hampers the development of genomics. To share data more securely, we propose a differential privacy preserving approach of data sharing for GWAS, and achieve the equilibrium between privacy and data utility. Firstly, a reasonable disturbance interval for the genotype is calculated based on the expected utility. Secondly, based on the interval, we get the Nash equilibrium point between utility and privacy. Finally, based on the equilibrium point, the original genotype matrix is perturbed with differential privacy, and the corresponding random genotype matrix is obtained. We theoretically and experimentally show that the method satisfies expected privacy protection and utility. This method provides engineering guidance for protecting GWAS data privacy.

Mining High Utility Sequential Patterns Using Sequence Utility Lists (시퀀스 유틸리티 리스트를 사용하여 높은 유틸리티 순차 패턴 탐사 기법)

  • Park, Jong Soo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.2
    • /
    • pp.51-62
    • /
    • 2018
  • High utility sequential pattern (HUSP) mining has been considered as an important research topic in data mining. Although some algorithms have been proposed for this topic, they incur the problem of producing a large search space for HUSPs. The tighter utility upper bound of a sequence can prune more unpromising patterns early in the search space. In this paper, we propose a sequence expected utility (SEU) as a new utility upper bound of each sequence, which is the maximum expected utility of a sequence and all its descendant sequences. A sequence utility list for each pattern is used as a new data structure to maintain essential information for mining HUSPs. We devise an algorithm, high sequence utility list-span (HSUL-Span), to identify HUSPs by employing SEU. Experimental results on both synthetic and real datasets from different domains show that HSUL-Span generates considerably less candidate patterns and outperforms other algorithms in terms of execution time.

Investigating Utility, Attitude, Intention, and Satisfaction of Skill-Sharing Economy

  • La, Soo-Jung;Cho, Yooncheong
    • The Journal of Industrial Distribution & Business
    • /
    • v.10 no.1
    • /
    • pp.39-49
    • /
    • 2019
  • Purpose - Previous studies examined effects of sharing economy in the fields such as accommodation and automobile sector, while there are lack of researches in the field of skill-sharing economy. By classifying skill-sharing into general and special skill-sharing, this study explored effects of variables such as transaction utility, social utility, sustainability utility, emotional utility, economic utility, and trust utility, on attitudes, intention, satisfaction, and loyalty of demand (i.e., customers) and supply (i.e., providers) sides, potential, and actual customers. Research design, data, and methodology - Data were collected via both online and offline surveys. This study applied factor analysis and multiple regression analysis for findings. Results - Results show that utilities for general suppliers' skill-sharing are significant than other cases. Among utilities, this study found that trust utility shows significant for the cases of special customers', general suppliers' and special suppliers' potential skill-sharing. The results implies that trust is crucial in the transaction of the sharing economy. Conclusions - Enhanced managerial systems help resolve issues on the sharing economy. This study provides implications what are positive effects of skill-sharing economy and recommends proper establishment of the sharing economy.

Data reconciliation and optimization of utility plants for energy saving

  • Lee, Moo-Ho;Kim, Jeong-Hwan;Chonghun Han;Chang, Kun-Soo;Kim, Seong-Hwan;You, Sang-Hyun
    • Proceedings of the Korea Society for Energy Engineering kosee Conference
    • /
    • 1997.10a
    • /
    • pp.17-23
    • /
    • 1997
  • A methodology for on-line data reconciliation and optimization has been proposed to minimize the energy cost of a utility system. As industrial data tend to be corrupted by noise or gross error, fast and robust data reconciliation technique is essential for the on-line optimization of utility system. Thus, we propose the hierarchical decomposition approach that can be applicable to on-line data reconciliation and optimization. As this approach divides whole system into several subsystems and removes the nonlinearity of constraint systematically, it handles complexity of system easily and shows good performance in accuracy and computation speed. Through case studies, we prove that this methodology is a good candidate for on-line data reconciliation and optimization.

  • PDF

A Cost-Utility Analysis of Home Care Services by using the QALY (QALY를 이용한 가정간호서비스의 비용효용분석)

  • 임지영
    • Journal of Korean Academy of Nursing
    • /
    • v.34 no.3
    • /
    • pp.449-457
    • /
    • 2004
  • Purpose: The aim of this study was to analyze economical efficiency of home care service by comparing a cost-utility ratio(CUR) between home care and hospitalization. Method: The analytic framework of this study was constructed in 5 stages: Identifying the analytic perspectives, measurement of costs, measurement of utility, analysis of CUR, and sensitivity test. Data was collected by reviewing medical records, home care service records, medical fee claims, and other related research. Result: The mean of the annual total cost was 23,317,636 Won in home care and 73,739,352 Won in hospital care. QALY was 0.389 in home care and 0.474 in hospital care, so CUR was 299,712,545 QALY in home care and 777,841,266 QALY in hospital care. Conclusion: The findings affirmed that home care had an economical efficiency in the aspect of utility compared to hospitalization. Therefore, the findings of this study can be used to develop a governmental health policy or to expand the home care system. In addition, the cost-utility analysis framework and process of this study will be an example model for cost-utility analysis in nursing research. Therefore, it will be used as a guideline for future research related to cost-utility analysis in nursing.

The Perceived Utility of Education and Training in SMEs on Employee Satisfaction: The Moderating Role of HRM Department Activities (중소기업 재직자들의 교육훈련에 대한 인지된 유용성이 교육 훈련 만족도에 미치는 영향: 인사부서 활동의 조절효과)

  • Park, Ji-Sung;Chae, Hee-Sun
    • Asia-Pacific Journal of Business
    • /
    • v.12 no.4
    • /
    • pp.241-251
    • /
    • 2021
  • Purpose - Drawing on the content-process approach, this study examines the effect of employees' perceived utility of education and training in small and medium enterprises (SMEs) on their satisfaction. In addition, this study investigates how the human resource management department' activities moderate the relationship between employees' perceived utility of education and training and satisfaction. Design/methodology/approach - This study predicts the positive relationship between employees' perceived utility of education and training and satisfaction, and HR activities strengthens this positive relationship. To test these hypotheses, this study utilized Human Capital Corporate Panel (HCCP) datasets, especially 2017 data at the individual level. The number of the final sample is 425 for the test. Moreover, this study used the hierarchical regression model with SPSS. Finding - As predicted, the analytical results with the hierarchical regression model showed that employees' percieved utility of education and training and satisfaction were positively related. In addition, HR activities strengthened this relationship between employees' percieved utility of education and training and satisfaction. Research implications or Originality - This study will provide academic and practical implications for future research on human resource development, especially SMEs by deepening an understanding of the important factors in order to increase employees' satisfaction of education and training. the number of viewers is found in most American films released in Korea.

Comparison of Performance Measures for Credit-Card Delinquents Classification Models : Measured by Hit Ratio vs. by Utility (신용카드 연체자 분류모형의 성능평가 척도 비교 : 예측률과 유틸리티 중심으로)

  • Chung, Suk-Hoon;Suh, Yong-Moo
    • Journal of Information Technology Applications and Management
    • /
    • v.15 no.4
    • /
    • pp.21-36
    • /
    • 2008
  • As the great disturbance from abusing credit cards in Korea becomes stabilized, credit card companies need to interpret credit-card delinquents classification models from the viewpoint of profit. However, hit ratio which has been used as a measure of goodness of classification models just tells us how much correctly they classified rather than how much profits can be obtained as a result of using classification models. In this research, we tried to develop a new utility-based measure from the viewpoint of profit and then used this new measure to analyze two classification models(Neural Networks and Decision Tree models). We found that the hit ratio of neural model is higher than that of decision tree model, but the utility value of decision tree model is higher than that of neural model. This experiment shows the importance of utility based measure for credit-card delinquents classification models. We expect this new measure will contribute to increasing profits of credit card companies.

  • PDF

A single-phase algorithm for mining high utility itemsets using compressed tree structures

  • Bhat B, Anup;SV, Harish;M, Geetha
    • ETRI Journal
    • /
    • v.43 no.6
    • /
    • pp.1024-1037
    • /
    • 2021
  • Mining high utility itemsets (HUIs) from transaction databases considers such factors as the unit profit and quantity of purchased items. Two-phase tree-based algorithms transform a database into compressed tree structures and generate candidate patterns through a recursive pattern-growth procedure. This procedure requires a lot of memory and time to construct conditional pattern trees. To address this issue, this study employs two compressed tree structures, namely, Utility Count Tree and String Utility Tree, to enumerate valid patterns and thus promote fast utility computation. Furthermore, the study presents an algorithm called single-phase utility computation (SPUC) that leverages these two tree structures to mine HUIs in a single phase by incorporating novel pruning strategies. Experiments conducted on both real and synthetic datasets demonstrate the superior performance of SPUC compared with IHUP, UP-Growth, and UP-Growth+algorithms.