• Title/Summary/Keyword: transformation data

Search Result 2,061, Processing Time 0.03 seconds

Augmented Rotation-Based Transformation for Privacy-Preserving Data Clustering

  • Hong, Do-Won;Mohaisen, Abedelaziz
    • ETRI Journal
    • /
    • v.32 no.3
    • /
    • pp.351-361
    • /
    • 2010
  • Multiple rotation-based transformation (MRBT) was introduced recently for mitigating the apriori-knowledge independent component analysis (AK-ICA) attack on rotation-based transformation (RBT), which is used for privacy-preserving data clustering. MRBT is shown to mitigate the AK-ICA attack but at the expense of data utility by not enabling conventional clustering. In this paper, we extend the MRBT scheme and introduce an augmented rotation-based transformation (ARBT) scheme that utilizes linearity of transformation and that both mitigates the AK-ICA attack and enables conventional clustering on data subsets transformed using the MRBT. In order to demonstrate the computational feasibility aspect of ARBT along with RBT and MRBT, we develop a toolkit and use it to empirically compare the different schemes of privacy-preserving data clustering based on data transformation in terms of their overhead and privacy.

Compositional data analysis by the square-root transformation: Application to NBA USG% data

  • Jeseok Lee;Byungwon Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.3
    • /
    • pp.349-363
    • /
    • 2024
  • Compositional data refers to data where the sum of the values of the components is a constant, hence the sample space is defined as a simplex making it impossible to apply statistical methods developed in the usual Euclidean vector space. A natural approach to overcome this restriction is to consider an appropriate transformation which moves the sample space onto the Euclidean space, and log-ratio typed transformations, such as the additive log-ratio (ALR), the centered log-ratio (CLR) and the isometric log-ratio (ILR) transformations, have been mostly conducted. However, in scenarios with sparsity, where certain components take on exact zero values, these log-ratio type transformations may not be effective. In this work, we mainly suggest an alternative transformation, that is the square-root transformation which moves the original sample space onto the directional space. We compare the square-root transformation with the log-ratio typed transformation by the simulation study and the real data example. In the real data example, we applied both types of transformations to the USG% data obtained from NBA, and used a density based clustering method, DBSCAN (density-based spatial clustering of applications with noise), to show the result.

A note on Box-Cox transformation and application in microarray data

  • Rahman, Mezbahur;Lee, Nam-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.967-976
    • /
    • 2011
  • The Box-Cox transformation is a well known family of power transformations that brings a set of data into agreement with the normality assumption of the residuals and hence the response variable of a postulated model in regression analysis. Normalization (studentization) of the regressors is a common practice in analyzing microarray data. Here, we implement Box-Cox transformation in normalizing regressors in microarray data. Pridictabilty of the model can be improved using data transformation compared to studentization.

Relational Data Extraction and Transformation: A Study to Enhance Information Systems Performance

  • Forat Falih, Hasan;Muhamad Shahbani Abu, Bakar
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.4
    • /
    • pp.265-272
    • /
    • 2022
  • The most effective method to improve information system capabilities is to enable instant access to several relational database sources and transform data with a logical structure into multiple target relational databases. There are numerous data transformation tools available; however, they typically contain fixed procedures that cannot be changed by the user, making it impossible to fulfill the near-real-time data transformation requirements. Furthermore, some tools cannot build object references or alter attribute constraints. There are various situations in which tool changes in data type cause conflicts and difficulties with data quality while transforming between the two systems. The R-programming language was extensively used throughout this study, and several different relational database structures were utilized to complete the proposed study. Experiments showed that the developed study can improve the performance of information systems by interacting with and exchanging data with various relational databases. The study addresses data quality issues, particularly the completeness and integrity dimensions of the data transformation processes.

Analysis of binary data by empirical logit transformation and the type of Freeman-Tukey inverse sine transformation (경험로지트변환과 Freeman-Tukey형 역정현 변환에 의한 계수치 자료의 해석)

  • 김홍준;채규용;이상용
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.20 no.42
    • /
    • pp.1-8
    • /
    • 1997
  • In case of analysis of discrete data, it shows by way of example orthogonal array experiment for o, 1 data. This paper introduced expirical logit transformation and the type of Freeman-Tukey inverse sine transformation. As the result of analysis of variance, empirical logit transformation turned out a mistake in application but it is possible for graphical analysis by normal probability paper.

  • PDF

Speaker Adaptation Using ICA-Based Feature Transformation

  • Jung, Ho-Young;Park, Man-Soo;Kim, Hoi-Rin;Hahn, Min-Soo
    • ETRI Journal
    • /
    • v.24 no.6
    • /
    • pp.469-472
    • /
    • 2002
  • Speaker adaptation techniques are generally used to reduce speaker differences in speech recognition. In this work, we focus on the features fitted to a linear regression-based speaker adaptation. These are obtained by feature transformation based on independent component analysis (ICA), and the feature transformation matrices are estimated from the training data and adaptation data. Since the adaptation data is not sufficient to reliably estimate the ICA-based feature transformation matrix, it is necessary to adjust the ICA-based feature transformation matrix estimated from a new speaker utterance. To cope with this problem, we propose a smoothing method through a linear interpolation between the speaker-independent (SI) feature transformation matrix and the speaker-dependent (SD) feature transformation matrix. From our experiments, we observed that the proposed method is more effective in the mismatched case. In the mismatched case, the adaptation performance is improved because the smoothed feature transformation matrix makes speaker adaptation using noisy speech more robust.

  • PDF

A Study on Fast Datum Transformation model for GIS (지리정보시스템을 위한 고속 측지계 변환 모델 연구)

  • Suh, Yong-Cheol
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.7 no.3
    • /
    • pp.48-56
    • /
    • 2004
  • This research focuses on the development of a fast datum transformation model to be used in GIS that utilizes real-time data transformation. Instance, when a GIS data constructed according to a datum is conformed to another datum, instead of transforming the axes of the original data, the data is transformed right before the results are reflected on the monitor. In this research, the prospects of calculating transformation parameters for every grid cells on the area based on two-dimensional conformal transformation model in order to decrease real-time datum transformation time while maintaining a high accuracy has been investigated. Research results showed that for a fixed area, the accuracies of the two-dimensional conformal transformation and the three-dimensional datum transformation, which requires more computing time, were almost equal and fast transformation speed, high accuracy real-time datum transformation is made feasible by implementing the grid-divided two-dimensional conformal transformation model.

  • PDF

Research on the introduction and use of Big Data for trade digital transformation (무역 디지털 트랜스포메이션을 위한 빅데이터 도입 및 활용에 관한 연구)

  • Joon-Mo Jung;Yoon-Say Jeong
    • Korea Trade Review
    • /
    • v.47 no.3
    • /
    • pp.57-73
    • /
    • 2022
  • The process and change of convergence in the economy and industry with the development of digital technology and combining with new technologies is called Digital Transformation. Specifically, it refers to innovating existing businesses and services by utilizing information and communication technologies such as big data analysis, Internet of Things, cloud computing, and artificial intelligence. Digital transformation is changing the shape of business and has a wide impact on businesses and consumers in all industries. Among them, the big data and analytics market is emerging as one of the most important growth drivers of digital transformation. Integrating intelligent data into an existing business is one of the key tasks of digital transformation, and it is important to collect and monitor data and learn from the collected data in order to efficiently operate a data-based business. In developed countries overseas, research on new business models using various data accumulated at the level of government and private companies is being actively conducted. However, although the trade and import/export data collected in the domestic public sector is being accumulated in various types and ranges, the establishment of an analysis and utilization model is still in its infancy. Currently, we are living in an era of massive amounts of big data. We intend to discuss the value of trade big data possessed from the past to the present, and suggest a strategy to activate trade big data for trade digital transformation and a new direction for future trade big data research.

An Implementation of Total Data Quality Management Using an Information Structure Graph (정보 구조 그래프를 이용한 통합 데이터 품질 관리 방안 연구)

  • 이춘열
    • Journal of Information Technology Applications and Management
    • /
    • v.10 no.4
    • /
    • pp.103-118
    • /
    • 2003
  • This study presents a database quality evaluation framework. As a way to build a framework, this study expands data quality management to include data transformation processes as well as data. Further, an information structure graph is applied to represent data transformations processes. An information structure graph is absed on a relational database scheme. Thus, data transformation processes may be stored in a relational database. This kind of integration of data transformation metadata with technical metadata eases evaluation of database qualities and their causes.

  • PDF

An Effective Algorithm of Power Transformation: Box-Cox Transformation

  • Lee, Seung-Woo;Cha, Kyung-Joon
    • Journal for History of Mathematics
    • /
    • v.11 no.2
    • /
    • pp.63-76
    • /
    • 1998
  • When teaching the linear regression analysis in the class, the power transformation must be introduced to fit the linear regression model for nonlinear data. Box and Cox (1964) proposed the attractive power transformation technique which is so called Box-Cox transformation. In this paper, an effective algorithm selecting an appropriate value for Box-Cox transformation is developed which is considered to find a value minimizing error sum of squares. When the proposed algorithm is used to find a value for transformation, the number of iterations needs to be considered. Thus, the number of iterations is examined through simulation study. Since SAS is one of most widely used packages and does not provide the procedure that performs iterative Box-Cox transformation, a SAS program automatically choosing the value for transformation is developed. Hence, the students could learn how the Box-Cox transformation works, moreover, researchers can use this for analysis of data.

  • PDF