• 제목/요약/키워드: Data Transformation

검색결과 2,063건 처리시간 0.03초

Augmented Rotation-Based Transformation for Privacy-Preserving Data Clustering

  • Hong, Do-Won;Mohaisen, Abedelaziz
    • ETRI Journal
    • /
    • 제32권3호
    • /
    • pp.351-361
    • /
    • 2010
  • Multiple rotation-based transformation (MRBT) was introduced recently for mitigating the apriori-knowledge independent component analysis (AK-ICA) attack on rotation-based transformation (RBT), which is used for privacy-preserving data clustering. MRBT is shown to mitigate the AK-ICA attack but at the expense of data utility by not enabling conventional clustering. In this paper, we extend the MRBT scheme and introduce an augmented rotation-based transformation (ARBT) scheme that utilizes linearity of transformation and that both mitigates the AK-ICA attack and enables conventional clustering on data subsets transformed using the MRBT. In order to demonstrate the computational feasibility aspect of ARBT along with RBT and MRBT, we develop a toolkit and use it to empirically compare the different schemes of privacy-preserving data clustering based on data transformation in terms of their overhead and privacy.

Compositional data analysis by the square-root transformation: Application to NBA USG% data

  • Jeseok Lee;Byungwon Kim
    • Communications for Statistical Applications and Methods
    • /
    • 제31권3호
    • /
    • pp.349-363
    • /
    • 2024
  • Compositional data refers to data where the sum of the values of the components is a constant, hence the sample space is defined as a simplex making it impossible to apply statistical methods developed in the usual Euclidean vector space. A natural approach to overcome this restriction is to consider an appropriate transformation which moves the sample space onto the Euclidean space, and log-ratio typed transformations, such as the additive log-ratio (ALR), the centered log-ratio (CLR) and the isometric log-ratio (ILR) transformations, have been mostly conducted. However, in scenarios with sparsity, where certain components take on exact zero values, these log-ratio type transformations may not be effective. In this work, we mainly suggest an alternative transformation, that is the square-root transformation which moves the original sample space onto the directional space. We compare the square-root transformation with the log-ratio typed transformation by the simulation study and the real data example. In the real data example, we applied both types of transformations to the USG% data obtained from NBA, and used a density based clustering method, DBSCAN (density-based spatial clustering of applications with noise), to show the result.

A note on Box-Cox transformation and application in microarray data

  • Rahman, Mezbahur;Lee, Nam-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권5호
    • /
    • pp.967-976
    • /
    • 2011
  • The Box-Cox transformation is a well known family of power transformations that brings a set of data into agreement with the normality assumption of the residuals and hence the response variable of a postulated model in regression analysis. Normalization (studentization) of the regressors is a common practice in analyzing microarray data. Here, we implement Box-Cox transformation in normalizing regressors in microarray data. Pridictabilty of the model can be improved using data transformation compared to studentization.

Relational Data Extraction and Transformation: A Study to Enhance Information Systems Performance

  • Forat Falih, Hasan;Muhamad Shahbani Abu, Bakar
    • Journal of information and communication convergence engineering
    • /
    • 제20권4호
    • /
    • pp.265-272
    • /
    • 2022
  • The most effective method to improve information system capabilities is to enable instant access to several relational database sources and transform data with a logical structure into multiple target relational databases. There are numerous data transformation tools available; however, they typically contain fixed procedures that cannot be changed by the user, making it impossible to fulfill the near-real-time data transformation requirements. Furthermore, some tools cannot build object references or alter attribute constraints. There are various situations in which tool changes in data type cause conflicts and difficulties with data quality while transforming between the two systems. The R-programming language was extensively used throughout this study, and several different relational database structures were utilized to complete the proposed study. Experiments showed that the developed study can improve the performance of information systems by interacting with and exchanging data with various relational databases. The study addresses data quality issues, particularly the completeness and integrity dimensions of the data transformation processes.

경험로지트변환과 Freeman-Tukey형 역정현 변환에 의한 계수치 자료의 해석 (Analysis of binary data by empirical logit transformation and the type of Freeman-Tukey inverse sine transformation)

  • 김홍준;채규용;이상용
    • 산업경영시스템학회지
    • /
    • 제20권42호
    • /
    • pp.1-8
    • /
    • 1997
  • In case of analysis of discrete data, it shows by way of example orthogonal array experiment for o, 1 data. This paper introduced expirical logit transformation and the type of Freeman-Tukey inverse sine transformation. As the result of analysis of variance, empirical logit transformation turned out a mistake in application but it is possible for graphical analysis by normal probability paper.

  • PDF

Speaker Adaptation Using ICA-Based Feature Transformation

  • Jung, Ho-Young;Park, Man-Soo;Kim, Hoi-Rin;Hahn, Min-Soo
    • ETRI Journal
    • /
    • 제24권6호
    • /
    • pp.469-472
    • /
    • 2002
  • Speaker adaptation techniques are generally used to reduce speaker differences in speech recognition. In this work, we focus on the features fitted to a linear regression-based speaker adaptation. These are obtained by feature transformation based on independent component analysis (ICA), and the feature transformation matrices are estimated from the training data and adaptation data. Since the adaptation data is not sufficient to reliably estimate the ICA-based feature transformation matrix, it is necessary to adjust the ICA-based feature transformation matrix estimated from a new speaker utterance. To cope with this problem, we propose a smoothing method through a linear interpolation between the speaker-independent (SI) feature transformation matrix and the speaker-dependent (SD) feature transformation matrix. From our experiments, we observed that the proposed method is more effective in the mismatched case. In the mismatched case, the adaptation performance is improved because the smoothed feature transformation matrix makes speaker adaptation using noisy speech more robust.

  • PDF

지리정보시스템을 위한 고속 측지계 변환 모델 연구 (A Study on Fast Datum Transformation model for GIS)

  • 서용철
    • 한국지리정보학회지
    • /
    • 제7권3호
    • /
    • pp.48-56
    • /
    • 2004
  • 본 연구에서는 실시간 측지계 변환 기법을 사용하는 지리정보시스템에 사용될 고속 변환 모델 개발을 수행하였다. 한 측지계에 준거하여 구축된 지리정보데이터를 다른 측지계에 준거하여 표시하는 경우 원 구축데이터의 좌표를 변환시키지 않고, 화면 표시나 출력 직전에 변환하여 표시하는 방법이 사용된다. 본 연구에서는 이러한 실시간 측지계 변환 작업의 속도를 향상시키고 높은 변환 정확도를 유지하기 위한 방법으로, 지역 분할 변환 매개변수 계산에 의한 2차원 동각상사변환 모델의 적용 방안을 검토하였다. 연구 결과 일정한 범위 안에서는 비교적 많은 계산 시간을 필요로 하는 3차원 측지계 변환과 2차원 등각상사변환이 거의 동일한 변환 정확도를 나타내었으며, 영역분할에 의한 2차원 상사 변환 모델을 적용할 경우 높은 정확도를 유지하고 향상된 변환 속도를 나타내는 실시간 측지계 변환이 가능하다는 결과를 얻게 되었다.

  • PDF

무역 디지털 트랜스포메이션을 위한 빅데이터 도입 및 활용에 관한 연구 (Research on the introduction and use of Big Data for trade digital transformation)

  • 정준모;정윤세
    • 무역학회지
    • /
    • 제47권3호
    • /
    • pp.57-73
    • /
    • 2022
  • The process and change of convergence in the economy and industry with the development of digital technology and combining with new technologies is called Digital Transformation. Specifically, it refers to innovating existing businesses and services by utilizing information and communication technologies such as big data analysis, Internet of Things, cloud computing, and artificial intelligence. Digital transformation is changing the shape of business and has a wide impact on businesses and consumers in all industries. Among them, the big data and analytics market is emerging as one of the most important growth drivers of digital transformation. Integrating intelligent data into an existing business is one of the key tasks of digital transformation, and it is important to collect and monitor data and learn from the collected data in order to efficiently operate a data-based business. In developed countries overseas, research on new business models using various data accumulated at the level of government and private companies is being actively conducted. However, although the trade and import/export data collected in the domestic public sector is being accumulated in various types and ranges, the establishment of an analysis and utilization model is still in its infancy. Currently, we are living in an era of massive amounts of big data. We intend to discuss the value of trade big data possessed from the past to the present, and suggest a strategy to activate trade big data for trade digital transformation and a new direction for future trade big data research.

정보 구조 그래프를 이용한 통합 데이터 품질 관리 방안 연구 (An Implementation of Total Data Quality Management Using an Information Structure Graph)

  • 이춘열
    • Journal of Information Technology Applications and Management
    • /
    • 제10권4호
    • /
    • pp.103-118
    • /
    • 2003
  • This study presents a database quality evaluation framework. As a way to build a framework, this study expands data quality management to include data transformation processes as well as data. Further, an information structure graph is applied to represent data transformations processes. An information structure graph is absed on a relational database scheme. Thus, data transformation processes may be stored in a relational database. This kind of integration of data transformation metadata with technical metadata eases evaluation of database qualities and their causes.

  • PDF

An Effective Algorithm of Power Transformation: Box-Cox Transformation

  • Lee, Seung-Woo;Cha, Kyung-Joon
    • 한국수학사학회지
    • /
    • 제11권2호
    • /
    • pp.63-76
    • /
    • 1998
  • When teaching the linear regression analysis in the class, the power transformation must be introduced to fit the linear regression model for nonlinear data. Box and Cox (1964) proposed the attractive power transformation technique which is so called Box-Cox transformation. In this paper, an effective algorithm selecting an appropriate value for Box-Cox transformation is developed which is considered to find a value minimizing error sum of squares. When the proposed algorithm is used to find a value for transformation, the number of iterations needs to be considered. Thus, the number of iterations is examined through simulation study. Since SAS is one of most widely used packages and does not provide the procedure that performs iterative Box-Cox transformation, a SAS program automatically choosing the value for transformation is developed. Hence, the students could learn how the Box-Cox transformation works, moreover, researchers can use this for analysis of data.

  • PDF