Search | Korea Science

On the Estimation of Parameters in ALT under Generalized Exponential Distribution

Yoon, Sang-Chul
- Journal of the Korean Data and Information Science Society
- /
- v.16 no.4
- /
- pp.923-931
- /
- 2005
The two parameter generalized exponential distribution was recently introduced by Gupta and Kundu (1999). It is observed that the generalized exponential distribution can be used quite effectively to analyze skewed data set. This paper develops the accelerated life test model using generalized exponential distribution and considers maximum likelihood estimation of parameters under the tampered random variable model. To show the performance of proposed maximum likelihood estimates, some simulation will be performed. Using a real data set, an example will be given.
PDF

Effect of zero imputation methods for log-transformation of independent variables in logistic regression

Seo Young Park
- Communications for Statistical Applications and Methods
- /
- v.31 no.4
- /
- pp.409-425
- /
- 2024
Logistic regression models are commonly used to explain binary health outcome variable using independent variables such as patient characteristics in medical science and public health research. Although there is no distributional assumption required for independent variables in logistic regression, variables with severely right-skewed distribution such as lab values are often log-transformed to achieve symmetry or approximate normality. However, lab values often have zeros due to limit of detection which makes it impossible to apply log-transformation. Therefore, preprocessing to handle zeros in the observation before log-transformation is necessary. In this study, five methods that remove zeros (shift by 1, shift by half of the smallest nonzero, shift by square root of the smallest nonzero, replace zeros with half of the smallest nonzero, replace zeros with the square root of the smallest nonzero) are investigated in logistic regression setting. To evaluate performances of these methods, we performed a simulation study based on randomly generated data from log-normal distribution and logistic regression model. Shift by 1 method has the worst performance, and overall shift by half of the smallest nonzero method, replace zeros with half of the smallest nonzero method, and replace zeros with the square root of the smallest nonzero method showed comparable and stable performances.
https://doi.org/10.29220/CSAM.2024.31.4.409 인용 PDF

J-Tree: An Efficient Index using User Searching Patterns for Large Scale Data (J-tree : 사용자의 검색패턴을 이용한 대용량 데이타를 위한 효율적인 색인)

Jang, Su-Min;Seo, Kwang-Seok;Yoo, Jae-Soo
- Journal of KIISE:Databases
- /
- v.36 no.1
- /
- pp.44-49
- /
- 2009
In recent years, with the development of portable terminals, various searching services on large data have been provided in portable terminals. In order to search large data, most applications for information retrieval use indexes such as B-trees or R-trees. However, only a small portion of the data set is accessed by users, and the access frequencies of each data are not uniform. The existing indexes such as B-trees or R-trees do not consider the properties of the skewed access patterns. And a cache stores the frequently accessed data for fast access in memory. But the size of memory used in the cache is restricted. In this paper, we propose a new index based on disk, called J-tree, which considers user's search patterns. The proposed index is a balanced tree which guarantees uniform searching time on all data. It also supports fast searching time on the frequently accessed data. Our experiments show the effectiveness of our proposed index under various settings.
PDF KSCI

A Physical Design Method of Storage Structures for MOLAP Systems of Data Warehouse (데이터 웨어하우스의 다차원 온라인 분석처리 시스템을 위한 저장구조의 물리적 설계기법)

Lee Jong-Hak
- Journal of Korea Multimedia Society
- /
- v.8 no.3
- /
- pp.297-312
- /
- 2005
Aggregation is an operation that plays a key role in multidimensional OLAP (MOLAP) systems of data warehouse. Existing aggregation operations in MOLAP have been proposed for file structures such as multidimensional arrays. These tile structures do not work well with skewed distributions. This paper presents a physical design methodology for storage structures ni MOLAP that use the multidimensional tile organizations adapting to a skewed distribution. In uniform data distribution, we first show that the performance of multidimensional analytical processing is highly affected by the similarity of the shapes between query regions and page regions in the domain space of the multidimensional file organizations. And than, in skewed distributions, we reflect the effect of data distributions on the design by using the shapes of the normalized query regions that are weighted with data density of those query regions. Finally, we demonstrate that the physical design methodology theoretically derived is indeed correct in real environments. In the two-dimensional file organizations, the results of experiments indicate that the performance of the proposed method is enhanced by more than seven times over the conventional method. We expect that the performance will be more enhanced when the dimensionality is more than two. The result confirms that the proposed physical design methodology is useful in a practical way.
PDF

Effective Parallel Hash Join Algorithm Based on Histoftam Equalization in the Presence of Data Skew (데이터 편재 하에서 히스토그램 변환기법에 기초한 효율적인 병렬 해쉬 결합 알고리즘)

Park, Ung-Gyu;Choe, Hwang-Gyu;Kim, Tak-Gon
- The Transactions of the Korea Information Processing Society
- /
- v.4 no.2
- /
- pp.338-348
- /
- 1997
In this pater, we first propose a data distribution framework to resolve load imbalance and bucket oerflow in parallel hash join.Using the histogram equalization technique, the framework transforms a histogram of skewed data to the desired uniform distribution that corresponds to the relative computing power of node processors in the system.Next we propose an effcient parallel hash join algorithm for handing skwed data based on the proposed data distribution methodology.For performance comparison of our algorithm with other hash join algorithms.we perform similation experiments and actual exeution on COREDB database computer with 8-node hyperube architecture. In these experiments, skwed data distebution of the join atteibute is modeled using a Zipf-like distribution.The perfomance studies undicate that our algorithm outperforms other algorithms in the skewed cases.
PDF

The Approximate MLE in a Skew-Symmetric Laplace Distribution

Son, Hee-Ju;Woo, Jung-Soo
- Journal of the Korean Data and Information Science Society
- /
- v.18 no.2
- /
- pp.573-584
- /
- 2007
We define a skew-symmetric Laplace distribution by a symmetric Laplace distribution and evaluate its coefficient of skewness. And we derive an approximate maximum likelihood estimator(AME) and a moment estimator(MME) of a skewed parameter in a skew-symmetric Laplace distribution, and hence compare simulated mean squared errors of those estimators. We compare asymptotic mean squared errors of two defined estimators of reliability in two independent skew-symmetric distributions.
PDF

Development of a Multiobjective Optimization Algorithm Using Data Distribution Characteristics (데이터 분포특성을 이용한 다목적함수 최적화 알고리즘 개발)

Hwang, In-Jin;Park, Gyung-Jin
- Transactions of the Korean Society of Mechanical Engineers A
- /
- v.34 no.12
- /
- pp.1793-1803
- /
- 2010
The weighting method and goal programming require weighting factors or target values to obtain a Pareto optimal solution. However, it is difficult to define these parameters, and a Pareto solution is not guaranteed when the choice of the parameters is incorrect. Recently, the Mahalanobis Taguchi System (MTS) has been introduced to minimize the Mahalanobis distance (MD). However, the MTS method cannot obtain a Pareto optimal solution. We propose a function called the skewed Mahalanobis distance (SMD) to obtain a Pareto optimal solution while retaining the advantages of the MD. The SMD is a new distance scale that multiplies the skewed value of a design point by the MD. The weighting factors are automatically reflected when the SMD is calculated. The SMD always gives a unique Pareto optimal solution. To verify the efficiency of the SMD, we present two numerical examples and show that the SMD can obtain a unique Pareto optimal solution without any additional information.
https://doi.org/10.3795/KSME-A.2010.34.12.1793 인용 PDF KSCI

Performance Enhancement of a DVA-tree by the Independent Vector Approximation (독립적인 벡터 근사에 의한 분산 벡터 근사 트리의 성능 강화)

Choi, Hyun-Hwa;Lee, Kyu-Chul
- The KIPS Transactions:PartD
- /
- v.19D no.2
- /
- pp.151-160
- /
- 2012
Most of the distributed high-dimensional indexing structures provide a reasonable search performance especially when the dataset is uniformly distributed. However, in case when the dataset is clustered or skewed, the search performances gradually degrade as compared with the uniformly distributed dataset. We propose a method of improving the k-nearest neighbor search performance for the distributed vector approximation-tree based on the strongly clustered or skewed dataset. The basic idea is to compute volumes of the leaf nodes on the top-tree of a distributed vector approximation-tree and to assign different number of bits to them in order to assure an identification performance of vector approximation. In other words, it can be done by assigning more bits to the high-density clusters. We conducted experiments to compare the search performance with the distributed hybrid spill-tree and distributed vector approximation-tree by using the synthetic and real data sets. The experimental results show that our proposed scheme provides consistent results with significant performance improvements of the distributed vector approximation-tree for strongly clustered or skewed datasets.
https://doi.org/10.3745/KIPSTD.2012.19D.2.151 인용 PDF KSCI

Semiparametric Bayesian Hierarchical Selection Models with Skewed Elliptical Distribution (왜도 타원형 분포를 이용한 준모수적 계층적 선택 모형)

정윤식;장정훈
- The Korean Journal of Applied Statistics
- /
- v.16 no.1
- /
- pp.101-115
- /
- 2003
Lately there has been much theoretical and applied interest in linear models with non-normal heavy tailed error distributions. Starting Zellner(1976)'s study, many authors have explored the consequences of non-normality and heavy-tailed error distributions. We consider hierarchical models including selection models under a skewed heavy-tailed e..o. distribution proposed originally by Chen, Dey and Shao(1999) and Branco and Dey(2001) with Dirichlet process prior(Ferguson, 1973) in order to use a meta-analysis. A general calss of skewed elliptical distribution is reviewed and developed. Also, we consider the detail computational scheme under skew normal and skew t distribution using MCMC method. Finally, we introduce one example from Johnson(1993)'s real data and apply our proposed methodology.
https://doi.org/10.5351/KJAS.2003.16.1.101 인용 PDF KSCI

Optimal Welding Condition for the Inclined and Skewed Fillet Joints ill the Curved Block of a Ship (I) (선박 골블록의 경사 필렛 이음부의 적정 용접조건 (I))

PARK JU-YONG
- Journal of Ocean Engineering and Technology
- /
- v.18 no.6 s.61
- /
- pp.79-83
- /
- 2004
The curved blocks which compose the bow and stem of a ship contain many skewed joints that are inclined horizontally and vertically. Most of these joints have a large fitness error and are continuously changing their form and are not easily accessible. The welding position and parameter values should be appropriately set in correspondence to the shape and the inclination of the joints. The welding parameters such as current, voltage, travel speed, and melting rate, are related to each other and their values must be in a specific limited range for the sound welding. These correlations and the ranges are dependent up on the kind and size of wire, shielding gas, joint shape and fitness. To determine these relationships, extensive welding experiments were performed. The experimental data were processed using several information processing technologies. The regression method was used to determine the relationship between current voltage, and deposition rate. When a joint is inclined, the weld bead should be confined to a the limited size, inorder to avoid undercut as well as overlap due to flowing down of molten metal by gravity. The dependency of the limited weld size which is defined as the critical deposited area on various factors such as the horizontally and vertically inclined angle of the joint, skewed angle of the joint, up or down welding direction and weaving was investigated through a number of welding experiments. On the basis of this result, an ANN system was developed to estimate the critical deposited area. The ANN system consists of a 4 layer structure and uses an error back propagation learning algorithm. The estimated values of the ANN were validated using experimental values.
PDF KSCI

Search Result 206, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)