• Title/Summary/Keyword: categorical effect

Search Result 80, Processing Time 0.034 seconds

Error cause analysis of Pearson test statistics for k-population homogeneity test (k-모집단 동질성검정에서 피어슨검정의 오차성분 분석에 관한 연구)

  • Heo, Sunyeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.815-824
    • /
    • 2013
  • Traditional Pearson chi-squared test is not appropriate for the data collected by the complex sample design. When one uses the traditional Pearson chi-squared test to the complex sample categorical data, it may give wrong test results, and the error may occur not only due to the biased variance estimators but also due to the biased point estimators of cell proportions. In this study, the design based consistent Wald test statistics was derived for k-population homogeneity test, and the traditional Pearson chi-squared test statistics was partitioned into three parts according to the causes of error; the error due to the bias of variance estimator, the error due to the bias of cell proportion estimator, and the unseparated error due to the both bias of variance estimator and bias of cell proportion estimator. An analysis was conducted for empirical results of the relative size of each error component to the Pearson chi-squared test statistics. The second year data from the fourth Korean national health and nutrition examination survey (KNHANES, IV-2) was used for the analysis. The empirical results show that the relative size of error from the bias of variance estimator was relatively larger than the size of error from the bias of cell proportion estimator, but its degrees were different variable by variable.

A Study on Utilization of Vision Transformer for CTR Prediction (CTR 예측을 위한 비전 트랜스포머 활용에 관한 연구)

  • Kim, Tae-Suk;Kim, Seokhun;Im, Kwang Hyuk
    • Knowledge Management Research
    • /
    • v.22 no.4
    • /
    • pp.27-40
    • /
    • 2021
  • Click-Through Rate (CTR) prediction is a key function that determines the ranking of candidate items in the recommendation system and recommends high-ranking items to reduce customer information overload and achieve profit maximization through sales promotion. The fields of natural language processing and image classification are achieving remarkable growth through the use of deep neural networks. Recently, a transformer model based on an attention mechanism, differentiated from the mainstream models in the fields of natural language processing and image classification, has been proposed to achieve state-of-the-art in this field. In this study, we present a method for improving the performance of a transformer model for CTR prediction. In order to analyze the effect of discrete and categorical CTR data characteristics different from natural language and image data on performance, experiments on embedding regularization and transformer normalization are performed. According to the experimental results, it was confirmed that the prediction performance of the transformer was significantly improved when the L2 generalization was applied in the embedding process for CTR data input processing and when batch normalization was applied instead of layer normalization, which is the default regularization method, to the transformer model.

Applicability Evaluation of a Mixed Model for the Analysis of Repeated Inventory Data : A Case Study on Quercus variabilis Stands in Gangwon Region (반복측정자료 분석을 위한 혼합모형의 적용성 검토: 강원지역 굴참나무 임분을 대상으로)

  • Pyo, Jungkee;Lee, Sangtae;Seo, Kyungwon;Lee, Kyungjae
    • Journal of Korean Society of Forest Science
    • /
    • v.104 no.1
    • /
    • pp.111-116
    • /
    • 2015
  • The purpose of this study was to evaluate mixed model of dbh-height relation containing random effect. Data were obtained from a survey site for Quercus variabilis in Gangwon region and remeasured the same site after three years. The mixed model were used to fixed effect in the dbh-height relation for Quercus variabilis, with random effect representing correlation of survey period were obtained. To verify the evaluation of the model for random effect, the akaike information criterion (abbreviated as, AIC) was used to calculate the variance-covariance matrix, and residual of repeated data. The estimated variance-covariance matrix, and residual were -0.0291, 0.1007, respectively. The model with random effect (AIC = -215.5) has low AIC value, comparison with model with fixed effect (AIC = -154.4). It is for this reason that random effect associated with categorical data is used in the data fitting process, the model can be calibrated to fit repeated site by obtaining measurements. Therefore, the results of this study could be useful method for developing model using repeated measurement.

Implicit Representations of Social Categories: Asymmetrical Priming Effects on Gender Stereotype (사주적 범부의 암묵적 표상 구조: 성별 고정관념의 비대칭적 점화효과)

  • 이재호;조긍호;오경기;김미라
    • Korean Journal of Cognitive Science
    • /
    • v.12 no.1_2
    • /
    • pp.43-54
    • /
    • 2001
  • This study was conducted to explore the implicit structure of gender-stereotype which is one of the social categories. Social categories were considered to have the more evaluative properties and unclear hierarchical representations compared to the objects or the action categories. In this series of experiments. we want to examine the generalizability of the congruent effect into gender-stereotype using a priming paradigm and introducing the various SOA (stimulus onset asynchrony. SOA 250ms to 1000ms). The results of Experiment 1 and 3 (SOA 250-500ms) showed that the priming effects of female-female condition was larger than the other conditions. However. Experiment 2 (SOA 1000ms) showed that the priming effects among the conditions disappeared. We found the female-congruent effects only in a short SOA. These results suggest the possibility that the gender-stereotype in the automatic and implicit processing level can be represented b by the cross-categorical structure in some cultural area.

  • PDF

An Exploration of Factor's of Service Quality influencing at User's Satisfaction and Distribution Channel of the Digital Contents (디지털 콘텐츠 사용자의 만족에 영향을 주는 서비스 품질 요인 및 유통 채널 탐색에 관한 연구)

  • Suh, Jung Han;Bae, Soonh Han;Kim, Young Gook;Choi, Jae Young
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.7 no.4
    • /
    • pp.183-198
    • /
    • 2011
  • With the recent development of IT technology, the existing contents have been digitalized through various distribution channels. Accordingly, a lot of studies have been done in order to figure out the distribution and features of digital contents, In these studies, however, categorical characteristics of digital contents were not considered ; most of the previous researchers saw digital contents as only a single item or focused on some contents within particular part such as movie, music, etc. So, this study divides digital contents into movies, music and texts. I was going to study which factors affect Customer Satisfaction in relation with the kind of contents. With SERVQUAL as independent variables, which affect the Customer satisfaction, I used five factors :Design Quality, Information Quality, Security Quality, Communication Quality and Transaction Quality. As for the detailed items, I corrected them with Open-End Question and Pre Survey Research, which are more fit into the features of digital contents. This research conducted Principle Component Analysis, Reliability Test, Correlation Analysis and Regression Analysis. I verified that each factor of Service Qualities has a positive effect on Customer Satisfaction. Moreover, the factors of the effect are different according to the kind of digital contents. This paper was added Exploratory Study to find the best distribute channel. For the study, I search the possible distribute channel in each digital contents and their characteristic.

Electromyographic evidence for a gestural-overlap analysis of vowel devoicing in Korean

  • Jun, Sun-A;Beckman, M.;Niimi, Seiji;Tiede, Mark
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.153-200
    • /
    • 1997
  • In languages such as Japanese, it is very common to observe that short peripheral vowel are completely voiceless when surrounded by voiceless consonants. This phenomenon has been known as Montreal French, Shanghai Chinese, Greek, and Korean. Traditionally this phenomenon has been described as a phonological rule that either categorically deletes the vowel or changes the [+voice] feature of the vowel to [-voice]. This analysis was supported by Sawashima (1971) and Hirose (1971)'s observation that there are two distinct EMG patterns for voiced and devoiced vowel in Japanese. Close examination of the phonetic evidence based on acoustic data, however, shows that these phonological characterizations are not tenable (Jun & Beckman 1993, 1994). In this paper, we examined the vowel devoicing phenomenon in Korean using data from ENG fiberscopic and acoustic recorders of 100 sentences produced by one Korean speaker. The results show that there is variability in the 'degree of devoicing' in both acoustic and EMG signals, and in the patterns of glottal closing and opening across different devoiced tokens. There seems to be no categorical difference between devoiced and voiced tokens, for either EMG activity events or glottal patterns. All of these observations support the notion that vowel devoicing in Korean can not be described as the result of the application of a phonological rule. Rather, devoicing seems to be a highly variable 'phonetic' process, a more or less subtle variation in the specification of such phonetic metrics as degree and timing of glottal opening, or of associated subglottal pressure or intra-oral airflow associated with concurrent tone and stricture specifications. Some of token-pair comparisons are amenable to an explanation in terms of gestural overlap and undershoot. However, the effect of gestural timing on vocal fold state seems to be a highly nonlinear function of the interaction among specifications for the relative timing of glottal adduction and abduction gestures, of the amplitudes of the overlapped gestures, of aerodynamic conditions created by concurrent oral tonal gestures, and so on. In summary, to understand devoicing, it will be necessary to examine its effect on phonetic representation of events in many parts of the vocal tracts, and at many stages of the speech chain between the motor intent and the acoustic signal that reaches the hearer's ear.

  • PDF

Effect of complex sample design on Pearson test statistic for homogeneity (복합표본자료에서 동질성검정을 위한 피어슨 검정통계량의 효과)

  • Heo, Sun-Yeong;Chung, Young-Ae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.757-764
    • /
    • 2012
  • This research is for comparison of test statistics for homogeneity when the data is collected based on complex sample design. The survey data based on complex sample design does not satisfy the condition of independency which is required for the standard Pearson multinomial-based chi-squared test. Today, lots of data sets ara collected by complex sample designs, but the tests for categorical data are conducted using the standard Pearson chi-squared test. In this study, we compared the performance of three test statistics for homogeneity between two populations using data from the 2009 customer satisfaction evaluation survey to the service from Gyeongsangnam-do regional offices of education: the standard Pearson test, the unbiasedWald test, and the Pearsontype test with survey-based point estimates. Through empirical analyses, we fist showed that the standard Pearson test inflates the values of test statistics very much and the results are not reliable. Second, in the comparison of Wald test and Pearson-type test, we find that the test results are affected by the number of categories, the mean and standard deviation of the eigenvalues of design matrix.

Data Mining-Based Performance Prediction Technology of Geothermal Heat Pump System (지열 히트펌프 시스템의 데이터 마이닝 기반 성능 예측 기술)

  • Hwang, Min Hye;Park, Myung Kyu;Jun, In Ki;Sohn, Byonghu
    • Transactions of the KSME C: Technology and Education
    • /
    • v.4 no.1
    • /
    • pp.27-34
    • /
    • 2016
  • This preliminary study investigated data mining-based methods to assess and predict the performance of geothermal heat pump(GHP) system. Data mining is a key process of the knowledge discovery in database (KDD), which includes five steps: 1) Selection; 2) Pre-processing; 3) Transformation; 4) Analysis(data mining); and 5) Interpretation/Evaluation. We used two analysis models, categorical and numerical decision tree models to ascertain the patterns of performance(COP) and electrical consumption of the GHP system. Prior to applying the decision tree models, we statistically analyzed measurement database to determine the effect of sampling intervals on the system performance. Analysis results showed that 10-min sampling data for the performance analysis had highest accuracy of 97.7% over the actual dataset of the GHP system.

Isokinetic Muscle Strength and Muscle Endurance by the Types and Size of Rotator Cuff Tear in Men

  • Kim, In Bo;Kim, Do Keun
    • Clinics in Shoulder and Elbow
    • /
    • v.17 no.4
    • /
    • pp.166-174
    • /
    • 2014
  • Background: Our study was to determine the effect on shoulder isokinetic muscle strength and muscle endurance in isolated full-thickness supraspinatus tendon tear and combined other rotator cuff tear. Methods: Total of 81 male patients (mean age $57.8{\pm}7.4$ years) who were diagnosed as a full-thickness supraspinatus tendon tear were included. They were classified into isolated or combined tear. The isokinetic muscle strength and muscle endurance were measured using the Biodex multi-joint system $PRO^{(R)}$ (Biodex Medical Systems, Shirley, NY, USA) in following movements: shoulder abduction, adduction, flexion, extension, external rotation, and internal rotation. Then, the difference in muscle function according to the type of tears were assessed. Fifty-seven patients had isolated supraspinatus tendon (mean age $56.9{\pm}7.3$ years). They were classified into either anteroposterior tear or modified mediolateral tear. The size were measured using T2-weighted magnetic resonance imaging scans in sagittal plane. Results: Between subjects categorized into the type of tear, we found significant inter-categorical differences in isokinetic muscle strength during abduction, adduction, flexion, extension, and internal rotation, and in muscle endurance during flexion, extension, and internal rotation. Anteroposterior diameter tear, we did not show significant differences in either isokinetic muscle strength or muscle endurance during any movements. However, with modified mediolateral diameter, we found significant differences with isokinetic muscle strength during adduction, and in muscle endurance the external rotation and internal rotation. Conclusions: We found that a supraspinatus tendon tear associated with more numbers of rotator cuff tears has lower isokinetic muscle strength and muscle endurance than a tear found alone.

Causality between climatic and soil factors on Italian ryegrass yield in paddy field via climate and soil big data

  • Kim, Moonju;Peng, Jing-Lun;Sung, Kyungil
    • Journal of Animal Science and Technology
    • /
    • v.61 no.6
    • /
    • pp.324-332
    • /
    • 2019
  • This study aimed to identify the causality between climatic and soil variables affecting the yield of Italian ryegrass (Lolium multiflorum Lam., IRG) in the paddy field by constructing the pathways via structure equation model. The IRG data (n = 133) was collected from the National Agricultural Cooperative Federation (1992-2013). The climatic variables were accumulated temperature, growing days and precipitation amount from the weather information system of Korea Meteorological Administration, and soil variables were effective soil depth, slope, gravel content and drainage class as soil physical properties from the soil information system of Rural Development Administration. In general, IRG cultivation by the rice-rotation system in paddy field is important and unique in East Asia because it contributes to the increase of income by cultivating IRG during agricultural off-season. As a result, the seasonal effects of accumulated temperature and growing days of autumn and next spring were evident, furthermore, autumnal temperature and spring precipitation indirectly influenced yield through spring temperature. The effect of autumnal temperature, spring temperature, spring precipitation and soil physics factors were 0.62, 0.36, 0.23, and 0.16 in order (p < 0.05). Even though the relationship between soil physical and precipitation was not significant, it does not mean there was no association. Because the soil physical variables were categorical, their effects were weakly reflected even with scale adjustment by jitter transformation. We expected that this study could contribute to increasing IRG yield by presenting the causality of climatic and soil factors and could be extended to various factors.