• Title/Summary/Keyword: Best fitting

Search Result 320, Processing Time 0.028 seconds

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

Development of an Automatic 3D Coregistration Technique of Brain PET and MR Images (뇌 PET과 MR 영상의 자동화된 3차원적 합성기법 개발)

  • Lee, Jae-Sung;Kwark, Cheol-Eun;Lee, Dong-Soo;Chung, June-Key;Lee, Myung-Chul;Park, Kwang-Suk
    • The Korean Journal of Nuclear Medicine
    • /
    • v.32 no.5
    • /
    • pp.414-424
    • /
    • 1998
  • Purpose: Cross-modality coregistration of positron emission tomography (PET) and magnetic resonance imaging (MR) could enhance the clinical information. In this study we propose a refined technique to improve the robustness of registration, and to implement more realistic visualization of the coregistered images. Materials and Methods: Using the sinogram of PET emission scan, we extracted the robust head boundary and used boundary-enhanced PET to coregister PET with MR. The pixels having 10% of maximum pixel value were considered as the boundary of sinogram. Boundary pixel values were exchanged with maximum value of sinogram. One hundred eighty boundary points were extracted at intervals of about 2 degree using simple threshold method from each slice of MR images. Best affined transformation between the two point sets was performed using least square fitting which should minimize the sum of Euclidean distance between the point sets. We reduced calculation time using pre-defined distance map. Finally we developed an automatic coregistration program using this boundary detection and surface matching technique. We designed a new weighted normalization technique to display the coregistered PET and MR images simultaneously. Results: Using our newly developed method, robust extraction of head boundary was possible and spatial registration was successfully performed. Mean displacement error was less than 2.0 mm. In visualization of coregistered images using weighted normalization method, structures shown in MR image could be realistically represented. Conclusion: Our refined technique could practically enhance the performance of automated three dimensional coregistration.

  • PDF

Comparison of Tear Distributions by the Corneal Eccentricity when Fitted with Spherical and Aspherical RGP Lenses (구면 및 비구면 RGP렌즈 피팅 시 각막 이심률별 눈물분포 비교)

  • Kim, Jihye;Kim, So Ra;Park, Mijung
    • Journal of Korean Ophthalmic Optics Society
    • /
    • v.21 no.2
    • /
    • pp.99-108
    • /
    • 2016
  • Purpose: The present study was aimed to compare the tear volume and distribution by corneal eccentricity when fitted with spherical and aspherical RGP lenses. Methods: Spherical and aspherical RGP lenses were fitted in best alignment on a total of 77 subjects (136 eyes) in their twenties and thirties without any ocular disease or ocular surgery experience. The tear volume was analyzed by estimating the concentration of tear stained with fluorescein in the center of RGP lens as well as at the mid-peripheral and peripheral areas, and the difference of tear distributions was analyzed according to corneal eccentricity. Results: Tear distribution from the center to the peripheral area was not significantly different when spherical RGP lenses were fitted on the corneal eccentricities of e < 0.38 and $0.68{\leq}e$, indicating the relatively even tear distribution compared with other corneal eccentricity. In the case of aspherical RGP lenses, the difference of tear distribution between the central and peripheral areas was smaller than spherical RGP lenses. The significant difference of tear distribution according to RGP lens design was observed in the corneal eccentricity of 0.48 < e < 0.68. In other words, more even tear distribution was shown when aspherical RGP lenses were fitted on the cornea with eccentricity of $0.48{\leq}e<0.68$ and spherical RGP lenses were fitted on the cornea with eccentricity $0.68{\leq}e$. Furthermore, tear volume in the mid-peripheral area increased with higher corneal eccentricity. Conclusions: The results suggest that the appropriate selection of RGP lens design according to corneal eccentricity is necessary since tear volume and distribution by the regions of spherical and aspherical lenses are affected by corneal eccentricity.

Mathematical Modelling of Phenol Desorption from Spent Activated Carbon by Acetone (활성탄에 흡착된 페놀의 아세톤 탈착 모델에 대한 연구)

  • Kim, Seungdo;Oh, Young-Jin
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.22 no.12
    • /
    • pp.2115-2123
    • /
    • 2000
  • This research was designed to investigate the mathematical model and kinetics of phenol desorption from spent activated carbon. elucidating the desorption characteristics of phenol in the case of using acetone. The Freundlich isotherm constant ($k_e$) is expressed as a function of temperature: $k_e(T)=0.1exp(797.297/T)$. The Freundlich isotherm constant(n) is a weak temperature function and is rarely affected by temperature below $50^{\circ}C$. whereas it is necessary to correct the n value with respect to temperature above $100^{\circ}C$ owing to significant deviation (~5%). Based on the assumption that the surface desorption reaction of phenol is rate limiting, the desorption model was developed. Desorption reaction constant($k_d$) was determined by means of fitting the theoretical results best to experimental ones. The Arrhenius relationships for $k_d$ was expressed by: $k_d(sec^{-1})=0.0479{\cdot}exp(-3037/T)$. The model was verified by comparing the experimental ones under different reaction conditions with the theoretical results determined by the previously estimated $k_d$. Since the difference between them is with 5%, it is expected that the desorption model of this research seems to be appropriate to explain the desorption of phenol from activated carbon by acetone. According to studies of the model. regeneration time and ratio was estimated as a function of temperature under present conditions as follows: (1) regeneration time : ${\tau}_{reg}(hr)=-0.08130T_c+8.4775$. (2) regeneration ratio : ${\eta}(%)=0.2210T_c+83.745$. The regeneration time at 15, 55, and $100^{\circ}C$. respectively. was 7, 4.2, and 0.35 hours, whereas the regeneration ratio was 87. 96. and 99%. respectively. Also. studies of the model would make it possible to determine the regeneration time and ratio under other specific conditions (temperature, applied acetone volume, amount of activated carbon, and initially adsorbed phenol amount).

  • PDF

Determinants of Dual-earner Wives' Needs for Family-supportive Services: A Comparison of Professional and Blue-collar Models (맞벌이 부인의 가족지원서비스 필요도 결정요인 : 전문직과 생산직 모델 비교)

  • Lee, Myung-Shin
    • Korean Journal of Social Welfare
    • /
    • v.36
    • /
    • pp.199-228
    • /
    • 1998
  • This study is designed to find out the determinants of dual-earner wives' needs for family-supportive services. For this purpose, a hypothetical model which explains the relationships among 6 stressors, role overload, stress and needs for 4 family-supportive services is developed. Using the data collected by purposive sampling from 234 professional women and 208 blue-collar women living in Chinju and Sacheon, the hypothetical model developed in this study was tested. In order to examine occupational class differences, a model for professionals and another model for blue-collars were developed separately and compared. For data analysis, a covariance structure analysis was used. The best-fitting model for professional women (df=141, GFI=0.928, CFI=0.965) and the model for blue collar women (df=141, GFI=0.902, CFI=0.912) were found. As a result of comparing two models, 9 common relationships were found:l)Greater dissatisfaction with child care service increases role overload; 2)Longer work hours increases role overload; 3) Higher level of role overload increases stress; 4)Higher level of stress increase needs for leaves; 5)Older child increases needs for flexible work pattern; 6)Younger child increases needs for finalcial assistance for child care fee; 7)needs for financial assistance for child care increases needs for on-site child care services; 8)needs for on-site child care services increases needs for leaves; 9)needs for leaves increases needs for flexible work pattern. With the exception of these 9 common relationships, the analyses revealed substantial differences between professional and blue-collar dual-earner wives. Based on the common and differential needs between 2 groups of wives, the effective ways to provide family-supportive services according to the needs of individual dual-earner wives who are in different familial, financial, and work conditions were suggested.

  • PDF

Frequency of Micronuclei in Lymphocytes Following Gamma and Fast-neutron Irradiations (방사선 조사량에 따른 인체 정상 림파구의 미세핵 발생빈도)

  • Kim Sung-Ho;Cho Chul-Koo;Kim Tae-Hwan;Chung In-Yong;Yoo Seong-Yul;Koh Kyoung-Hwan;Yun Hyong-Geun
    • Radiation Oncology Journal
    • /
    • v.11 no.1
    • /
    • pp.35-42
    • /
    • 1993
  • The dose response of the number of micronuclei in cytokinesis-blocked (CB) lymphocytes after in vitro irradiation with $\gamma$-rays and neutrons in the 5 dose ranges was studied for a heterogeneous population of 4 donors. One thousand binucleated cells were systematically scored for micronuclei. Measurements performed after irradiation showed a dose-dependent increase in micronuclei (MN) frequency in each of the donors studied. The dose-response curves were analyzed by a linear-quadratic model, frequencies per 1000 CB cells were ($0.31{\pm}0.049$)D+($0.0022{\pm}0.0002)D^2+(13.19{\pm}1.854) (r^2=1.000,\;X^2=0.7074,\;p=0.95$) following $\gamma$ irradiation, and ($0.99{\pm}0.528$)\;D+(0.0093{\pm}0.0047)\;D^2+(13.31{\pm}7.309)\;(r^2=0.996,\;X^2=7.6834,\;p=0.11) following neutrons irradiation (D is irradiation dose in cGy). The relative biological effectiveness (RBE) of neutrons compared with $\gamma$-rays was estimated by best fitting linear-quadratic model. In the micronuclei frequency between 0.05 and 0.8 per cell, the RBE of neutrons was $2.37{\pm}0.17$. Since the MN assay is simple and rapid, it may be a good tool for evaluating the $\gamma$-ray and neutron response.

  • PDF

Infrared Characteristics of Some Flash Light Sources (섬광의 적외선 특성 연구)

  • Lim, Sang-Yeon;Park, Seung-Man
    • Korean Journal of Optics and Photonics
    • /
    • v.27 no.1
    • /
    • pp.18-24
    • /
    • 2016
  • To effectively utilize a flash and predict its effects on an infrared device, it is essential to know the infrared characteristics of the flash source. In this paper, a study of the IR characteristics of flash light sources is carried out. The IR characteristics of three flash sources, of which two are combustive and the other is explosive, are measured with an IR characteristic measurement system over the middle- and long-wavelength infrared ranges. From the measurements, the radiances over the two IR ranges and the radiative temperatures of the flashes are extracted. The IR radiance of flash A is found to be the strongest among the three, followed by those of sources C and B. It is also shown that the IR radiance of flash A is about 10 times stronger than that of flash B, even though these two sources are the same type of flash with the same powder. This means that the IR radiance intensity of a combustive flash source depends only on the amount of powder, not on the characteristics of the powder. From the measured radiance over MWIR and LWIR ranges for each flashes, the radiative temperatures of the flashes are extracted by fitting the measured data to blackbody radiance. The best-fit radiative temperatures (equivalent to black-body temperatures) of the three flash sources A, B, and C are 3300, 1120, and 1640 K respectively. From the radiance measurements and radiative temperatures of the three flash sources, it is shown that a combustive source radiates more IR energy than an explosive one; this mean, in turn, that the effects of a combustive flash on an IR device are more profound than those of an explosive flash source. The measured IR radiances and radiative temperatures of the flash sources in this study can be used to estimate the effects of flashes on various IR devices, and play a critical role for the modeling and simulation of the effects of a flash source on various IR devices.

Environmental Factors, Types of Bullying Behavior, and Psychological and Behavioral Outcomes for the Bullies (괴롭힘 가해자의 환경적 요인, 괴롭힘 행동유형, 가해자의 심리.행동적 결과에 대한 연구)

  • Lee, Myung-Shin
    • Korean Journal of Social Welfare
    • /
    • v.51
    • /
    • pp.29-61
    • /
    • 2002
  • This study was designed to find out the determinants of types of bullying behavior, and the effects of types of bullying behavior on the bullies. For this purpose, a hypothetical model which explains the relationships among 6 environmental factors, 5 types of bullying behavior, and 5 outcome variables for the bullies was developed. Using the data collected from 177 junior and high school students who have bullied the other students, the hypothetical model was tested. For data analysis, a path analysis was used, and the best-fitting model was found (df=78, GFI=0.953, CFI=1.00). As a result of analyzing the model, types of bullying behavior were found to be determined by the different environmental factors: Isolation was determined by 2 factors (feeling of isolation from friends, exposure to bullying), social bullying by 2 factors (lack of support from parents, exposure to bullying), verbal bullying by conflicts with parents, physical bullying by 3 factors (lack of support from parents, exposure to isolation and exposure to bullying), and instrumental bullying by lack of support from parents. On the other hand, the pleasure that the bullies feel after bullying behavior was increased by isolation, verbal bullying and physical bullying, while decreased by instrumental bullying. Guilt feeling was decreased by isolation and instrumental bullying, while increased by physical bullying. Isolation increased the tendency of blaming the victim. Isolation and instrumental bullying increased bullies' self-esteem, while social bullying decreased self-esteem. Verbal bullying increased the extent of bullying, while instrumental bullying decreased the extent of bullying. Based on the findings, the intervention strategies to change the bullies' attitudes toward victim, and to increase social support from the significant others as well as the effective ways to reorganize the school environment in order to reduce and prevent bullying behavior were suggested.

  • PDF

A study on solar radiation prediction using medium-range weather forecasts (중기예보를 이용한 태양광 일사량 예측 연구)

  • Sujin Park;Hyojeoung Kim;Sahm Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.1
    • /
    • pp.49-62
    • /
    • 2023
  • Solar energy, which is rapidly increasing in proportion, is being continuously developed and invested. As the installation of new and renewable energy policy green new deal and home solar panels increases, the supply of solar energy in Korea is gradually expanding, and research on accurate demand prediction of power generation is actively underway. In addition, the importance of solar radiation prediction was identified in that solar radiation prediction is acting as a factor that most influences power generation demand prediction. In addition, this study can confirm the biggest difference in that it attempted to predict solar radiation using medium-term forecast weather data not used in previous studies. In this paper, we combined the multi-linear regression model, KNN, random fores, and SVR model and the clustering technique, K-means, to predict solar radiation by hour, by calculating the probability density function for each cluster. Before using medium-term forecast data, mean absolute error (MAE) and root mean squared error (RMSE) were used as indicators to compare model prediction results. The data were converted into daily data according to the medium-term forecast data format from March 1, 2017 to February 28, 2022. As a result of comparing the predictive performance of the model, the method showed the best performance by predicting daily solar radiation with random forest, classifying dates with similar climate factors, and calculating the probability density function of solar radiation by cluster. In addition, when the prediction results were checked after fitting the model to the medium-term forecast data using this methodology, it was confirmed that the prediction error increased by date. This seems to be due to a prediction error in the mid-term forecast weather data. In future studies, among the weather factors that can be used in the mid-term forecast data, studies that add exogenous variables such as precipitation or apply time series clustering techniques should be conducted.

Application of Support Vector Regression for Improving the Performance of the Emotion Prediction Model (감정예측모형의 성과개선을 위한 Support Vector Regression 응용)

  • Kim, Seongjin;Ryoo, Eunchung;Jung, Min Kyu;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.185-202
    • /
    • 2012
  • .Since the value of information has been realized in the information society, the usage and collection of information has become important. A facial expression that contains thousands of information as an artistic painting can be described in thousands of words. Followed by the idea, there has recently been a number of attempts to provide customers and companies with an intelligent service, which enables the perception of human emotions through one's facial expressions. For example, MIT Media Lab, the leading organization in this research area, has developed the human emotion prediction model, and has applied their studies to the commercial business. In the academic area, a number of the conventional methods such as Multiple Regression Analysis (MRA) or Artificial Neural Networks (ANN) have been applied to predict human emotion in prior studies. However, MRA is generally criticized because of its low prediction accuracy. This is inevitable since MRA can only explain the linear relationship between the dependent variables and the independent variable. To mitigate the limitations of MRA, some studies like Jung and Kim (2012) have used ANN as the alternative, and they reported that ANN generated more accurate prediction than the statistical methods like MRA. However, it has also been criticized due to over fitting and the difficulty of the network design (e.g. setting the number of the layers and the number of the nodes in the hidden layers). Under this background, we propose a novel model using Support Vector Regression (SVR) in order to increase the prediction accuracy. SVR is an extensive version of Support Vector Machine (SVM) designated to solve the regression problems. The model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ${\varepsilon}$) to the model prediction. Using SVR, we tried to build a model that can measure the level of arousal and valence from the facial features. To validate the usefulness of the proposed model, we collected the data of facial reactions when providing appropriate visual stimulating contents, and extracted the features from the data. Next, the steps of the preprocessing were taken to choose statistically significant variables. In total, 297 cases were used for the experiment. As the comparative models, we also applied MRA and ANN to the same data set. For SVR, we adopted '${\varepsilon}$-insensitive loss function', and 'grid search' technique to find the optimal values of the parameters like C, d, ${\sigma}^2$, and ${\varepsilon}$. In the case of ANN, we adopted a standard three-layer backpropagation network, which has a single hidden layer. The learning rate and momentum rate of ANN were set to 10%, and we used sigmoid function as the transfer function of hidden and output nodes. We performed the experiments repeatedly by varying the number of nodes in the hidden layer to n/2, n, 3n/2, and 2n, where n is the number of the input variables. The stopping condition for ANN was set to 50,000 learning events. And, we used MAE (Mean Absolute Error) as the measure for performance comparison. From the experiment, we found that SVR achieved the highest prediction accuracy for the hold-out data set compared to MRA and ANN. Regardless of the target variables (the level of arousal, or the level of positive / negative valence), SVR showed the best performance for the hold-out data set. ANN also outperformed MRA, however, it showed the considerably lower prediction accuracy than SVR for both target variables. The findings of our research are expected to be useful to the researchers or practitioners who are willing to build the models for recognizing human emotions.