• Title/Summary/Keyword: higher-order clustering

Search Result 62, Processing Time 0.023 seconds

Personalized Product Recommendation Method for Analyzing User Behavior Using DeepFM

  • Xu, Jianqiang;Hu, Zhujiao;Zou, Junzhong
    • Journal of Information Processing Systems
    • /
    • v.17 no.2
    • /
    • pp.369-384
    • /
    • 2021
  • In a personalized product recommendation system, when the amount of log data is large or sparse, the accuracy of model recommendation will be greatly affected. To solve this problem, a personalized product recommendation method using deep factorization machine (DeepFM) to analyze user behavior is proposed. Firstly, the K-means clustering algorithm is used to cluster the original log data from the perspective of similarity to reduce the data dimension. Then, through the DeepFM parameter sharing strategy, the relationship between low- and high-order feature combinations is learned from log data, and the click rate prediction model is constructed. Finally, based on the predicted click-through rate, products are recommended to users in sequence and fed back. The area under the curve (AUC) and Logloss of the proposed method are 0.8834 and 0.0253, respectively, on the Criteo dataset, and 0.7836 and 0.0348 on the KDD2012 Cup dataset, respectively. Compared with other newer recommendation methods, the proposed method can achieve better recommendation effect.

Study on mapping of dark matter clustering from real space to redshift space

  • Zheng, Yi;Song, Yong-Seon
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.41 no.1
    • /
    • pp.38.2-38.2
    • /
    • 2016
  • The mapping of dark matter clustering from real to redshift spaces introduces the anisotropic property to the measured density power spectrum in redshift space, known as the Redshift Space Distortion (hereafter RSD) effect. The mapping formula is intrinsically non-linear, which is complicated by the higher order polynomials due to the indefinite cross correlations between the density and velocity fields, and the Finger-of-God (hereafter FoG) effect due to the randomness of the peculiar velocity field. Furthermore, the rigorous test of this mapping formula is contaminated by the unknown non-linearity of the density and velocity fields, including their auto- and cross-correlations, for calculating which our theoretical calculation breaks down beyond some scales. Whilst the full higher order polynomials remains unknown, the other systematics can be controlled consistently within the same order truncation in the expansion of the mapping formula, as shown in this paper. The systematic due to the unknown non-linear density and velocity fields is removed by separately measuring all terms in the expansion using simulations. The uncertainty caused by the velocity randomness is controlled by splitting the FoG term into two pieces, 1) the non-local FoG term being independent of the separation vector between two different points, and 2) the local FoG term appearing as an indefinite polynomials which is expanded in the same order as all other perturbative polynomials. Using 100 realizations of simulations, we find that the best fitted non-local FoG function is Gaussian, with only one scale-independent free parameter, and that our new mapping formulation accurately reproduces the observed power spectrum in redshift space at the smallest scales by far, up to k ~ 0.3 h/Mpc, considering the resolution of future experiments.

  • PDF

Threshold based User-centric Clustering for Cell-free MIMO Network (셀프리 다중안테나 네트워크를 위한 임계값 기반 사용자 중심 클러스터링)

  • Ryu, Jong Yeol;Lee, Woongsup;Ban, Tae-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.1
    • /
    • pp.114-121
    • /
    • 2022
  • In this paper, we consider a user centric clustering in order to guarantee the performance of the users in cell free multiple-input multiple-output (MIMO) network. In the user centric clustering scheme, by using large scale fading coefficients of the connected access points (APs), each user decides own cluster with the APs having the higher the large scale fading coefficients than threshold value compared to the highest large scale fading coefficient. In the determined user centric clusters, the APs design the beamformers and power allocations in the distributed manner and the APs cooperatively transmit data to users by using beamformers and power allocations. In the simulation results, we verify the performance of user centric clustering in terms of the spectral efficiency and we also find the optimal threshold value in the given configuration.

K-Means-Based Polynomial-Radial Basis Function Neural Network Using Space Search Algorithm: Design and Comparative Studies (공간 탐색 최적화 알고리즘을 이용한 K-Means 클러스터링 기반 다항식 방사형 기저 함수 신경회로망: 설계 및 비교 해석)

  • Kim, Wook-Dong;Oh, Sung-Kwun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.17 no.8
    • /
    • pp.731-738
    • /
    • 2011
  • In this paper, we introduce an advanced architecture of K-Means clustering-based polynomial Radial Basis Function Neural Networks (p-RBFNNs) designed with the aid of SSOA (Space Search Optimization Algorithm) and develop a comprehensive design methodology supporting their construction. In order to design the optimized p-RBFNNs, a center value of each receptive field is determined by running the K-Means clustering algorithm and then the center value and the width of the corresponding receptive field are optimized through SSOA. The connections (weights) of the proposed p-RBFNNs are of functional character and are realized by considering three types of polynomials. In addition, a WLSE (Weighted Least Square Estimation) is used to estimate the coefficients of polynomials (serving as functional connections of the network) of each node from output node. Therefore, a local learning capability and an interpretability of the proposed model are improved. The proposed model is illustrated with the use of nonlinear function, NOx called Machine Learning dataset. A comparative analysis reveals that the proposed model exhibits higher accuracy and superb predictive capability in comparison to some previous models available in the literature.

A Study on Degradation Pattern of GIS Using Clustering Methode (군집화 기법을 이용한 GIS 열화 패턴 연구)

  • Lee, Deok Jin
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.31 no.4
    • /
    • pp.255-260
    • /
    • 2018
  • In recent years, increasing electricity use has led to considerable interest in green energy. In order to effectively supply, cut off, and operate an electric power system, many electric power facilities such as gas insulation switch (GIS), cable, and large substation facilities with higher densities are being developed to meet demand. However, because of the increased use of aging electric power facilities, safety problems are emerging. Electromagnetic wave and leakage current detection are mainly used as sensing methods to detect live-line partial discharges. Although electromagnetic sensors are excellent at providing an initial diagnosis and very reliable, it is difficult to precisely determine the fault point, while leakage current sensors require a connection to the ground line and are very vulnerable to line noise. The partial discharge characteristic in particular is accompanied by statistical irregularity, and it has been reported that proper statistical processing of data is very important. Therefore, in this paper, we present the results of analyzing ${\Phi}-q-n$ cluster distributions of partial discharge characteristics by using K-means clustering to develop an expert partial discharge diagnosis system generated in a GIS facility.

Multivariate Analysis for Classification of Smog Type during the Summer Season in Seoul, Korea (다변량해석을 이용한 서울시 하계 스모그의 형태 분류)

  • 홍낙기;이종범;김용국
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.9 no.4
    • /
    • pp.278-287
    • /
    • 1993
  • In order to calssify smog type durnig the summer season in Seoul, air Quality and meterorological data were analyzed by multivariate analysis. Among 15 variables relating to visibility, 10 variables were selected by multiple regression analysis for clustering of smog types; total suspended particle, sulfur dioxide, ozone, ntrogen dioxide, total hydrocarbon, south-north wind component, ralative humidity, precipitable water, mixing height and air temperature. Somg types were grouped into three clusters using cubic clustering criterion and the mumbers of days in each cluster were contained 74, 28 and 16 days. Each cluster was seperated clearly by sulfur dioxide, precipitable water and air teperature. The first cluster was representative of high ozone concentration and prevailing meterological conditions for ozone formation. Therefore, visibility in the first cluster was considered to be affected by photochemical smog. The third cluster showed characteristics of sulphurous smog type due to the higher concentration of primary pollutant, based on the dry condition than that in another cluster. On the other hand, the characteristic of the second cluster was not relatively clear, but considered to be in an intermediate characteristic between photochemical smog and sulphurous smog type.

  • PDF

Fuzzy Identification by means of Fuzzy Inference Method and Its Application to Wate Water Treatment System (퍼지추론 방법에 의한 퍼지동정과 하수처리공정시스템 응용)

  • 오성권;주영훈;남위석;우광방
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.6
    • /
    • pp.43-52
    • /
    • 1994
  • A design method of rule-based fuzzy modeling is presented for the model identification of complex and nonlinear systems. The proposed rule-based fuzzy modeling implements system structure and parameter identification in the efficient form of ``IF....,THEN...', using the theories of optimization theory , linguistic fuzzy implication rules and fuzzy c-means clustering. Three kinds of method for fuzzy modeling presented in this paper include simplified inference (type I), linear inference (type 2), and modified linear inference (type 3). In order to identify premise structure and parameter of fuzzy implication rules, fuzzy c- means clustering and modified complex method are used respectively and the least sequare method is utilized for the identification of optimum consequence parameters. Time series data for gas furance and those for sewage treatment process are used to evaluate the performance of the proposed rule-based fuzzy modeling. Comparison shows that the proposed method can produce the fuzzy model with higher accuracy than previous other studies.

  • PDF

Poor Correlation Between the New Statistical and the Old Empirical Algorithms for DNA Microarray Analysis

  • Kim, Ju Han;Kuo, Winston P.;Kong, Sek-Won;Ohno-Machado, Lucila;Kohane, Isaac S.
    • Genomics & Informatics
    • /
    • v.1 no.2
    • /
    • pp.87-93
    • /
    • 2003
  • DNA microarray is currently the most prominent tool for investigating large-scale gene expression data. Different algorithms for measuring gene expression levels from scanned images of microarray experiments may significantly impact the following steps of functional genomic analyses. $Affymetrix^{(R)}$ recently introduced high-density microarrays and new statistical algorithms in Microarray Suit (MAS) version 5.0$^{(R)}$. Very high correlations (0.92 - 0.97) between the new algorithms and the old algorithms (MAS 4.0) across several species and conditions were reported. We found that the column-wise array correlations had a tendency to be much higher than the row-wise gene correlations, which may be much more meaningful in the following higher-order data analyses including clustering and pattern analyses. In this paper, not only the detailed comparison of the two sets of algorithms is illustrated, but the impact of the introducing new algorithms on the further clustering analysis of microarray data and of possible pitfalls in mixing the old and the new algorithms were also described.

Improved LTE Fingerprint Positioning Through Clustering-based Repeater Detection and Outlier Removal

  • Kwon, Jae Uk;Chae, Myeong Seok;Cho, Seong Yun
    • Journal of Positioning, Navigation, and Timing
    • /
    • v.11 no.4
    • /
    • pp.369-379
    • /
    • 2022
  • In weighted k-nearest neighbor (WkNN)-based Fingerprinting positioning step, a process of comparing the requested positioning signal with signal information for each reference point stored in the fingerprint DB is performed. At this time, the higher the number of matched base station identifiers, the higher the possibility that the terminal exists in the corresponding location, and in fact, an additional weight is added to the location in proportion to the number of matching base stations. On the other hand, if the matching number of base stations is small, the selected candidate reference point has high dependence on the similarity value of the signal. But one problem arises here. The positioning signal can be compared with the repeater signal in the signal information stored on the DB, and the corresponding reference point can be selected as a candidate location. The selected reference point is likely to be an outlier, and if a certain weight is applied to the corresponding location, the error of the estimated location information increases. In order to solve this problem, this paper proposes a WkNN technique including an outlier removal function. To this end, it is first determined whether the repeater signal is included in the DB information of the matched base station. If the reference point for the repeater signal is selected as the candidate position, the reference position corresponding to the outlier is removed based on the clustering technique. The performance of the proposed technique is verified through data acquired in Seocho 1 and 2 dongs in Seoul.

Evolutionary Design of Radial Basis Function-based Polynomial Neural Network with the aid of Information Granulation (정보 입자화를 통한 방사형 기저 함수 기반 다항식 신경 회로망의 진화론적 설계)

  • Park, Ho-Sung;Jin, Yong-Ha;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.4
    • /
    • pp.862-870
    • /
    • 2011
  • In this paper, we introduce a new topology of Radial Basis Function-based Polynomial Neural Networks (RPNN) that is based on a genetically optimized multi-layer perceptron with Radial Polynomial Neurons (RPNs). This study offers a comprehensive design methodology involving mechanisms of optimization algorithms, especially Fuzzy C-Means (FCM) clustering method and Particle Swarm Optimization (PSO) algorithms. In contrast to the typical architectures encountered in Polynomial Neural Networks (PNNs), our main objective is to develop a design strategy of RPNNs as follows : (a) The architecture of the proposed network consists of Radial Polynomial Neurons (RPNs). In here, the RPN is fully reflective of the structure encountered in numeric data which are granulated with the aid of Fuzzy C-Means (FCM) clustering method. The RPN dwells on the concepts of a collection of radial basis function and the function-based nonlinear (polynomial) processing. (b) The PSO-based design procedure being applied at each layer of RPNN leads to the selection of preferred nodes of the network (RPNs) whose local characteristics (such as the number of input variables, a collection of the specific subset of input variables, the order of the polynomial, and the number of clusters as well as a fuzzification coefficient in the FCM clustering) can be easily adjusted. The performance of the RPNN is quantified through the experimentation where we use a number of modeling benchmarks - NOx emission process data of gas turbine power plant and learning machine data(Automobile Miles Per Gallon Data) already experimented with in fuzzy or neurofuzzy modeling. A comparative analysis reveals that the proposed RPNN exhibits higher accuracy and superb predictive capability in comparison to some previous models available in the literature.