• Title/Summary/Keyword: Clustering coefficient

Search Result 197, Processing Time 0.031 seconds

A Design on Face Recognition System Based on pRBFNNs by Obtaining Real Time Image (실시간 이미지 획득을 통한 pRBFNNs 기반 얼굴인식 시스템 설계)

  • Oh, Sung-Kwun;Seok, Jin-Wook;Kim, Ki-Sang;Kim, Hyun-Ki
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.12
    • /
    • pp.1150-1158
    • /
    • 2010
  • In this study, the Polynomial-based Radial Basis Function Neural Networks is proposed as one of the recognition part of overall face recognition system that consists of two parts such as the preprocessing part and recognition part. The design methodology and procedure of the proposed pRBFNNs are presented to obtain the solution to high-dimensional pattern recognition problem. First, in preprocessing part, we use a CCD camera to obtain a picture frame in real-time. By using histogram equalization method, we can partially enhance the distorted image influenced by natural as well as artificial illumination. We use an AdaBoost algorithm proposed by Viola and Jones, which is exploited for the detection of facial image area between face and non-facial image area. As the feature extraction algorithm, PCA method is used. In this study, the PCA method, which is a feature extraction algorithm, is used to carry out the dimension reduction of facial image area formed by high-dimensional information. Secondly, we use pRBFNNs to identify the ID by recognizing unique pattern of each person. The proposed pRBFNNs architecture consists of three functional modules such as the condition part, the conclusion part, and the inference part as fuzzy rules formed in 'If-then' format. In the condition part of fuzzy rules, input space is partitioned with Fuzzy C-Means clustering. In the conclusion part of rules, the connection weight of pRBFNNs is represented as three kinds of polynomials such as constant, linear, and quadratic. Coefficients of connection weight identified with back-propagation using gradient descent method. The output of pRBFNNs model is obtained by fuzzy inference method in the inference part of fuzzy rules. The essential design parameters (including learning rate, momentum coefficient and fuzzification coefficient) of the networks are optimized by means of the Particle Swarm Optimization. The proposed pRBFNNs are applied to real-time face recognition system and then demonstrated from the viewpoint of output performance and recognition rate.

Level 3 Type Land Use Land Cover (LULC) Characteristics Based on Phenological Phases of North Korea (생물계절 상 분석을 통한 Level 3 type 북한 토지피복 특성)

  • Yu, Jae-Shim;Park, Chong-Hwa;Lee, Seung-Ho
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.4
    • /
    • pp.457-466
    • /
    • 2011
  • The objectives of this study are to produce level 3 type LULC map and analysis of phenological features of North Korea, ISODATA clustering of the 88scenes of MVC of MODIS NDVI in 2008 and 8scenes in 2009 was carried out. Analysis of phenological phases based mapping method was conducted, In level 2 type map, the confusion matrix was summarized and Kappa coefficient was calculated. Total of 27 typical habitat types that represent the dominant species or vegetation density that cover land surface of North Korea in 2008 were made. The total of 27 classes includes the 17 forest biotopes, 7 different croplands, 2 built up types and one water body. Dormancy phase of winter (${\sigma}^2$ = 0.348) and green up phase in spring (${\sigma}^2$ = 0.347) displays phenological dynamics when much vegetation growth changes take place. Overall accuracy is (851/955) 85.85% and Kappa coefficient is 0.84. Phenological phase based mapping method was possible to minimize classification error when analyzing the inaccessible land of North Korea.

Syntaxonomy and Synecology of the Robinia pseudoacacia Forests (아까시나무림의 군락분류와 군락생태)

  • Cho, Kwang-Jin;Kim, Jong-Won
    • The Korean Journal of Ecology
    • /
    • v.28 no.1
    • /
    • pp.15-23
    • /
    • 2005
  • The black locust (Robinia pseudoacacia L.) forests were studied by a phytosociological approach. Particular attention was given to characterize the vegetation classification, distribution pattern, and ecological flora of the syntaxa classified. A total of 38 releves were analyzed by using Correlation coefficient, UPGMA as the clustering method, and Principal Coordinates Analysis for ordination. Ecological flora analyzed by plant character sets such as scrambler, annual and biennial plants, forest elements, and actual urbanization index. The analyzed data are based on site-releve matrix with relative net contribution degree (r-NCD) of species. A total of 77 families, 193 genera and 323 species of vascular plants are recorded. Camellino-Robinietum pseudoacaciae ass. nov. and Phragmites-Robinia pseudoacacia community were described. Main cluster and ordination could be separated: 1) urban type, 2) rural type, 3) riparian type, and 4) combined type. It is defined that the Robinietum is a representative unit on the black locust afforestation, Phragmites-Robinia community on the lentic zone in the river ecosystem, and Cameliino-Robinietum ailanthetosum altissimae as an urban forest type. The Robinietum was considered as a perpetual community.

Design of Optimized pRBFNNs-based Face Recognition Algorithm Using Two-dimensional Image and ASM Algorithm (최적 pRBFNNs 패턴분류기 기반 2차원 영상과 ASM 알고리즘을 이용한 얼굴인식 알고리즘 설계)

  • Oh, Sung-Kwun;Ma, Chang-Min;Yoo, Sung-Hoon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.6
    • /
    • pp.749-754
    • /
    • 2011
  • In this study, we propose the design of optimized pRBFNNs-based face recognition system using two-dimensional Image and ASM algorithm. usually the existing 2 dimensional face recognition methods have the effects of the scale change of the image, position variation or the backgrounds of an image. In this paper, the face region information obtained from the detected face region is used for the compensation of these defects. In this paper, we use a CCD camera to obtain a picture frame directly. By using histogram equalization method, we can partially enhance the distorted image influenced by natural as well as artificial illumination. AdaBoost algorithm is used for the detection of face image between face and non-face image area. We can butt up personal profile by extracting the both face contour and shape using ASM(Active Shape Model) and then reduce dimension of image data using PCA. The proposed pRBFNNs consists of three functional modules such as the condition part, the conclusion part, and the inference part. In the condition part of fuzzy rules, input space is partitioned with Fuzzy C-Means clustering. In the conclusion part of rules, the connection weight of RBFNNs is represented as three kinds of polynomials such as constant, linear, and quadratic. The essential design parameters (including learning rate, momentum coefficient and fuzzification coefficient) of the networks are optimized by means of Differential Evolution. The proposed pRBFNNs are applied to real-time face image database and then demonstrated from viewpoint of the output performance and recognition rate.

A Study on Scale Effects of the MAUP According to the Degree of Spatial Autocorrelation - Focused on LBSNS Data - (공간적 자기상관성의 정도에 따른 MAUP에서의 스케일 효과 연구 - LBSNS 데이터를 중심으로 -)

  • Lee, Young Min;Kwon, Pil;Yu, Ki Yun;Huh, Yong
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.24 no.1
    • /
    • pp.25-33
    • /
    • 2016
  • In order to visualize point based Location-Based Social Network Services(LBSNS) data on multi-scaled tile map effectively, it is necessary to apply tile-based clustering method. Then determinating reasonable numbers and size of tiles is required. However, there is no such criteria and the numbers and size of tiles are modified based on data type and the purpose of analysis. In other words, researchers' subjectivity is always involved in this type of study. This is when Modifiable Areal Unit Problem(MAUP) occurs, that affects the results of analysis. Among LBSNS, geotagged Twitter data were chosen to find the influence of MAUP in scale effects perspective. For this purpose, the degree of spatial autocorrelation using spatial error model was altered, and change of distributions was analyzed using Morna's I. As a result, positive spatial autocorrelation showed in the original data and the spatial autocorrelation was decreased as the value of spatial autoregressive coefficient was increasing. Therefore, the intensity of the spatial autocorrelation of Twitter data was adjusted to five levels, and for each level, nine different size of grid was created. For each level and different grid sizes, Moran's I was calculated. It was found that the spatial autocorrelation was increased when the aggregation level was being increased and decreased in a certainpoint. Another tendency was found that the scale effect of MAUP was decreased when the spatial autocorrelation was high.

Predicting the Performance of Recommender Systems through Social Network Analysis and Artificial Neural Network (사회연결망분석과 인공신경망을 이용한 추천시스템 성능 예측)

  • Cho, Yoon-Ho;Kim, In-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.159-172
    • /
    • 2010
  • The recommender system is one of the possible solutions to assist customers in finding the items they would like to purchase. To date, a variety of recommendation techniques have been developed. One of the most successful recommendation techniques is Collaborative Filtering (CF) that has been used in a number of different applications such as recommending Web pages, movies, music, articles and products. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. Broadly, there are memory-based CF algorithms, model-based CF algorithms, and hybrid CF algorithms which combine CF with content-based techniques or other recommender systems. While many researchers have focused their efforts in improving CF performance, the theoretical justification of CF algorithms is lacking. That is, we do not know many things about how CF is done. Furthermore, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting the performances of CF algorithms in advance is practically important and needed. In this study, we propose an efficient approach to predict the performance of CF. Social Network Analysis (SNA) and Artificial Neural Network (ANN) are applied to develop our prediction model. CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. SNA facilitates an exploration of the topological properties of the network structure that are implicit in data for CF recommendations. An ANN model is developed through an analysis of network topology, such as network density, inclusiveness, clustering coefficient, network centralization, and Krackhardt's efficiency. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Inclusiveness refers to the number of nodes which are included within the various connected parts of the social network. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. Krackhardt's efficiency characterizes how dense the social network is beyond that barely needed to keep the social group even indirectly connected to one another. We use these social network measures as input variables of the ANN model. As an output variable, we use the recommendation accuracy measured by F1-measure. In order to evaluate the effectiveness of the ANN model, sales transaction data from H department store, one of the well-known department stores in Korea, was used. Total 396 experimental samples were gathered, and we used 40%, 40%, and 20% of them, for training, test, and validation, respectively. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. The input variable measuring process consists of following three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used Net Miner 3 and UCINET 6.0 for SNA, and Clementine 11.1 for ANN modeling. The experiments reported that the ANN model has 92.61% estimated accuracy and 0.0049 RMSE. Thus, we can know that our prediction model helps decide whether CF is useful for a given application with certain data characteristics.

A Study on the Impact Factors of Contents Diffusion in Youtube using Integrated Content Network Analysis (일반영향요인과 댓글기반 콘텐츠 네트워크 분석을 통합한 유튜브(Youtube)상의 콘텐츠 확산 영향요인 연구)

  • Park, Byung Eun;Lim, Gyoo Gun
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.19-36
    • /
    • 2015
  • Social media is an emerging issue in content services and in current business environment. YouTube is the most representative social media service in the world. YouTube is different from other conventional content services in its open user participation and contents creation methods. To promote a content in YouTube, it is important to understand the diffusion phenomena of contents and the network structural characteristics. Most previous studies analyzed impact factors of contents diffusion from the view point of general behavioral factors. Currently some researchers use network structure factors. However, these two approaches have been used separately. However this study tries to analyze the general impact factors on the view count and content based network structures all together. In addition, when building a content based network, this study forms the network structure by analyzing user comments on 22,370 contents of YouTube not based on the individual user based network. From this study, we re-proved statistically the causal relations between view count and not only general factors but also network factors. Moreover by analyzing this integrated research model, we found that these factors affect the view count of YouTube according to the following order; Uploader Followers, Video Age, Betweenness Centrality, Comments, Closeness Centrality, Clustering Coefficient and Rating. However Degree Centrality and Eigenvector Centrality affect the view count negatively. From this research some strategic points for the utilizing of contents diffusion are as followings. First, it is needed to manage general factors such as the number of uploader followers or subscribers, the video age, the number of comments, average rating points, and etc. The impact of average rating points is not so much important as we thought before. However, it is needed to increase the number of uploader followers strategically and sustain the contents in the service as long as possible. Second, we need to pay attention to the impacts of betweenness centrality and closeness centrality among other network factors. Users seems to search the related subject or similar contents after watching a content. It is needed to shorten the distance between other popular contents in the service. Namely, this study showed that it is beneficial for increasing view counts by decreasing the number of search attempts and increasing similarity with many other contents. This is consistent with the result of the clustering coefficient impact analysis. Third, it is important to notice the negative impact of degree centrality and eigenvector centrality on the view count. If the number of connections with other contents is too much increased it means there are many similar contents and eventually it might distribute the view counts. Moreover, too high eigenvector centrality means that there are connections with popular contents around the content, and it might lose the view count because of the impact of the popular contents. It would be better to avoid connections with too powerful popular contents. From this study we analyzed the phenomenon and verified diffusion factors of Youtube contents by using an integrated model consisting of general factors and network structure factors. From the viewpoints of social contribution, this study might provide useful information to music or movie industry or other contents vendors for their effective contents services. This research provides basic schemes that can be applied strategically in online contents marketing. One of the limitations of this study is that this study formed a contents based network for the network structure analysis. It might be an indirect method to see the content network structure. We can use more various methods to establish direct content network. Further researches include more detailed researches like an analysis according to the types of contents or domains or characteristics of the contents or users, and etc.

Water resources monitoring technique using multi-source satellite image data fusion (다종 위성영상 자료 융합 기반 수자원 모니터링 기술 개발)

  • Lee, Seulchan;Kim, Wanyub;Cho, Seongkeun;Jeon, Hyunho;Choi, Minhae
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.8
    • /
    • pp.497-508
    • /
    • 2023
  • Agricultural reservoirs are crucial structures for water resources monitoring especially in Korea where the resources are seasonally unevenly distributed. Optical and Synthetic Aperture Radar (SAR) satellites, being utilized as tools for monitoring the reservoirs, have unique limitations in that optical sensors are sensitive to weather conditions and SAR sensors are sensitive to noises and multiple scattering over dense vegetations. In this study, we tried to improve water body detection accuracy through optical-SAR data fusion, and quantitatively analyze the complementary effects. We first detected water bodies at Edong, Cheontae reservoir using the Compact Advanced Satellite 500(CAS500), Kompsat-3/3A, and Sentinel-2 derived Normalized Difference Water Index (NDWI), and SAR backscattering coefficient from Sentinel-1 by K-means clustering technique. After that, the improvements in accuracies were analyzed by applying K-means clustering to the 2-D grid space consists of NDWI and SAR. Kompsat-3/3A was found to have the best accuracy (0.98 at both reservoirs), followed by Sentinel-2(0.83 at Edong, 0.97 at Cheontae), Sentinel-1(both 0.93), and CAS500(0.69, 0.78). By applying K-means clustering to the 2-D space at Cheontae reservoir, accuracy of CAS500 was improved around 22%(resulting accuracy: 0.95) with improve in precision (85%) and degradation in recall (14%). Precision of Kompsat-3A (Sentinel-2) was improved 3%(5%), and recall was degraded 4%(7%). More precise water resources monitoring is expected to be possible with developments of high-resolution SAR satellites including CAS500-5, developments of image fusion and water body detection techniques.

A Study on Differences of Contents and Tones of Arguments among Newspapers Using Text Mining Analysis (텍스트 마이닝을 활용한 신문사에 따른 내용 및 논조 차이점 분석)

  • Kam, Miah;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.53-77
    • /
    • 2012
  • This study analyses the difference of contents and tones of arguments among three Korean major newspapers, the Kyunghyang Shinmoon, the HanKyoreh, and the Dong-A Ilbo. It is commonly accepted that newspapers in Korea explicitly deliver their own tone of arguments when they talk about some sensitive issues and topics. It could be controversial if readers of newspapers read the news without being aware of the type of tones of arguments because the contents and the tones of arguments can affect readers easily. Thus it is very desirable to have a new tool that can inform the readers of what tone of argument a newspaper has. This study presents the results of clustering and classification techniques as part of text mining analysis. We focus on six main subjects such as Culture, Politics, International, Editorial-opinion, Eco-business and National issues in newspapers, and attempt to identify differences and similarities among the newspapers. The basic unit of text mining analysis is a paragraph of news articles. This study uses a keyword-network analysis tool and visualizes relationships among keywords to make it easier to see the differences. Newspaper articles were gathered from KINDS, the Korean integrated news database system. KINDS preserves news articles of the Kyunghyang Shinmun, the HanKyoreh and the Dong-A Ilbo and these are open to the public. This study used these three Korean major newspapers from KINDS. About 3,030 articles from 2008 to 2012 were used. International, national issues and politics sections were gathered with some specific issues. The International section was collected with the keyword of 'Nuclear weapon of North Korea.' The National issues section was collected with the keyword of '4-major-river.' The Politics section was collected with the keyword of 'Tonghap-Jinbo Dang.' All of the articles from April 2012 to May 2012 of Eco-business, Culture and Editorial-opinion sections were also collected. All of the collected data were handled and edited into paragraphs. We got rid of stop-words using the Lucene Korean Module. We calculated keyword co-occurrence counts from the paired co-occurrence list of keywords in a paragraph. We made a co-occurrence matrix from the list. Once the co-occurrence matrix was built, we used the Cosine coefficient matrix as input for PFNet(Pathfinder Network). In order to analyze these three newspapers and find out the significant keywords in each paper, we analyzed the list of 10 highest frequency keywords and keyword-networks of 20 highest ranking frequency keywords to closely examine the relationships and show the detailed network map among keywords. We used NodeXL software to visualize the PFNet. After drawing all the networks, we compared the results with the classification results. Classification was firstly handled to identify how the tone of argument of a newspaper is different from others. Then, to analyze tones of arguments, all the paragraphs were divided into two types of tones, Positive tone and Negative tone. To identify and classify all of the tones of paragraphs and articles we had collected, supervised learning technique was used. The Na$\ddot{i}$ve Bayesian classifier algorithm provided in the MALLET package was used to classify all the paragraphs in articles. After classification, Precision, Recall and F-value were used to evaluate the results of classification. Based on the results of this study, three subjects such as Culture, Eco-business and Politics showed some differences in contents and tones of arguments among these three newspapers. In addition, for the National issues, tones of arguments on 4-major-rivers project were different from each other. It seems three newspapers have their own specific tone of argument in those sections. And keyword-networks showed different shapes with each other in the same period in the same section. It means that frequently appeared keywords in articles are different and their contents are comprised with different keywords. And the Positive-Negative classification showed the possibility of classifying newspapers' tones of arguments compared to others. These results indicate that the approach in this study is promising to be extended as a new tool to identify the different tones of arguments of newspapers.

Evaluation of Genetic Diversity among Persimmon Cultivars (Diospyros kaki Thunb.) Using Microsatellite Markers (초위성 마커를 이용한 감(Diospyros kaki Thunb.)의 유연관계 분석)

  • Hwang, Ji-Hyeon;Park, Yu-Ok;Kim, Sung-Churl;Lee, Yong-Jae;Kang, Jum-Soon;Choi, Young-Whan;Son, Beung-Gu;Park, Young-Hoon
    • Journal of Life Science
    • /
    • v.20 no.4
    • /
    • pp.632-638
    • /
    • 2010
  • The genetic diversity among 48 persimmon (Diospyros kaki Thunb.) accessions, indigenous in Korea and introduced from Japan and China, was evaluated by using simple sequence repeat (SSR) markers. From 20 SSR primer sets, a total of 114 polymorphic markers were detected among 12 pollination-constant non-astringent (PCNA), 13 pollination-variant non-astringent (PVNA), 15 pollination-variant astringent (PVA), and 8 pollination-constant astringent (PCA) cultivars. Analysis of pair-wise genetic similarity coefficient (Nei-Li) and unweighted pair-group method with arithmetic averaging (UPGMA) clustering revealed two main clusters and four subclusters for cluster I. The subclustering pattern was in accordance with the classification of persimmon cultivars based on the nature of astringency loss. Phenetic relationships among the subclusters showed a closer relatedness of the PCNA group with the PVNA group, and the PVA with the PCA group. Genetic similarity co-efficiency was 0.499 on average and the highest (0.954) similarity was observed between 'Cheongdo-Bansi' and 'Haman-Bansi'. The similarity was lowest (0.192) between 'Damopan'and 'Atago'. Identification of each cultivar with the execption of 'Cheongdo-Bansi' and 'Gyeongsan-Bansi' was possible based on the SSR fingerprints, suggesting that these SSR markers are a useful tool for protecting intellectual property on newly developed cultivars.