• Title/Summary/Keyword: Group Model Clustering

Search Result 99, Processing Time 0.029 seconds

A Study of Library Grouping using Cluster Analysis Methods (군집분석 기법을 이용한 공공도서관 그룹화에 대한 연구)

  • Kwak, Chul Wan
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.31 no.3
    • /
    • pp.79-99
    • /
    • 2020
  • The purpose of this study is to investigate the model of cluster analysis techniques for grouping public libraries and analyze their characteristics. Statistical data of public libraries of the National Library Statistics System were used, and three models of cluster analysis were applied. As a result of the study, cluster analysis was conducted based on the size of public libraries, and it was largely divided into two clusters. The size of the cluster was largely skewed to one side. For grouping based on size, the ward method of hierarchical cluster analysis and the k-means cluster analysis model were suitable. Three suggestions were presented as implications of the grouping method of public libraries. First, it is necessary to collect library service-related data in addition to statistical data. Second, an analysis model suitable for the data set to be analyzed must be applied. Third, it is necessary to study the possibility of using cluster analysis techniques in various fields other than library grouping.

Genetic Relationship of Mono-cotyledonous Model Plant by Ionizing Irradiation (단자엽 모델 식물의 방사선원 별 처리에 따른 유전적 다형성 분석)

  • Song, Mira;Kim, Sun-Hee;Jang, Duk-Soo;Kang, Si-Yong;Kim, Jin-Baek;Kim, Sang Hoon;Ha, Bo-Keun;Kim, Dong Sub
    • Journal of Radiation Industry
    • /
    • v.6 no.1
    • /
    • pp.23-29
    • /
    • 2012
  • In this study, we investigated the genetic variation in the general of monocot model plant (rice) in response to various ionizing irradiations including gamma-ray, ion beam and cosmic-ray. The non-irradiated and three irradiated (200 Gy of gamma-ray and 40 Gy of ion beam and cosmic-ray) plants were analyzed by AFLP technique using capillary electrophoresis with ABI3130xl genetic analyzer. The 29 primer combinations tested produced polymorphism results showing a total of 2,238 bands with fragments sizes ranged from 30 bp to 600 bp. The number of polymorphism generated by each primer combinations was varied significantly, ranging from 2 (M-CAC/E-ACG) to 158 (M-CAT/E-AGG) with an average of 77 bands. Polymorphic peaks were detected as 1,269 with an average of 44 per primer combinations. By UPGMA (Unweighted Pair Group Method using Arithmetic clustering) analysis method, the clusters were divided into non-irradiated sample and three irradiated samples at a similarity coefficient of 0.41 and three irradiation samples was subdivided into cosmic-ray and two irradiation samples (200 Gy of gamma-ray and 40 Gy of ion beam) at similarity coefficient of 0.48. Similarity coefficient values ranged from 0.41 to 0.55.

A study on vision system based on Generalized Hough Transform 2-D object recognition (Generalized Hough Transform을 이용한 이차원 물체인식 비젼 시스템 구현에 대한 연구)

  • Koo, Bon-Cheol;Park, Jin-Soo;Chien Sung-Il
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.1
    • /
    • pp.67-78
    • /
    • 1996
  • The purpose of this paper is object recognition even in the presence of occlusion by using generalized Hough transform(GHT). The GHT can be considered as a kind of model based object recognition algorithm and is executed in the following two stages. The first stage is to store the information of the model in the form of R-table (Reference table). The next stage is to identify the existence of the objects in the image by using the R-table. The improved GHT method is proposed for the practical vision system. First, in constructing the R-table, we extracted the partial arc from the portion of the whole object boundary, and this partial arc can be used for constructing the R-table. Also, clustering algorithm is employed for compensating an error arised by digitizing an object image. Second, an efficient method is introduced to avoid Ballard's use of 4-D array which is necessary for estimating position, orientation and scale change of an object. Only 2-D array is enough for recognizing an object. Especially, scale token method is introduced for calculating the scale change which is easily affected by camera zoom. The results of our test show that the improved hierarchical GHT method operates stably in the realistic vision situation, even in the case of object occlusion.

  • PDF

Depth Map Pre-processing using Gaussian Mixture Model and Mean Shift Filter (혼합 가우시안 모델과 민쉬프트 필터를 이용한 깊이 맵 부호화 전처리 기법)

  • Park, Sung-Hee;Yoo, Ji-Sang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.5
    • /
    • pp.1155-1163
    • /
    • 2011
  • In this paper, we propose a new pre-processing algorithm applied to depth map to improve the coding efficiency. Now, 3DV/FTV group in the MPEG is working for standard of 3DVC(3D video coding), but compression method for depth map images are not confirmed yet. In the proposed algorithm, after dividing the histogram distribution of a given depth map by EM clustering method based on GMM, we classify the depth map into several layered images. Then, we apply different mean shift filter to each classified image according to the existence of background or foreground in it. In other words, we try to maximize the coding efficiency while keeping the boundary of each object and taking average operation toward inner field of the boundary. The experiments are performed with many test images and the results show that the proposed algorithm achieves bits reduction of 19% ~ 20% and computation time is also reduced.

Development for Wetland Network Model in Nakdong Basin using a Graph Theory (그래프이론을 이용한 낙동강 유역의 습지네트워크 구축모델 개발)

  • Rho, Paikho
    • Journal of Wetlands Research
    • /
    • v.15 no.3
    • /
    • pp.397-406
    • /
    • 2013
  • Wetland conservation plan has been established to protect ecologically important wetlands based on vegetation integrity, spatial distribution of endangered species, but recently more demands are concentrated on the landscape ecological approaches such as topological relationship, neighboring area, spatial arrangements between wetlands at the broad scale. Landscape ecological analysis and graph theory are conducted to identify spatial characteristics related to core nodes and weak links of wetland networks in Nakdong basin. Regular planar model, which is selected for wetland networks, is applied in the Nakdong basin. The analysis indicates that 5 regional groups and 4 core wetlands are extracted with 15km threshold distance. The IIC and PC values based on the binary and probability models suggest that the wetland group C composed of main stream of Nakdong river and Geumho river is the most important area for wetland network. Wetland conservation plan, restoration projected of damaged and weak links between wetlands should be proposed through evaluating the node, links, and networks from wetlands at the local to the regional scale in Nakdong basin.

Predicting the Performance of Recommender Systems through Social Network Analysis and Artificial Neural Network (사회연결망분석과 인공신경망을 이용한 추천시스템 성능 예측)

  • Cho, Yoon-Ho;Kim, In-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.159-172
    • /
    • 2010
  • The recommender system is one of the possible solutions to assist customers in finding the items they would like to purchase. To date, a variety of recommendation techniques have been developed. One of the most successful recommendation techniques is Collaborative Filtering (CF) that has been used in a number of different applications such as recommending Web pages, movies, music, articles and products. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. Broadly, there are memory-based CF algorithms, model-based CF algorithms, and hybrid CF algorithms which combine CF with content-based techniques or other recommender systems. While many researchers have focused their efforts in improving CF performance, the theoretical justification of CF algorithms is lacking. That is, we do not know many things about how CF is done. Furthermore, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting the performances of CF algorithms in advance is practically important and needed. In this study, we propose an efficient approach to predict the performance of CF. Social Network Analysis (SNA) and Artificial Neural Network (ANN) are applied to develop our prediction model. CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. SNA facilitates an exploration of the topological properties of the network structure that are implicit in data for CF recommendations. An ANN model is developed through an analysis of network topology, such as network density, inclusiveness, clustering coefficient, network centralization, and Krackhardt's efficiency. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Inclusiveness refers to the number of nodes which are included within the various connected parts of the social network. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. Krackhardt's efficiency characterizes how dense the social network is beyond that barely needed to keep the social group even indirectly connected to one another. We use these social network measures as input variables of the ANN model. As an output variable, we use the recommendation accuracy measured by F1-measure. In order to evaluate the effectiveness of the ANN model, sales transaction data from H department store, one of the well-known department stores in Korea, was used. Total 396 experimental samples were gathered, and we used 40%, 40%, and 20% of them, for training, test, and validation, respectively. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. The input variable measuring process consists of following three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used Net Miner 3 and UCINET 6.0 for SNA, and Clementine 11.1 for ANN modeling. The experiments reported that the ANN model has 92.61% estimated accuracy and 0.0049 RMSE. Thus, we can know that our prediction model helps decide whether CF is useful for a given application with certain data characteristics.

State of Mind in the Flow 4-Channel Model and Play (플로우 4경로모형의 마음상태와 플레이(play))

  • Sohn, Jun-Sang
    • Journal of Global Scholars of Marketing Science
    • /
    • v.17 no.2
    • /
    • pp.1-29
    • /
    • 2007
  • The flow theory becomes one of the most important frameworks in the internet research arena. Hoffman and Novak proposed a hierarchical flow model showing the antecedents and outcomes of flow and the relationship among these variables in the hyper-media computer circumstances (Hoffman and Novak 1996). This model was further tested after their initial research (Novak, Hoffman, and Yung 2000). At their paper, Hoffman and Novak explained that the balance of challenge and skill leads to flow which means the positive optimal state of mind (Hoffman and Novak 1996). An imbalance between challenge and skill, leads to negative states of mind like anxiety, boredom, apathy (Csikszentmihalyi and Csikszentmihalyi 1988). Almost all research on the flow 4-channel model have been focusingon flow, the positive state of mind (Ellis, Voelkl, and Morris 1994 Mathwick and Rigdon 2004). However, it also needs to examine the formation of the negative states of minds and their outcomes. Flow researchers explain play or playfulness as antecedents or the early state of flow. However, play has been regarded as a distinct concept from flow in the flow literatures (Hoffman and Novak 1996; Novak, Hoffman, and Yung 2000). Mathwick and Rigdon discovered the influences of challenge and skill on play; they also observed the influence of play on web-loyalty and brand loyalty (Mathwick and Rigdon 2004). Unfortunately, they did not go so far as to test the influences of play on state of mind. This study focuses on the relationships between state of mind in the flow 4-channel model and play. Early research has attempted to hypothetically explain state of mind in flow theory, but has not been tested except flow until now. Also the importance of play has been emphasized in the flow theory, but has not been tested in the flow 4-channel model context. This researcher attempts to analyze the relationships among state of mind, skill of play, challenge, state of mind and web loyalty. For this objective, I developed a measure for state of mind and defined the concept of play as a trait. Then, the influences of challenge and skill on the state of mind and play under on-line shopping conditions were tested. Also the influences of play on state of mind were tested and those of flow and play on web loyalty were highlighted. 294 undergraduate students participated in this research survey. They were asked to respond about their perceptions of challenge, skill, state of mind, play, and web-loyalty to on-line shopping mall. Respondents were restricted to students who bought products on-line in a month. In case of buying products at two or more on-line shopping malls, they asked to respond about the shopping mall where they bought the most important one. Construct validity, discriminant validity, and convergent validity were used to check the measurement validations. Also, Cronbach's alpha was used to check scale reliability. A series of exploratory factor analyses was conducted. This researcher conducted confirmatory factor analyses to assess the validity of measurements. All items loaded significantly on their respective constructs. Also, all reliabilities were greater than.70. Chi-square difference tests and goodness of fit tests supported discriminant and convergent validity. The results of clustering and ANOVA showed that high challenge and high skill leaded to flow, low challenge and high skill leaded to boredom, and low challenge and low skill leaded to apathy. But, it was different from my expectation that high challenge and low skill didnot lead to anxiety but leaded to apathy. The results also showed that high challenge and high skill, and high challenge and low skill leaded to the highest play. Low challenge leaded to low play. 4 Structural Equation Models were built by flow, anxiety, boredom, apathy for analyzing not only the impact of play on state of mind and web-loyalty, but also that of state of mind on web-loyalty. According the analyses results of these models, play impacted flow and web-loyalty positively, but impacted anxiety, boredom, and apathy negatively. Results also showed that flow impacted web-loyalty positively, but anxiety, boredom, and apathy impacted web-loyalty negatively. The interpretations and implications of the test results of the hypotheses are as follows. First, respondents belonging to different clusters based on challenge and skill level experienced different states of mind such as flow, anxiety, boredom, apathy. The low challenge and low skill group felt the highest anxiety and apathy. It could be interpreted that this group feeling high anxiety or fear, then avoided attempts to shop on-line. Second, it was found that higher challenge leads to higher levels of play. Test results show that the play level of the high challenge and low skill group (anxiety group) was higher than that of the high challenge and high skill group (flow group). However, this was not significant. Third, play positively impacted flow and negatively impacted boredom. The negative impacts on anxiety and apathy were not significant. This means that the combination of challenge and skill creates different results. Forth, play and flow positively impacted web-loyalty, but anxiety, boredom, apathy had negative impacts. The effect of play on web-loyalty was stronger in case of anxiety, boredom, apathy group than fl ow group. These results show that challenge and skill influences state of mind and play. Results also demonstrate how play and flow influence web-loyalty. It implies that state of mind and play should be the core marketing variables in internet marketing. The flow theory has been focusing on flow and on the positive outcomes of flow experiences. But, this research shows that lots of consumers experience the negative state of mind rather than flow state in the internet shopping circumstance. Results show that the negative state of mind leads to low or negative web-loyalty. Play can have an important role with the web-loyalty when consumers have the negative state of mind. Results of structural equation model analyses show that play influences web-loyalty positively, even though consumers may be in the negative state of mind. This research found the impacts of challenge and skill on state of mind in the flow 4-channel model, not only flow but also anxiety, boredom, apathy. Also, it highlighted the role of play in the flow 4-channel model context and impacts on web-loyalty. However, tests show a few different results from hypothetical expectations such as the highest anxiety level of apathy group and insignificant impacts of play on anxiety and apathy. Further research needs to replicate this research and/or to compare 3-channel model with 4-channel model.

  • PDF

Online news-based stock price forecasting considering homogeneity in the industrial sector (산업군 내 동질성을 고려한 온라인 뉴스 기반 주가예측)

  • Seong, Nohyoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.1-19
    • /
    • 2018
  • Since stock movements forecasting is an important issue both academically and practically, studies related to stock price prediction have been actively conducted. The stock price forecasting research is classified into structured data and unstructured data, and it is divided into technical analysis, fundamental analysis and media effect analysis in detail. In the big data era, research on stock price prediction combining big data is actively underway. Based on a large number of data, stock prediction research mainly focuses on machine learning techniques. Especially, research methods that combine the effects of media are attracting attention recently, among which researches that analyze online news and utilize online news to forecast stock prices are becoming main. Previous studies predicting stock prices through online news are mostly sentiment analysis of news, making different corpus for each company, and making a dictionary that predicts stock prices by recording responses according to the past stock price. Therefore, existing studies have examined the impact of online news on individual companies. For example, stock movements of Samsung Electronics are predicted with only online news of Samsung Electronics. In addition, a method of considering influences among highly relevant companies has also been studied recently. For example, stock movements of Samsung Electronics are predicted with news of Samsung Electronics and a highly related company like LG Electronics.These previous studies examine the effects of news of industrial sector with homogeneity on the individual company. In the previous studies, homogeneous industries are classified according to the Global Industrial Classification Standard. In other words, the existing studies were analyzed under the assumption that industries divided into Global Industrial Classification Standard have homogeneity. However, existing studies have limitations in that they do not take into account influential companies with high relevance or reflect the existence of heterogeneity within the same Global Industrial Classification Standard sectors. As a result of our examining the various sectors, it can be seen that there are sectors that show the industrial sectors are not a homogeneous group. To overcome these limitations of existing studies that do not reflect heterogeneity, our study suggests a methodology that reflects the heterogeneous effects of the industrial sector that affect the stock price by applying k-means clustering. Multiple Kernel Learning is mainly used to integrate data with various characteristics. Multiple Kernel Learning has several kernels, each of which receives and predicts different data. To incorporate effects of target firm and its relevant firms simultaneously, we used Multiple Kernel Learning. Each kernel was assigned to predict stock prices with variables of financial news of the industrial group divided by the target firm, K-means cluster analysis. In order to prove that the suggested methodology is appropriate, experiments were conducted through three years of online news and stock prices. The results of this study are as follows. (1) We confirmed that the information of the industrial sectors related to target company also contains meaningful information to predict stock movements of target company and confirmed that machine learning algorithm has better predictive power when considering the news of the relevant companies and target company's news together. (2) It is important to predict stock movements with varying number of clusters according to the level of homogeneity in the industrial sector. In other words, when stock prices are homogeneous in industrial sectors, it is important to use relational effect at the level of industry group without analyzing clusters or to use it in small number of clusters. When the stock price is heterogeneous in industry group, it is important to cluster them into groups. This study has a contribution that we testified firms classified as Global Industrial Classification Standard have heterogeneity and suggested it is necessary to define the relevance through machine learning and statistical analysis methodology rather than simply defining it in the Global Industrial Classification Standard. It has also contribution that we proved the efficiency of the prediction model reflecting heterogeneity.

Reconstruction of 3D Building Model from Satellite Imagery Based on the Grouping of 3D Line Segments Using Centroid Neural Network (중심신경망을 이용한 3차원 선소의 군집화에 의한 위성영상의 3차원 건물모델 재구성)

  • Woo, Dong-Min;Park, Dong-Chul;Ho, Hai-Nguyen;Kim, Tae-Hyun
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.2
    • /
    • pp.121-130
    • /
    • 2011
  • This paper highlights the reconstruction of the rectilinear type of 3D rooftop model from satellite image data using centroid neural network. The main idea of the proposed 3D reconstruction method is based on the grouping of 3D line segments. 3D lines are extracted by 2D lines and DEM (Digital Elevation Map) data evaluated from a pair of stereo images. Our grouping process consists of two steps. We carry out the first grouping process to group fragmented or duplicated 3D lines into the principal 3D lines, which can be used to construct the rooftop model, and construct the groups of lines that are parallel each other in the second step. From the grouping result, 3D rooftop models are reconstructed by the final clustering process. High-resolution IKONOS images are utilized for the experiments. The experimental result's indicate that the reconstructed building models almost reflect the actual position and shape of buildings in a precise manner, and that the proposed approach can be efficiently applied to building reconstruction problem from high-resolution satellite images of an urban area.

Real Estate Price Forecasting by Exploiting the Regional Analysis Based on SOM and LSTM (SOM과 LSTM을 활용한 지역기반의 부동산 가격 예측)

  • Shin, Eun Kyung;Kim, Eun Mi;Hong, Tae Ho
    • The Journal of Information Systems
    • /
    • v.30 no.2
    • /
    • pp.147-163
    • /
    • 2021
  • Purpose The study aims to predict real estate prices by utilizing regional characteristics. Since real estate has the characteristic of immobility, the characteristics of a region have a great influence on the price of real estate. In addition, real estate prices are closely related to economic development and are a major concern for policy makers and investors. Accurate house price forecasting is necessary to prepare for the impact of house price fluctuations. To improve the performance of our predictive models, we applied LSTM, a widely used deep learning technique for predicting time series data. Design/methodology/approach This study used time series data on real estate prices provided by the Ministry of Land, Infrastructure and Transport. For time series data preprocessing, HP filters were applied to decompose trends and SOM was used to cluster regions with similar price directions. To build a real estate price prediction model, SVR and LSTM were applied, and the prices of regions classified into similar clusters by SOM were used as input variables. Findings The clustering results showed that the region of the same cluster was geographically close, and it was possible to confirm the characteristics of being classified as the same cluster even if there was a price level and a similar industry group. As a result of predicting real estate prices in 1, 2, and 3 months, LSTM showed better predictive performance than SVR, and LSTM showed better predictive performance in long-term forecasting 3 months later than in 1-month short-term forecasting.