• Title/Summary/Keyword: Cluster based network

Search Result 846, Processing Time 0.028 seconds

A study on the estimation of AADT by short-term traffic volume survey (단기조사 교통량을 이용한 AADT 추정연구)

  • 이승재;백남철;권희정
    • Journal of Korean Society of Transportation
    • /
    • v.20 no.6
    • /
    • pp.59-68
    • /
    • 2002
  • AADT(Annual Average Daily Traffic) can be obtained by using short-term counted traffic data rather than using traffic data collected for 365 days. The process is a very important in estimating AADT using short-term traffic count data. Therefore, There have been many studies about estimating AADT. In this Paper, we tried to improve the process of the AADT estimation based on the former AADT estimation researches. Firstly, we found the factor showing differences among groups. To do so, we examined hourly variables(divided to total hours, weekday hours. Saturday hours, Sunday hours, weekday and Sunday hours, and weekday and Saturday hours) every time changing the number of groups. After all, we selected the hourly variables of Sunday and weekday as the factor showing differences among groups. Secondly, we classified 200 locations into 10 groups through cluster analysis using only monthly variables. The nile of deciding the number of groups is maximizing deviation among hourly variables of each group. Thirdly, we classified 200 locations which had been used in the second step into the 10 groups by applying statistical techniques such as Discriminant analysis and Neural network. This step is for testing the rate of distinguish between the right group including each location and a wrong one. In conclusion, the result of this study's method was closer to real AADT value than that of the former method. and this study significantly contributes to improve the method of AADT estimation.

A Performance Improvement Scheme for a Wireless Internet Proxy Server Cluster (무선 인터넷 프록시 서버 클러스터 성능 개선)

  • Kwak, Hu-Keun;Chung, Kyu-Sik
    • Journal of KIISE:Information Networking
    • /
    • v.32 no.3
    • /
    • pp.415-426
    • /
    • 2005
  • Wireless internet, which becomes a hot social issue, has limitations due to the following characteristics, as different from wired internet. It has low bandwidth, frequent disconnection, low computing power, and small screen in user terminal. Also, it has technical issues to Improve in terms of user mobility, network protocol, security, and etc. Wireless internet server should be scalable to handle a large scale traffic due to rapidly growing users. In this paper, wireless internet proxy server clusters are used for the wireless Internet because their caching, distillation, and clustering functions are helpful to overcome the above limitations and needs. TranSend was proposed as a clustering based wireless internet proxy server but it has disadvantages; 1) its scalability is difficult to achieve because there is no systematic way to do it and 2) its structure is complex because of the inefficient communication structure among modules. In our former research, we proposed the All-in-one structure which can be scalable in a systematic way but it also has disadvantages; 1) data sharing among cache servers is not allowed and 2) its communication structure among modules is complex. In this paper, we proposed its improved scheme which has an efficient communication structure among modules and allows data to be shared among cache servers. We performed experiments using 16 PCs and experimental results show 54.86$\%$ and 4.70$\%$ performance improvement of the proposed system compared to TranSend and All-in-one system respectively Due to data sharing amount cache servers, the proposed scheme has an advantage of keeping a fixed size of the total cache memory regardless of cache server numbers. On the contrary, in All-in-one, the total cache memory size increases proportional to the number of cache servers since each cache server should keep all cache data, respectively.

Geometric Analysis of Fracture System and Suggestion of a Modified RMR on Volcanic Rocks in the Vicinity of Ilgwang Fault (일광단층 인근 화산암 암반사면의 단열계 기하 분석 및 암반 분류 수정안 제시)

  • Chang, Tae-Woo;Lee, Hyeon-Woo;Chae, Byung-Gon;Seo, Yong-Seok;Cho, Yong-Chan
    • The Journal of Engineering Geology
    • /
    • v.17 no.3
    • /
    • pp.483-494
    • /
    • 2007
  • The properties of fracture system on road-cut slopes along the Busan-Ulsan express way under construction are investigated and analyzed. Fracture spacing distributions show log-normal form with extension fractures and negative exponential form with shear fractures. Straight line segments in log-log plots of cumulative fracture length indicate a power-law scaling with exponents of -1.13 in site 1, -1.01 in site 2 and -1.52 in site 3. It is likely that the stability and strength of rock mass are the lowest in site 1 as judged from the analyses of spacing, density and inter-section of fractures in three sites. In contrast, the highest efficiency of the fracture network for conducting fluid flow is seen in site 3 where the largest cluster occupies 73% through the window map. Based on the field survey data, this study modified weighting values of the RMR system using a multiple regression analysis method. The analysis result suggests a modified weighting values of the RMR parameters as follows; 18 for the intact strength of rock; 61 for RQD; 2 for spacing of discontinuities; 2 for the condition of discontinuities; and 17 for ground water.

Genetic Diversity and Relationship of the Genus Barbatula (Cypriniformes; Nemacheilidae) by Mitochondrial DNA Cytochrome b Partial Gene in Korea (한국산 종개속(Barbatula) 어류의 유전적 다양성 특성 연구)

  • An, Jung-Hyun;Yu, Jeong-Nam;Kim, Byung-Jik;Bae, Yang-Seop
    • Korean Journal of Ichthyology
    • /
    • v.33 no.2
    • /
    • pp.107-116
    • /
    • 2021
  • Two stone loaches (Nemacheilidae, Cypriniformes), Barbatula toni (Dybowski, 1869) and B. nuda (Bleeker, 1864), have been recognized in the Korean waters to date. Recently, due to indiscriminate artificial introduction as well as the change of their habitats induced by natural disasters, it seems to be concerned about the damage of species-specific geographic boundaries. We examined the genetic difference of two Korean Barbatula species by the haplotype network based on the Cytochrome b sequences of mitochondrial DNA and the phylogenetic relationships among them including Barbatula fishes occurring around the Korean peninsula. As a result, three and 29 haplotypes were obtained from B. toni and B. nuda, respectively, and totally three clades comprising "toni group", "nuda hangang group", and "nuda donghae group" were identified. The sequence variable sites among them was 10~24%, showing a difference of interspecific level. Phylogenetic relationships of the latter group, especially, forms an independent cluster discriminating with other two groups as well as the Chinese, Japanese, Russian, and European Barbatula species, suggesting the possibility of the specific level divergence.

Differences in Environmental Behavior Practice Experience according to the Level of Environmental Literacy Factors (환경소양 요인별 수준에 따른 환경행동 실천 경험의 차이)

  • Yoonkyung Kim;Jihoon Kang;Dongyoung Lee
    • Journal of the Korean Society of Earth Science Education
    • /
    • v.16 no.1
    • /
    • pp.153-165
    • /
    • 2023
  • This study investigates learners' environmental literacy, classifies the results by factors of environmental literacy, and then investigates the differences in the students' environmental behavior practice experiences according to the classification by factor. The study was conducted with 47 6th grade students from D elementary school located in P metropolitan city as the subject of final analysis, and environmental literacy questionnaires and environmental behavior practice experience questionnaires were used as the main data. As a result of the study, the learners were classified into three groups according to the factors of environmental literacy, and they were respectively named as the "High environmental literacy group", "low environmental literacy group", and "Low Function and Affectif group". A Word network was formed using the descriptions of environmental behavior practice experiences for each cluster, and a Degree Centrality Analysis was performed to visualize and then analyze. As a result of the analysis, "High environmental literacy group" was confirmed, 1) recognized the subjects of environmental action practice as individuals and families, 2) described his experience of environmental action practice in relation to all elements of environmental literacy, and had a relatively pessimistic view. "low environmental literacy group", and "Low Function and Affectif group" were confirmed 1) perceive the subject of environmental behavior practice as a relatively social problem, 2) the description of the experience of environmental behavior practice is relatively biased specific factors, and the "Low Function and Affectif group" is particularly focused on the knowledge element. And 3) it was confirmed that they were aware of climate change from a relatively optimistic perspective. Based on this conclusion, suggestions were made from the perspective of environmental education.

Analysis of Space Use Patterns of Public Library Users through AI Cameras (AI 카메라를 활용한 공공도서관 이용자의 공간이용행태 분석 연구)

  • Gyuhwan Kim;Do-Heon Jeong
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.57 no.4
    • /
    • pp.333-351
    • /
    • 2023
  • This study investigates user behavior in library spaces through the lens of AI camera analytics. By leveraging the face recognition and tracking capabilities of AI cameras, we accurately identified the gender and age of visitors and meticulously collected video data to track their movements. Our findings revealed that female users slightly outnumbered male users and the dominant age group was individuals in their 30s. User visits peaked between Tuesday to Friday, with the highest footfall recorded between 14:00 and 15:00 pm, while visits decreased over the weekend. Most visitors utilized one or two specific spaces, frequently consulting the information desk for inquiries, checking out/returning items, or using the rest area for relaxation. The library stacks were used approximately twice as much as they were avoided. The most frequented subject areas were Philosophy(100), Religion(200), Social Sciences(300), Science(400), Technology(500), and Literature(800), with Literature(800) and Religion(200) displaying the most intersections with other areas. By categorizing users into five clusters based on space utilization patterns, we discerned varying objectives and subject interests, providing insights for future library service enhancements. Moreover, the study underscores the need to address the associated costs and privacy concerns when considering the broader application of AI camera analytics in library settings.

Evaluation of Germplasm and Development of SSR Markers for Marker-assisted Backcross in Tomato (분자마커 이용 여교잡 육종을 위한 토마토 유전자원 평가 및 SSR 마커 개발)

  • Hwang, Ji-Hyun;Kim, Hyuk-Jun;Chae, Young;Choi, Hak-Soon;Kim, Myung-Kwon;Park, Young-Hoon
    • Horticultural Science & Technology
    • /
    • v.30 no.5
    • /
    • pp.557-567
    • /
    • 2012
  • This study was conducted to achieve basal information for the development of tomato cultivars with disease resistances through marker-assisted backcross (MAB). Ten inbred lines with TYLCV, late blight, bacterial wilt, or powdery mildew resistance and four adapted inbred lines with superior horticultural traits were collected, which can be useful as the donor parents and recurrent parents in MAB, respectively. Inbred lines collected were evaluated by molecular markers and bioassay for confirming their disease resistances. To develop DNA markers for selecting recurrent parent genome (background selection) in MAB, a total of 108 simple sequence repeat (SSR) primer sets (nine per chromosome at average) were selected from the tomato reference genetic maps posted on SOL Genomics Network. Genetic similarity and relationships among the inbred lines were assessed using a total of 303 polymorphic SSR markers. Similarity coefficient ranged from 0.33 to 0.80; the highest similarity coefficient (0.80) was found between bacterial wilt-resistant donor lines '10BA333' and '10BA424', and the lowest (0.33) between a late blight resistant-wild species L3708 (S. pimpinelliforium L.) and '10BA424'. UPGMA analysis grouped the inbred lines into three clusters based on the similarity coefficient 0.58. Most of the donor lines of the same resistance were closely related, indicating the possibility that these lines were developed using a common resistance source. Parent combinations (donor parent ${\times}$ recurrent parent) showing appropriate levels of genetic distance and SSR marker polymorphism for MAB were selected based on the dendrogram. These combinations included 'TYR1' ${\times}$ 'RPL1' for TYLCV, '10BA333' or '10BA424' ${\times}$ 'RPL2' for bacterial wilt, and 'KNU12' ${\times}$ 'AV107-4' or 'RPL2' for powdery mildew. For late blight, the wild species resistant line 'L3708' was distantly related to all recurrent parental lines, and a suitable parent combination for MAB was 'L3708' ${\times}$ 'AV107-4', which showed a similarity coefficient of 0.41 and 45 polymorphic SSR markers.

Development of Customer Sentiment Pattern Map for Webtoon Content Recommendation (웹툰 콘텐츠 추천을 위한 소비자 감성 패턴 맵 개발)

  • Lee, Junsik;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.67-88
    • /
    • 2019
  • Webtoon is a Korean-style digital comics platform that distributes comics content produced using the characteristic elements of the Internet in a form that can be consumed online. With the recent rapid growth of the webtoon industry and the exponential increase in the supply of webtoon content, the need for effective webtoon content recommendation measures is growing. Webtoons are digital content products that combine pictorial, literary and digital elements. Therefore, webtoons stimulate consumer sentiment by making readers have fun and engaging and empathizing with the situations in which webtoons are produced. In this context, it can be expected that the sentiment that webtoons evoke to consumers will serve as an important criterion for consumers' choice of webtoons. However, there is a lack of research to improve webtoons' recommendation performance by utilizing consumer sentiment. This study is aimed at developing consumer sentiment pattern maps that can support effective recommendations of webtoon content, focusing on consumer sentiments that have not been fully discussed previously. Metadata and consumer sentiments data were collected for 200 works serviced on the Korean webtoon platform 'Naver Webtoon' to conduct this study. 488 sentiment terms were collected for 127 works, excluding those that did not meet the purpose of the analysis. Next, similar or duplicate terms were combined or abstracted in accordance with the bottom-up approach. As a result, we have built webtoons specialized sentiment-index, which are reduced to a total of 63 emotive adjectives. By performing exploratory factor analysis on the constructed sentiment-index, we have derived three important dimensions for classifying webtoon types. The exploratory factor analysis was performed through the Principal Component Analysis (PCA) using varimax factor rotation. The three dimensions were named 'Immersion', 'Touch' and 'Irritant' respectively. Based on this, K-Means clustering was performed and the entire webtoons were classified into four types. Each type was named 'Snack', 'Drama', 'Irritant', and 'Romance'. For each type of webtoon, we wrote webtoon-sentiment 2-Mode network graphs and looked at the characteristics of the sentiment pattern appearing for each type. In addition, through profiling analysis, we were able to derive meaningful strategic implications for each type of webtoon. First, The 'Snack' cluster is a collection of webtoons that are fast-paced and highly entertaining. Many consumers are interested in these webtoons, but they don't rate them well. Also, consumers mostly use simple expressions of sentiment when talking about these webtoons. Webtoons belonging to 'Snack' are expected to appeal to modern people who want to consume content easily and quickly during short travel time, such as commuting time. Secondly, webtoons belonging to 'Drama' are expected to evoke realistic and everyday sentiments rather than exaggerated and light comic ones. When consumers talk about webtoons belonging to a 'Drama' cluster in online, they are found to express a variety of sentiments. It is appropriate to establish an OSMU(One source multi-use) strategy to extend these webtoons to other content such as movies and TV series. Third, the sentiment pattern map of 'Irritant' shows the sentiments that discourage customer interest by stimulating discomfort. Webtoons that evoke these sentiments are hard to get public attention. Artists should pay attention to these sentiments that cause inconvenience to consumers in creating webtoons. Finally, Webtoons belonging to 'Romance' do not evoke a variety of consumer sentiments, but they are interpreted as touching consumers. They are expected to be consumed as 'healing content' targeted at consumers with high levels of stress or mental fatigue in their lives. The results of this study are meaningful in that it identifies the applicability of consumer sentiment in the areas of recommendation and classification of webtoons, and provides guidelines to help members of webtoons' ecosystem better understand consumers and formulate strategies.

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.