• Title/Summary/Keyword: clustering method

Search Result 2,553, Processing Time 0.031 seconds

A Study on Social Issues and Consumption Behavior Using Big Data (빅데이터를 활용한 사회적 이슈와 소비행동 연구)

  • Baek, Seung-Heon;Kim, Gi-Tak
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.8
    • /
    • pp.377-389
    • /
    • 2019
  • This study conducted social network big data analysis to investigate consumer's perception of Japanese sporting goods related to Japanese boycott and to extract problems and variables by recognition. Social network big data analysis was conducted in two areas, "Japanese boycott" and "Japanese sporting goods". Months of data were collected and investigated. If you specify the research method, you will identify the issues of the times - keyword setting using social network analysis - clustering using CONCOR analysis using TEXTOM and Ucinet 6 programs - variable selection through expert meetings - questionnaire preparation and answering - and validity of questionnaire Reliability Verification - It consists of hypothesis verification using the structural model equation. Based on the results of using the big data of social networks, four variables of relevant characteristics, nationality, attitude, and consumption behavior were extracted. A total of 30 questions and 292 questionnaires were used for final hypothesis verification. As a result of the analysis, first, the boycott-related characteristics showed a positive relationship with nationality. Specifically, all of the characteristics related to boycotts (necessary boycott, sense of boycott, and perceived boycott benefits were positively related to nationality. In addition, nationality was found to have a positive relationship with consumption behavior.

A Study on the Visualization of Data in Virtual Space utilizing Realistic Exhibition Contents - Focusing on the application of the Tamed Cloud clustering algorithm in 70mK project (전시콘텐츠에 구현된 가상공간 내 데이터 시각화 연구 - 70mK의 Tamed Cloud 군집형 알고리즘 적용을 중심으로)

  • Sungmin Kang;Daniel H. Byun
    • Trans-
    • /
    • v.15
    • /
    • pp.1-24
    • /
    • 2023
  • This study examines the application of data visualization technology using a clustered data algorithm called 'Tamed Cloud' to virtual spaces and seeks the possibility of implementing it in various types of realistic exhibition contents. To this end, we first attempt to classify virtual reality (VR) exhibition contents starting with COVID-19, and summarize technologies applied. Also, various realistic exhibition contents provide visitors with an opportunity to appreciate the artworks through online and virtual exhibitions. In this trend, virtual reality and augmented reality (AR) technologies have been introduced, allowing visitors to enjoy the artwork more immersively, and the possibility of realistic exhibition content with interaction between the artwork and the user is also being demonstrated. Based on this background, this study examines the history of exhibition contents by dividing them before and after the advent of virtual reality technology, and examines how the clustered algorithm technology called Tamed Cloud was applied to virtual space and implemented as a realistic exhibition content in <70mK> project. By synthesizing all of this, we propose a convergence method of data visualization, virtual reality, and realistic content, and propose it as a new alternative to realistic exhibition content in virtual space.

A study on the electrolytic properties of $CaF_2$ crystals with $YF_3$ addition ($YF_3 $ 첨가에 따른 $CaF_2 $ 결정의 고체전해질 특성에 관한 연구)

  • Cha, Y.W.;Park, D.C.;Orr, K.K.
    • Journal of the Korean Crystal Growth and Crystal Technology
    • /
    • v.4 no.1
    • /
    • pp.21-32
    • /
    • 1994
  • $CaF_2$ crystals were grown with various growth rates by Bridgman method, and the electrical properties of these were studied to examine the changes of ionic conductivities with growth rates by AC Impedance Analyzer. As the growth rates were higher, $CaF_2$ crystals were grown to polycrystals from single crystal. And as grain boundaries and various defects were altered, the ionic conductivities were changed dramatically. $YF_3$ added to $CaF_2$ for disorderizing $CaF_2$ structure and improving the number of $F^-$ carriers and vacancies in $CaF_2$ crystals. Then $Ca_{1-x}Y_XF_{2+X}$ crystals were gained. And the ionic conductivities of $Ca_{1-x}Y_XF_{2+X}$ crystals were investigated with $YF_3$ addition. The ionic conductivities of $CaF_2$ and $Ca_{1-x}Y_XF_{2+X}$ crystals with temperatures were compared. In addition, the effects of clusterings and defects on the electrical properties of solid electrolytes were researched.

  • PDF

Improving the Performance of Deep-Learning-Based Ground-Penetrating Radar Cavity Detection Model using Data Augmentation and Ensemble Techniques (데이터 증강 및 앙상블 기법을 이용한 딥러닝 기반 GPR 공동 탐지 모델 성능 향상 연구)

  • Yonguk Choi;Sangjin Seo;Hangilro Jang;Daeung Yoon
    • Geophysics and Geophysical Exploration
    • /
    • v.26 no.4
    • /
    • pp.211-228
    • /
    • 2023
  • Ground-penetrating radar (GPR) surveys are commonly used to monitor embankments, which is a nondestructive geophysical method. The results of GPR surveys can be complex, depending on the situation, and data processing and interpretation are subject to expert experiences, potentially resulting in false detection. Additionally, this process is time-intensive. Consequently, various studies have been undertaken to detect cavities in GPR survey data using deep learning methods. Deep-learning-based approaches require abundant data for training, but GPR field survey data are often scarce due to cost and other factors constaining field studies. Therefore, in this study, a deep- learning-based model was developed for embankment GPR survey cavity detection using data augmentation strategies. A dataset was constructed by collecting survey data over several years from the same embankment. A you look only once (YOLO) model, commonly used in computer vision for object detection, was employed for this purpose. By comparing and analyzing various strategies, the optimal data augmentation approach was determined. After initial model development, a stepwise process was employed, including box clustering, transfer learning, self-ensemble, and model ensemble techniques, to enhance the final model performance. The model performance was evaluated, with the results demonstrating its effectiveness in detecting cavities in embankment GPR survey data.

Identification of Employee Experience Factors and Their Influence on Job Satisfaction (직원경험 요인 파악 및 직무 만족도에 끼치는 영향력 분석)

  • Juhyeon Lee;So-Hyun Lee;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.25 no.2
    • /
    • pp.181-203
    • /
    • 2023
  • With the fierce competition of companies for the attraction of outstanding individuals, job satisfaction of employees has been of importance. In this circumstance, many companies try to invest in job satisfaction improvement by finding employees' everyday experiences and difficulties. However, due to a lack of understanding of the employee experience, their investments are not paying off. This study examined the relationship between employee experience and job satisfaction using employee reviews and company ratings from Glassdoor, one of the largest employee communities worldwide. We use text mining techniques such as K-means clustering and LDA topic-based sentiment analysis to extract key experience factors by job level, and DistilBERT sentiment analysis to measure the sentiment score of each employee experience factor. The drawn employee experience factors and each sentiment score were analyzed quantitatively, and thereby relations between each employee experience factor and job satisfaction were analyzed. As a result, this study found that there is a significant difference between the workplace experiences of managers and general employees. In addition, employee experiences that affect job satisfaction also differed between positions, such as customer relationship and autonomy, which did not affect the satisfaction of managers. This study used text mining and quantitative modeling method based on theory of work adjustment so as to find and verify main factors of employee experience, and thus expanded research literature. In addition, the results of this study are applicable to the personnel management strategy for improving employees' job satisfaction, and are expected to improve corporate productivity ultimately.

Cluster exploration of water pipe leak and complaints surveillance using a spatio-temporal statistical analysis (스캔통계량 분석을 통한 상수도 누수 및 수질 민원 발생 클러스터 탐색)

  • Juwon Lee;Eunju Kim;Sookhyun Nam;Tae-Mun Hwang
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.37 no.5
    • /
    • pp.261-269
    • /
    • 2023
  • In light of recent social concerns related to issues such as water supply pipe deterioration leading to problems like leaks and degraded water quality, the significance of maintenance efforts to enhance water source quality and ensure a stable water supply has grown substantially. In this study, scan statistic was applied to analyze water quality complaints and water leakage accidents from 2015 to 2021 to present a reasonable method to identify areas requiring improvement in water management. SaTScan, a spatio-temporal statistical analysis program, and ArcGIS were used for spatial information analysis, and clusters with high relative risk (RR) were determined using the maximum log-likelihood ratio, relative risk, and Monte Carlo hypothesis test for I city, the target area. Specifically, in the case of water quality complaints, the analysis results were compared by distinguishing cases occurring before and after the onset of "red water." The period between 2015 and 2019 revealed that preceding the occurrence of red water, the leak cluster at location L2 posed a significantly higher risk (RR: 2.45) than other regions. As for water quality complaints, cluster C2 exhibited a notably elevated RR (RR: 2.21) and appeared concentrated in areas D and S, respectively. On the other hand, post-red water incidents of water quality complaints were predominantly concentrated in area S. The analysis found that the locations of complaint clusters were similar to those of red water incidents. Of these, cluster C7 exhibited a substantial RR of 4.58, signifying more than a twofold increase compared to pre-incident levels. A kernel density map analysis was performed using GIS to identify priority areas for waterworks management based on the central location of clusters and complaint cluster RR data.

Analysis of Research Trends Related to drug Repositioning Based on Machine Learning (머신러닝 기반의 신약 재창출 관련 연구 동향 분석)

  • So Yeon Yoo;Gyoo Gun Lim
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.21-37
    • /
    • 2022
  • Drug repositioning, one of the methods of developing new drugs, is a useful way to discover new indications by allowing drugs that have already been approved for use in people to be used for other purposes. Recently, with the development of machine learning technology, the case of analyzing vast amounts of biological information and using it to develop new drugs is increasing. The use of machine learning technology to drug repositioning will help quickly find effective treatments. Currently, the world is having a difficult time due to a new disease caused by coronavirus (COVID-19), a severe acute respiratory syndrome. Drug repositioning that repurposes drugsthat have already been clinically approved could be an alternative to therapeutics to treat COVID-19 patients. This study intends to examine research trends in the field of drug repositioning using machine learning techniques. In Pub Med, a total of 4,821 papers were collected with the keyword 'Drug Repositioning'using the web scraping technique. After data preprocessing, frequency analysis, LDA-based topic modeling, random forest classification analysis, and prediction performance evaluation were performed on 4,419 papers. Associated words were analyzed based on the Word2vec model, and after reducing the PCA dimension, K-Means clustered to generate labels, and then the structured organization of the literature was visualized using the t-SNE algorithm. Hierarchical clustering was applied to the LDA results and visualized as a heat map. This study identified the research topics related to drug repositioning, and presented a method to derive and visualize meaningful topics from a large amount of literature using a machine learning algorithm. It is expected that it will help to be used as basic data for establishing research or development strategies in the field of drug repositioning in the future.

Analysis of Interactions in Multiple Genes using IFSA(Independent Feature Subspace Analysis) (IFSA 알고리즘을 이용한 유전자 상호 관계 분석)

  • Kim, Hye-Jin;Choi, Seung-Jin;Bang, Sung-Yang
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.3
    • /
    • pp.157-165
    • /
    • 2006
  • The change of external/internal factors of the cell rquires specific biological functions to maintain life. Such functions encourage particular genes to jnteract/regulate each other in multiple ways. Accordingly, we applied a linear decomposition model IFSA, which derives hidden variables, called the 'expression mode' that corresponds to the functions. To interpret gene interaction/regulation, we used a cross-correlation method given an expression mode. Linear decomposition models such as principal component analysis (PCA) and independent component analysis (ICA) were shown to be useful in analyzing high dimensional DNA microarray data, compared to clustering methods. These methods assume that gene expression is controlled by a linear combination of uncorrelated/indepdendent latent variables. However these methods have some difficulty in grouping similar patterns which are slightly time-delayed or asymmetric since only exactly matched Patterns are considered. In order to overcome this, we employ the (IFSA) method of [1] to locate phase- and shut-invariant features. Membership scoring functions play an important role to classify genes since linear decomposition models basically aim at data reduction not but at grouping data. We address a new function essential to the IFSA method. In this paper we stress that IFSA is useful in grouping functionally-related genes in the presence of time-shift and expression phase variance. Ultimately, we propose a new approach to investigate the multiple interaction information of genes.

The Need for Paradigm Shift in Semantic Similarity and Semantic Relatedness : From Cognitive Semantics Perspective (의미간의 유사도 연구의 패러다임 변화의 필요성-인지 의미론적 관점에서의 고찰)

  • Choi, Youngseok;Park, Jinsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.111-123
    • /
    • 2013
  • Semantic similarity/relatedness measure between two concepts plays an important role in research on system integration and database integration. Moreover, current research on keyword recommendation or tag clustering strongly depends on this kind of semantic measure. For this reason, many researchers in various fields including computer science and computational linguistics have tried to improve methods to calculating semantic similarity/relatedness measure. This study of similarity between concepts is meant to discover how a computational process can model the action of a human to determine the relationship between two concepts. Most research on calculating semantic similarity usually uses ready-made reference knowledge such as semantic network and dictionary to measure concept similarity. The topological method is used to calculated relatedness or similarity between concepts based on various forms of a semantic network including a hierarchical taxonomy. This approach assumes that the semantic network reflects the human knowledge well. The nodes in a network represent concepts, and way to measure the conceptual similarity between two nodes are also regarded as ways to determine the conceptual similarity of two words(i.e,. two nodes in a network). Topological method can be categorized as node-based or edge-based, which are also called the information content approach and the conceptual distance approach, respectively. The node-based approach is used to calculate similarity between concepts based on how much information the two concepts share in terms of a semantic network or taxonomy while edge-based approach estimates the distance between the nodes that correspond to the concepts being compared. Both of two approaches have assumed that the semantic network is static. That means topological approach has not considered the change of semantic relation between concepts in semantic network. However, as information communication technologies make advantage in sharing knowledge among people, semantic relation between concepts in semantic network may change. To explain the change in semantic relation, we adopt the cognitive semantics. The basic assumption of cognitive semantics is that humans judge the semantic relation based on their cognition and understanding of concepts. This cognition and understanding is called 'World Knowledge.' World knowledge can be categorized as personal knowledge and cultural knowledge. Personal knowledge means the knowledge from personal experience. Everyone can have different Personal Knowledge of same concept. Cultural Knowledge is the knowledge shared by people who are living in the same culture or using the same language. People in the same culture have common understanding of specific concepts. Cultural knowledge can be the starting point of discussion about the change of semantic relation. If the culture shared by people changes for some reasons, the human's cultural knowledge may also change. Today's society and culture are changing at a past face, and the change of cultural knowledge is not negligible issues in the research on semantic relationship between concepts. In this paper, we propose the future directions of research on semantic similarity. In other words, we discuss that how the research on semantic similarity can reflect the change of semantic relation caused by the change of cultural knowledge. We suggest three direction of future research on semantic similarity. First, the research should include the versioning and update methodology for semantic network. Second, semantic network which is dynamically generated can be used for the calculation of semantic similarity between concepts. If the researcher can develop the methodology to extract the semantic network from given knowledge base in real time, this approach can solve many problems related to the change of semantic relation. Third, the statistical approach based on corpus analysis can be an alternative for the method using semantic network. We believe that these proposed research direction can be the milestone of the research on semantic relation.

A Study of Intangible Cultural Heritage Communities through a Social Network Analysis - Focused on the Item of Jeongseon Arirang - (소셜 네트워크 분석을 통한 무형문화유산 공동체 지식연결망 연구 - 정선아리랑을 중심으로 -)

  • Oh, Jung-shim
    • Korean Journal of Heritage: History & Science
    • /
    • v.52 no.3
    • /
    • pp.172-187
    • /
    • 2019
  • Knowledge of intangible cultural heritage is usually disseminated through word-of-mouth and actions rather than written records. Thus, people assemble to teach others about it and form communities. Accordingly, to understand and spread information about intangible cultural heritage properly, it is necessary to understand not only their attributes but also a community's relational characteristics. Community members include specialized transmitters who work under the auspices of institutions, and general transmitters who enjoy intangible cultural heritage in their daily lives. They converse about intangible cultural heritage in close relationships. However, to date, research has focused only on professionals. Thus, this study focused on the roles of general transmitters of intangible cultural heritage information by investigating intangible cultural heritage communities centering around Jeongseon Arirang; a social network analysis was performed. Regarding the research objectives presented in the introduction, the main findings of the study are summarized as follows. First, there were 197 links between 74 members of the Jeongseon Arirang Transmission Community. One individual had connections with 2.7 persons on average, and all were connected through two steps in the community. However, the density and the clustering coefficient were low, 0.036 and 0.32, respectively; therefore, the cohesiveness of this community was low, and the relationships between the members were not strong. Second, 'Young-ran Yu', 'Nam-gi Kim' and 'Gil-ja Kim' were found to be the prominent figures of the Jeongseon Arirang Transmission Community, and the central structure of the network was concentrated around these three individuals. Being located in the central structure of the network indicates that a person is popular and ranked high. Also, it means that a person has an advantage in terms of the speed and quantity of the acquisition of information and resources, and is in a relatively superior position in terms of bargaining power. Third, to understand the replaceability of the roles of Young-ran Yu, Nam-gi Kim, and Gil-ja Kim, who were found to be the major figures through an analysis of the central structure, structural equivalence was profiled. The results of the analysis showed that the positions and roles of Young-ran Yu, Nam-gi Kim, and Gil-ja Kim were unrivaled and irreplaceable in the Jeongseon Arirang Transmission Community. However, considering that these three members were in their 60s and 70s, it seemed that it would be necessary to prepare measures for the smooth maintenance and operation of the community. Fourth, to examine the subgroup hidden in the network of the Jeongseon Arirang Transmission Community, an analysis of communities was conducted. A community refers to a subgroup clearly differentiated based on modularity. The results of the analysis identified the existence of four communities. Furthermore, the results of an analysis of the central structure showed that the communities were formed and centered around Young-ran Yu, Hyung-jo Kim, Nam-gi Kim, and Gil-ja Kim. Most of the transmission TAs recommended by those members, students who completed a course, transmission scholarship holders, and the general members taught in the transmission classes of the Jeongseon Arirang Preservation Society were included as members of the communities. Through these findings, it was discovered that it is possible to maintain the transmission genealogy, making an exchange with the general members by employing the present method for the transmission of Jeongseon Arirang, the joint transmission method. It is worth paying attention to the joint transmission method as it overcomes the demerits of the existing closed one-on-one apprentice method and provides members with an opportunity to learn their masters' various singing styles. This study is significant for the following reasons: First, by collecting and examining data using a social network analysis method, this study analyzed phenomena that had been difficult to investigate using existing statistical analyses. Second, by adopting a different approach to the previous method in which the genealogy was understood, looking at oral data, this study analyzed the structures of the transmitters' relationships with objective and quantitative data. Third, this study visualized and presented the abstract structures of the relationships among the transmitters of intangible cultural heritage information on a 2D spring map. The results of this study can be utilized as a baseline for the development of community-centered policies for the protection of intangible cultural heritage specified in the UNESCO Convention for the Safeguarding of Intangible Cultural Heritage. To achieve this, it would be necessary to supplement this study through case studies and follow-up studies on more aspects in the future.