• Title/Summary/Keyword: Hierarchical K-means clustering

Search Result 86, Processing Time 0.022 seconds

Analysis of Research Trends Related to drug Repositioning Based on Machine Learning (머신러닝 기반의 신약 재창출 관련 연구 동향 분석)

  • So Yeon Yoo;Gyoo Gun Lim
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.21-37
    • /
    • 2022
  • Drug repositioning, one of the methods of developing new drugs, is a useful way to discover new indications by allowing drugs that have already been approved for use in people to be used for other purposes. Recently, with the development of machine learning technology, the case of analyzing vast amounts of biological information and using it to develop new drugs is increasing. The use of machine learning technology to drug repositioning will help quickly find effective treatments. Currently, the world is having a difficult time due to a new disease caused by coronavirus (COVID-19), a severe acute respiratory syndrome. Drug repositioning that repurposes drugsthat have already been clinically approved could be an alternative to therapeutics to treat COVID-19 patients. This study intends to examine research trends in the field of drug repositioning using machine learning techniques. In Pub Med, a total of 4,821 papers were collected with the keyword 'Drug Repositioning'using the web scraping technique. After data preprocessing, frequency analysis, LDA-based topic modeling, random forest classification analysis, and prediction performance evaluation were performed on 4,419 papers. Associated words were analyzed based on the Word2vec model, and after reducing the PCA dimension, K-Means clustered to generate labels, and then the structured organization of the literature was visualized using the t-SNE algorithm. Hierarchical clustering was applied to the LDA results and visualized as a heat map. This study identified the research topics related to drug repositioning, and presented a method to derive and visualize meaningful topics from a large amount of literature using a machine learning algorithm. It is expected that it will help to be used as basic data for establishing research or development strategies in the field of drug repositioning in the future.

Transcriptome Analyses for the Anti-Adipogenic Mechanism of an Herbal Composition (생약복합물의 지방세포형성억제 기전규명을 위한 전사체 분석)

  • Lee, Hae-Yong;Kang, Ryun-Hwa;Bae, Sung-Min;Chae, Soo-Ahn;Lee, Jung-Ju;Oh, Dong-Jin;Park, Suk-Won;Cho, Soo-Hyun;Shim, Yae-Jie;Yoon, Yoo-Sik
    • Journal of Life Science
    • /
    • v.20 no.7
    • /
    • pp.1054-1065
    • /
    • 2010
  • SH21B is a natural composition composed of seven herbs: Scutellaria baicalensis Georgi, Prunus armeniaca Maxim, Ephedra sinica Stapf, Acorus gramineus Soland, Typha orientalis Presl, Polygala tenuifolia Willd and Nelumbo nucifera Gaertner (Ratio 3:3:3:3:3:2:2). In our previous study, we reported that SH21B inhibited adipogenesis and fat accumulation in 3T3-L1 cells through modulation of various regulators in the adipogenesis pathway. The aim of this study was to analyze the transcriptome profiles for the anti-adipogenic effects of SH21B in 3T3-L1 cells. Total RNAs from SH21B-treated 3T3-L1 cells were reverse-transcribed into cDNAs and hybridized to Affymetrix Mouse Gene 1.0 ST array. From microarray analyses, we identified 2,568 genes of which expressions were changed more than two-fold by SH21B, and the clustering analyses of these genes resulted in 9 clusters. Three clusters among the 9 showed down-regulation by SH21B (cluster 4, cluster 6 and cluster 9), and two clusters showed up-regulation by SH21B (cluster 7 and cluster 8) during the adipogenesis of 3T3-L1 cells. It was found that many genes related to cell proliferation and adipogenesis were included in these clusters. Clusters 4, 6 and 9 included genes which were related with adipogenesis induction and cell cycle arrest. Clusters 7 and 8 included genes related to cell proliferation as well as adipogenesis inhibition. These results suggest that the mechanisms of the anti-adipogenic effects of SH21B may be the modulation of genes involved in cell proliferation and adipogenesis.

The Need for Paradigm Shift in Semantic Similarity and Semantic Relatedness : From Cognitive Semantics Perspective (의미간의 유사도 연구의 패러다임 변화의 필요성-인지 의미론적 관점에서의 고찰)

  • Choi, Youngseok;Park, Jinsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.111-123
    • /
    • 2013
  • Semantic similarity/relatedness measure between two concepts plays an important role in research on system integration and database integration. Moreover, current research on keyword recommendation or tag clustering strongly depends on this kind of semantic measure. For this reason, many researchers in various fields including computer science and computational linguistics have tried to improve methods to calculating semantic similarity/relatedness measure. This study of similarity between concepts is meant to discover how a computational process can model the action of a human to determine the relationship between two concepts. Most research on calculating semantic similarity usually uses ready-made reference knowledge such as semantic network and dictionary to measure concept similarity. The topological method is used to calculated relatedness or similarity between concepts based on various forms of a semantic network including a hierarchical taxonomy. This approach assumes that the semantic network reflects the human knowledge well. The nodes in a network represent concepts, and way to measure the conceptual similarity between two nodes are also regarded as ways to determine the conceptual similarity of two words(i.e,. two nodes in a network). Topological method can be categorized as node-based or edge-based, which are also called the information content approach and the conceptual distance approach, respectively. The node-based approach is used to calculate similarity between concepts based on how much information the two concepts share in terms of a semantic network or taxonomy while edge-based approach estimates the distance between the nodes that correspond to the concepts being compared. Both of two approaches have assumed that the semantic network is static. That means topological approach has not considered the change of semantic relation between concepts in semantic network. However, as information communication technologies make advantage in sharing knowledge among people, semantic relation between concepts in semantic network may change. To explain the change in semantic relation, we adopt the cognitive semantics. The basic assumption of cognitive semantics is that humans judge the semantic relation based on their cognition and understanding of concepts. This cognition and understanding is called 'World Knowledge.' World knowledge can be categorized as personal knowledge and cultural knowledge. Personal knowledge means the knowledge from personal experience. Everyone can have different Personal Knowledge of same concept. Cultural Knowledge is the knowledge shared by people who are living in the same culture or using the same language. People in the same culture have common understanding of specific concepts. Cultural knowledge can be the starting point of discussion about the change of semantic relation. If the culture shared by people changes for some reasons, the human's cultural knowledge may also change. Today's society and culture are changing at a past face, and the change of cultural knowledge is not negligible issues in the research on semantic relationship between concepts. In this paper, we propose the future directions of research on semantic similarity. In other words, we discuss that how the research on semantic similarity can reflect the change of semantic relation caused by the change of cultural knowledge. We suggest three direction of future research on semantic similarity. First, the research should include the versioning and update methodology for semantic network. Second, semantic network which is dynamically generated can be used for the calculation of semantic similarity between concepts. If the researcher can develop the methodology to extract the semantic network from given knowledge base in real time, this approach can solve many problems related to the change of semantic relation. Third, the statistical approach based on corpus analysis can be an alternative for the method using semantic network. We believe that these proposed research direction can be the milestone of the research on semantic relation.

Term Mapping Methodology between Everyday Words and Legal Terms for Law Information Search System (법령정보 검색을 위한 생활용어와 법률용어 간의 대응관계 탐색 방법론)

  • Kim, Ji Hyun;Lee, Jong-Seo;Lee, Myungjin;Kim, Wooju;Hong, June Seok
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.137-152
    • /
    • 2012
  • In the generation of Web 2.0, as many users start to make lots of web contents called user created contents by themselves, the World Wide Web is overflowing by countless information. Therefore, it becomes the key to find out meaningful information among lots of resources. Nowadays, the information retrieval is the most important thing throughout the whole field and several types of search services are developed and widely used in various fields to retrieve information that user really wants. Especially, the legal information search is one of the indispensable services in order to provide people with their convenience through searching the law necessary to their present situation as a channel getting knowledge about it. The Office of Legislation in Korea provides the Korean Law Information portal service to search the law information such as legislation, administrative rule, and judicial precedent from 2009, so people can conveniently find information related to the law. However, this service has limitation because the recent technology for search engine basically returns documents depending on whether the query is included in it or not as a search result. Therefore, it is really difficult to retrieve information related the law for general users who are not familiar with legal terms in the search engine using simple matching of keywords in spite of those kinds of efforts of the Office of Legislation in Korea, because there is a huge divergence between everyday words and legal terms which are especially from Chinese words. Generally, people try to access the law information using everyday words, so they have a difficulty to get the result that they exactly want. In this paper, we propose a term mapping methodology between everyday words and legal terms for general users who don't have sufficient background about legal terms, and we develop a search service that can provide the search results of law information from everyday words. This will be able to search the law information accurately without the knowledge of legal terminology. In other words, our research goal is to make a law information search system that general users are able to retrieval the law information with everyday words. First, this paper takes advantage of tags of internet blogs using the concept for collective intelligence to find out the term mapping relationship between everyday words and legal terms. In order to achieve our goal, we collect tags related to an everyday word from web blog posts. Generally, people add a non-hierarchical keyword or term like a synonym, especially called tag, in order to describe, classify, and manage their posts when they make any post in the internet blog. Second, the collected tags are clustered through the cluster analysis method, K-means. Then, we find a mapping relationship between an everyday word and a legal term using our estimation measure to select the fittest one that can match with an everyday word. Selected legal terms are given the definite relationship, and the relations between everyday words and legal terms are described using SKOS that is an ontology to describe the knowledge related to thesauri, classification schemes, taxonomies, and subject-heading. Thus, based on proposed mapping and searching methodologies, our legal information search system finds out a legal term mapped with user query and retrieves law information using a matched legal term, if users try to retrieve law information using an everyday word. Therefore, from our research, users can get exact results even if they do not have the knowledge related to legal terms. As a result of our research, we expect that general users who don't have professional legal background can conveniently and efficiently retrieve the legal information using everyday words.

Implementation Strategy for the Elderly Care Solution Based on Usage Log Analysis: Focusing on the Case of Hyodol Product (사용자 로그 분석에 기반한 노인 돌봄 솔루션 구축 전략: 효돌 제품의 사례를 중심으로)

  • Lee, Junsik;Yoo, In-Jin;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.117-140
    • /
    • 2019
  • As the aging phenomenon accelerates and various social problems related to the elderly of the vulnerable are raised, the need for effective elderly care solutions to protect the health and safety of the elderly generation is growing. Recently, more and more people are using Smart Toys equipped with ICT technology for care for elderly. In particular, log data collected through smart toys is highly valuable to be used as a quantitative and objective indicator in areas such as policy-making and service planning. However, research related to smart toys is limited, such as the development of smart toys and the validation of smart toy effectiveness. In other words, there is a dearth of research to derive insights based on log data collected through smart toys and to use them for decision making. This study will analyze log data collected from smart toy and derive effective insights to improve the quality of life for elderly users. Specifically, the user profiling-based analysis and elicitation of a change in quality of life mechanism based on behavior were performed. First, in the user profiling analysis, two important dimensions of classifying the type of elderly group from five factors of elderly user's living management were derived: 'Routine Activities' and 'Work-out Activities'. Based on the dimensions derived, a hierarchical cluster analysis and K-Means clustering were performed to classify the entire elderly user into three groups. Through a profiling analysis, the demographic characteristics of each group of elderlies and the behavior of using smart toy were identified. Second, stepwise regression was performed in eliciting the mechanism of change in quality of life. The effects of interaction, content usage, and indoor activity have been identified on the improvement of depression and lifestyle for the elderly. In addition, it identified the role of user performance evaluation and satisfaction with smart toy as a parameter that mediated the relationship between usage behavior and quality of life change. Specific mechanisms are as follows. First, the interaction between smart toy and elderly was found to have an effect of improving the depression by mediating attitudes to smart toy. The 'Satisfaction toward Smart Toy,' a variable that affects the improvement of the elderly's depression, changes how users evaluate smart toy performance. At this time, it has been identified that it is the interaction with smart toy that has a positive effect on smart toy These results can be interpreted as an elderly with a desire to meet emotional stability interact actively with smart toy, and a positive assessment of smart toy, greatly appreciating the effectiveness of smart toy. Second, the content usage has been confirmed to have a direct effect on improving lifestyle without going through other variables. Elderly who use a lot of the content provided by smart toy have improved their lifestyle. However, this effect has occurred regardless of the attitude the user has toward smart toy. Third, log data show that a high degree of indoor activity improves both the lifestyle and depression of the elderly. The more indoor activity, the better the lifestyle of the elderly, and these effects occur regardless of the user's attitude toward smart toy. In addition, elderly with a high degree of indoor activity are satisfied with smart toys, which cause improvement in the elderly's depression. However, it can be interpreted that elderly who prefer outdoor activities than indoor activities, or those who are less active due to health problems, are hard to satisfied with smart toys, and are not able to get the effects of improving depression. In summary, based on the activities of the elderly, three groups of elderly were identified and the important characteristics of each type were identified. In addition, this study sought to identify the mechanism by which the behavior of the elderly on smart toy affects the lives of the actual elderly, and to derive user needs and insights.

State of Mind in the Flow 4-Channel Model and Play (플로우 4경로모형의 마음상태와 플레이(play))

  • Sohn, Jun-Sang
    • Journal of Global Scholars of Marketing Science
    • /
    • v.17 no.2
    • /
    • pp.1-29
    • /
    • 2007
  • The flow theory becomes one of the most important frameworks in the internet research arena. Hoffman and Novak proposed a hierarchical flow model showing the antecedents and outcomes of flow and the relationship among these variables in the hyper-media computer circumstances (Hoffman and Novak 1996). This model was further tested after their initial research (Novak, Hoffman, and Yung 2000). At their paper, Hoffman and Novak explained that the balance of challenge and skill leads to flow which means the positive optimal state of mind (Hoffman and Novak 1996). An imbalance between challenge and skill, leads to negative states of mind like anxiety, boredom, apathy (Csikszentmihalyi and Csikszentmihalyi 1988). Almost all research on the flow 4-channel model have been focusingon flow, the positive state of mind (Ellis, Voelkl, and Morris 1994 Mathwick and Rigdon 2004). However, it also needs to examine the formation of the negative states of minds and their outcomes. Flow researchers explain play or playfulness as antecedents or the early state of flow. However, play has been regarded as a distinct concept from flow in the flow literatures (Hoffman and Novak 1996; Novak, Hoffman, and Yung 2000). Mathwick and Rigdon discovered the influences of challenge and skill on play; they also observed the influence of play on web-loyalty and brand loyalty (Mathwick and Rigdon 2004). Unfortunately, they did not go so far as to test the influences of play on state of mind. This study focuses on the relationships between state of mind in the flow 4-channel model and play. Early research has attempted to hypothetically explain state of mind in flow theory, but has not been tested except flow until now. Also the importance of play has been emphasized in the flow theory, but has not been tested in the flow 4-channel model context. This researcher attempts to analyze the relationships among state of mind, skill of play, challenge, state of mind and web loyalty. For this objective, I developed a measure for state of mind and defined the concept of play as a trait. Then, the influences of challenge and skill on the state of mind and play under on-line shopping conditions were tested. Also the influences of play on state of mind were tested and those of flow and play on web loyalty were highlighted. 294 undergraduate students participated in this research survey. They were asked to respond about their perceptions of challenge, skill, state of mind, play, and web-loyalty to on-line shopping mall. Respondents were restricted to students who bought products on-line in a month. In case of buying products at two or more on-line shopping malls, they asked to respond about the shopping mall where they bought the most important one. Construct validity, discriminant validity, and convergent validity were used to check the measurement validations. Also, Cronbach's alpha was used to check scale reliability. A series of exploratory factor analyses was conducted. This researcher conducted confirmatory factor analyses to assess the validity of measurements. All items loaded significantly on their respective constructs. Also, all reliabilities were greater than.70. Chi-square difference tests and goodness of fit tests supported discriminant and convergent validity. The results of clustering and ANOVA showed that high challenge and high skill leaded to flow, low challenge and high skill leaded to boredom, and low challenge and low skill leaded to apathy. But, it was different from my expectation that high challenge and low skill didnot lead to anxiety but leaded to apathy. The results also showed that high challenge and high skill, and high challenge and low skill leaded to the highest play. Low challenge leaded to low play. 4 Structural Equation Models were built by flow, anxiety, boredom, apathy for analyzing not only the impact of play on state of mind and web-loyalty, but also that of state of mind on web-loyalty. According the analyses results of these models, play impacted flow and web-loyalty positively, but impacted anxiety, boredom, and apathy negatively. Results also showed that flow impacted web-loyalty positively, but anxiety, boredom, and apathy impacted web-loyalty negatively. The interpretations and implications of the test results of the hypotheses are as follows. First, respondents belonging to different clusters based on challenge and skill level experienced different states of mind such as flow, anxiety, boredom, apathy. The low challenge and low skill group felt the highest anxiety and apathy. It could be interpreted that this group feeling high anxiety or fear, then avoided attempts to shop on-line. Second, it was found that higher challenge leads to higher levels of play. Test results show that the play level of the high challenge and low skill group (anxiety group) was higher than that of the high challenge and high skill group (flow group). However, this was not significant. Third, play positively impacted flow and negatively impacted boredom. The negative impacts on anxiety and apathy were not significant. This means that the combination of challenge and skill creates different results. Forth, play and flow positively impacted web-loyalty, but anxiety, boredom, apathy had negative impacts. The effect of play on web-loyalty was stronger in case of anxiety, boredom, apathy group than fl ow group. These results show that challenge and skill influences state of mind and play. Results also demonstrate how play and flow influence web-loyalty. It implies that state of mind and play should be the core marketing variables in internet marketing. The flow theory has been focusing on flow and on the positive outcomes of flow experiences. But, this research shows that lots of consumers experience the negative state of mind rather than flow state in the internet shopping circumstance. Results show that the negative state of mind leads to low or negative web-loyalty. Play can have an important role with the web-loyalty when consumers have the negative state of mind. Results of structural equation model analyses show that play influences web-loyalty positively, even though consumers may be in the negative state of mind. This research found the impacts of challenge and skill on state of mind in the flow 4-channel model, not only flow but also anxiety, boredom, apathy. Also, it highlighted the role of play in the flow 4-channel model context and impacts on web-loyalty. However, tests show a few different results from hypothetical expectations such as the highest anxiety level of apathy group and insignificant impacts of play on anxiety and apathy. Further research needs to replicate this research and/or to compare 3-channel model with 4-channel model.

  • PDF