• Title/Summary/Keyword: Knowledge Discovery

Search Result 394, Processing Time 0.028 seconds

Enhancing Recommender Systems by Fusing Diverse Information Sources through Data Transformation and Feature Selection

  • Thi-Linh Ho;Anh-Cuong Le;Dinh-Hong Vu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.5
    • /
    • pp.1413-1432
    • /
    • 2023
  • Recommender systems aim to recommend items to users by taking into account their probable interests. This study focuses on creating a model that utilizes multiple sources of information about users and items by employing a multimodality approach. The study addresses the task of how to gather information from different sources (modalities) and transform them into a uniform format, resulting in a multi-modal feature description for users and items. This work also aims to transform and represent the features extracted from different modalities so that the information is in a compatible format for integration and contains important, useful information for the prediction model. To achieve this goal, we propose a novel multi-modal recommendation model, which involves extracting latent features of users and items from a utility matrix using matrix factorization techniques. Various transformation techniques are utilized to extract features from other sources of information such as user reviews, item descriptions, and item categories. We also proposed the use of Principal Component Analysis (PCA) and Feature Selection techniques to reduce the data dimension and extract important features as well as remove noisy features to increase the accuracy of the model. We conducted several different experimental models based on different subsets of modalities on the MovieLens and Amazon sub-category datasets. According to the experimental results, the proposed model significantly enhances the accuracy of recommendations when compared to SVD, which is acknowledged as one of the most effective models for recommender systems. Specifically, the proposed model reduces the RMSE by a range of 4.8% to 21.43% and increases the Precision by a range of 2.07% to 26.49% for the Amazon datasets. Similarly, for the MovieLens dataset, the proposed model reduces the RMSE by 45.61% and increases the Precision by 14.06%. Additionally, the experimental results on both datasets demonstrate that combining information from multiple modalities in the proposed model leads to superior outcomes compared to relying on a single type of information.

A Philosophical Study on the Generating Process of Declarative Scientific Knowledge - Focused on Inductive, Abductive, and Deductive process (선언적 과학 지식의 생성 과정에 대한 과학철학적 연구 - 귀납적, 귀추적, 연역적 과정을 중심으로 -)

  • Kwon, Yong-Ju;Jeong, Jin-Su;Park, Yun-Bok;Kang, Min-Jeong
    • Journal of The Korean Association For Science Education
    • /
    • v.23 no.3
    • /
    • pp.215-228
    • /
    • 2003
  • The present study is to analyze the arguments about the generation of declarative scientific-knowledge in the philosophy of science and invent a structured model of the process of scientific-knowledge generation with the types of the generated scientific-knowledge. The invented model shows that scientific-knowledge generation is a distinctive process with the processes of inductive, abductive, and deductive thinking. Furthermore, inductive process is included with observation, which is consisted of simple observation and operative observation, and rule-discovery which is involved with the processes of commonness discovery, classification, pattern discovery, and hierarchical relationship. Also, abductive process has two components. One component generates question and second component generates hypothesis in which the process consists of representing question situation, identifying experienced situation, identifying causal explicans, and generating hypothetical explicans. Finally, deductive process is involved with logical inventing test method and evaluation criteria, concrete inventing test method and evaluation criteria, evaluating hypothesis, and making conclusion.

LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19

  • Ouyang, Sizhuo;Wang, Yuxing;Zhou, Kaiyin;Xia, Jingbo
    • Genomics & Informatics
    • /
    • v.19 no.3
    • /
    • pp.23.1-23.7
    • /
    • 2021
  • Currently, coronavirus disease 2019 (COVID-19) literature has been increasing dramatically, and the increased text amount make it possible to perform large scale text mining and knowledge discovery. Therefore, curation of these texts becomes a crucial issue for Bio-medical Natural Language Processing (BioNLP) community, so as to retrieve the important information about the mechanism of COVID-19. PubAnnotation is an aligned annotation system which provides an efficient platform for biological curators to upload their annotations or merge other external annotations. Inspired by the integration among multiple useful COVID-19 annotations, we merged three annotations resources to LitCovid data set, and constructed a cross-annotated corpus, LitCovid-AGAC. This corpus consists of 12 labels including Mutation, Species, Gene, Disease from PubTator, GO, CHEBI from OGER, Var, MPA, CPA, NegReg, PosReg, Reg from AGAC, upon 50,018 COVID-19 abstracts in LitCovid. Contain sufficient abundant information being possible to unveil the hidden knowledge in the pathological mechanism of COVID-19.

Tutorial on Drug Development for Central Nervous System

  • Yoon, Hye-Jin;Kim, Jung-Su
    • Interdisciplinary Bio Central
    • /
    • v.2 no.4
    • /
    • pp.9.1-9.5
    • /
    • 2010
  • Many neurodegenerative diseases, such as Alzheimer's and Parkinson's disease, are devastating disorders that affect millions of people worldwide. However, the number of therapeutic options remains severely limited with only symptomatic management therapies available. With the better understanding of the pathogenesis of neurodegenerative diseases, discovery efforts for disease-modifying drugs have increased dramatically in recent years. However, the process of translating basic science discovery into novel therapies is still lagging behind for various reasons. The task of finding new effective drugs targeting central nervous system (CNS) has unique challenges due to blood-brain barrier (BBB). Furthermore, the relatively slow progress of neurodegenerative disorders create another level of difficulty, as clinical trials must be carried out for an extended period of time. This review is intended to provide molecular and cell biologists with working knowledge and resources on CNS drug discovery and development.

Discovery of promising business items by technology-industry concordance and keyword co-occurrence analysis of US patents. (기술-산업 연계구조 및 특허 분석을 통한 미래유망 아이템 발굴)

  • Cho Byoung-Youl;Rho Hyun-Sook
    • Journal of Korea Technology Innovation Society
    • /
    • v.8 no.2
    • /
    • pp.860-885
    • /
    • 2005
  • This study relates to develop a quantitative method through which promising technology-based business items can be discovered and selected. For this study, we utilized patent trend analysis, technology-industry concordance analysis, and keyword co-occurrence analysis of US patents. By analyzing patent trends and technology-industry concordance, we were able to find out the emerging industry trends : prevalence of bio industry, service industry, and B2C business. From the direct and co-occurrence analysis of newly discovered patent keywords in the year, 2000, 28 promising business item candidates were extracted. Finally, the promising item candidates were prioritized using 4 business attractiveness determinants; market size, product life cycle, degree of the technological innovation, and coincidence with the industry trends. This result implicates that reliable discovery and selection of promising technology-based business items can be performed by a quantitative, objective and low- cost process using knowledge discovery method from patent database instead of peer review.

  • PDF

A Study on Visualization of Digital Preservation Knowledge Domain Using CiteSpace (CiteSpace 적용을 통한 디지털 보존 지식영역 비주얼화 연구)

  • Kim Hee-Jung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.4
    • /
    • pp.89-104
    • /
    • 2005
  • This article identifies an emerging research paradigm and monitors the changes in digital preservation area using CiteSpace, a Java application which supports visual exploration with knowledge discovery in bibliographic databases. 74 articles on digital preservation field covering the time period from 1990-2005 were extracted from Web of Science. According to the result of analysis, core knowledge domains in digital preservation are technical preservation strategies, information network and preservation system, knowledge management and electronic government.

Creating Knowledge from Construction Documents Using Text Mining

  • Shin, Yoonjung;Chi, Seokho
    • International conference on construction engineering and project management
    • /
    • 2015.10a
    • /
    • pp.37-38
    • /
    • 2015
  • A number of documents containing important and useful knowledge have been generated over time in the construction industry. Such text-based knowledge plays an important role in the construction industry for decision-making and business strategy development by being used as best practice for upcoming projects, delivering lessons learned for better risk management and project control. Thus, practical and usable knowledge creation from construction documents is necessary to improve business efficiency. This study proposes a knowledge creating system from construction documents using text mining and the design comprises three main steps - text mining preprocessing, weight calculation of each term, and visualization. A system prototype was developed as a pilot study of the system design. This study is significant because it validates a knowledge creating system design based on text mining and visualization functionality through the developed system prototype. Automated visualization was found to significantly reduce unnecessary time consumption and energy for processing existing data and reading a range of documents to get to their core, and helped the system to provide an insight into the construction industry.

  • PDF

Digital Collaborative Network Architecture Model Supported by Knowledge Engineering in Heritage Sites

  • Marcio Crescencio;Alexandre Augusto Biz;Jose Leomar Todesco
    • Journal of Smart Tourism
    • /
    • v.4 no.1
    • /
    • pp.19-29
    • /
    • 2024
  • The objective of this article is to create a model of integrated management from the framework modeling of a digital collaborative network supported by knowledge engineering to make heritage site in the Brazil more effective. It is an exploratory and qualitative research with thematic analysis as technique of data analysis from the collaborative network, digital platform, world heritage, and tourism themes. The snowballing approach was chosen, and the mapping and classification of relevant studies was developed with the use of the spreadsheet tool and the Mendeley® software. The results show that the collaborative network model oriented towards strategic objectives should be supported by a digital platform that provides a technological environment that adds functionalities and digital platform services with the integration of knowledge engineering techniques and tools, enabling the discovery and sharing of knowledge in the collaborative network.

Linear Programming Model Discovery from Databases Using GPS and Artificial Neural Networks (GPS와 인공신경망을 활용한 데이터베이스로부터의 선형계획모형 발견법)

  • 권오병;양진설
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.25 no.3
    • /
    • pp.91-107
    • /
    • 2000
  • The linear programming model is a special form of useful knowledge that is embedded in a database. Since formulating models from scratch requires knowledge-intensive efforts, knowledge-based formulation support systems have been proposed in the Decision Support Systems area. However, they rely on the assumption that sufficient domain knowledge should already be captured as a specific knowledge representation form. Hence, the purpose of this paper is to propose a methodology that finds useful knowledge on building linear programming models from a database. The methodology consists of two parts. The first part is to find s first-cut model based on a data dictionary. To do so, we applied the General Problem Solver(GPS) algorithm. The second part is to discover a second-cut model by applying neural network technique. An illustrative example is described to show the feasibility of the proposed methodology.

  • PDF

Shedding; towards a new paradigm of syndecan function in cancer

  • Choi, So-Joong;Lee, Ha-Won;Choi, Jung-Ran;Oh, Eok-Soo
    • BMB Reports
    • /
    • v.43 no.5
    • /
    • pp.305-310
    • /
    • 2010
  • Syndecans, cell surface heparansulfate proteoglycans, have been proposed to act as cell surface receptors and/or coreceptors to play critical roles in multiple cellular functions. However, recent reports suggest that the function of syndecans can be further extended through shedding, a cleavage of extracellular domain. Shedding constitutes an additional level for controlling the function of syndecans, providing a means to attenuate and/or regulate amplitude and duration of syndecan signals by modulating the activity of syndecans as cell surface receptors. Whether these remaining cleavage products are still capable of functioning as cell surface receptors to efficiently transduce signals inside of cells is not clear. However, shedding transforms cell surface receptor syndecans into soluble forms, which, like growth factors, may act as novel ligands to induce cellular responses by association with other cell surface receptors. It is becoming interestingly evident that shed syndecans also contribute significantly to syndecan functions in cancer biology. This review presents current knowledge about syndecan shedding and its functional significance, particularly in the context of cancer.