• Title/Summary/Keyword: dataset relationships

Search Result 96, Processing Time 0.026 seconds

Knowledge Model for Disaster Dataset Navigation

  • Hwang, Yun-Young;Yuk, Jin-Hee;Shin, Sumi
    • Journal of Information Science Theory and Practice
    • /
    • v.9 no.4
    • /
    • pp.35-49
    • /
    • 2021
  • In a situation where there are multiple diverse datasets, it is essential to have an efficient method to provide users with the datasets they require. To address this suggestion, necessary datasets should be selected on the basis of the relationships between the datasets. In particular, in order to discover the necessary datasets for disaster resolution, we need to consider the disaster resolution stage. In this paper, in order to provide the necessary datasets for each stage of disaster resolution, we constructed a disaster type and disaster management process ontology and designed a method to determine the necessary datasets for each disaster type and disaster management process step. In addition, we introduce a method to determine relationships between datasets necessary for disaster response. We propose a method for discovering datasets based on minimal relationships such as "isA," "sameAs," and "subclassOf." To discover suitable datasets, we designed a knowledge exploration model and collected 651 disaster-related datasets for improving our method. These datasets were categorized by disaster type from the perspective of disaster management. Categorizing actual datasets into disaster types and disaster management types allows a single dataset to be classified as multiple types in both categories. We built a knowledge exploration model on the basis of disaster examples to ensure the configuration of our model.

Stock News Dataset Quality Assessment by Evaluating the Data Distribution and the Sentiment Prediction

  • Alasmari, Eman;Hamdy, Mohamed;Alyoubi, Khaled H.;Alotaibi, Fahd Saleh
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.2
    • /
    • pp.1-8
    • /
    • 2022
  • This work provides a reliable and classified stocks dataset merged with Saudi stock news. This dataset allows researchers to analyze and better understand the realities, impacts, and relationships between stock news and stock fluctuations. The data were collected from the Saudi stock market via the Corporate News (CN) and Historical Data Stocks (HDS) datasets. As their names suggest, CN contains news, and HDS provides information concerning how stock values change over time. Both datasets cover the period from 2011 to 2019, have 30,098 rows, and have 16 variables-four of which they share and 12 of which differ. Therefore, the combined dataset presented here includes 30,098 published news pieces and information about stock fluctuations across nine years. Stock news polarity has been interpreted in various ways by native Arabic speakers associated with the stock domain. Therefore, this polarity was categorized manually based on Arabic semantics. As the Saudi stock market massively contributes to the international economy, this dataset is essential for stock investors and analyzers. The dataset has been prepared for educational and scientific purposes, motivated by the scarcity of data describing the impact of Saudi stock news on stock activities. It will, therefore, be useful across many sectors, including stock market analytics, data mining, statistics, machine learning, and deep learning. The data evaluation is applied by testing the data distribution of the categories and the sentiment prediction-the data distribution over classes and sentiment prediction accuracy. The results show that the data distribution of the polarity over sectors is considered a balanced distribution. The NB model is developed to evaluate the data quality based on sentiment classification, proving the data reliability by achieving 68% accuracy. So, the data evaluation results ensure dataset reliability, readiness, and high quality for any usage.

Multimodal Context Embedding for Scene Graph Generation

  • Jung, Gayoung;Kim, Incheol
    • Journal of Information Processing Systems
    • /
    • v.16 no.6
    • /
    • pp.1250-1260
    • /
    • 2020
  • This study proposes a novel deep neural network model that can accurately detect objects and their relationships in an image and represent them as a scene graph. The proposed model utilizes several multimodal features, including linguistic features and visual context features, to accurately detect objects and relationships. In addition, in the proposed model, context features are embedded using graph neural networks to depict the dependencies between two related objects in the context feature vector. This study demonstrates the effectiveness of the proposed model through comparative experiments using the Visual Genome benchmark dataset.

Geometric and Semantic Improvement for Unbiased Scene Graph Generation

  • Ruhui Zhang;Pengcheng Xu;Kang Kang;You Yang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.10
    • /
    • pp.2643-2657
    • /
    • 2023
  • Scene graphs are structured representations that can clearly convey objects and the relationships between them, but are often heavily biased due to the highly skewed, long-tailed relational labeling in the dataset. Indeed, the visual world itself and its descriptions are biased. Therefore, Unbiased Scene Graph Generation (USGG) prefers to train models to eliminate long-tail effects as much as possible, rather than altering the dataset directly. To this end, we propose Geometric and Semantic Improvement (GSI) for USGG to mitigate this issue. First, to fully exploit the feature information in the images, geometric dimension and semantic dimension enhancement modules are designed. The geometric module is designed from the perspective that the position information between neighboring object pairs will affect each other, which can improve the recall rate of the overall relationship in the dataset. The semantic module further processes the embedded word vector, which can enhance the acquisition of semantic information. Then, to improve the recall rate of the tail data, the Class Balanced Seesaw Loss (CBSLoss) is designed for the tail data. The recall rate of the prediction is improved by penalizing the body or tail relations that are judged incorrectly in the dataset. The experimental findings demonstrate that the GSI method performs better than mainstream models in terms of the mean Recall@K (mR@K) metric in three tasks. The long-tailed imbalance in the Visual Genome 150 (VG150) dataset is addressed better using the GSI method than by most of the existing methods.

The Relationship between the Use of Korean and Western Medicine in treating Musculoskeletal Disease (근골격계 질환에 대한 한방의료기관 이용이 양방의료기관 이용에 미치는 영향 - 한국의료패널 자료를 이용하여 -)

  • Choi, Byunghee;Son, Chihyoung;Lim, Byungmook
    • The Journal of Korean Medicine
    • /
    • v.35 no.3
    • /
    • pp.22-31
    • /
    • 2014
  • Objectives: The aim of this study was to identify the complementary and substitute relationships between the use of Korean medicine (KM) and that of Western medicine (WM) in the treatment of musculoskeletal disease. Methods: We analyzed the 2009 Korea Health Panel dataset. General characteristics and the medical utilization of respondents were analyzed descriptively. Logistic regression, negative binominal regression, and Tobit regression analysis were used to identify the relationships between the use of KM and the use, visit frequency, and expenses of WM, respectively. Results: In the treatment of musculoskeletal disease, KM use and non-herbal treatments with Korean medicine significantly reduced WM use. Herb medication use significantly increased WM visit frequency. There were no significant relationships between KM use and WM expenses. Conclusions: There are substitute relationships between WM use and KM use, especially non-herbal treatments in KM. Therefore we need to develop the clinical protocols of KM and WM treatments in the treatment of musculoskeletal disease for proper distribution medical resources.

A Structural Equation Model on Korean Adolescents' Multi-cultural Acceptance (청소년의 다문화 수용성 구조 모형 구축)

  • Lee, Ha-na
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.2
    • /
    • pp.302-310
    • /
    • 2018
  • This study was conducted to develop a unified structural model that defines relationships among factors that affect Multi Cultural Acceptance (MCA) for adolescents. This study was performed using the dataset from the 2016 Korean Children and Youth Panel Survey (KCPS). We analyzed the survey result from the dataset at the 0.05 significance level using the SPSS and AMOS version 22 programs. Specifically, we investigated several demographic characteristics of the survey participants by a descriptive analysis method that adopted the maximum likelihood estimate method to verify the fitness of the hypothetical model and the hypotheses therein. In addition, we applied the ${\chi}^2$-test, GFI, AGFI, CFI, IFI, and RMSEA to show the fitness level of our structural model. The results showed that our proposed structural model demonstrated a fine fitness level. We found that key factors that affect MCA for adolescents were ego-resilience, peer relationships, and sense of community. Overall, the results of our study indicate that combinational intervention is needed to help adolescents lift their ego-resilience, as well as to develop peer relationships and a sense of community.

ValueRank: Keyword Search of Object Summaries Considering Values

  • Zhi, Cai;Xu, Lan;Xing, Su;Kun, Lang;Yang, Cao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.12
    • /
    • pp.5888-5903
    • /
    • 2019
  • The Relational ranking method applies authority-based ranking in relational dataset that can be modeled as graphs considering also their tuples' values. Authority directions from tuples that contain the given keywords and transfer to their corresponding neighboring nodes in accordance with their values and semantic connections. From our previous work, ObjectRank extends to ValueRank that also takes into account the value of tuples in authority transfer flows. In a maked difference from ObjectRank, which only considers authority flows through relationships, it is only valid in the bibliographic databases e.g. DBLP dataset, ValueRank facilitates the estimation of importance for any databases, e.g. trading databases, etc. A relational keyword search paradigm Object Summary (denote as OS) is proposed recently, given a set of keywords, a group of Object Summaries as its query result. An OS is a multilevel-tree data structure, in which node (namely the tuple with keywords) is OS's root node, and the surrounding nodes are the summary of all data on the graph. But, some of these trees have a very large in total number of tuples, size-l OSs are the OS snippets, have also been investigated using ValueRank.We evaluated the real bibliographical dataset and Microsoft business databases to verify of our proposed approach.

MicroRNA-Gene Association Prediction Method using Deep Learning Models

  • Seung-Won Yoon;In-Woo Hwang;Kyu-Chul Lee
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.4
    • /
    • pp.294-299
    • /
    • 2023
  • Micro ribonucleic acids (miRNAs) can regulate the protein expression levels of genes in the human body and have recently been reported to be closely related to the cause of disease. Determining the genes related to miRNAs will aid in understanding the mechanisms underlying complex miRNAs. However, the identification of miRNA-related genes through wet experiments (in vivo, traditional methods are time- and cost-consuming). To overcome these problems, recent studies have investigated the prediction of miRNA relevance using deep learning models. This study presents a method for predicting the relationships between miRNAs and genes. First, we reconstruct a negative dataset using the proposed method. We then extracted the feature using an autoencoder, after which the feature vector was concatenated with the original data. Thereafter, the concatenated data were used to train a long short-term memory model. Our model exhibited an area under the curve of 0.9609, outperforming previously reported models trained using the same dataset.

Relationships between Carcass Characteristics of Commercial Pork Breeds

  • Hwang, I.H.;Park, B.Y.;Kim, J.H.;Cho, S.H.;Kim, D.H.;Lee, J.M.;Lee, C.S.
    • Proceedings of the Korean Society for Food Science of Animal Resources Conference
    • /
    • 2006.05a
    • /
    • pp.196-199
    • /
    • 2006
  • The current study was conducted to identify relationship between myosin heavy chain I to objective color dimensions. Myosin heavy chain I isoform showed coefficients of determinant($r^2$) of 0.54 and 0.40 for Hunter a* and b* values. For he current dataset, Hunter a* value at day 1 had higher relationships with that at both day 7 and 14, emphasizing the importance of initial meat color which is largely affected by animal management prior to slaughter.

  • PDF

Task Planning Algorithm with Graph-based State Representation (그래프 기반 상태 표현을 활용한 작업 계획 알고리즘 개발)

  • Seongwan Byeon;Yoonseon Oh
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.2
    • /
    • pp.196-202
    • /
    • 2024
  • The ability to understand given environments and plan a sequence of actions leading to goal state is crucial for personal service robots. With recent advancements in deep learning, numerous studies have proposed methods for state representation in planning. However, previous works lack explicit information about relationships between objects when the state observation is converted to a single visual embedding containing all state information. In this paper, we introduce graph-based state representation that incorporates both object and relationship features. To leverage these advantages in addressing the task planning problem, we propose a Graph Neural Network (GNN)-based subgoal prediction model. This model can extract rich information about object and their interconnected relationships from given state graph. Moreover, a search-based algorithm is integrated with pre-trained subgoal prediction model and state transition module to explore diverse states and find proper sequence of subgoals. The proposed method is trained with synthetic task dataset collected in simulation environment, demonstrating a higher success rate with fewer additional searches compared to baseline methods.