• Title/Summary/Keyword: Data-science

Search Result 55,842, Processing Time 0.075 seconds

Artificial Intelligence and Pattern Recognition Using Data Mining Algorithms

  • Al-Shamiri, Abdulkawi Yahya Radman
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.221-232
    • /
    • 2021
  • In recent years, with the existence of huge amounts of data stored in huge databases, the need for developing accurate tools for analyzing data and extracting information and knowledge from the huge and multi-source databases have been increased. Hence, new and modern techniques have emerged that will contribute to the development of all other sciences. Knowledge discovery techniques are among these technologies, one popular technique of knowledge discovery techniques is data mining which aims to knowledge discovery from huge amounts of data. Such modern technologies of knowledge discovery will contribute to the development of all other fields. Data mining is important, interesting technique, and has many different and varied algorithms; Therefore, this paper aims to present overview of data mining, and clarify the most important of those algorithms and their uses.

A Quantitative Assessment Model for Data Governance (Data Governance 정량평가 모델 개발방법의 제안)

  • Jang, Kyoung-Ae;Kim, Woo-Je
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.42 no.1
    • /
    • pp.53-63
    • /
    • 2017
  • Managing the quantitative measurement of the data control activities in enterprise wide is important to secure management of data governance. However, research on data governance is limited to concept definitions and components, and data governance research on evaluation models is lacking. In this study, we developed a model of quantitative assessment for data governance including the assessment area, evaluation index and evaluation matrix. We also, proposed a method of developing the model of quantitative assessment for data governance. For this purpose, we used previous studies and expert opinion analysis such as the Delphi technique, KJ method in this paper. This study contributes to literature by developing a quantitative evaluation model for data governance at the early stage of the study. This paper can be used for the base line data in objective evidence of performance in the companies and agencies of operating data governance.

SRC-Stat Package for Fitting Double Hierarchical Generalized Linear Models (이중 다단계 일반화 선형모형 적합을 위한 SRC-stat의 사용)

  • Noh, Maengseok;Ha, Il Do;Lee, Youngjo;Lim, Johan;Lee, Jaeyong;Oh, Heeseok;Shin, Dongwan;Lee, Sanggoo;Seo, Jinuk;Park, Yonhtae;Cho, Sungzoon;Park, Jonghun;Kim, Youkyung;You, Kyungsang
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.2
    • /
    • pp.343-351
    • /
    • 2015
  • We introduce how to fit random effects models via a SRC-Stat statistical package. This package has been developed to fit double hierarchical generalized linear models where mean and dispersion parameters for the variance of random effects and residual variance (overdispersion) can be modeled as random-effect models. The estimates of fixed effects, random effects and variances are calculated by a hierarchical likelihood method. We illustrate the use of our package with practical data-sets.

INTRODUCTION OF J-OFURO LATENT HEAT FLUX VERSION 2

  • Kubota, Masahisa;Hiroyuki, Tomita;iwasaki, Shinsuke;Hihara, Tsutomu;Kawatsura, Ayako
    • Proceedings of the KSRS Conference
    • /
    • 2007.10a
    • /
    • pp.306-309
    • /
    • 2007
  • Japanese Ocean Flux Data Sets with Use of Remote Sensing Observations (J-OFURO) includes global ocean surface heat flux data derived from satellite data and are used in many studies related to air-sea interaction. Recently latent heat flux data version 2 was constructed in J-OFURO. In version 2 many points are improved compared with version 1. A bulk algorithm used for estimation of latent heat flux is changed from Kondo (1975) to COASRE 3.0(Fairall et al., 2005). In version 1 we used NCEP reanalysis data (Reynolds and Smith, 1994) as SST data. However, the temporal resolution of the data is weekly and considerably low. Recently there are many kinds of global SST data because we can obtain SST data using a microwave radiometer sensor such as TRMM/MI and Aqua/AMSR-E. Therefore, we compared many SST products and determined to use Merged satellite and in situ data Global Daily (MGD) SST provided by Japan Meteorological Agency. Since we use wind speed and specific humidity data derived from one DMSP/SSMI sensor in J-OFURO, we obtain two data at most one day. Therefore, there may be large sampling errors for the daily-mean value. In order to escape this problem, multi-satellite data are used in version 2. As a result we could improve temporal resolution from 3-days mean value in version 1 to daily-mean value in version 2. Also we used an Optimum Interpolation method to estimate wind speed and specific humidity data instead of a simple mean method. Finally the data period is extended to 1989-2004. In this presentation we will introduce latent heat flux data version 2 in J-OFURO and comparison results with other surface latent heat flux data such as GSSTF2 and HOAPS etc. Moreover, we will present validation results by using buoy data.

  • PDF

Obesity Level Prediction Based on Data Mining Techniques

  • Alqahtani, Asma;Albuainin, Fatima;Alrayes, Rana;Al muhanna, Noura;Alyahyan, Eyman;Aldahasi, Ezaz
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.3
    • /
    • pp.103-111
    • /
    • 2021
  • Obesity affects individuals of all gender and ages worldwide; consequently, several studies have performed great works to define factors causing it. This study develops an effective method to trace obesity levels based on supervised data mining techniques such as Random Forest and Multi-Layer Perception (MLP), so as to tackle this universal epidemic. Notably, the dataset was from countries like Mexico, Peru, and Colombia in the 14- 61year age group, with varying eating habits and physical conditions. The data includes 2111 instances and 17 attributes labelled using NObesity, which facilitates categorization of data using Overweight Levels l I and II, Insufficient Weight, Normal Weight, as well as Obesity Type I to III. This study found that the highest accuracy was achieved by Random Forest algorithm in comparison to the MLP algorithm, with an overall classification rate of 96.7%.

Data Source Management using weight table in u-GIS DSMS

  • Kim, Sang-Ki;Baek, Sung-Ha;Lee, Dong-Wook;Chung, Warn-Il;Kim, Gyoung-Bae;Bae, Hae-Young
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.2
    • /
    • pp.27-33
    • /
    • 2009
  • The emergences of GeoSensor and researches about GIS have promoted many researches of u-GIS. The disaster application coupled in the u-GIS can apply to monitor accident area and to prevent spread of accident. The application needs the u-GIS DSMS technique to acquire, to process GeoSensor data and to integrate them with GIS data. The u-GIS DSMS must process big and large-volume data stream such as spatial data and multimedia data. Due to the feature of the data stream, in u-GIS DSMS, query processing can be delayed. Moreover, as increasing the input rate of data in the area generating events, the network traffic is increased. To solve this problem, in this paper we describe TRIGGER ACTION clause in CQ on the u-GIS DSMS environment and proposes data source management. Data source weight table controls GES information and incoming data rate. It controls incoming data rate as increasing weight at GES of disaster area. Consequently, it can contribute query processing rate and accuracy

  • PDF

Agriculture Big Data Analysis System Based on Korean Market Information

  • Chuluunsaikhan, Tserenpurev;Song, Jin-Hyun;Yoo, Kwan-Hee;Rah, Hyung-Chul;Nasridinov, Aziz
    • Journal of Multimedia Information System
    • /
    • v.6 no.4
    • /
    • pp.217-224
    • /
    • 2019
  • As the world's population grows, how to maintain the food supply is becoming a bigger problem. Now and in the future, big data will play a major role in decision making in the agriculture industry. The challenge is how to obtain valuable information to help us make future decisions. Big data helps us to see history clearer, to obtain hidden values, and make the right decisions for the government and farmers. To contribute to solving this challenge, we developed the Agriculture Big Data Analysis System. The system consists of agricultural big data collection, big data analysis, and big data visualization. First, we collected structured data like price, climate, yield, etc., and unstructured data, such as news, blogs, TV programs, etc. Using the data that we collected, we implement prediction algorithms like ARIMA, Decision Tree, LDA, and LSTM to show the results in data visualizations.

The Functional Requirements of Core Elements for Research Data Management and Service (연구 데이터 관리 및 서비스를 위한 핵심요소의 기능적 요건)

  • Kim, Juseop;Kim, Suntae;Choi, Sangki
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.53 no.3
    • /
    • pp.317-344
    • /
    • 2019
  • Increasing the value of data, paradigm shifts in research methods, and specific manifestations of open science indicate that research is no longer text-centric, but data-driven. In this study, we analyzed the services for DCC, ICPSR, ANDS and DataONE to derive key elements and functional requirements for research data management and services that are still insufficient in domestic research. Key factors derived include DMP writing support, data description, data storage, data sharing and access, data citations, and data management training. In addition, by presenting functional requirements to the derived key elements, this study can be applied to construct and operate RDM service in the future.

Big IoT Healthcare Data Analytics Framework Based on Fog and Cloud Computing

  • Alshammari, Hamoud;El-Ghany, Sameh Abd;Shehab, Abdulaziz
    • Journal of Information Processing Systems
    • /
    • v.16 no.6
    • /
    • pp.1238-1249
    • /
    • 2020
  • Throughout the world, aging populations and doctor shortages have helped drive the increasing demand for smart healthcare systems. Recently, these systems have benefited from the evolution of the Internet of Things (IoT), big data, and machine learning. However, these advances result in the generation of large amounts of data, making healthcare data analysis a major issue. These data have a number of complex properties such as high-dimensionality, irregularity, and sparsity, which makes efficient processing difficult to implement. These challenges are met by big data analytics. In this paper, we propose an innovative analytic framework for big healthcare data that are collected either from IoT wearable devices or from archived patient medical images. The proposed method would efficiently address the data heterogeneity problem using middleware between heterogeneous data sources and MapReduce Hadoop clusters. Furthermore, the proposed framework enables the use of both fog computing and cloud platforms to handle the problems faced through online and offline data processing, data storage, and data classification. Additionally, it guarantees robust and secure knowledge of patient medical data.

Research Data Management of Science and Technology Research Institutes in Korea (국내 과학기술분야 연구기관의 과학데이터 관리 현황)

  • Choi, Myung-Seok;Lee, Seung-Bock;Lee, Sanghwan
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.12
    • /
    • pp.117-126
    • /
    • 2017
  • As the recent research environment and research paradigm have become data-driven, Open Science, based on openness and sharing of public research results, has emerged as a global agenda for scientific research. National policies for sharing and re-use of research data from publicly-funded research are in effect globally. Therefore, in Korea, it is urgent to build policies and infrastructure for sharing and re-use of research data. In this paper, we investigate the current status of research data management of science and technology research institutes in Korea. We conducted in-depth interviews with researchers from 22 research institutes belonging to the National Research Council of Science & Technology, and 20 universities in Korea, asking about terms of creation management utilization of research data, willingness to share data, and needs for sharing and re-use of research data. From these interviews, we drew implications for open research data and future directions.