• Title/Summary/Keyword: Data Analyzing

Search Result 9,783, Processing Time 0.04 seconds

Analyzing RDF Data in Linked Open Data Cloud using Formal Concept Analysis

  • Hwang, Suk-Hyung;Cho, Dong-Heon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.6
    • /
    • pp.57-68
    • /
    • 2017
  • The Linked Open Data(LOD) cloud is quickly becoming one of the largest collections of interlinked datasets and the de facto standard for publishing, sharing and connecting pieces of data on the Web. Data publishers from diverse domains publish their data using Resource Description Framework(RDF) data model and provide SPARQL endpoints to enable querying their data, which enables creating a global, distributed and interconnected dataspace on the LOD cloud. Although it is possible to extract structured data as query results by using SPARQL, users have very poor in analysis and visualization of RDF data from SPARQL query results. Therefore, to tackle this issue, based on Formal Concept Analysis, we propose a novel approach for analyzing and visualizing useful information from the LOD cloud. The RDF data analysis and visualization technique proposed in this paper can be utilized in the field of semantic web data mining by extracting and analyzing the information and knowledge inherent in LOD and supporting classification and visualization.

Automation technology for analyzing 3D point cloud data of construction sites

  • Park, Suyeul;Kim, Younggun;Choi, Yungjun;Kim, Seok
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.1100-1105
    • /
    • 2022
  • Denoising, registering, and detecting changes of 3D digital map are generally conducted by skilled technicians, which leads to inefficiency and the intervention of individual judgment. The manual post-processing for analyzing 3D point cloud data of construction sites requires a long time and sufficient resources. This study develops automation technology for analyzing 3D point cloud data for construction sites. Scanned data are automatically denoised, and the denoised data are stored in a specific storage. The stored data set is automatically registrated when the data set to be registrated is prepared. In addition, regions with non-homogeneous densities will be converted into homogeneous data. The change detection function is developed to automatically analyze the degree of terrain change occurred between time series data.

  • PDF

A System for Analyzing Data Transmission Time in Ubiquitous Sensor Network (유비쿼터스 센서 네트워크에서의 데이터 전송시간 분석 시스템의 구현 사례)

  • Chong, Ki-Won;Kim, Jae-Cheol;Kim, Ju-Il;Lee, Woo-Jin
    • The Journal of Society for e-Business Studies
    • /
    • v.13 no.2
    • /
    • pp.149-163
    • /
    • 2008
  • In a ubiquitous sensor network (USN) with several nodes, real-time data processing is one of important factors. In order to process data appropriately, all the nodes should transmit sensor data in time and the transmission between nodes and their server should be managed very systematically. For the purpose of systematic management of transmission in a USN, this paper proposes a system for analyzing transmission time of sensor data. To implement the proposed system, an analyzing process of data transmission time, an analyzing method of clock drift, a collecting method of data send/receive times, and calculating formulas of data transmission duration are proposed. According to the proposed process and methods, this paper presents a system for monitoring and analyzing data transmission duration, and it also shows the results of a sample case.

  • PDF

Design of a Platform for Collecting and Analyzing Agricultural Big Data (농업 빅데이터 수집 및 분석을 위한 플랫폼 설계)

  • Nguyen, Van-Quyet;Nguyen, Sinh Ngoc;Kim, Kyungbaek
    • Journal of Digital Contents Society
    • /
    • v.18 no.1
    • /
    • pp.149-158
    • /
    • 2017
  • Big data have been presenting us with exciting opportunities and challenges in economic development. For instance, in the agriculture sector, mixing up of various agricultural data (e.g., weather data, soil data, etc.), and subsequently analyzing these data deliver valuable and helpful information to farmers and agribusinesses. However, massive data in agriculture are generated in every minute through multiple kinds of devices and services such as sensors and agricultural web markets. It leads to the challenges of big data problem including data collection, data storage, and data analysis. Although some systems have been proposed to address this problem, they are still restricted either in the type of data, the type of storage, or the size of data they can handle. In this paper, we propose a novel design of a platform for collecting and analyzing agricultural big data. The proposed platform supports (1) multiple methods of collecting data from various data sources using Flume and MapReduce; (2) multiple choices of data storage including HDFS, HBase, and Hive; and (3) big data analysis modules with Spark and Hadoop.

Mixed-effects LS-SVR for longitudinal dat

  • Cho, Dae-Hyeon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.2
    • /
    • pp.363-369
    • /
    • 2010
  • In this paper we propose a mixed-effects least squares support vector regression (LS-SVR) for longitudinal data. We add a random-effect term in the optimization function of LS-SVR to take random effects into LS-SVR for analyzing longitudinal data. We also present the model selection method that employs generalized cross validation function for choosing the hyper-parameters which affect the performance of the mixed-effects LS-SVR. A simulated example is provided to indicate the usefulness of mixed-effect method for analyzing longitudinal data.

Applying Decision Tree Algorithms for Analyzing HS-VOSTS Questionnaire Results

  • Kang, Dae-Ki
    • Journal of Engineering Education Research
    • /
    • v.15 no.4
    • /
    • pp.41-47
    • /
    • 2012
  • Data mining and knowledge discovery techniques have shown to be effective in finding hidden underlying rules inside large database in an automated fashion. On the other hand, analyzing, assessing, and applying students' survey data are very important in science and engineering education because of various reasons such as quality improvement, engineering design process, innovative education, etc. Among those surveys, analyzing the students' views on science-technology-society can be helpful to engineering education. Because, although most researches on the philosophy of science have shown that science is one of the most difficult concepts to define precisely, it is still important to have an eye on science, pseudo-science, and scientific misconducts. In this paper, we report the experimental results of applying decision tree induction algorithms for analyzing the questionnaire results of high school students' views on science-technology-society (HS-VOSTS). Empirical results on various settings of decision tree induction on HS-VOSTS results from one South Korean university students indicate that decision tree induction algorithms can be successfully and effectively applied to automated knowledge discovery from students' survey data.

Area Usage Factor Analyzing Method for Semi-conductor Manufacturing Process

  • Konishi, Katunobu;Ukida, Hiroyuki;Sawada, Koutarou
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1998.10a
    • /
    • pp.480-483
    • /
    • 1998
  • For memory products, it is very important to develop a new production line as soon as possible. All products are inspected to get rid of defected products at the last testing stage. Those inspection data are called FCM. In this paper, based on the FCM data, Area Usage Factor (AUF) analyzing method will be proposed. Process engineers can make up their mind to which direction they should concentrate their analyzing power.

  • PDF

Analyzing Survival Data as Binary Outcomes with Logistic Regression

  • Lim, Jo-Han;Lee, Kyeong-Eun;Hahn, Kyu-S.;Park, Kun-Woo
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.1
    • /
    • pp.117-126
    • /
    • 2010
  • Clinical researchers often analyze survival data as binary outcomes using the logistic regression method. This paper examines the information loss resulting from analyzing survival time as binary outcomes. We first demonstrate that, under the proportional hazard assumption, this binary discretization does result in a significant information loss. Second, when fitting a logistic model to survival time data, researchers inadvertently use the maximal statistic. We implement a numerical study to examine the properties of the reference distribution for this statistic, finally, we show that the logistic regression method can still be a useful tool for analyzing survival data in particular when the proportional hazard assumption is questionable.

A marginal logit mixed-effects model for repeated binary response data

  • Choi, Jae-Sung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.2
    • /
    • pp.413-420
    • /
    • 2008
  • This paper suggests a marginal logit mixed-effects for analyzing repeated binary response data. Since binary repeated measures are obtained over time from each subject, observations will have a certain covariance structure among them. As a plausible covariance structure, 1st order auto-regressive correlation structure is assumed for analyzing data. Generalized estimating equations(GEE) method is used for estimating fixed effects in the model.

  • PDF

A Method for Analyzing Web Log of the Hadoop System for Analyzing a Effective Pattern of Web Users (효과적인 웹 사용자의 패턴 분석을 위한 하둡 시스템의 웹 로그 분석 방안)

  • Lee, Byungju;Kwon, Jungsook;Go, Gicheol;Choi, Yonglak
    • Journal of Information Technology Services
    • /
    • v.13 no.4
    • /
    • pp.231-243
    • /
    • 2014
  • Of the various data that corporations can approach, web log data are important data that correspond to data analysis to implement customer relations management strategies. As the volume of approachable data has increased exponentially due to the Internet and popularization of smart phone, web log data have also increased a lot. As a result, it has become difficult to expand storage to process large amounts of web logs data flexibly and extremely hard to implement a system capable of categorizing, analyzing, and processing web log data accumulated over a long period of time. This study thus set out to apply Hadoop, a distributed processing system that had recently come into the spotlight for its capacity of processing large volumes of data, and propose an efficient analysis plan for large amounts of web log. The study checked the forms of web log by the effective web log collection methods and the web log levels by using Hadoop and proposed analysis techniques and Hadoop organization designs accordingly. The present study resolved the difficulty with processing large amounts of web log data and proposed the activity patterns of users through web log analysis, thus demonstrating its advantages as a new means of marketing.