• 제목/요약/키워드: Data Analyzing

검색결과 9,916건 처리시간 0.036초

Analyzing RDF Data in Linked Open Data Cloud using Formal Concept Analysis

  • Hwang, Suk-Hyung;Cho, Dong-Heon
    • 한국컴퓨터정보학회논문지
    • /
    • 제22권6호
    • /
    • pp.57-68
    • /
    • 2017
  • The Linked Open Data(LOD) cloud is quickly becoming one of the largest collections of interlinked datasets and the de facto standard for publishing, sharing and connecting pieces of data on the Web. Data publishers from diverse domains publish their data using Resource Description Framework(RDF) data model and provide SPARQL endpoints to enable querying their data, which enables creating a global, distributed and interconnected dataspace on the LOD cloud. Although it is possible to extract structured data as query results by using SPARQL, users have very poor in analysis and visualization of RDF data from SPARQL query results. Therefore, to tackle this issue, based on Formal Concept Analysis, we propose a novel approach for analyzing and visualizing useful information from the LOD cloud. The RDF data analysis and visualization technique proposed in this paper can be utilized in the field of semantic web data mining by extracting and analyzing the information and knowledge inherent in LOD and supporting classification and visualization.

Automation technology for analyzing 3D point cloud data of construction sites

  • Park, Suyeul;Kim, Younggun;Choi, Yungjun;Kim, Seok
    • 국제학술발표논문집
    • /
    • The 9th International Conference on Construction Engineering and Project Management
    • /
    • pp.1100-1105
    • /
    • 2022
  • Denoising, registering, and detecting changes of 3D digital map are generally conducted by skilled technicians, which leads to inefficiency and the intervention of individual judgment. The manual post-processing for analyzing 3D point cloud data of construction sites requires a long time and sufficient resources. This study develops automation technology for analyzing 3D point cloud data for construction sites. Scanned data are automatically denoised, and the denoised data are stored in a specific storage. The stored data set is automatically registrated when the data set to be registrated is prepared. In addition, regions with non-homogeneous densities will be converted into homogeneous data. The change detection function is developed to automatically analyze the degree of terrain change occurred between time series data.

  • PDF

유비쿼터스 센서 네트워크에서의 데이터 전송시간 분석 시스템의 구현 사례 (A System for Analyzing Data Transmission Time in Ubiquitous Sensor Network)

  • 정기원;김재철;김주일;이우진
    • 한국전자거래학회지
    • /
    • 제13권2호
    • /
    • pp.149-163
    • /
    • 2008
  • 센서 네트워크에서는 데이터의 실시간 처리가 중요한 요소 중의 하나이다. 각 노드들이 감지한 데이터를 정해진 시간 내에 전달해야 필요한 시기에 적합한 처리가 가능하다. 따라서 노드들이 데이터를 정해진 시간 내에 제대로 전달하고 있는가를 점검하는 것은 매우 중요하다. 이에 따라 본 논문에서는 데이터 전송시간에 대한 모니터링을 통하여 센서 네트워크에 존재하는 노드들이 허용시간 범위 내에서 서버로 데이터를 전송하고 있는가를 확인하기 위한 데이터 전송시간 분석 시스템의 구현 사례를 제안한다. 이를 위하여 데이터 전송시간 분석을 위한 절차를 제시하고, 제시한 절차에 따라 전송시간을 분석하기 위해 필요한 시간차 분석 방법, 데이터 송수신 시간 수집 방법 및 데이터 전송시간 계산 방법을 제시한다. 또한 제시한 방법을 바탕으로 데이터 전송시간을 모니터링하고 분석하기 위한 시스템을 구현하고, 사례 연구를 수행한 결과를 보인다.

  • PDF

농업 빅데이터 수집 및 분석을 위한 플랫폼 설계 (Design of a Platform for Collecting and Analyzing Agricultural Big Data)

  • 뉘엔 반 퀴엣;뉘엔 신 녹;김경백
    • 디지털콘텐츠학회 논문지
    • /
    • 제18권1호
    • /
    • pp.149-158
    • /
    • 2017
  • 빅데이터는 경제개발에서 흥미로운 기회와 도전을 보여왔다. 예를 들어, 농업 분야에서 날씨 데이터 및 토양데이터와 같은 복합데이터의 조합과 이들의 분석 결과는 농업종사자 및 농업경영체들에게 귀중하고 도움되는 정보를 제공한다. 그러나 농업 데이터는 센서들과 농업 웹 마켓 등의 다양한 형태의 장치 및 서비스들을 통해 매 분마다 대규모로 생성된다. 이는 데이터 수집, 저장, 분석과 같은 빅데이터 이슈들을 발생시킨다. 비록 몇몇 시스템들이 이 문제를 해결하기 위해 제안되었으나, 이들은 다루는 데이터 종류의 제약, 저장 방식의 제약, 데이터 크기의 제약 등의 문제를 여전히 가지고 있다. 이 논문에서는 농업데이터의 수집과 분석 플랫폼의 새로운 설계를 제안한다. 제안하는 플랫폼은 (1) Flume과 MapReduce를 이용한 다양한 데이터 소스들로부터의 데이터 수집 방법, (2) HDFS, HBase, 그리고 Hive를 이용한 다양한 데이터 저장 방법, (3) Spark와 Hadoop을 이용한 빅데이터 분석 모듈들을 제공한다.

Mixed-effects LS-SVR for longitudinal dat

  • Cho, Dae-Hyeon
    • Journal of the Korean Data and Information Science Society
    • /
    • 제21권2호
    • /
    • pp.363-369
    • /
    • 2010
  • In this paper we propose a mixed-effects least squares support vector regression (LS-SVR) for longitudinal data. We add a random-effect term in the optimization function of LS-SVR to take random effects into LS-SVR for analyzing longitudinal data. We also present the model selection method that employs generalized cross validation function for choosing the hyper-parameters which affect the performance of the mixed-effects LS-SVR. A simulated example is provided to indicate the usefulness of mixed-effect method for analyzing longitudinal data.

Applying Decision Tree Algorithms for Analyzing HS-VOSTS Questionnaire Results

  • Kang, Dae-Ki
    • 공학교육연구
    • /
    • 제15권4호
    • /
    • pp.41-47
    • /
    • 2012
  • Data mining and knowledge discovery techniques have shown to be effective in finding hidden underlying rules inside large database in an automated fashion. On the other hand, analyzing, assessing, and applying students' survey data are very important in science and engineering education because of various reasons such as quality improvement, engineering design process, innovative education, etc. Among those surveys, analyzing the students' views on science-technology-society can be helpful to engineering education. Because, although most researches on the philosophy of science have shown that science is one of the most difficult concepts to define precisely, it is still important to have an eye on science, pseudo-science, and scientific misconducts. In this paper, we report the experimental results of applying decision tree induction algorithms for analyzing the questionnaire results of high school students' views on science-technology-society (HS-VOSTS). Empirical results on various settings of decision tree induction on HS-VOSTS results from one South Korean university students indicate that decision tree induction algorithms can be successfully and effectively applied to automated knowledge discovery from students' survey data.

Area Usage Factor Analyzing Method for Semi-conductor Manufacturing Process

  • Konishi, Katunobu;Ukida, Hiroyuki;Sawada, Koutarou
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 1998년도 제13차 학술회의논문집
    • /
    • pp.480-483
    • /
    • 1998
  • For memory products, it is very important to develop a new production line as soon as possible. All products are inspected to get rid of defected products at the last testing stage. Those inspection data are called FCM. In this paper, based on the FCM data, Area Usage Factor (AUF) analyzing method will be proposed. Process engineers can make up their mind to which direction they should concentrate their analyzing power.

  • PDF

Analyzing Survival Data as Binary Outcomes with Logistic Regression

  • Lim, Jo-Han;Lee, Kyeong-Eun;Hahn, Kyu-S.;Park, Kun-Woo
    • Communications for Statistical Applications and Methods
    • /
    • 제17권1호
    • /
    • pp.117-126
    • /
    • 2010
  • Clinical researchers often analyze survival data as binary outcomes using the logistic regression method. This paper examines the information loss resulting from analyzing survival time as binary outcomes. We first demonstrate that, under the proportional hazard assumption, this binary discretization does result in a significant information loss. Second, when fitting a logistic model to survival time data, researchers inadvertently use the maximal statistic. We implement a numerical study to examine the properties of the reference distribution for this statistic, finally, we show that the logistic regression method can still be a useful tool for analyzing survival data in particular when the proportional hazard assumption is questionable.

A marginal logit mixed-effects model for repeated binary response data

  • Choi, Jae-Sung
    • Journal of the Korean Data and Information Science Society
    • /
    • 제19권2호
    • /
    • pp.413-420
    • /
    • 2008
  • This paper suggests a marginal logit mixed-effects for analyzing repeated binary response data. Since binary repeated measures are obtained over time from each subject, observations will have a certain covariance structure among them. As a plausible covariance structure, 1st order auto-regressive correlation structure is assumed for analyzing data. Generalized estimating equations(GEE) method is used for estimating fixed effects in the model.

  • PDF

효과적인 웹 사용자의 패턴 분석을 위한 하둡 시스템의 웹 로그 분석 방안 (A Method for Analyzing Web Log of the Hadoop System for Analyzing a Effective Pattern of Web Users)

  • 이병주;권정숙;고기철;최용락
    • 한국IT서비스학회지
    • /
    • 제13권4호
    • /
    • pp.231-243
    • /
    • 2014
  • Of the various data that corporations can approach, web log data are important data that correspond to data analysis to implement customer relations management strategies. As the volume of approachable data has increased exponentially due to the Internet and popularization of smart phone, web log data have also increased a lot. As a result, it has become difficult to expand storage to process large amounts of web logs data flexibly and extremely hard to implement a system capable of categorizing, analyzing, and processing web log data accumulated over a long period of time. This study thus set out to apply Hadoop, a distributed processing system that had recently come into the spotlight for its capacity of processing large volumes of data, and propose an efficient analysis plan for large amounts of web log. The study checked the forms of web log by the effective web log collection methods and the web log levels by using Hadoop and proposed analysis techniques and Hadoop organization designs accordingly. The present study resolved the difficulty with processing large amounts of web log data and proposed the activity patterns of users through web log analysis, thus demonstrating its advantages as a new means of marketing.