• Title/Summary/Keyword: Apache

Search Result 360, Processing Time 0.03 seconds

Design an Indexing Structure System Based on Apache Hadoop in Wireless Sensor Network

  • Keo, Kongkea;Chung, Yeongjee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.05a
    • /
    • pp.45-48
    • /
    • 2013
  • In this paper, we proposed an Indexing Structure System (ISS) based on Apache Hadoop in Wireless Sensor Network (WSN). Nowadays sensors data continuously keep growing that need to control. Data constantly update in order to provide the newest information to users. While data keep growing, data retrieving and storing are face some challenges. So by using the ISS, we can maximize processing quality and minimize data retrieving time. In order to design ISS, Indexing Types have to be defined depend on each sensor type. After identifying, each sensor goes through the Indexing Structure Processing (ISP) in order to be indexed. After ISP, indexed data are streaming and storing in Hadoop Distributed File System (HDFS) across a number of separate machines. Indexed data are split and run by MapReduce tasks. Data are sorted and grouped depend on sensor data object categories. Thus, while users send the requests, all the queries will be filter from sensor data object and managing the task by MapReduce processing framework.

A Survey on the Performance Comparison of Map Reduce Technologies and the Architectural Improvement of Spark

  • Raghavendra, GS;Manasa, Bezwada;Vasavi, M.
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.5
    • /
    • pp.121-126
    • /
    • 2022
  • Hadoop and Apache Spark are Apache Software Foundation open source projects, and both of them are premier large data analytic tools. Hadoop has led the big data industry for five years. The processing velocity of the Spark can be significantly different, up to 100 times quicker. However, the amount of data handled varies: Hadoop Map Reduce can process data sets that are far bigger than Spark. This article compares the performance of both spark and map and discusses the advantages and disadvantages of both above-noted technologies.

The Accuracy of Prediction Models in Burn Patients (화상환자에서 사망예측모델의 성능 평가에 관한 연구)

  • Woo, Jaeyeon;Kym, Dohern
    • Journal of the Korean Burn Society
    • /
    • v.24 no.1
    • /
    • pp.1-6
    • /
    • 2021
  • Purpose: The purpose of this study was to evaluate the accuracy of four prediction models in adult burn patients. Methods: This retrospective study was conducted on 696 adult burn patients who were treated at burn intensive care unit (BICU) of Hallym University Hangang Sacred Heart Hospital from January 2017 to December 2019. The models are ABSI, APACHE IV, rBaux and Hangang score. Results: The discrimination of each prediction model was analyzed as AUC of ROC curve. AUC value was the highest with Hangang score of 0.931 (0.908~0.954), followed by rBaux 0.896 (0.867~0.924), ABSI 0.883 (0.853~0.913) and APACHE IV 0.851 (0.818~0.884). Conclusion: The results of evaluating the accuracy of the four models, Hangang score showed the highest prediction. But it is necessary to apply the appropriate prediction model according to characteristics of the burn center.

Design on the IoT Sensor Data Collection Envionment using Lambda Architecture (Lambda 구조를 적용한 IoT 센서 데이터 수집 환경 설계)

  • Hwang, Yun-Young;Kim, Soo-Hyun;Shin, Yong-Tae
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.07a
    • /
    • pp.547-548
    • /
    • 2020
  • 데이터의 양은 기술의 발전과 함께 크게 증가하였다. Hadoop은 빅데이터 분야에서 사용되는 대표적인 빅데이터 처리 플랫폼으로 IoT 분야에서도 사용된다. HDFS(Haddop Distributed File System)는 Hadoop의 코어 프로젝트로 블록 기반의 대용량 데이터 저장소다. 기존의 Hadoop 기반 IoT 센서 데이터 수집 환경은 HDFS를 사용한다. 그러나 HDFS의 Small File로 인한 네임노드의 과부하 문제와 한 번 Import된 데이터의 Update와 Delete를 지원하지 않는 Hadoop의 특징으로 인해 성능과 활용이 제한적이다. 본 논문에서는 기존 Hadoop 기반 IoT 센서 데이터 수집 환경의 단점을 극복하기 위해 Lambda 구조를 적용한 IoT 센서 데이터 수집 환경을 설계한다.

  • PDF

The Usefulness of B-type Natriuretic Peptide test in Critically Ill, Noncardiac Patients (심질환 병력이 없었던 중환자에서 B-type Natriuretic Peptide 검사의 유용성)

  • Kim, Kang Ho;Park, Hong-Hoon;Kim, Esther;Cheon, Seok-Cheol;Lee, Ji Hyun;Lee, Stephen YongGu;Lee, Ji-Hyun;Kim, In Jai;Cha, Dong-Hoon;Kim, Sehyun;Choi, Jeongeun;Hong, Sang-Bum
    • Tuberculosis and Respiratory Diseases
    • /
    • v.54 no.3
    • /
    • pp.311-319
    • /
    • 2003
  • Background : Previous studies have suggested that a B-type natriuretic peptide(BNP) test can provide important information on diagnosis, as well as predicting the severity and prognosis of heart failure. Myocardial dysfunction is often observed in critically ill noncardiac patients admitted to the Intensive Care Unit, and the prognosis of the myocardial dysfunction needs to be determined. This study evaluated the predictability of BNP on the prognosis of critically ill noncardiac patients. Methods : 32 ICU patients, who were hospitalized from June to October 2002 and in whom the BNP test was evaluated, were enrolled in this study. The exclusion criteria included the conditions that could increase the BNP levels irrespective of the severity, such as congestive heart failure, atrial fibrillation, ischemic heart disease, and renal insufficiencies. A triage B-Type Natriuretic Peptide test with a RIA-kit was used for the fluorescence immunoassay of BNP test. In addition, the acute physiology and the chronic health evaluation (APACHE) II score and mortality were recorded. Results : There were 16 males and 16 females enrolled in this study. The mean age was 59 years old. The mean BNP levels between the ICU patients and control were significantly different ($186.7{\pm}274.1$ pg/mL vs. $19.9{\pm}21.3$ pg/mL, p=0.033). Among the ICU patients, there were 14(44----) patients with BNP levels above 100 pg/mL. The APACHE II score was $16.5{\pm}7.6$. In addition, there were 11 mortalities reported. The correlation between the BNP and APACHE II score, between the BNP and mortality were significant (r=0.443, p=0.011 & r=0.530, p=0.002). The mean BNP levels between the dead and alive groups were significantly different ($384.1{\pm}401.7$ pg/mL vs. $83.2{\pm}55.8$ pg/mL p=0.033). However, the $PaO_2/FiO_2$ did not significantly correlate with the BNP level. Conclusion : This study evaluated the BNP level was elevated in critically ill, noncardiac patients. The BNP level could be a useful, noninvasive tool for predicting the prognosis of the critically ill, noncardiac patients.

Is it Meaningful to Use the Serum Cholinesterase Level as a Predictive Value in Acute Organophosphate Poisoning? (혈청 콜린에스테라제 활성도를 이용하여 유기인계 농약 음독 환자의 증증도를 예측할 수 있는가?)

  • Lee, Sang-Jin;Jung, Jin-Hee;Jung, Koo-Young
    • Journal of The Korean Society of Clinical Toxicology
    • /
    • v.2 no.2
    • /
    • pp.72-76
    • /
    • 2004
  • Purpose: Dealing patients with organophosphate poisoning, cholinesterase level has been used as a diagnostic and prognostic value. But there are some controversies that the cholinesterase level is significantly related to the severity or prognosis of acute organophosphate poisoning. We evaluated the correlation between initial serum level of cholinesterase and APACHE II score as an index for severity, and we assessed cholinesterase levels for predicting value of weaning from mechanical ventilation. Method: From August 1996 to March 2003, 23 patients with organophosphate poisoning who needed ventilatory care were enrolled. Retrospective review was done for the serum level of cholinesterase, APACHE II score, and the duration of ventilatory care. The percentage of measured serum cholinesterase to median normal value was used to standardize cholinesterase levels from different laboratories. Result: There were tendencies that the lower initial serum of cholinesterase, the higher the APACHE II score (r=0.297) and the longer the duration of mechanical ventilation (r=-0.204), but they were not significant (p=0.264 and p=0.351 respectively). In 9 patients whose serum cholinesterase level were checked at the time of weaning, mean of measured cholinesterase level was $10.3\pm7.60\%$ of normal value. Conclusion: There was no significant relationship between initial level of serum cholinesterase and severity or duration of mechanical ventilation. General health status of patient, amount of ingestion, toxicity of agent should be considered as important factors for severity of poisoning. And the decision of weaning should be based not solely on the cholinesterase level but on the consideration of general and respiratory state of individual patients.

  • PDF

Item Recommendation Technique Using Spark (Spark를 이용한 항목 추천 기법에 관한 연구)

  • Yun, So-Young;Youn, Sung-Dae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.5
    • /
    • pp.715-721
    • /
    • 2018
  • With the spread of mobile devices, the users of social network services or e-commerce sites have increased dramatically, and the amount of data produced by the users has increased exponentially. E-commerce companies have faced a task regarding how to extract useful information from a vast amount of data produced by the users. To solve this problem, there are various studies applying big data processing technique. In this paper, we propose a collaborative filtering method that applies the tag weight in the Apache Spark platform. In order to elevate the accuracy of recommendation, the proposed method refines the tag data in the preprocessing process and categorizes the items and then applies the information of periods and tag weight to the estimate rating of the items. After generating RDD, we calculate item similarity and prediction values and recommend items to users. The experiment result indicated that the proposed method process large amounts of data quickly and improve the appropriateness of recommendation better.

Processing large-scale data with Apache Spark (Apache Spark를 활용한 대용량 데이터의 처리)

  • Ko, Seyoon;Won, Joong-Ho
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.6
    • /
    • pp.1077-1094
    • /
    • 2016
  • Apache Spark is a fast and general-purpose cluster computing package. It provides a new abstraction named resilient distributed dataset, which is capable of support for fault tolerance while keeping data in memory. This type of abstraction results in a significant speedup compared to legacy large-scale data framework, MapReduce. In particular, Spark framework is suitable for iterative machine learning applications such as logistic regression and K-means clustering, and interactive data querying. Spark also supports high level libraries for various applications such as machine learning, streaming data processing, database querying and graph data mining thanks to its versatility. In this work, we introduce the concept and programming model of Spark as well as show some implementations of simple statistical computing applications. We also review the machine learning package MLlib, and the R language interface SparkR.

A Study of Real-Time Video Streaming Data Service on the Linux Server (리눅스 서버를 이용한 동영상 데이터 실시간 스트리밍 서비스 연구)

  • Jang, Seung-Ju;Heo, Won-Yeong;Yoo, Hyun-Min;Lee, Chang-Hoon;Shin, Woo-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.4
    • /
    • pp.893-901
    • /
    • 2013
  • This paper suggests a method of developing live media streaming service through Linux server system on android system environment. The android application constructed in the experiment is able to record media while sending it to Linux server. Generated real time media data is send to linux server through Multipart Request class of Apache Tomcat server constructed on Linux system. Also in this research, by utilizing Android video player and media player class, development of android application structures was accomplished, which has methods of; playing live media stream data on video server, or playing live media stream data while saving stream data in cache. The structure and function of suggested system and application is confirmed by series of experiments.

Analyzing Machine Learning Techniques for Fault Prediction Using Web Applications

  • Malhotra, Ruchika;Sharma, Anjali
    • Journal of Information Processing Systems
    • /
    • v.14 no.3
    • /
    • pp.751-770
    • /
    • 2018
  • Web applications are indispensable in the software industry and continuously evolve either meeting a newer criteria and/or including new functionalities. However, despite assuring quality via testing, what hinders a straightforward development is the presence of defects. Several factors contribute to defects and are often minimized at high expense in terms of man-hours. Thus, detection of fault proneness in early phases of software development is important. Therefore, a fault prediction model for identifying fault-prone classes in a web application is highly desired. In this work, we compare 14 machine learning techniques to analyse the relationship between object oriented metrics and fault prediction in web applications. The study is carried out using various releases of Apache Click and Apache Rave datasets. En-route to the predictive analysis, the input basis set for each release is first optimized using filter based correlation feature selection (CFS) method. It is found that the LCOM3, WMC, NPM and DAM metrics are the most significant predictors. The statistical analysis of these metrics also finds good conformity with the CFS evaluation and affirms the role of these metrics in the defect prediction of web applications. The overall predictive ability of different fault prediction models is first ranked using Friedman technique and then statistically compared using Nemenyi post-hoc analysis. The results not only upholds the predictive capability of machine learning models for faulty classes using web applications, but also finds that ensemble algorithms are most appropriate for defect prediction in Apache datasets. Further, we also derive a consensus between the metrics selected by the CFS technique and the statistical analysis of the datasets.