• Title/Summary/Keyword: Data-intensive processing

Search Result 131, Processing Time 0.03 seconds

Explainable Machine Learning Based a Packed Red Blood Cell Transfusion Prediction and Evaluation for Major Internal Medical Condition

  • Lee, Seongbin;Lee, Seunghee;Chang, Duhyeuk;Song, Mi-Hwa;Kim, Jong-Yeup;Lee, Suehyun
    • Journal of Information Processing Systems
    • /
    • v.18 no.3
    • /
    • pp.302-310
    • /
    • 2022
  • Efficient use of limited blood products is becoming very important in terms of socioeconomic status and patient recovery. To predict the appropriateness of patient-specific transfusions for the intensive care unit (ICU) patients who require real-time monitoring, we evaluated a model to predict the possibility of transfusion dynamically by using the Medical Information Mart for Intensive Care III (MIMIC-III), an ICU admission record at Harvard Medical School. In this study, we developed an explainable machine learning to predict the possibility of red blood cell transfusion for major medical diseases in the ICU. Target disease groups that received packed red blood cell transfusions at high frequency were selected and 16,222 patients were finally extracted. The prediction model achieved an area under the ROC curve of 0.9070 and an F1-score of 0.8166 (LightGBM). To explain the performance of the machine learning model, feature importance analysis and a partial dependence plot were used. The results of our study can be used as basic data for recommendations related to the adequacy of blood transfusions and are expected to ultimately contribute to the recovery of patients and prevention of excessive consumption of blood products.

Real-time Processing of Manufacturing Facility Data based on Big Data for Smart-Factory (스마트팩토리를 위한 빅데이터 기반 실시간 제조설비 데이터 처리)

  • Hwang, Seung-Yeon;Shin, Dong-Jin;Kwak, Kwang-Jin;Kim, Jeong-Joon;Park, Jeong-Min
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.5
    • /
    • pp.219-227
    • /
    • 2019
  • Manufacturing methods have been changed from labor-intensive methods to technological intensive methods centered on manufacturing facilities. As manufacturing facilities replace human labour, the importance of monitoring and managing manufacturing facilities is emphasized. In addition, Big Data technology has recently emerged as an important technology to discover new value from limited data. Therefore, changes in manufacturing industries have increased the need for smart factory that combines IoT, information and communication technologies, sensor data, and big data. In this paper, we present strategies for existing domestic manufacturing factory to becom big data based smart-factory through technologies for distributed storage and processing of manufacturing facility data in MongoDB in real time and visualization using R programming.

An Adaptive Workflow Scheduling Scheme Based on an Estimated Data Processing Rate for Next Generation Sequencing in Cloud Computing

  • Kim, Byungsang;Youn, Chan-Hyun;Park, Yong-Sung;Lee, Yonggyu;Choi, Wan
    • Journal of Information Processing Systems
    • /
    • v.8 no.4
    • /
    • pp.555-566
    • /
    • 2012
  • The cloud environment makes it possible to analyze large data sets in a scalable computing infrastructure. In the bioinformatics field, the applications are composed of the complex workflow tasks, which require huge data storage as well as a computing-intensive parallel workload. Many approaches have been introduced in distributed solutions. However, they focus on static resource provisioning with a batch-processing scheme in a local computing farm and data storage. In the case of a large-scale workflow system, it is inevitable and valuable to outsource the entire or a part of their tasks to public clouds for reducing resource costs. The problems, however, occurred at the transfer time for huge dataset as well as there being an unbalanced completion time of different problem sizes. In this paper, we propose an adaptive resource-provisioning scheme that includes run-time data distribution and collection services for hiding the data transfer time. The proposed adaptive resource-provisioning scheme optimizes the allocation ratio of computing elements to the different datasets in order to minimize the total makespan under resource constraints. We conducted the experiments with a well-known sequence alignment algorithm and the results showed that the proposed scheme is efficient for the cloud environment.

Self-healing Method for Data Aggregation Tree in Wireless Sensor Networks (무선센서네트워크에서 데이터 병합 트리를 위한 자기치유 방법)

  • Le, Duc Tai;Duc, Thang Le;Yeom, Sanggil;Zalyubovskiy, Vyacheslav V.;Choo, Hyunseung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.04a
    • /
    • pp.212-213
    • /
    • 2015
  • Data aggregation is a fundamental problem in wireless sensor networks that has attracted great attention in recent years. On constructing a robust algorithm for minimizing data aggregation delay in wireless sensor networks, we consider limited transmission range sensors and approximate the minimum-delay data aggregation tree which can only be built in networks of unlimited transmission range sensors. The paper proposes an adaptive method that can be applied to maintain the network structure in case of a sensor node fails. The data aggregation tree built by the proposed scheme is therefore self-healing and robust. Intensive simulations are carried out and the results show that the scheme could adapt well to network topology changes compared with other approaches.

An FPGA Implementation of Parallel Hardware Architecture for the Real-time Window-based Image Processing (실시간 윈도우 기반 영상 처리를 위한 병렬 하드웨어 구조의 FPGA 구현)

  • Jin S.H.;Cho J.U.;Kwon K.H.;Jeon J.W.
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.223-230
    • /
    • 2006
  • A window-based image processing is an elementary part of image processing area. Because window-based image processing is computationally intensive and data intensive, it is hard to perform ail of the operations of a window-based image processing in real-time by using a software program on general-purpose computers. This paper proposes a parallel hardware architecture that can perform a window-based image processing in real-time using FPGA(Field Programmable Gate Array). A dynamic threshold circuit and a local histogram equalization circuit of the proposed architecture are designed using VHDL(VHSIC Hardware Description Language) and implemented with an FPGA. The performances of both implementations are measured.

Analysing Productivity in Vietnamese Seafood Processing Firms: A Control Function Approach

  • NGUYEN, Van;TRAN, Thuan Duc;MAI, Thanh Khac
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.8 no.2
    • /
    • pp.411-417
    • /
    • 2021
  • This study aims to estimate the production function and total factor productivity (TFP) of Vietnamese seafood processing firms. At the same time, the study analyses the impact of internal factors of firms and the quality of economic institutions on the TFP of the Vietnamese seafood processing industry. The study uses the Function Control (FC) approach in TFP estimation and the Feasible Generalized Least Squares (FGLS) regression model in the analysis of factors affecting TFP. The study was carried out on the census data of enterprises of the Vietnamese seafood processing industry collected by the Vietnamese General Statistics Office and Provincial Competitiveness Index data of Vietnam Chamber of Commerce and Industry in the period from 2013 to 2018. Estimated results from the models show that: i) Vietnamese seafood processing firms are, currently, mainly labor-intensive, the TFP contribution and output is only about 2.258. ii) Factors such as the firm's age, firm's size, and the firm's ownership affect TFP. In which, firms that have few numbers of years of operation, small and medium firms, and private firms have low TFP. iii) Institutional quality and the provincial business environment have a positive impact on the TFP of Vietnamese seafood processing firms in this period.

Performance of Distributed Database System built on Multicore Systems

  • Kim, Kangseok
    • Journal of Internet Computing and Services
    • /
    • v.18 no.6
    • /
    • pp.47-53
    • /
    • 2017
  • Recently, huge datasets have been generating rapidly in a variety of fields. Then, there is an urgent need for technologies that will allow efficient and effective processing of huge datasets. Therefore the problems of partitioning a huge dataset effectively and alleviating the processing overhead of the partitioned data efficiently have been a critical factor for scalability and performance in distributed database system. In our work we utilized multicore servers to provide scalable service to our distributed system. The partitioning of database over multicore servers have emerged from a need for new architectural design of distributed database system from scalability and performance concerns in today's data deluge. The system allows uniform access through a web service interface to concurrently distributed databases over multicore servers, using SQMD (Single Query Multiple Database) mechanism based on publish/subscribe paradigm. We will present performance results with the distributed database system built on multicore server, which is time intensive with traditional architectures. We will also discuss future works.

Comparison of Physical Injury, Emotional Response and Unplanned Self-Removal of Medical Devices According to Use of Physical Restraint in Intensive Care Unit Patients (중환자실 환자의 물리적 억제대 적용 여부에 따른 신체손상, 정서반응, 우발적 치료기구 자가 제거 발생 비교)

  • Lee, Mi Mi;Kim, Keum Soon
    • Journal of Korean Clinical Nursing Research
    • /
    • v.18 no.2
    • /
    • pp.296-306
    • /
    • 2012
  • Purpose: This study was done to compare the physical injury, emotional response and unplanned self-removal of medical devices in patients with physical restraints and patients not restrained. Methods: Eighty patients admitted to the intensive care unit (ICU) of a university hospital in Seoul participated in this study. Forty patients made up each group and the group not restrained was matched with the restraint group for age and history of smoking and alcohol consumption. Data on occurrence of physical injury, intensity of anxiety, stage of agitation and unplanned self-removal of medical devices were collected by observation and medical chart review using a structured instrument. Statistical processing of collected data was done with the SPSS WIN 17.0 program. Results: The physically restrained group experienced more physical injuries and recorded significantly higher levels of anxiety and agitation than the unrestrained group. However, there were no significant differences between the groups in occurrence of unplanned self-removal of medical devices. Conclusion: Results indicate a need for critical care nurses to carefully monitor physical injuries and emotional responses of physically restrained patients and to develop nursing interventions to prevent adverse effects associated with restraint use. There is also a need to develop patient safety guidelines when using physical restraints.

TP-Sim: A Trace-driven Processing-in-Memory Simulator (TP-Sim: 트레이스 기반의 프로세싱 인 메모리 시뮬레이터)

  • Jeonggeun Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.3
    • /
    • pp.78-83
    • /
    • 2023
  • This paper proposes a lightweight trace-driven Processing-In-Memory (PIM) simulator, TP-Sim. TP-Sim is a General Purpose PIM (GP-PIM) simulator that evaluates various PIM system performance-related metrics. Based on instruction and memory traces extracted from the Intel Pin tool, TP-Sim can replay trace files for multiple models of PIM architectures to compare its performance. To verify the availability of TP-Sim, we estimated three different system configurations on the STREAM benchmark. Compared to the traditional Host CPU-only systems with conventional memory hierarchy, simple GP-PIM architecture achieved better performance; even the Host CPU has the same number of in-order cores. For further study, we also extend TP-Sim as a part of a heterogeneous system simulator that contains CPU, GPGPU, and PIM as its primary and co-processors.

  • PDF

LDBAS: Location-aware Data Block Allocation Strategy for HDFS-based Applications in the Cloud

  • Xu, Hua;Liu, Weiqing;Shu, Guansheng;Li, Jing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.204-226
    • /
    • 2018
  • Big data processing applications have been migrated into cloud gradually, due to the advantages of cloud computing. Hadoop Distributed File System (HDFS) is one of the fundamental support systems for big data processing on MapReduce-like frameworks, such as Hadoop and Spark. Since HDFS is not aware of the co-location of virtual machines in the cloud, the default scheme of block allocation in HDFS does not fit well in the cloud environments behaving in two aspects: data reliability loss and performance degradation. In this paper, we present a novel location-aware data block allocation strategy (LDBAS). LDBAS jointly optimizes data reliability and performance for upper-layer applications by allocating data blocks according to the locations and different processing capacities of virtual nodes in the cloud. We apply LDBAS to two stages of data allocation of HDFS in the cloud (the initial data allocation and data recovery), and design the corresponding algorithms. Finally, we implement LDBAS into an actual Hadoop cluster and evaluate the performance with the benchmark suite BigDataBench. The experimental results show that LDBAS can guarantee the designed data reliability while reducing the job execution time of the I/O-intensive applications in Hadoop by 8.9% on average and up to 11.2% compared with the original Hadoop in the cloud.