• Title/Summary/Keyword: Big data processing

Search Result 1,063, Processing Time 0.031 seconds

Dynamic Caching Routing Strategy for LEO Satellite Nodes Based on Gradient Boosting Regression Tree

  • Yang Yang;Shengbo Hu;Guiju Lu
    • Journal of Information Processing Systems
    • /
    • v.20 no.1
    • /
    • pp.131-147
    • /
    • 2024
  • A routing strategy based on traffic prediction and dynamic cache allocation for satellite nodes is proposed to address the issues of high propagation delay and overall delay of inter-satellite and satellite-to-ground links in low Earth orbit (LEO) satellite systems. The spatial and temporal correlations of satellite network traffic were analyzed, and the relevant traffic through the target satellite was extracted as raw input for traffic prediction. An improved gradient boosting regression tree algorithm was used for traffic prediction. Based on the traffic prediction results, a dynamic cache allocation routing strategy is proposed. The satellite nodes periodically monitor the traffic load on inter-satellite links (ISLs) and dynamically allocate cache resources for each ISL with neighboring nodes. Simulation results demonstrate that the proposed routing strategy effectively reduces packet loss rate and average end-to-end delay and improves the distribution of services across the entire network.

Solitary Death and Old age Management Revolution: Proposed Men's Menopausal Program for Healthy Aging (고독사와 노후관리 혁신: 건강한 노화를 위한 남성 갱년기 프로그램 제안)

  • Ju-Yeon Lim;Jin Kim
    • Annual Conference of KIPS
    • /
    • 2024.05a
    • /
    • pp.550-551
    • /
    • 2024
  • 본 연구는 최근 중장년층 남성의 고독사 문제를 다루며, 남성 갱년기와 고독사, 정신 건강 문제의 연관성을 통계분석으로 확인한다. 연구 결과에 따르면, 사회적 고립도와 우울증, 불안장애는 고독사 위험 정도와 밀접한 관련이 있음을 알 수 있다. 이를 바탕으로 현재까지 존재하지 않았던 남성 갱년기 프로그램 도입을 제안하고자 한다. 프로그램이 활성화될 경우 중장년층의 고독사와 사회적 고립도를 완화시키며, 노후 생활의 질을 향상시킬 수 있을 것으로 기대한다.

Domain-Adaptive Pre-training for Korean Document Summarization (도메인 적응 사전 훈련 (Domain-Adaptive Pre-training, DAPT) 한국어 문서 요약)

  • Hyungkuk Jang;Hyuncheol, Jang
    • Annual Conference of KIPS
    • /
    • 2024.05a
    • /
    • pp.843-845
    • /
    • 2024
  • 도메인 적응 사전 훈련(Domain-Adaptive Pre-training, DAPT)을 활용한 한국어 문서 요약 연구에서는 특정 도메인의 문서에 대한 이해도와 요약 성능을 향상시키기 위해 DAPT 기법을 적용했다. 이 연구는 사전 훈련된 언어 모델이 일반적인 언어 이해 능력을 넘어 특정 도메인에 최적화된 성능을 발휘할 수 있도록 도메인 특화 데이터셋을 사용하여 추가적인 사전 훈련을 진행한다. 구체적으로, 의료, 법률, 기술 등 다양한 도메인에서 수집한 한국어 텍스트 데이터를 이용하여 모델을 미세 조정하며, 이를 통해 얻은 모델은 도메인에 특화된 용어와 문맥을 효과적으로 처리할 수 있음을 보여준다. 성능 평가에서는 기존 사전 훈련 모델과 DAPT를 적용한 모델을 비교하여 DAPT의 효과를 검증했다. 연구 결과, DAPT를 적용한 모델은 도메인 특화 문서 요약 작업에서 성능 향상을 보였으며, 이는 실제 도메인별 활용에서도 유용할 것으로 기대된다.

The Efficient Method of Parallel Genetic Algorithm using MapReduce of Big Data (빅 데이터의 MapReduce를 이용한 효율적인 병렬 유전자 알고리즘 기법)

  • Hong, Sung-Sam;Han, Myung-Mook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.5
    • /
    • pp.385-391
    • /
    • 2013
  • Big Data is data of big size which is not processed, collected, stored, searched, analyzed by the existing database management system. The parallel genetic algorithm using the Hadoop for BigData technology is easily realized by implementing GA(Genetic Algorithm) using MapReduce in the Hadoop Distribution System. The previous study that the genetic algorithm using MapReduce is proposed suitable transforming for the GA by MapReduce. However, they did not show good performance because of frequently occurring data input and output. In this paper, we proposed the MRPGA(MapReduce Parallel Genetic Algorithm) using improvement Map and Reduce process and the parallel processing characteristic of MapReduce. The optimal solution can be found by using the topology, migration of parallel genetic algorithm and local search algorithm. The convergence speed of the proposal method is 1.5 times faster than that of the existing MapReduce SGA, and is the optimal solution can be found quickly by the number of sub-generation iteration. In addition, the MRPGA is able to improve the processing and analysis performance of Big Data technology.

Words Recommendation Algorithm for Similarity Connection based on Data Transmutability (데이터 변형성 기반 유사성 연결을 위한 단어 추천 알고리즘)

  • Kim, Boon-Hee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.11
    • /
    • pp.1719-1724
    • /
    • 2013
  • Big data which requires a different approach from existing data processing methods, is unstructured data with a variety of features. The features mean the volume of data, the rate of change of the data, the data with a variety of features. Tweets of twitter in only Korea are more than 5 millions per day. So much cheaper data storage and analysis system due to the increasing demand for information, the value of research is increasing. In this paper, the technology required by the deformation characteristics of the data elements as a technology priority-based word-based recommendation algorithm is proposed.

Study on Proactive Data Process Orchestration in Distributed Cloud

  • Jong-Sub Lee;Seok-Jae Moon
    • International journal of advanced smart convergence
    • /
    • v.13 no.3
    • /
    • pp.135-142
    • /
    • 2024
  • Recently, along with digital transformation, technologies such as cloud computing, big data, and artificial intelligence have been actively introduced. In a situation where these technological changes are progressing rapidly, it is often difficult to manage processes efficiently using existing simple workflow management methods. Companies providing current cloud services are adopting virtualization technologies, including virtual machines (VMs) and containers, in their distributed system infrastructure for automated application deployment. Accordingly, this paper proposes a process-based orchestration system for integrated execution of corporate process-oriented workloads by integrating the potential of big data and machine learning technologies. This system consists of four layers as components for performing workload processes. Additionally, a common information model is applied to the data to efficiently integrate and manage the various formats and uses of data generated during the process creation stage. Moreover, a standard metadata protocol is introduced to ensure smooth exchange between data. This proposed system utilizes various types of data storage to store process data, metadata, and analysis models. This enables flexible management and efficient processing of data.

A Study on the Anomaly Prediction System of Drone Using Big Data (빅데이터를 활용한 드론의 이상 예측시스템 연구)

  • Lee, Yang-Kyoo;Hong, Jun-Ki;Hong, Sung-Chan
    • Journal of Internet Computing and Services
    • /
    • v.21 no.2
    • /
    • pp.27-37
    • /
    • 2020
  • Recently, big data is rapidly emerging as a core technology in the 4th industrial revolution. Further, the utilization and the demand of drones are continuously increasing with the development of the 4th industrial revolution. However, as the drones usage increases, the risk of drones falling increases. Drones always have a risk of being able to fall easily even with small problems due to its simple structure. In this paper, in order to predict the risk of drone fall and to prevent the fall, ESC (Electronic Speed Control) is attached integrally with the drone's driving motor and the acceleration sensor is stored to collect the vibration data in real time. By processing and monitoring the data in real time and analyzing the data through big data obtained in such a situation using a Fast Fourier Transform (FFT) algorithm, we proposed a prediction system that minimizes the risk of drone fall by analyzing big data collected from drones.

Big Data-based Medical Clinical Results Analysis (빅데이터 기반 의료 임상 결과 분석)

  • Hwang, Seung-Yeon;Park, Ji-Hun;Youn, Ha-Young;Kwak, Kwang-Jin;Park, Jeong-Min;Kim, Jeong-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.1
    • /
    • pp.187-195
    • /
    • 2019
  • Recently, it has become possible to collect, store, process, and analyze data generated in various fields by the development of the technology related to the big data. These big data technologies are used for clinical results analysis and the optimization of clinical trial design will reduce the costs associated with health care. Therefore, in this paper, we are going to analyze clinical results and present guidelines that can reduce the period and cost of clinical trials. First, we use Sqoop to collect clinical results data from relational databases and store in HDFS, and use Hive, a processing tool based on Hadoop, to process data. Finally we use R, a big data analysis tool that is widely used in various fields such as public sector or business, to analyze associations.

Suggestions on how to convert official documents to Machine Readable (공문서의 기계가독형(Machine Readable) 전환 방법 제언)

  • Yim, Jin Hee
    • The Korean Journal of Archival Studies
    • /
    • no.67
    • /
    • pp.99-138
    • /
    • 2021
  • In the era of big data, analyzing not only structured data but also unstructured data is emerging as an important task. Official documents produced by government agencies are also subject to big data analysis as large text-based unstructured data. From the perspective of internal work efficiency, knowledge management, records management, etc, it is necessary to analyze big data of public documents to derive useful implications. However, since many of the public documents currently held by public institutions are not in open format, a pre-processing process of extracting text from a bitstream is required for big data analysis. In addition, since contextual metadata is not sufficiently stored in the document file, separate efforts to secure metadata are required for high-quality analysis. In conclusion, the current official documents have a low level of machine readability, so big data analysis becomes expensive.

Performance Evaluation of Medical Big Data Analysis based on RHadoop (RHadoop 기반 보건의료 빅데이터 분석의 성능 평가)

  • Ryu, Woo-Seok
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.1
    • /
    • pp.207-212
    • /
    • 2018
  • As a data analysis tool which is becoming popular in the Big Data era, R is rapidly expanding its user range by providing powerful statistical analysis and data visualization functions. Major advantage of R is its functional scalability based on open source, but its scale scalability is limited, resulting in performance degrades in large data processing. RHadoop, one of the extension packages to complement it, can improve data analysis performance as it supports Hadoop platform-based distributed processing of programs written in R. In this paper, we evaluate the validity of RHadoop by evaluating the performance improvement of RHadoop in real medical big data analysis. Performance evaluation of the analysis of the medical history information, which is provided by National Health Insurance Service, using R and RHadoop shows that RHadoop cluster composed of 8 data nodes can improve performance up to 8 times compared with R.