• Title/Summary/Keyword: Big data processing

Search Result 1,063, Processing Time 0.033 seconds

Data-Compression-Based Resource Management in Cloud Computing for Biology and Medicine

  • Zhu, Changming
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.1
    • /
    • pp.21-31
    • /
    • 2016
  • With the application and development of biomedical techniques such as next-generation sequencing, mass spectrometry, and medical imaging, the amount of biomedical data have been growing explosively. In terms of processing such data, we face the problems surrounding big data, highly intensive computation, and high dimensionality data. Fortunately, cloud computing represents significant advantages of resource allocation, data storage, computation, and sharing and offers a solution to solve big data problems of biomedical research. In order to improve the efficiency of resource management in cloud computing, this paper proposes a clustering method and adopts Radial Basis Function in order to compress comprehensive data sets found in biology and medicine in high quality, and stores these data with resource management in cloud computing. Experiments have validated that with such a data-compression-based resource management in cloud computing, one can store large data sets from biology and medicine in fewer capacities. Furthermore, with reverse operation of the Radial Basis Function, these compressed data can be reconstructed with high accuracy.

Design and Evaluation Security Control Iconology for Big Data Processing (빅데이터 처리를 위한 보안관제 시각화 구현과 평가)

  • Jeon, Sang June;Yun, Seong Yul;Kim, Jeong Ho
    • Journal of Platform Technology
    • /
    • v.8 no.4
    • /
    • pp.38-46
    • /
    • 2020
  • This study describes how to build a security control system using an open source big data solution so that private companies can build an overall security control infrastructure. In particular, the infrastructure was built using the Elastic Stack, one of the free open source big data analysis solutions, as a way to shorten the cost and development time when building a security control system. A comparative experiment was conducted. In addition, as a result of comparing and analyzing the functions, convenience, service and technical support of the two solution, it was found that the Elastic Stack has advantages in the security control of Big Data in terms of community and open solution. Using the Elastic Stack, security logs were collected, analyzed, and visualized step by step to create a dashboard, input large logs, and measure the search speed. Through this, we discovered the possibility of the Elastic Stack as a big data analysis solution that could replace Splunk.

  • PDF

Attention-based word correlation analysis system for big data analysis (빅데이터 분석을 위한 어텐션 기반의 단어 연관관계 분석 시스템)

  • Chi-Gon, Hwang;Chang-Pyo, Yoon;Soo-Wook, Lee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.27 no.1
    • /
    • pp.41-46
    • /
    • 2023
  • Recently, big data analysis can use various techniques according to the development of machine learning. Big data collected in reality lacks an automated refining technique for the same or similar terms based on semantic analysis of the relationship between words. Since most of the big data is described in general sentences, it is difficult to understand the meaning and terms of the sentences. To solve these problems, it is necessary to understand the morphological analysis and meaning of sentences. Accordingly, NLP, a technique for analyzing natural language, can understand the word's relationship and sentences. Among the NLP techniques, the transformer has been proposed as a way to solve the disadvantages of RNN by using self-attention composed of an encoder-decoder structure of seq2seq. In this paper, transformers are used as a way to form associations between words in order to understand the words and phrases of sentences extracted from big data.

Accounting Information Processing Model Using Big Data Mining (빅데이터마이닝을 이용한 회계정보처리 모형)

  • Kim, Kyung-Ihl
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.7
    • /
    • pp.14-19
    • /
    • 2020
  • This study suggests an accounting information processing model based on internet standard XBRL which applies an extensible business reporting language, the XML technology. Due to the differences in document characteristics among various companies, this is very important with regard to the purpose of accounting that the system should provide useful information to the decision maker. This study develops a data mining model based on XML hierarchy which is stored as XBRL in the X-Hive data base. The data ming analysis is experimented by the data mining association rule. And based on XBRL, the DC-Apriori data mining method is suggested combining Apriori algorithm and X-query together. Finally, the validity and effectiveness of the suggested model is investigated through experiments.

Frequency and Social Network Analysis of the Bible Data using Big Data Analytics Tools R (빅데이터 분석도구 R을 이용한 성경 데이터의 빈도와 소셜 네트워크 분석)

  • Ban, ChaeHoon;Ha, JongSoo;Kim, Dong Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.166-171
    • /
    • 2020
  • Big data processing technology that can store and analyze data and obtain new knowledge has been adjusted for importance in many fields of the society. Big data is emerging as an important problem in the field of information and communication technology, but the mind of continuous technology is rising. the R, a tool that can analyze big data, is a language and environment that enables information analysis of statistical bases. In this paper, we use this to analyze the Bible data. We analyze the four Gospels of the New Testament in the Bible. We collect the Bible data and perform filtering for analysis. The R is used to investigate the frequency of what text is distributed and analyze the Bible through social network analysis, in which words from a sentence are paired and analyzed between words for accurate data analysis.

Load Balancing for Distributed Processing of Real-time Spatial Big Data Stream (실시간 공간 빅데이터 스트림 분산 처리를 위한 부하 균형화 방법)

  • Yoon, Susik;Lee, Jae-Gil
    • Journal of KIISE
    • /
    • v.44 no.11
    • /
    • pp.1209-1218
    • /
    • 2017
  • A variety of sensors is widely used these days, and it has become much easier to acquire spatial big data streams from various sources. Since spatial data streams have inherently skewed and dynamically changing distributions, the system must effectively distribute the load among workers. Previous studies to solve this load imbalance problem are not directly applicable to processing spatial data. In this research, we propose Adaptive Spatial Key Grouping (ASKG). The main idea of ASKG is, by utilizing the previous distribution of the data streams, to adaptively suggest a new grouping scheme that evenly distributes the future load among workers. We evaluate the validity of the proposed algorithm in various environments, by conducting an experiment with real datasets while varying the number of workers, input rate, and processing overhead. Compared to two other alternative algorithms, ASKG improves the system performance in terms of load imbalance, throughput, and latency.

Developing an Intrusion Detection Framework for High-Speed Big Data Networks: A Comprehensive Approach

  • Siddique, Kamran;Akhtar, Zahid;Khan, Muhammad Ashfaq;Jung, Yong-Hwan;Kim, Yangwoo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.4021-4037
    • /
    • 2018
  • In network intrusion detection research, two characteristics are generally considered vital to building efficient intrusion detection systems (IDSs): an optimal feature selection technique and robust classification schemes. However, the emergence of sophisticated network attacks and the advent of big data concepts in intrusion detection domains require two more significant aspects to be addressed: employing an appropriate big data computing framework and utilizing a contemporary dataset to deal with ongoing advancements. As such, we present a comprehensive approach to building an efficient IDS with the aim of strengthening academic anomaly detection research in real-world operational environments. The proposed system has the following four characteristics: (i) it performs optimal feature selection using information gain and branch-and-bound algorithms; (ii) it employs machine learning techniques for classification, namely, Logistic Regression, Naïve Bayes, and Random Forest; (iii) it introduces bulk synchronous parallel processing to handle the computational requirements of large-scale networks; and (iv) it utilizes a real-time contemporary dataset generated by the Information Security Centre of Excellence at the University of Brunswick (ISCX-UNB) to validate its efficacy. Experimental analysis shows the effectiveness of the proposed framework, which is able to achieve high accuracy, low computational cost, and reduced false alarms.

Big Data Analysis of Public Acceptance of Nuclear Power in Korea

  • Roh, Seungkook
    • Nuclear Engineering and Technology
    • /
    • v.49 no.4
    • /
    • pp.850-854
    • /
    • 2017
  • Public acceptance of nuclear power is important for the government, the major stakeholder of the industry, because consensus is required to drive actions. It is therefore no coincidence that the governments of nations operating nuclear reactors are endeavoring to enhance public acceptance of nuclear power, as better acceptance allows stable power generation and peaceful processing of nuclear wastes produced from nuclear reactors. Past research, however, has been limited to epistemological measurements using methods such as the Likert scale. In this research, we propose big data analysis as an attractive alternative and attempt to identify the attitudes of the public on nuclear power. Specifically, we used common big data analyses to analyze consumer opinions via SNS (Social Networking Services), using keyword analysis and opinion analysis. The keyword analysis identified the attitudes of the public toward nuclear power. The public felt positive toward nuclear power when Korea successfully exported nuclear reactors to the United Arab Emirates. With the Fukushima accident in 2011 and certain supplier scandals in 2012, however, the image of nuclear power was degraded and the negative image continues. It is recommended that the government focus on developing useful businesses and use cases of nuclear power in order to improve public acceptance.

Suggested social media big data consulting chatbot service for restaurant start-ups

  • Jong-Hyun Park;Jun-Ho Park;Ki-Hwan Ryu
    • International journal of advanced smart convergence
    • /
    • v.12 no.3
    • /
    • pp.68-74
    • /
    • 2023
  • The food industry has been hit hard since the first outbreak of COVID-19 in 2019. However, as of April 2022, social distancing has been resolved and the restaurant industry has gradually recovered, interest in restaurant start-ups is increasing. Therefore, in this paper, 'restaurant start-up' was cited as a key keyword through social media big data analysis using TexTom, and word frequency and cone analysis were conducted for big data analysis. The keyword collection period was selected from May 1, 2022, when social distancing due to COVID-19 was lifted, to May 23, 2023, and based on this, a plan to develop chatbot services for restaurant start-ups was proposed. This paper was prepared in consideration of what to consider when starting a restaurant and a chatbot service that allows prospective restaurant founders to receive information more conveniently. Based on these analysis results, we expected to contribute to the process of developing chatbots for prospective restaurant founders in the future

App Recommendation System Based on Collaborative Filtering

  • Nasridinov, Aziz;Park, Young-Ho
    • Annual Conference of KIPS
    • /
    • 2013.11a
    • /
    • pp.1158-1159
    • /
    • 2013
  • It gives to users a difficulty for searching between this huge numbers of programs. Searching the best applications for our needs is a big challenge today. In this paper, we propose a study on collaborative filtering based app recommendation system. The proposed method is composed of three steps. In the first step, we extract the data set from the target website. In the second step, we parse the extracted raw data according to the types, and store in a database. In the third, we perform recommendations based on the stored data in database.