Search | Korea Science

A Design on Informal Big Data Topic Extraction System Based on Spark Framework (Spark 프레임워크 기반 비정형 빅데이터 토픽 추출 시스템 설계)

Park, Kiejin
- KIPS Transactions on Software and Data Engineering
- /
- v.5 no.11
- /
- pp.521-526
- /
- 2016
As on-line informal text data have massive in its volume and have unstructured characteristics in nature, there are limitations in applying traditional relational data model technologies for data storage and data analysis jobs. Moreover, using dynamically generating massive social data, social user's real-time reaction analysis tasks is hard to accomplish. In the paper, to capture easily the semantics of massive and informal on-line documents with unsupervised learning mechanism, we design and implement automatic topic extraction systems according to the mass of the words that consists a document. The input data set to the proposed system are generated first, using N-gram algorithm to build multiple words to capture the meaning of the sentences precisely, and Hadoop and Spark (In-memory distributed computing framework) are adopted to run topic model. In the experiment phases, TB level input data are processed for data preprocessing and proposed topic extraction steps are applied. We conclude that the proposed system shows good performance in extracting meaningful topics in time as the intermediate results come from main memories directly instead of an HDD reading.
https://doi.org/10.3745/KTSDE.2016.5.11.521 인용 PDF KSCI

Spatial Computation on Spark Using GPGPU (GPGPU를 활용한 스파크 기반 공간 연산)

Son, Chanseung;Kim, Daehee;Park, Neungsoo
- KIPS Transactions on Computer and Communication Systems
- /
- v.5 no.8
- /
- pp.181-188
- /
- 2016
Recently, as the amount of spatial information increases, an interest in the study of spatial information processing has been increased. Spatial database systems extended from the traditional relational database systems are difficult to handle large data sets because of the scalability. SpatialHadoop extended from Hadoop system has a low performance, because spatial computations in SpationHadoop require a lot of write operations of intermediate results to the disk, resulting in the performance degradation. In this paper, Spatial Computation Spark(SC-Spark) is proposed, which is an in-memory based distributed processing framework. SC-Spark is extended from Spark in order to efficiently perform the spatial operation for large-scale data. In addition, SC-Spark based on the GPGPU is developed to improve the performance of the SC-Spark. SC-Spark uses the advantage of the Spark holding intermediate results in the memory. And GPGPU-based SC-Spark can perform spatial operations in parallel using a plurality of processing elements of an GPU. To verify the proposed work, experiments on a single AMD system were performed using SC-Spark and GPGPU-based SC-Spark for Point-in-Polygon and spatial join operation. The experimental results showed that the performance of SC-Spark and GPGPU-based SC-Spark were up-to 8 times faster than SpatialHadoop.
https://doi.org/10.3745/KTCCS.2016.5.8.181 인용 PDF KSCI

Meta-server Model for Middleware Supporting for Context Awareness (상황인식을 지원하는 미들웨어를 위한 메타서버 모델)

Lee, Seo-Jeong;Hwang, Byung-Yeon;Yoon, Yong-Ik
- Journal of Korea Spatial Information System Society
- /
- v.6 no.2 s.12
- /
- pp.39-49
- /
- 2004
An increasing number of distributed applications will be achieved with mobile technology. These applications face temporary loss of network connectivity when they move. They need to discover other hosts in an ad-hoc manner, and they are likely to have scarce resources including CPU speed, memory and battery power. Software engineers building mobile applications need to use a suitable middleware that resolves these problems and offers appropriate support for developing mobile applications. In this paper, we describe the meta-server building for middleware that addresses reflective context awareness and present usability with demonstration. Metadata is consist of user configuration, device configuration, user context, device context and dynamic image metadata. When middleware send a saving or retrieval request to meta-server, it returns messages to middleware after the verification of the request. This meta-server has the application for multimedia stream services with context awareness.
PDF

Smart grid and nuclear power plant security by integrating cryptographic hardware chip

Kumar, Niraj;Mishra, Vishnu Mohan;Kumar, Adesh
- Nuclear Engineering and Technology
- /
- v.53 no.10
- /
- pp.3327-3334
- /
- 2021
Present electric grids are advanced to integrate smart grids, distributed resources, high-speed sensing and control, and other advanced metering technologies. Cybersecurity is one of the challenges of the smart grid and nuclear plant digital system. It affects the advanced metering infrastructure (AMI), for grid data communication and controls the information in real-time. The research article is emphasized solving the nuclear and smart grid hardware security issues with the integration of field programmable gate array (FPGA), and implementing the latest Time Authenticated Cryptographic Identity Transmission (TACIT) cryptographic algorithm in the chip. The cryptographic-based encryption and decryption approach can be used for a smart grid distribution system embedding with FPGA hardware. The chip design is carried in Xilinx ISE 14.7 and synthesized on Virtex-5 FPGA hardware. The state of the art of work is that the algorithm is implemented on FPGA hardware that provides the scalable design with different key sizes, and its integration enhances the grid hardware security and switching. It has been reported by similar state-of-the-art approaches, that the algorithm was limited in software, not implemented in a hardware chip. The main finding of the research work is that the design predicts the utilization of hardware parameters such as slices, LUTs, flip-flops, memory, input/output blocks, and timing information for Virtex-5 FPGA synthesis before the chip fabrication. The information is extracted for 8-bit to 128-bit key and grid data with initial parameters. TACIT security chip supports 400 MHz frequency for 128-bit key. The research work is an effort to provide the solution for the industries working towards embedded hardware security for the smart grid, power plants, and nuclear applications.
https://doi.org/10.1016/j.net.2021.05.006 인용 PDF KSCI

Performance Analysis of TCAM-based Jumping Window Algorithm for Snort 2.9.0 (Snort 2.9.0 환경을 위한 TCAM 기반 점핑 윈도우 알고리즘의 성능 분석)

Lee, Sung-Yun;Ryu, Ki-Yeol
- Journal of Internet Computing and Services
- /
- v.13 no.2
- /
- pp.41-49
- /
- 2012
Wireless network support and extended mobile network environment with exponential growth of smart phone users allow us to utilize the network anytime or anywhere. Malicious attacks such as distributed DOS, internet worm, e-mail virus and so on through high-speed networks increase and the number of patterns is dramatically increasing accordingly by increasing network traffic due to this internet technology development. To detect the patterns in intrusion detection systems, an existing research proposed an efficient algorithm called the jumping window algorithm and analyzed approximately 2,000 patterns in Snort 2.1.0, the most famous intrusion detection system. using the algorithm. However, it is inappropriate from the number of TCAM lookups and TCAM memory efficiency to use the result proposed in the research in current environment (Snort 2.9.0) that has longer patterns and a lot of patterns because the jumping window algorithm is affected by the number of patterns and pattern length. In this paper, we simulate the number of TCAM lookups and the required TCAM size in the jumping window with approximately 8,100 patterns from Snort-2.9.0 rules, and then analyse the simulation result. While Snort 2.1.0 requires 16-byte window and 9Mb TCAM size to show the most effective performance as proposed in the previous research, in this paper we suggest 16-byte window and 4 18Mb-TCAMs which are cascaded in Snort 2.9.0 environment.
https://doi.org/10.7472/jksii.2012.13.2.41 인용 PDF KSCI

The Parallelization Effectiveness Analysis of K-DRUM Model (분포형 강우유출모형(K-DRUM)의 병렬화 효과 분석)

Chung, Sung-Young;Park, Jin-Hyeog;Hur, Young-Teck;Jung, Kwan-Sue
- Journal of Korean Society for Geospatial Information Science
- /
- v.18 no.4
- /
- pp.21-30
- /
- 2010
In this paper, the parallel distributed rainfall runoff model(K-DRUM) using MPI(Message Passing Interface) technique was developed to solve the problem of calculation time as it is one of the demerits of the distributed model for performing physical and complicated numerical calculations for large scale watersheds. The K-DRUM model which is based on GIS can simulate temporal and spatial distribution of surface flow and sub-surface flow during flood period, and input parameters of ASCII format as pre-process can be extracted using ArcView. The comparison studies were performed with various domain divisions in Namgang Dam watershed in case of typoon 'Ewiniar' at 2006. The numerical simulation using the cluster system was performed to check a parallelization effectiveness increasing the domain divisions from 1 to 25. As a result, the computer memory size reduced and the calculation time was decreased with increase of divided domains. And also, the tool was suggested in order to decreasing the discharge error on each domain connections. The result shows that the calculation and communication times in each domain have to repeats three times at each time steps in order to minimization of discharge error.
PDF KSCI

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
- Journal of Internet Computing and Services
- /
- v.14 no.6
- /
- pp.71-84
- /
- 2013
Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.
https://doi.org/10.7472/jksii.2013.14.6.71 인용 PDF KSCI

Video Retrieval System supporting Adaptive Streaming Service (적응형 스트리밍 서비스를 지원하는 비디오 검색 시스템)

이윤채;전형수;장옥배
- Journal of KIISE:Computing Practices and Letters
- /
- v.9 no.1
- /
- pp.1-12
- /
- 2003
Recently, many researches into distributed processing on Internet, and multimedia data processing have been performed. Rapid and convenient multimedia services supplied with high quality and high speed are to be needed. In this paper, we design and implement clip-based video retrieval system on the Web enviroment in real-time. Our system consists of the content-based indexing system supporting convenient services for video content providers, and the Web-based retrieval system in order to make it easy and various information retrieval for users in the Web. Three important methods are used in the content-based indexing system, key frame extracting method by dividing video data, clip file creation method by clustering related information, and video database construction method by using clip unit. In Web-based retrieval system, retrieval method ny using a key word, two dimension browsing method of key frame, and real-time display method of the clip are used. In this paper, we design and implement the system that supports real-time display method of the clip are used. In this paper, we design and implement the system that supports real-time retrieval for video clips on Web environment and provides the multimedia service in stability. The proposed methods show a usefulness of video content providing, and provide an easy method for serching intented video content.
PDF KSCI

Comparison of Wave Stresses in the Eulerian Nearshore Current Models (오일러형 해빈류 모형의 파랑응력 비교)

Ahn, Kyungmo;Suh, Kyung-Duck;Chun, Hwusub
- Journal of Korean Society of Coastal and Ocean Engineers
- /
- v.29 no.6
- /
- pp.350-362
- /
- 2017
The Eulerian nearshore current model is more advantageous than the Lagrangian model in the way that numerical results from the Eulerian model can be directly compared with the measurements by the stationary equipment. It is because the wave mass flux is not included in the computed mass flux of Euleran nearshore current model. In addition, the Eulerian model can simulate the longshore currents with depth varying parabolic profile. However, the numerical models proposed by different researcher have different forms of the wave stress terms. For example, wave stresses in Newberger and Allen's (2007) model is constant over the depth, while those of Chun (2012) are vertically distributed. In the present study, these wave stress terms were compared against Hamilton et al.'s (2001) laboratory experiments to see the effects of different wave stress terms performed on the computation of nearshore currents.
https://doi.org/10.9765/KSCOE.2017.29.6.350 인용 PDF KSCI

A Design of Certificate Password Recovery Using Decentralized Identifier (DID를 사용한 인증서 암호 복구)

Kim, Hyeong-uk;Kim, Sang-jin;Kim, Tae-jin;Yu, Hyeong-geun
- Journal of Venture Innovation
- /
- v.2 no.2
- /
- pp.21-29
- /
- 2019
In the public certificate technology commonly used in Korea, users have a cumbersome problem of always resetting when they forget their password. In this paper, as a solution to this problem, we propose a secure authentication certificate password recovery protocol using blockchain, PKI, and DID for distributed storage. DID is a schema for protecting block ID in blockchain system. The private key used in the PKI is configured as a user's biometric, for example, a fingerprint, so that it can completely replace the memory of the complex private key. To this end, based on the FIDO authentication technology that most users currently use on their smartphones, the process of authenticating a user to access data inside the block minimizes the risk of an attacker taking over the data.
https://doi.org/10.22788/2.2.2 인용 PDF

Search Result 212, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)