• Title/Summary/Keyword: bigdata analysis

Search Result 345, Processing Time 0.022 seconds

Similarity Model Analysis and Implementation for Enzyme Reaction Prediction (효소 반응 예측을 위한 유사도 모델 분석 및 구현)

  • Oh, Joo-Seong;Na, Do-Kyun;Park, Chun-Goo;Ceong, Hyi-Thaek
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.3
    • /
    • pp.579-586
    • /
    • 2018
  • With the beginning of the new era of bigdata, information extraction or prediction are an important research area. Here, we present the acquisition of semi-automatically curated large-scale biological database and the prediction of enzyme reaction annotation for analyzing the pharmacological activities of drugs. Because the xenobiotic metabolism of pharmaceutical drugs by cellular enzymes is an important aspect of pharmacology and medicine. In this study, we apply and analyze similarity models to predict bimolecular reactions between human enzymes and their corresponding substrates. Thirteen models select to reflect the characteristics of each cluster in the similarity model. These models compare based on sensitivity and AUC. Among the evaluation models, the Simpson coefficient model showed the best performance in predicting the reactivity between the enzymes. The whole similarity model implement as a web service. The proposed model can respond dynamically to the addition of reaction information, which will contribute to the shortening of new drug development time and cost reduction.

A Study on the Current Status and Application Strategies of the Smart Devices in the Library (도서관에서의 스마트 디바이스 활용 현황분석 및 서비스 적용방안)

  • Kim, Tae-Young;Park, Tae-Yeon;Yang, Dongmin;Oh, Hyo-Jung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.51 no.4
    • /
    • pp.203-226
    • /
    • 2017
  • The advent of the fourth industrial revolution has led to various technologies such as bigdata, the internet of things, artificial intelligence etc. Based on these innovations, the types of information services can changed in the library. The focus is on smart device. This study aims to identify utilization status and service implications of the smart device in the library. To achieve this goal, we conducted current status analysis of the smart device in the library through literature research and online search and gathered the executives views of practical librarians. Consequently, we proposed improvement of library service by using smart device. The results of this study will be expected to help next generation library establish service strategies.

Enhancing the performance of taxi application based on in-memory data grid technology (In-memory data grid 기술을 활용한 택시 애플리케이션 성능 향상 기법 연구)

  • Choi, Chi-Hwan;Kim, Jin-Hyuk;Park, Min-Kyu;Kwon, Kaaen;Jung, Seung-Hyun;Nazareno, Franco;Cho, Wan-Sup
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.5
    • /
    • pp.1035-1045
    • /
    • 2015
  • Recent studies in Big Data Analysis are showing promising results, utilizing the main memory for rapid data processing. In-memory computing technology can be highly advantageous when used with high-performing servers having tens of gigabytes of RAM with multi-core processors. The constraint in network in these infrastructure can be lessen by combining in-memory technology with distributed parallel processing. This paper discusses the research in the aforementioned concept applying to a test taxi hailing application without disregard to its underlying RDBMS structure. The application of IMDG technology in the application's backend API without restructuring the database schema yields 6 to 9 times increase in performance in data processing and throughput. Specifically, the change in throughput is very small even with increase in data load processing.

Self-Disclosure and Boundary Impermeability among Languages of Twitter Users (트위터 이용자의 언어권별 자기노출 및 경계 불투과성)

  • Jang, Phil-Sik
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.4
    • /
    • pp.434-441
    • /
    • 2016
  • Using bigdata analysis procedures, the present study sought to review and explore the various aspects of self-disclosure and boundary impermeability of worldwide twitter users. A total of 415 million tweets issued by 54 million users were collected during 6 months and the users of top 10 languages were investigated. And the effect of languages of twitter users on the boundary impermeability, disclosure rate of user profile, profile image, geographical information, URL in profile and user description were analyzed in this study. The results showed that the boundary impermeability and all the self-disclosure rates of twitter users (profile, profile image, geographical information, URL in profile, user description) were significantly (p<0.001) different among language groups of users. The self-disclosure rates and the average points of Portuguese, Indonesian and Spanish users were higher than those of Arabic, Japanese, Turkish and Korean users. The results also showed a positive relationship between boundary impermeability and the number of tweets (including retweets) issued by each users.

Parameter Estimation and Fitting Error Analysis of the Representative Spectrums using the Wave Spectrum off the Namhangjin, East Sea (남항진 파랑 스펙트럼 정보를 이용한 대표 스펙트럼 매개변수 추정 및 분석)

  • Cho, Hong Yeon;Jeong, Weon Mu;Oh, Sang-Ho;Baek, Won Dae
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.32 no.5
    • /
    • pp.363-371
    • /
    • 2020
  • The parameters of the modified BM and JONSWAP spectra are estimated using spectral data set off Namhangjin, located in the east coast of Korea, collected during high wave events. The parameters of the modified BM spectrum were estimated to be 1.04 and 0.27, which were similar to the conventional values of 1.098 and 0.30, but showed significant differences in statistical terms. On the other hand, the peak enhancement factor of JONSWAP spectrum was estimated to be 1.4, which was substantially small compared to the conventional value of 3.3. The RMSE differences from the fitted results of the two spectra were small, approximately 0.2. In the frequency range greater than the peak frequency, however, the spectral energy density showed relatively mild decrease with increase of the frequency, compared to the standard forms of the modified BM and JONSWAP spectra.

A Study on Implementation of Fraud Detection System (FDS) Applying BigData Platform (빅데이터 기술을 활용한 이상금융거래 탐지시스템 구축 연구)

  • Kang, Jae-Goo;Lee, Ji-Yean;You, Yen-Yoo
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.4
    • /
    • pp.19-24
    • /
    • 2017
  • The growing number of electronic financial transactions (e-banking) has entailed the rapid increase in security threats such as extortion and falsification of financial transaction data. Against such background, rigid security and countermeasures to hedge against such problems have risen as urgent tasks. Thus, this study aims to implement an improved case model by applying the Fraud Detection System (hereinafter, FDS) in a financial corporation 'A' using big data technique (e.g. the function to collect/store various types of typical/atypical financial transaction event data in real time regarding the external intrusion, outflow of internal data, and fraud financial transactions). As a result, There was reduction effect in terms of previous scenario detection target by minimizing false alarm via advanced scenario analysis. And further suggest the future direction of the enhanced FDS.

Spatial Correlation Analysis of the Mean Sea Level Data Sets in the Coastal Seas, Korea (한국 연안 평균 해수면 자료의 공간 상관관계 분석)

  • Cho, Hong-Yeon;Jeong, Shin Taek;Lee, Uk Jae
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.32 no.1
    • /
    • pp.85-93
    • /
    • 2020
  • The basic information of mean sea level data of all tidal monitoring stations in Korea was reviewed, and the correlation coefficients between the stations were analyzed. Mean sea level changes expected due to global climate change effects were found to show a high correlation of more than 0.75 regardless of the distance between the stations were analyzed. The data between certain stations were found to have negative correlation and low correlation of 0.25 or less, but this was determined by the influence of small data numbers and outliers. However, since these correlations assume a linear increase and a linear relationship, the estimation results may be distorted for data with fluctuating trends that deviate from this assumption. Based on the results of the changing patterns of the MSL data, it shows that a number of the MSL data do not follow the linear trend.

Independence and Homogeneity Tests of the Annual Maxima Data used to Estimate the Design Wave Height (설계파고 추정에 사용한 연 최대 자료의 독립 및 분포 동질 검정)

  • Cho, Hong Yeon;Jeong, Weon Mu;Back, Jong Dai
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.32 no.1
    • /
    • pp.26-38
    • /
    • 2020
  • A statistical test was carried out on the IID (Independently and Identically Distributed) assumption of the AM (Annual Maxima) data used to estimate the design wave height. The test was divided into independence (randomness) test and homogeneity test, and each test was conducted on AM data of 210 and 310 stations in coastal and inner coastal grids in typhoon and non-typhoon (monsoon) conditions. As a result of the independence test, the rejection ratios of the test are in the range of 1.8~5.3% and 1.4~6.0% for the non-typhoon and typhoon data sets, respectively. On the other hand, in the distribution difference test of typhoon data and nontyphoon data, the same distribution hypothesis was found to be rejected in the range of 47~79% according to the test method for both coastal grid and inner coastal grid. Therefore, in estimating design wave height by extreme value analysis, the estimation process by dividing the typhoon and non-typhoon data is appropriate.

Graph Database Benchmarking Systems Supporting Diversity (다양성을 지원하는 그래프 데이터베이스 벤치마킹 시스템)

  • Choi, Do-Jin;Baek, Yeon-Hee;Lee, So-Min;Kim, Yun-A;Kim, Nam-Young;Choi, Jae-Young;Lee, Hyeon-Byeong;Lim, Jong-Tae;Bok, Kyoung-Soo;Song, Seok-Il;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.12
    • /
    • pp.84-94
    • /
    • 2021
  • Graph databases have been developed to efficiently store and query graph data composed of vertices and edges to express relationships between objects. Since the query types of graph database show very different characteristics from traditional NoSQL databases, benchmarking tools suitable for graph databases to verify the performance of the graph database are needed. In this paper, we propose an efficient graph database benchmarking system that supports diversity in graph inputs and queries. The proposed system utilizes OrientDB to conduct benchmarking for graph databases. In order to support the diversity of input graphs and query graphs, we use LDBC that is an existing graph data generation tool. We demonstrate the feasibility and effectiveness of the proposed scheme through analysis of benchmarking results. As a result of performance evaluation, it has been shown that the proposed system can generate customizable synthetic graph data, and benchmarking can be performed based on the generated graph data.

A Study on the Data Collection and Convergence of Career Advisor System Using AI (AI를 활용한 대학생 진로 조언 시스템 모델 및 데이터 수집과 융합에 대한 연구)

  • Kim, Jong-yul;Ro, Kwang-hyun
    • Journal of Digital Convergence
    • /
    • v.17 no.2
    • /
    • pp.177-185
    • /
    • 2019
  • The purpose of this study is to investigate the causes of career problems, which are the biggest problems of Korean university students, and to solve them by using case studies of domestic and global universities, I would like to suggest a career advisor system model for college students. It is most important to collect advice and learning data to solve the career problems of college students by utilizing information technology such as data analysis and AI. Research has not been actively pursued because the university has very limited internal data to advise on career problems. In this paper, we study the data types and methods of college students' career advice, and propose a career advisor counseling system for college students.