Search | Korea Science

Performance Comparison of Logistic Regression Algorithms on RHadoop

Jung, Byung Ho;Lim, Dong Hoon
- Journal of the Korea Society of Computer and Information
- /
- v.22 no.4
- /
- pp.9-16
- /
- 2017
Machine learning has found widespread implementations and applications in many different domains in our life. Logistic regression is a type of classification in machine leaning, and is used widely in many fields, including medicine, economics, marketing and social sciences. In this paper, we present the MapReduce implementation of three existing algorithms, this is, Gradient Descent algorithm, Cost Minimization algorithm and Newton-Raphson algorithm, for logistic regression on RHadoop that integrates R and Hadoop environment applicable to large scale data. We compare the performance of these algorithms for estimation of logistic regression coefficients with real and simulated data sets. We also compare the performance of our RHadoop and RHIPE platforms. The performance experiments showed that our Newton-Raphson algorithm when compared to Gradient Descent and Cost Minimization algorithms appeared to be better to all data tested, also showed that our RHadoop was better than RHIPE in real data, and was opposite in simulated data.
https://doi.org/10.9708/jksci.2017.22.04.009 인용 PDF KSCI

The Structural Relationship of Customer Data Integration and CRM Performances (고객 데이터 통합과 CRM성과간의 구조적 관련성)

Kang Jae-Jung;Moon Tae-Soo
- The Journal of Information Systems
- /
- v.15 no.3
- /
- pp.87-106
- /
- 2006
The customer-focused enterprise is interested in integrating every record of an interaction with a customer. This study is to investigate the structural relationship of data integration customer analysis capability, marketing & sales capability, customer service capability, and CRM performance. 205 survey data were collected from the company which implemented the CRM package. SEM analysis shows that data integration has influence on the CRM performance through the improvement of customer analysis capability, marketing 8t sales capability, and customer service capability. The revised model for further goodness-fitting model shows that data integration has influence on the improvement of customer analysis capability, marketing & sales capability, and customer service capability. but customer analysis capability has indirect influence on CRM performance through the improvement of marketing & sales capability, customer service capability.
PDF

Performance Analysis of RDBMS and MongoDB through YCSB in Medical Data Processing System Based HL7 FHIR (HL7 FHIR 기반 의료 데이터 처리 시스템에서 YCSB를 통한 RDBMS와 MongoDB의 성능 분석 연구)

Jeon, Dong-cheol;Lee, Byung Mun;Hwang, Heejoung
- Journal of Korea Multimedia Society
- /
- v.21 no.8
- /
- pp.934-941
- /
- 2018
There are some limits on cost and efficiency for large amount of data in RDBMS, and NoSQL is starting to gain popularity. In medical institutions, data forms are different between organizations, and that makes difficulty for interoperability between organizations. In this paper we focused on performance issues between RDMBS and NoSQL in medical documents. We had built two different environment and had experiment comparative analysis of NoSQL with RDBMS based on medical data. We used medical HL7 FHIR as a medical data standard. Also YCSB benchmark tool was used for performance comparison. Experiments shows that NoSQL has better performance in large amounts of medical data processing systems that have over 10,000~100,000 records.
https://doi.org/10.9717/kmms.2018.21.8.934 인용 PDF KSCI

Performance Evaluation of Energy Management Algorithms for MapReduce System (MapReduce 시스템을 위한 에너지 관리 알고리즘의 성능평가)

Kim, Min-Ki;Cho, Haengrae
- IEMEK Journal of Embedded Systems and Applications
- /
- v.9 no.2
- /
- pp.109-115
- /
- 2014
Analyzing large scale data has become an important activity for many organizations. Since MapReduce is a promising tool for processing the massive data sets, there are increasing studies to evaluate the performance of various algorithms related to MapReduce. In this paper, we first develop a simulation framework that includes MapReduce workload model, data center model, and the model of data access pattern. Then we propose two algorithms that can reduce the energy consumption of MapReduce systems. Using the simulation framework, we evaluate the performance of the proposed algorithms under different application characteristics and configurations of data centers.
https://doi.org/10.14372/IEMEK.2014.9.2.109 인용 PDF KSCI

TMY2 Weather data for Korea (TMY2 방식에 의한 국내 기상자료 작성 연구)

Shin, Kee-Shik;Yoon, Chang-Ryuel;Park, Sang-Dong
- 한국신재생에너지학회:학술대회논문집
- /
- 2009.06a
- /
- pp.243-246
- /
- 2009
To evaluate the building energy performance, many building simulation programs are used and its capabilities are developed. Despite of its increased capabilities the weather data used In the Building Energy performance evaluation, are still using the same limited set of data. This often forces users to find or calculate weather data such as illuminance, solar radiation, and ground temperature from other sources to calculate it. Also, proper selection of a right weather data set has been considered as one of important factors for a successful building energy simulation. In this paper, we describe TMY2 data, a generalized weather data format developed for use, and applied to Seoul region and examine the differences comparing to existing weather data. A set of 23 years raw weather data base has been developed to provide the weather data file for building energy analysis in Seoul.
PDF

Impact of Data Continuity in EEG Signal-based BCI Research (뇌파 신호 기반 BCI 연구에서 데이터 연속성의 영향)

Youn-Sang Kim;Ju-Hyuck Han;Woong-Sik Kim
- Journal of the Institute of Convergence Signal Processing
- /
- v.25 no.1
- /
- pp.7-14
- /
- 2024
This study conducted a comparative experiment on the continuity of time series data and the classification performance of artificial intelligence models. In BCI research using EEG signals, the performance of behavior and thought classification improved as the continuity of the data decreased. In particular, LSTM achieved a high performance of 0.8728 on data with low continuity, and DNN showed a performance of 0.9178 when continuity was not considered. This suggests that data without continuity may perform better. Additionally, data without continuity showed better performance in task classification. These results suggest that BCI research based on EEG signals can perform better by showing various data characteristics through shuffling rather than considering data continuity.
https://doi.org/10.23087/jkicsp.2024.25.1.002 인용 PDF

Performance Analysis for Group Delay and Non-linear Characteristics in High Speed Data Satellite Communication System (초고속 위성통신 시스템의 군 지연 및 비 선형 특성에 대한 영향 분석)

김영완;송윤정;김내수
- Proceedings of the IEEK Conference
- /
- 2000.11a
- /
- pp.113-116
- /
- 2000
The effect due to group delay and non linear characteristics in high speed data satellite channel was represented in this paper. Based on the modeling of group delay and non linear characteristics the performance was analyzed in ka band satellite channel. The group delay and non-linear characteristics in high speed data transmission severely affect the system performance. The more Eb/No is required to satisfy the required system performance. The optimum operating points of HDR satellite transmission system are implemented by considering analyzed results for channel characteristics
PDF

Marketing Performance and Big Data Use During the COVID-19 Pandemic: A Case Study of SMEs in Indonesia

WIBOWO, Sampurno;SURYANA, Yuyus;SARI, Diana;KALTUM, Umi
- The Journal of Asian Finance, Economics and Business
- /
- v.8 no.7
- /
- pp.571-578
- /
- 2021
The outbreak of the COVID-19 pandemic, which began in 2020, had a significant impact on the economy and business activities worldwide. Large companies, as well as small businesses were affected, many of them had to scale down or divert their businesses, and some even had to stop. This extraordinary situation requires business people to make innovations and adjustments to survive during a pandemic. Entering the digital era, business players are helped by the ease of internet access, which will make it easier for SME players to get data from their consumers. Business actors can use this data to innovate and create new creations to improve business performance during this pandemic. This research aims to identify how small and medium enterprises can take advantage of Big Data to improve marketing performance through innovation and value creation. The research methodology used the in this research is quantitative method. The respondents are SME producers of food and beverage, with a total of 150 respondents. The results in the study indicate that all the proposed hypotheses are accepted. The most significant influence is found on the relationship of Big Data to value creation. The lowest effect was obtained from the relationship between Big Data and marketing performance through the mediation variable and innovation capability.
https://doi.org/10.13106/jafeb.2021.vol8.no7.0571 인용 PDF KSCI HTML

Study on the Surface Defect Classification of Al 6061 Extruded Material By Using CNN-Based Algorithms (CNN을 이용한 Al 6061 압출재의 표면 결함 분류 연구)

Kim, S.B.;Lee, K.A.
- Transactions of Materials Processing
- /
- v.31 no.4
- /
- pp.229-239
- /
- 2022
Convolution Neural Network(CNN) is a class of deep learning algorithms and can be used for image analysis. In particular, it has excellent performance in finding the pattern of images. Therefore, CNN is commonly applied for recognizing, learning and classifying images. In this study, the surface defect classification performance of Al 6061 extruded material using CNN-based algorithms were compared and evaluated. First, the data collection criteria were suggested and a total of 2,024 datasets were prepared. And they were randomly classified into 1,417 learning data and 607 evaluation data. After that, the size and quality of the training data set were improved using data augmentation techniques to increase the performance of deep learning. The CNN-based algorithms used in this study were VGGNet-16, VGGNet-19, ResNet-50 and DenseNet-121. The evaluation of the defect classification performance was made by comparing the accuracy, loss, and learning speed using verification data. The DenseNet-121 algorithm showed better performance than other algorithms with an accuracy of 99.13% and a loss value of 0.037. This was due to the structural characteristics of the DenseNet model, and the information loss was reduced by acquiring information from all previous layers for image identification in this algorithm. Based on the above results, the possibility of machine vision application of CNN-based model for the surface defect classification of Al extruded materials was also discussed.
https://doi.org/10.5228/KSTP.2022.31.4.229 인용 PDF KSCI HTML

A Case Study of a Navigator Optimization Process

Cho, Doosan
- International journal of advanced smart convergence
- /
- v.6 no.1
- /
- pp.26-31
- /
- 2017
When mobile navigator device accesses data randomly, the cache memory performance is rapidly deteriorated due to low memory access locality. For instance, GPS (General Positioning System) of navigator program for automobiles or drones, that are currently in common use, uses data from 32 satellites and computes current position of a receiver. This computation of positioning is the major part of GPS which accounts more than 50% computation in the program. In this computation task, the satellite signals are received in real time and stored in buffer memories. At this task, since necessary data cannot be sequentially stored, the data is read and used at random. This data accessing patterns are generated randomly, thus, memory system performance is worse by low data locality. As a result, it is difficult to process data in real time due to low data localization. Improving the low memory access locality inherited on the algorithms of conventional communication applications requires a certain optimization technique to solve this problem. In this study, we try to do optimizations with data and memory to improve the locality problem. In experiment, we show that our case study can improve processing speed of core computation and improve our overall system performance by 14%.
https://doi.org/10.7236/IJASC.2017.6.1.26 인용 PDF KSCI

Search Result 31,644, Processing Time 0.047 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)