• Title/Summary/Keyword: machine learning

Search Result 5,182, Processing Time 0.036 seconds

Prediction of Short and Long-term PV Power Generation in Specific Regions using Actual Converter Output Data (실제 컨버터 출력 데이터를 이용한 특정 지역 태양광 장단기 발전 예측)

  • Ha, Eun-gyu;Kim, Tae-oh;Kim, Chang-bok
    • Journal of Advanced Navigation Technology
    • /
    • v.23 no.6
    • /
    • pp.561-569
    • /
    • 2019
  • Solar photovoltaic can provide electrical energy with only radiation, and its use is expanding rapidly as a new energy source. This study predicts the short and long-term PV power generation using actual converter output data of photovoltaic system. The prediction algorithm uses multiple linear regression, support vector machine (SVM), and deep learning such as deep neural network (DNN) and long short-term memory (LSTM). In addition, three models are used according to the input and output structure of the weather element. Long-term forecasts are made monthly, seasonally and annually, and short-term forecasts are made for 7 days. As a result, the deep learning network is better in prediction accuracy than multiple linear regression and SVM. In addition, LSTM, which is a better model for time series prediction than DNN, is somewhat superior in terms of prediction accuracy. The experiment results according to the input and output structure appear Model 2 has less error than Model 1, and Model 3 has less error than Model 2.

A Study on Predictive Models based on the Machine Learning for Evaluating the Extent of Hazardous Zone of Explosive Gases (기계학습 기반의 가스폭발위험범위 예측모델에 관한 연구)

  • Jung, Yong Jae;Lee, Chang Jun
    • Korean Chemical Engineering Research
    • /
    • v.58 no.2
    • /
    • pp.248-256
    • /
    • 2020
  • In this study, predictive models based on machine learning for evaluating the extent of hazardous zone of explosive gases are developed. They are able to provide important guidelines for installing the explosion proof apparatus. 1,200 research data sets including 12 combustible gases and their extents of hazardous zone are generated to train predictive models. The extent of hazardous zone is set to an output variable and 12 variables affecting an output are set as input variables. Multiple linear regression, principal component regression, and artificial neural network are employed to train predictive models. Mean absolute percentage errors of multiple linear regression, principal component regression, and artificial neural network are 44.2%, 49.3%, and 5.7% and root mean square errors are 1.389m, 1.602m, and 0.203 m respectively. Therefore, it can be concluded that the artificial neural network shows the best performance. This model can be easily used to evaluate the extent of hazardous zone for explosive gases.

Recognition of Answer Type for WiseQA (WiseQA를 위한 정답유형 인식)

  • Heo, Jeong;Ryu, Pum Mo;Kim, Hyun Ki;Ock, Cheol Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.7
    • /
    • pp.283-290
    • /
    • 2015
  • In this paper, we propose a hybrid method for the recognition of answer types in the WiseQA system. The answer types are classified into two categories: the lexical answer type (LAT) and the semantic answer type (SAT). This paper proposes two models for the LAT detection. One is a rule-based model using question focuses. The other is a machine learning model based on sequence labeling. We also propose two models for the SAT classification. They are a machine learning model based on multiclass classification and a filtering-rule model based on the lexical answer type. The performance of the LAT detection and the SAT classification shows F1-score of 82.47% and precision of 77.13%, respectively. Compared with IBM Watson for the performance of the LAT, the precision is 1.0% lower and the recall is 7.4% higher.

Forecasting of Short-term Wind Power Generation Based on SVR Using Characteristics of Wind Direction and Wind Speed (풍향과 풍속의 특징을 이용한 SVR기반 단기풍력발전량 예측)

  • Kim, Yeong-ju;Jeong, Min-a;Son, Nam-rye
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.5
    • /
    • pp.1085-1092
    • /
    • 2017
  • In this paper, we propose a wind forecasting method that reflects wind characteristics to improve the accuracy of wind power prediction. The proposed method consists of extracting wind characteristics and predicting power generation. The part that extracts the characteristics of the wind uses correlation analysis of power generation amount, wind direction and wind speed. Based on the correlation between the wind direction and the wind speed, the feature vector is extracted by clustering using the K-means method. In the prediction part, machine learning is performed using the SVR that generalizes the SVM so that an arbitrary real value can be predicted. Machine learning was compared with the proposed method which reflects the characteristics of wind and the conventional method which does not reflect wind characteristics. To verify the accuracy and feasibility of the proposed method, we used the data collected from three different locations of Jeju Island wind farm. Experimental results show that the error of the proposed method is better than that of general wind power generation.

Emotion Classification of User's Utterance for a Dialogue System (대화 시스템을 위한 사용자 발화 문장의 감정 분류)

  • Kang, Sang-Woo;Park, Hong-Min;Seo, Jung-Yun
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.4
    • /
    • pp.459-480
    • /
    • 2010
  • A dialogue system includes various morphological analyses for recognizing a user's intention from the user's utterances. However, a user can represent various intentions via emotional states in addition to morphological expressions. Thus, a user's emotion recognition can analyze a user's intention in various manners. This paper presents a new method to automatically recognize a user's emotion for a dialogue system. For general emotions, we define nine categories using a psychological approach. For an optimal feature set, we organize a combination of sentential, a priori, and context features. Then, we employ a support vector machine (SVM) that has been widely used in various learning tasks to automatically classify a user's emotions. The experiment results show that our method has a 62.8% F-measure, 15% higher than the reference system.

  • PDF

Experiment and Implementation of a Machine-Learning Based k-Value Prediction Scheme in a k-Anonymity Algorithm (k-익명화 알고리즘에서 기계학습 기반의 k값 예측 기법 실험 및 구현)

  • Muh, Kumbayoni Lalu;Jang, Sung-Bong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.1
    • /
    • pp.9-16
    • /
    • 2020
  • The k-anonymity scheme has been widely used to protect private information when Big Data are distributed to a third party for research purposes. When the scheme is applied, an optimal k value determination is one of difficult problems to be resolved because many factors should be considered. Currently, the determination has been done almost manually by human experts with their intuition. This leads to degrade performance of the anonymization, and it takes much time and cost for them to do a task. To overcome this problem, a simple idea has been proposed that is based on machine learning. This paper describes implementations and experiments to realize the proposed idea. In thi work, a deep neural network (DNN) is implemented using tensorflow libraries, and it is trained and tested using input dataset. The experiment results show that a trend of training errors follows a typical pattern in DNN, but for validation errors, our model represents a different pattern from one shown in typical training process. The advantage of the proposed approach is that it can reduce time and cost for experts to determine k value because it can be done semi-automatically.

Recognition of Korean Implicit Citation Sentences Using Machine Learning with Lexical Features (어휘 자질 기반 기계 학습을 사용한 한국어 암묵 인용문 인식)

  • Kang, In-Su
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.8
    • /
    • pp.5565-5570
    • /
    • 2015
  • Implicit citation sentence recognition is to locate citation sentences which lacks explicit citation markers, from articles' full-text. State-of-the-art approaches exploit word ngrams, clue words, researcher's surnames, mentions of previous methods, and distance relative to nearest explicit citation sentences, etc., reaching over 50% performance. However, most previous works have been conducted on English. As for Korean, a rule-based method using positive/negative clue patterns was reported to attain the performance of 42%, requiring further improvement. This study attempted to learn to recognize implicit citation sentences from Korean literatures' full-text using Korean lexical features. Different lexical feature units such as Eojeol, morpheme, and Eumjeol were evaluated to determine proper lexical features for Korean implicit citation sentence recognition. In addition, lexical features were combined with the position features representing backward/forward proximities to explicit citation sentences, improving the performance up to over 50%.

Outlier prediction in sensor network data using periodic pattern (주기 패턴을 이용한 센서 네트워크 데이터의 이상치 예측)

  • Kim, Hyung-Il
    • Journal of Sensor Science and Technology
    • /
    • v.15 no.6
    • /
    • pp.433-441
    • /
    • 2006
  • Because of the low power and low rate of a sensor network, outlier is frequently occurred in the time series data of sensor network. In this paper, we suggest periodic pattern analysis that is applied to the time series data of sensor network and predict outlier that exist in the time series data of sensor network. A periodic pattern is minimum period of time in which trend of values in data is appeared continuous and repeated. In this paper, a quantization and smoothing is applied to the time series data in order to analyze the periodic pattern and the fluctuation of each adjacent value in the smoothed data is measured to be modified to a simple data. Then, the periodic pattern is abstracted from the modified simple data, and the time series data is restructured according to the periods to produce periodic pattern data. In the experiment, the machine learning is applied to the periodic pattern data to predict outlier to see the results. The characteristics of analysis of the periodic pattern in this paper is not analyzing the periods according to the size of value of data but to analyze time periods according to the fluctuation of the value of data. Therefore analysis of periodic pattern is robust to outlier. Also it is possible to express values of time attribute as values in time period by restructuring the time series data into periodic pattern. Thus, it is possible to use time attribute even in the general machine learning algorithm in which the time series data is not possible to be learned.

Relationship among Degree of Time-delay, Input Variables, and Model Predictability in the Development Process of Non-linear Ecological Model in a River Ecosystem (비선형 시계열 하천생태모형 개발과정 중 시간지연단계와 입력변수, 모형 예측성 간 관계평가)

  • Jeong, Kwang-Seuk;Kim, Dong-Kyun;Yoon, Ju-Duk;La, Geung-Hwan;Kim, Hyun-Woo;Joo, Gea-Jae
    • Korean Journal of Ecology and Environment
    • /
    • v.43 no.1
    • /
    • pp.161-167
    • /
    • 2010
  • In this study, we implemented an experimental approach of ecological model development in order to emphasize the importance of input variable selection with respect to time-delayed arrangement between input and output variables. Time-series modeling requires relevant input variable selection for the prediction of a specific output variable (e.g. density of a species). Inadequate variable utility for input often causes increase of model construction time and low efficiency of developed model when applied to real world representation. Therefore, for future prediction, researchers have to decide number of time-delay (e.g. months, weeks or days; t-n) to predict a certain phenomenon at current time t. We prepared a total of 3,900 equation models produced by Time-Series Optimized Genetic Programming (TSOGP) algorithm, for the prediction of monthly averaged density of a potamic phytoplankton species Stephanodiscus hantzschii, considering future prediction from 0- (no future prediction) to 12-months ahead (interval by 1 month; 300 equations per each month-delay). From the investigation of model structure, input variable selectivity was obviously affected by the time-delay arrangement, and the model predictability was related with the type of input variables. From the results, we can conclude that, although Machine Learning (ML) algorithms which have popularly been used in Ecological Informatics (EI) provide high performance in future prediction of ecological entities, the efficiency of models would be lowered unless relevant input variables are selectively used.

An Intrusion Detection System based on the Artificial Neural Network for Real Time Detection (실시간 탐지를 위한 인공신경망 기반의 네트워크 침입탐지 시스템)

  • Kim, Tae Hee;Kang, Seung Ho
    • Convergence Security Journal
    • /
    • v.17 no.1
    • /
    • pp.31-38
    • /
    • 2017
  • As the cyber-attacks through the networks advance, it is difficult for the intrusion detection system based on the simple rules to detect the novel type of attacks such as Advanced Persistent Threat(APT) attack. At present, many types of research have been focused on the application of machine learning techniques to the intrusion detection system in order to detect previously unknown attacks. In the case of using the machine learning techniques, the performance of the intrusion detection system largely depends on the feature set which is used as an input to the system. Generally, more features increase the accuracy of the intrusion detection system whereas they cause a problem when fast responses are required owing to their large elapsed time. In this paper, we present a network intrusion detection system based on artificial neural network, which adopts a multi-objective genetic algorithm to satisfy the both requirements: accuracy, and fast response. The comparison between the proposing approach and previously proposed other approaches is conducted against NSL_KDD data set for the evaluation of the performance of the proposing approach.