• Title/Summary/Keyword: threshold learning

Search Result 209, Processing Time 0.023 seconds

Sleep Deprivation Attack Detection Based on Clustering in Wireless Sensor Network (무선 센서 네트워크에서 클러스터링 기반 Sleep Deprivation Attack 탐지 모델)

  • Kim, Suk-young;Moon, Jong-sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.1
    • /
    • pp.83-97
    • /
    • 2021
  • Wireless sensors that make up the Wireless Sensor Network generally have extremely limited power and resources. The wireless sensor enters the sleep state at a certain interval to conserve power. The Sleep deflation attack is a deadly attack that consumes power by preventing wireless sensors from entering the sleep state, but there is no clear countermeasure. Thus, in this paper, using clustering-based binary search tree structure, the Sleep deprivation attack detection model is proposed. The model proposed in this paper utilizes one of the characteristics of both attack sensor nodes and normal sensor nodes which were classified using machine learning. The characteristics used for detection were determined using Long Short-Term Memory, Decision Tree, Support Vector Machine, and K-Nearest Neighbor. Thresholds for judging attack sensor nodes were then learned by applying the SVM. The determined features were used in the proposed algorithm to calculate the values for attack detection, and the threshold for determining the calculated values was derived by applying SVM.Through experiments, the detection model proposed showed a detection rate of 94% when 35% of the total sensor nodes were attack sensor nodes and improvement of up to 26% in power retention.

Flood Disaster Prediction and Prevention through Hybrid BigData Analysis (하이브리드 빅데이터 분석을 통한 홍수 재해 예측 및 예방)

  • Ki-Yeol Eom;Jai-Hyun Lee
    • The Journal of Bigdata
    • /
    • v.8 no.1
    • /
    • pp.99-109
    • /
    • 2023
  • Recently, not only in Korea but also around the world, we have been experiencing constant disasters such as typhoons, wildfires, and heavy rains. The property damage caused by typhoons and heavy rain in South Korea alone has exceeded 1 trillion won. These disasters have resulted in significant loss of life and property damage, and the recovery process will also take a considerable amount of time. In addition, the government's contingency funds are insufficient for the current situation. To prevent and effectively respond to these issues, it is necessary to collect and analyze accurate data in real-time. However, delays and data loss can occur depending on the environment where the sensors are located, the status of the communication network, and the receiving servers. In this paper, we propose a two-stage hybrid situation analysis and prediction algorithm that can accurately analyze even in such communication network conditions. In the first step, data on river and stream levels are collected, filtered, and refined from diverse sensors of different types and stored in a bigdata. An AI rule-based inference algorithm is applied to analyze the crisis alert levels. If the rainfall exceeds a certain threshold, but it remains below the desired level of interest, the second step of deep learning image analysis is performed to determine the final crisis alert level.

Comparative Study of Automatic Trading and Buy-and-Hold in the S&P 500 Index Using a Volatility Breakout Strategy (변동성 돌파 전략을 사용한 S&P 500 지수의 자동 거래와 매수 및 보유 비교 연구)

  • Sunghyuck Hong
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.6
    • /
    • pp.57-62
    • /
    • 2023
  • This research is a comparative analysis of the U.S. S&P 500 index using the volatility breakout strategy against the Buy and Hold approach. The volatility breakout strategy is a trading method that exploits price movements after periods of relative market stability or concentration. Specifically, it is observed that large price movements tend to occur more frequently after periods of low volatility. When a stock moves within a narrow price range for a while and then suddenly rises or falls, it is expected to continue moving in that direction. To capitalize on these movements, traders adopt the volatility breakout strategy. The 'k' value is used as a multiplier applied to a measure of recent market volatility. One method of measuring volatility is the Average True Range (ATR), which represents the difference between the highest and lowest prices of recent trading days. The 'k' value plays a crucial role for traders in setting their trade threshold. This study calculated the 'k' value at a general level and compared its returns with the Buy and Hold strategy, finding that algorithmic trading using the volatility breakout strategy achieved slightly higher returns. In the future, we plan to present simulation results for maximizing returns by determining the optimal 'k' value for automated trading of the S&P 500 index using artificial intelligence deep learning techniques.

Development of a complex failure prediction system using Hierarchical Attention Network (Hierarchical Attention Network를 이용한 복합 장애 발생 예측 시스템 개발)

  • Park, Youngchan;An, Sangjun;Kim, Mintae;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.127-148
    • /
    • 2020
  • The data center is a physical environment facility for accommodating computer systems and related components, and is an essential foundation technology for next-generation core industries such as big data, smart factories, wearables, and smart homes. In particular, with the growth of cloud computing, the proportional expansion of the data center infrastructure is inevitable. Monitoring the health of these data center facilities is a way to maintain and manage the system and prevent failure. If a failure occurs in some elements of the facility, it may affect not only the relevant equipment but also other connected equipment, and may cause enormous damage. In particular, IT facilities are irregular due to interdependence and it is difficult to know the cause. In the previous study predicting failure in data center, failure was predicted by looking at a single server as a single state without assuming that the devices were mixed. Therefore, in this study, data center failures were classified into failures occurring inside the server (Outage A) and failures occurring outside the server (Outage B), and focused on analyzing complex failures occurring within the server. Server external failures include power, cooling, user errors, etc. Since such failures can be prevented in the early stages of data center facility construction, various solutions are being developed. On the other hand, the cause of the failure occurring in the server is difficult to determine, and adequate prevention has not yet been achieved. In particular, this is the reason why server failures do not occur singularly, cause other server failures, or receive something that causes failures from other servers. In other words, while the existing studies assumed that it was a single server that did not affect the servers and analyzed the failure, in this study, the failure occurred on the assumption that it had an effect between servers. In order to define the complex failure situation in the data center, failure history data for each equipment existing in the data center was used. There are four major failures considered in this study: Network Node Down, Server Down, Windows Activation Services Down, and Database Management System Service Down. The failures that occur for each device are sorted in chronological order, and when a failure occurs in a specific equipment, if a failure occurs in a specific equipment within 5 minutes from the time of occurrence, it is defined that the failure occurs simultaneously. After configuring the sequence for the devices that have failed at the same time, 5 devices that frequently occur simultaneously within the configured sequence were selected, and the case where the selected devices failed at the same time was confirmed through visualization. Since the server resource information collected for failure analysis is in units of time series and has flow, we used Long Short-term Memory (LSTM), a deep learning algorithm that can predict the next state through the previous state. In addition, unlike a single server, the Hierarchical Attention Network deep learning model structure was used in consideration of the fact that the level of multiple failures for each server is different. This algorithm is a method of increasing the prediction accuracy by giving weight to the server as the impact on the failure increases. The study began with defining the type of failure and selecting the analysis target. In the first experiment, the same collected data was assumed as a single server state and a multiple server state, and compared and analyzed. The second experiment improved the prediction accuracy in the case of a complex server by optimizing each server threshold. In the first experiment, which assumed each of a single server and multiple servers, in the case of a single server, it was predicted that three of the five servers did not have a failure even though the actual failure occurred. However, assuming multiple servers, all five servers were predicted to have failed. As a result of the experiment, the hypothesis that there is an effect between servers is proven. As a result of this study, it was confirmed that the prediction performance was superior when the multiple servers were assumed than when the single server was assumed. In particular, applying the Hierarchical Attention Network algorithm, assuming that the effects of each server will be different, played a role in improving the analysis effect. In addition, by applying a different threshold for each server, the prediction accuracy could be improved. This study showed that failures that are difficult to determine the cause can be predicted through historical data, and a model that can predict failures occurring in servers in data centers is presented. It is expected that the occurrence of disability can be prevented in advance using the results of this study.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

Application of Support Vector Regression for Improving the Performance of the Emotion Prediction Model (감정예측모형의 성과개선을 위한 Support Vector Regression 응용)

  • Kim, Seongjin;Ryoo, Eunchung;Jung, Min Kyu;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.185-202
    • /
    • 2012
  • .Since the value of information has been realized in the information society, the usage and collection of information has become important. A facial expression that contains thousands of information as an artistic painting can be described in thousands of words. Followed by the idea, there has recently been a number of attempts to provide customers and companies with an intelligent service, which enables the perception of human emotions through one's facial expressions. For example, MIT Media Lab, the leading organization in this research area, has developed the human emotion prediction model, and has applied their studies to the commercial business. In the academic area, a number of the conventional methods such as Multiple Regression Analysis (MRA) or Artificial Neural Networks (ANN) have been applied to predict human emotion in prior studies. However, MRA is generally criticized because of its low prediction accuracy. This is inevitable since MRA can only explain the linear relationship between the dependent variables and the independent variable. To mitigate the limitations of MRA, some studies like Jung and Kim (2012) have used ANN as the alternative, and they reported that ANN generated more accurate prediction than the statistical methods like MRA. However, it has also been criticized due to over fitting and the difficulty of the network design (e.g. setting the number of the layers and the number of the nodes in the hidden layers). Under this background, we propose a novel model using Support Vector Regression (SVR) in order to increase the prediction accuracy. SVR is an extensive version of Support Vector Machine (SVM) designated to solve the regression problems. The model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ${\varepsilon}$) to the model prediction. Using SVR, we tried to build a model that can measure the level of arousal and valence from the facial features. To validate the usefulness of the proposed model, we collected the data of facial reactions when providing appropriate visual stimulating contents, and extracted the features from the data. Next, the steps of the preprocessing were taken to choose statistically significant variables. In total, 297 cases were used for the experiment. As the comparative models, we also applied MRA and ANN to the same data set. For SVR, we adopted '${\varepsilon}$-insensitive loss function', and 'grid search' technique to find the optimal values of the parameters like C, d, ${\sigma}^2$, and ${\varepsilon}$. In the case of ANN, we adopted a standard three-layer backpropagation network, which has a single hidden layer. The learning rate and momentum rate of ANN were set to 10%, and we used sigmoid function as the transfer function of hidden and output nodes. We performed the experiments repeatedly by varying the number of nodes in the hidden layer to n/2, n, 3n/2, and 2n, where n is the number of the input variables. The stopping condition for ANN was set to 50,000 learning events. And, we used MAE (Mean Absolute Error) as the measure for performance comparison. From the experiment, we found that SVR achieved the highest prediction accuracy for the hold-out data set compared to MRA and ANN. Regardless of the target variables (the level of arousal, or the level of positive / negative valence), SVR showed the best performance for the hold-out data set. ANN also outperformed MRA, however, it showed the considerably lower prediction accuracy than SVR for both target variables. The findings of our research are expected to be useful to the researchers or practitioners who are willing to build the models for recognizing human emotions.

Syllabus Design and Pronunciation Teaching

  • Amakawa, Yukiko
    • Proceedings of the KSPS conference
    • /
    • 2000.07a
    • /
    • pp.235-240
    • /
    • 2000
  • In the age of global communication, more human exchange is extended at the grass-roots level. In the old days, language policy and language planning was based on one nation-state with one language. But high waves of globalizaiton have allowed extended human flow of exchange beyond one's national border on a daily basis. Under such circumstances, homogeneity in Japan may not allow Japanese to speak and communicate only in Japanese and only with Japanese people. In Japan, an advisory report was made to the Ministry of Education in June 1996 about what education should be like in the 21st century. In this report, an introduction of English at public elementary schools was for the first time made. A basic policy of English instruction at the elementary school level was revealed. With this concept, English instruction is not required at the elementary school level but each school has their own choice of introducing English as their curriculum starting April 2002. As Baker, Colin (1996) indicates the age of three as being the threshold diving a child becoming bilingual naturally or by formal instruction. Threre is a movement towards making second language acquisition more naturalistic in an educational setting, developing communicative competence in a more or less formal way. From the lesson of the Canadian immersion success, Genesee (1987) stresses the importance of early language instruction. It is clear that from a psycho-linguistic perspective, most children acquire basic communication skills in their first language apparently effortlessly and without systematic and formal instruction during the first six or seven years of life. This innate capacity diminishes with age, thereby making language learning increasingly difficult. The author, being a returnee, experienced considerable difficulty acquiring L2, and especially achieving native-like competence. There will be many hurdles to conquer until Japanese students are able to reach at least a communicative level in English. It has been mentioned that English is not taught to clear the college entrance examination, but to communicate. However, Japanese college entrance examination still makes students focus more on the grammar-translation method. This is expected to shift to a more communication stressed approach. Japan does not have to aim at becoming an official bilingual country, but at least communicative English should be taught at every level in school Mito College is a small two-year co-ed college in Japan. Students at Mito College are basically notgood at English. It has only one department for business and economics, and English is required for all freshmen. It is necessary for me to make my classes enjoyable and attractive so that students can at least get motivated to learn English. My major target is communicative English so that students may be prepared to use English in various business settings. As an experiment to introduce more communicative English, the author has made the following syllabus design. This program aims at training students speak and enjoy English. 90-minute class (only 190-minute session per week is most common in Japanese colleges) is divided into two: The first half is to train students orally using Graded Direct Method. The latter half uses different materials each time so that students can learn and enjoy English culture and language simultaneously. There are no quizes or examinations in my one-academic year program. However, all students are required to make an original English poem by the end of the spring semester. 2-6 students work together in a group on one poem. Students coming to Mito College, Japan have one of the lowest English levels in all of Japan. However, an attached example of one poem made by a group shows that students can improve their creativity as long as they are kept encouraged. At the end of the fall semester, all students are then required individually to make a 3-minute original English speech. An example of that speech contest will be presented at the Convention in Seoul.

  • PDF

Rainfall image DB construction for rainfall intensity estimation from CCTV videos: focusing on experimental data in a climatic environment chamber (CCTV 영상 기반 강우강도 산정을 위한 실환경 실험 자료 중심 적정 강우 이미지 DB 구축 방법론 개발)

  • Byun, Jongyun;Jun, Changhyun;Kim, Hyeon-Joon;Lee, Jae Joon;Park, Hunil;Lee, Jinwook
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.6
    • /
    • pp.403-417
    • /
    • 2023
  • In this research, a methodology was developed for constructing an appropriate rainfall image database for estimating rainfall intensity based on CCTV video. The database was constructed in the Large-Scale Climate Environment Chamber of the Korea Conformity Laboratories, which can control variables with high irregularity and variability in real environments. 1,728 scenarios were designed under five different experimental conditions. 36 scenarios and a total of 97,200 frames were selected. Rain streaks were extracted using the k-nearest neighbor algorithm by calculating the difference between each image and the background. To prevent overfitting, data with pixel values greater than set threshold, compared to the average pixel value for each image, were selected. The area with maximum pixel variability was determined by shifting with every 10 pixels and set as a representative area (180×180) for the original image. After re-transforming to 120×120 size as an input data for convolutional neural networks model, image augmentation was progressed under unified shooting conditions. 92% of the data showed within the 10% absolute range of PBIAS. It is clear that the final results in this study have the potential to enhance the accuracy and efficacy of existing real-world CCTV systems with transfer learning.

Comparative study of flood detection methodologies using Sentinel-1 satellite imagery (Sentinel-1 위성 영상을 활용한 침수 탐지 기법 방법론 비교 연구)

  • Lee, Sungwoo;Kim, Wanyub;Lee, Seulchan;Jeong, Hagyu;Park, Jongsoo;Choi, Minha
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.3
    • /
    • pp.181-193
    • /
    • 2024
  • The increasing atmospheric imbalance caused by climate change leads to an elevation in precipitation, resulting in a heightened frequency of flooding. Consequently, there is a growing need for technology to detect and monitor these occurrences, especially as the frequency of flooding events rises. To minimize flood damage, continuous monitoring is essential, and flood areas can be detected by the Synthetic Aperture Radar (SAR) imagery, which is not affected by climate conditions. The observed data undergoes a preprocessing step, utilizing a median filter to reduce noise. Classification techniques were employed to classify water bodies and non-water bodies, with the aim of evaluating the effectiveness of each method in flood detection. In this study, the Otsu method and Support Vector Machine (SVM) technique were utilized for the classification of water bodies and non-water bodies. The overall performance of the models was assessed using a Confusion Matrix. The suitability of flood detection was evaluated by comparing the Otsu method, an optimal threshold-based classifier, with SVM, a machine learning technique that minimizes misclassifications through training. The Otsu method demonstrated suitability in delineating boundaries between water and non-water bodies but exhibited a higher rate of misclassifications due to the influence of mixed substances. Conversely, the use of SVM resulted in a lower false positive rate and proved less sensitive to mixed substances. Consequently, SVM exhibited higher accuracy under conditions excluding flooding. While the Otsu method showed slightly higher accuracy in flood conditions compared to SVM, the difference in accuracy was less than 5% (Otsu: 0.93, SVM: 0.90). However, in pre-flooding and post-flooding conditions, the accuracy difference was more than 15%, indicating that SVM is more suitable for water body and flood detection (Otsu: 0.77, SVM: 0.92). Based on the findings of this study, it is anticipated that more accurate detection of water bodies and floods could contribute to minimizing flood-related damages and losses.