• Title/Summary/Keyword: analysis of algorithms

Search Result 3,568, Processing Time 0.034 seconds

Trajectory Index Structure based on Signatures for Moving Objects on a Spatial Network (공간 네트워크 상의 이동객체를 위한 시그니처 기반의 궤적 색인구조)

  • Kim, Young-Jin;Kim, Young-Chang;Chang, Jae-Woo;Sim, Chun-Bo
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.3
    • /
    • pp.1-18
    • /
    • 2008
  • Because we can usually get many information through analyzing trajectories of moving objects on spatial networks, efficient trajectory index structures are required to achieve good retrieval performance on their trajectories. However, there has been little research on trajectory index structures for spatial networks such as FNR-tree and MON-tree. Also, because FNR-tree and MON-tree store the segment unit of moving objects, they can't support the trajectory of whole moving objects. In this paper, we propose an efficient trajectory index structures based on signatures on a spatial network, named SigMO-Tree. For this, we divide moving object data into spatial and temporal attributes, and design an index structure which supports not only range query but trajectory query by preserving the whole trajectory of moving objects. In addition, we divide user queries into trajectory query based on spatio-temporal area and similar-tralectory query, and propose query processing algorithms to support them. The algorithm uses a signature file in order to retrieve candidate trajectories efficiently Finally, we show from our performance analysis that our trajectory index structure outperforms the existing index structures like FNR-Tree and MON-Tree.

  • PDF

Development the Geostationary Ocean Color Imager (GOCI) Data Processing System (GDPS) (정지궤도 해색탑재체(GOCI) 해양자료처리시스템(GDPS)의 개발)

  • Han, Hee-Jeong;Ryu, Joo-Hyung;Ahn, Yu-Hwan
    • Korean Journal of Remote Sensing
    • /
    • v.26 no.2
    • /
    • pp.239-249
    • /
    • 2010
  • The Geostationary Ocean Color Imager (GOCI) data-processing system (GDPS), which is a software system for satellite data processing and analysis of the first geostationary ocean color observation satellite, has been developed concurrently with the development of th satellite. The GDPS has functions to generate level 2 and 3 oceanographic analytical data, from level 1B data that comprise the total radiance information, by programming a specialized atmospheric algorithm and oceanic analytical algorithms to the software module. The GDPS will be a multiversion system not only as a standard Korea Ocean Satellite Center(KOSC) operational system, but also as a basic GOCI data-processing system for researchers and other users. Additionally, the GDPS will be used to make the GOCI images available for distribution by satellite network, to calculate the lookup table for radiometric calibration coefficients, to divide/mosaic several region images, to analyze time-series satellite data. the developed GDPS system has satisfied the user requirement to complete data production within 30 minutes. This system is expected to be able to be an excellent tool for monitoring both long-term and short-term changes of ocean environmental characteristics.

5G Network Resource Allocation and Traffic Prediction based on DDPG and Federated Learning (DDPG 및 연합학습 기반 5G 네트워크 자원 할당과 트래픽 예측)

  • Seok-Woo Park;Oh-Sung Lee;In-Ho Ra
    • Smart Media Journal
    • /
    • v.13 no.4
    • /
    • pp.33-48
    • /
    • 2024
  • With the advent of 5G, characterized by Enhanced Mobile Broadband (eMBB), Ultra-Reliable Low Latency Communications (URLLC), and Massive Machine Type Communications (mMTC), efficient network management and service provision are becoming increasingly critical. This paper proposes a novel approach to address key challenges of 5G networks, namely ultra-high speed, ultra-low latency, and ultra-reliability, while dynamically optimizing network slicing and resource allocation using machine learning (ML) and deep learning (DL) techniques. The proposed methodology utilizes prediction models for network traffic and resource allocation, and employs Federated Learning (FL) techniques to simultaneously optimize network bandwidth, latency, and enhance privacy and security. Specifically, this paper extensively covers the implementation methods of various algorithms and models such as Random Forest and LSTM, thereby presenting methodologies for the automation and intelligence of 5G network operations. Finally, the performance enhancement effects achievable by applying ML and DL to 5G networks are validated through performance evaluation and analysis, and solutions for network slicing and resource management optimization are proposed for various industrial applications.

A study on image region analysis and image enhancement using detail descriptor (디테일 디스크립터를 이용한 이미지 영역 분석과 개선에 관한 연구)

  • Lim, Jae Sung;Jeong, Young-Tak;Lee, Ji-Hyeok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.6
    • /
    • pp.728-735
    • /
    • 2017
  • With the proliferation of digital devices, the devices have generated considerable additive white Gaussian noise while acquiring digital images. The most well-known denoising methods focused on eliminating the noise, so detailed components that include image information were removed proportionally while eliminating the image noise. The proposed algorithm provides a method that preserves the details and effectively removes the noise. In this proposed method, the goal is to separate meaningful detail information in image noise environment using the edge strength and edge connectivity. Consequently, even as the noise level increases, it shows denoising results better than the other benchmark methods because proposed method extracts the connected detail component information. In addition, the proposed method effectively eliminated the noise for various noise levels; compared to the benchmark algorithms, the proposed algorithm shows a highly structural similarity index(SSIM) value and peak signal-to-noise ratio(PSNR) value, respectively. As shown the result of high SSIMs, it was confirmed that the SSIMs of the denoising results includes a human visual system(HVS).

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

  • Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.77-92
    • /
    • 2014
  • Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.

Improving Performance of Recommendation Systems Using Topic Modeling (사용자 관심 이슈 분석을 통한 추천시스템 성능 향상 방안)

  • Choi, Seongi;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.101-116
    • /
    • 2015
  • Recently, due to the development of smart devices and social media, vast amounts of information with the various forms were accumulated. Particularly, considerable research efforts are being directed towards analyzing unstructured big data to resolve various social problems. Accordingly, focus of data-driven decision-making is being moved from structured data analysis to unstructured one. Also, in the field of recommendation system, which is the typical area of data-driven decision-making, the need of using unstructured data has been steadily increased to improve system performance. Approaches to improve the performance of recommendation systems can be found in two aspects- improving algorithms and acquiring useful data with high quality. Traditionally, most efforts to improve the performance of recommendation system were made by the former approach, while the latter approach has not attracted much attention relatively. In this sense, efforts to utilize unstructured data from variable sources are very timely and necessary. Particularly, as the interests of users are directly connected with their needs, identifying the interests of the user through unstructured big data analysis can be a crew for improving performance of recommendation systems. In this sense, this study proposes the methodology of improving recommendation system by measuring interests of the user. Specially, this study proposes the method to quantify interests of the user by analyzing user's internet usage patterns, and to predict user's repurchase based upon the discovered preferences. There are two important modules in this study. The first module predicts repurchase probability of each category through analyzing users' purchase history. We include the first module to our research scope for comparing the accuracy of traditional purchase-based prediction model to our new model presented in the second module. This procedure extracts purchase history of users. The core part of our methodology is in the second module. This module extracts users' interests by analyzing news articles the users have read. The second module constructs a correspondence matrix between topics and news articles by performing topic modeling on real world news articles. And then, the module analyzes users' news access patterns and then constructs a correspondence matrix between articles and users. After that, by merging the results of the previous processes in the second module, we can obtain a correspondence matrix between users and topics. This matrix describes users' interests in a structured manner. Finally, by using the matrix, the second module builds a model for predicting repurchase probability of each category. In this paper, we also provide experimental results of our performance evaluation. The outline of data used our experiments is as follows. We acquired web transaction data of 5,000 panels from a company that is specialized to analyzing ranks of internet sites. At first we extracted 15,000 URLs of news articles published from July 2012 to June 2013 from the original data and we crawled main contents of the news articles. After that we selected 2,615 users who have read at least one of the extracted news articles. Among the 2,615 users, we discovered that the number of target users who purchase at least one items from our target shopping mall 'G' is 359. In the experiments, we analyzed purchase history and news access records of the 359 internet users. From the performance evaluation, we found that our prediction model using both users' interests and purchase history outperforms a prediction model using only users' purchase history from a view point of misclassification ratio. In detail, our model outperformed the traditional one in appliance, beauty, computer, culture, digital, fashion, and sports categories when artificial neural network based models were used. Similarly, our model outperformed the traditional one in beauty, computer, digital, fashion, food, and furniture categories when decision tree based models were used although the improvement is very small.

A Comparative Errors Assessment Between Surface Albedo Products of COMS/MI and GK-2A/AMI (천리안위성 1·2A호 지표면 알베도 상호 오차 분석 및 비교검증)

  • Woo, Jongho;Choi, Sungwon;Jin, Donghyun;Seong, Noh-hun;Jung, Daeseong;Sim, Suyoung;Byeon, Yugyeong;Jeon, Uujin;Sohn, Eunha;Han, Kyung-Soo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_1
    • /
    • pp.1767-1772
    • /
    • 2021
  • Global satellite observation surface albedo data over a long period of time are actively used to monitor changes in the global climate and environment, and their utilization and importance are great. Through the generational shift of geostationary satellites COMS (Communication, Ocean and Meteorological Satellite)/MI (Meteorological Imager sensor) and GK-2A (GEO-KOMPSAT-2A)/AMI (Advanced Meteorological Imager sensor), it is possible to continuously secure surface albedo outputs. However, the surface albedo outputs of COMS/MI and GK-2A/AMI differ between outputs due to Differences in retrieval algorithms. Therefore, in order to expand the retrieval period of the surface albedo of COMS/MI and GK-2A/AMI to secure continuous climate change monitoring linkage, the analysis of the two satellite outputs and errors should be preceded. In this study, error characteristics were analyzed by performing comparative analysis with ground observation data AERONET (Aerosol Robotic Network) and other satellite data GLASS (Global Land Surface Satellite) for the overlapping period of COMS/MI and GK-2A/AMI surface albedo data. As a result of error analysis, it was confirmed that the RMSE of COMS/MI was 0.043, higher than the RMSE of GK-2A/AMI, 0.015. In addition, compared to other satellite (GLASS) data, the RMSE of COMS/MI was 0.029, slightly lower than that of GK-2A/AMI 0.038. When understanding these error characteristics and using COMS/MI and GK-2A/AMI's surface albedo data, it will be possible to actively utilize them for long-term climate change monitoring.

A Proposed Algorithm and Sampling Conditions for Nonlinear Analysis of EEG (뇌파의 비선형 분석을 위한 신호추출조건 및 계산 알고리즘)

  • Shin, Chul-Jin;Lee, Kwang-Ho;Choi, Sung-Ku;Yoon, In-Young
    • Sleep Medicine and Psychophysiology
    • /
    • v.6 no.1
    • /
    • pp.52-60
    • /
    • 1999
  • Objectives: With the object of finding the appropriate conditions and algorithms for dimensional analysis of human EEG, we calculated correlation dimensions in the various condition of sampling rate and data aquisition time and improved the computation algorithm by taking advantage of bit operation instead of log operation. Methods: EEG signals from 13 scalp lead of a man were digitized with A-D converter under the condition of 12 bit resolution and 1000 Hertz of sampling rate during 32 seconds. From the original data, we made 15 time series data which have different sampling rate of 62.5, 125, 250, 500, 1000 hertz and data acqusition time of 10, 20, 30 second, respectively. New algorithm to shorten the calculation time using bit operation and the Least Trimmed Squares(LTS) estimator to get the optimal slope was applied to these data. Results: The values of the correlation dimension showed the increasing pattern as the data acquisition time becomes longer. The data with sampling rate of 62.5 Hz showed the highest value of correlation dimension regardless of sampling time but the correlation dimension at other sampling rates revealed similar values. The computation with bit operation instead of log operation had a statistically significant effect of shortening of calculation time and LTS method estimated more stably the slope of correlation dimension than the Least Squares estimator. Conclusion: The bit operation and LTS methods were successfully utilized to time-saving and efficient calculation of correlation dimension. In addition, time series of 20-sec length with sampling rate of 125 Hz was adequate to estimate the dimensional complexity of human EEG.

  • PDF

Dynamic forecasts of bankruptcy with Recurrent Neural Network model (RNN(Recurrent Neural Network)을 이용한 기업부도예측모형에서 회계정보의 동적 변화 연구)

  • Kwon, Hyukkun;Lee, Dongkyu;Shin, Minsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.139-153
    • /
    • 2017
  • Corporate bankruptcy can cause great losses not only to stakeholders but also to many related sectors in society. Through the economic crises, bankruptcy have increased and bankruptcy prediction models have become more and more important. Therefore, corporate bankruptcy has been regarded as one of the major topics of research in business management. Also, many studies in the industry are in progress and important. Previous studies attempted to utilize various methodologies to improve the bankruptcy prediction accuracy and to resolve the overfitting problem, such as Multivariate Discriminant Analysis (MDA), Generalized Linear Model (GLM). These methods are based on statistics. Recently, researchers have used machine learning methodologies such as Support Vector Machine (SVM), Artificial Neural Network (ANN). Furthermore, fuzzy theory and genetic algorithms were used. Because of this change, many of bankruptcy models are developed. Also, performance has been improved. In general, the company's financial and accounting information will change over time. Likewise, the market situation also changes, so there are many difficulties in predicting bankruptcy only with information at a certain point in time. However, even though traditional research has problems that don't take into account the time effect, dynamic model has not been studied much. When we ignore the time effect, we get the biased results. So the static model may not be suitable for predicting bankruptcy. Thus, using the dynamic model, there is a possibility that bankruptcy prediction model is improved. In this paper, we propose RNN (Recurrent Neural Network) which is one of the deep learning methodologies. The RNN learns time series data and the performance is known to be good. Prior to experiment, we selected non-financial firms listed on the KOSPI, KOSDAQ and KONEX markets from 2010 to 2016 for the estimation of the bankruptcy prediction model and the comparison of forecasting performance. In order to prevent a mistake of predicting bankruptcy by using the financial information already reflected in the deterioration of the financial condition of the company, the financial information was collected with a lag of two years, and the default period was defined from January to December of the year. Then we defined the bankruptcy. The bankruptcy we defined is the abolition of the listing due to sluggish earnings. We confirmed abolition of the list at KIND that is corporate stock information website. Then we selected variables at previous papers. The first set of variables are Z-score variables. These variables have become traditional variables in predicting bankruptcy. The second set of variables are dynamic variable set. Finally we selected 240 normal companies and 226 bankrupt companies at the first variable set. Likewise, we selected 229 normal companies and 226 bankrupt companies at the second variable set. We created a model that reflects dynamic changes in time-series financial data and by comparing the suggested model with the analysis of existing bankruptcy predictive models, we found that the suggested model could help to improve the accuracy of bankruptcy predictions. We used financial data in KIS Value (Financial database) and selected Multivariate Discriminant Analysis (MDA), Generalized Linear Model called logistic regression (GLM), Support Vector Machine (SVM), Artificial Neural Network (ANN) model as benchmark. The result of the experiment proved that RNN's performance was better than comparative model. The accuracy of RNN was high in both sets of variables and the Area Under the Curve (AUC) value was also high. Also when we saw the hit-ratio table, the ratio of RNNs that predicted a poor company to be bankrupt was higher than that of other comparative models. However the limitation of this paper is that an overfitting problem occurs during RNN learning. But we expect to be able to solve the overfitting problem by selecting more learning data and appropriate variables. From these result, it is expected that this research will contribute to the development of a bankruptcy prediction by proposing a new dynamic model.

Analysis and Performance Evaluation of Pattern Condensing Techniques used in Representative Pattern Mining (대표 패턴 마이닝에 활용되는 패턴 압축 기법들에 대한 분석 및 성능 평가)

  • Lee, Gang-In;Yun, Un-Il
    • Journal of Internet Computing and Services
    • /
    • v.16 no.2
    • /
    • pp.77-83
    • /
    • 2015
  • Frequent pattern mining, which is one of the major areas actively studied in data mining, is a method for extracting useful pattern information hidden from large data sets or databases. Moreover, frequent pattern mining approaches have been actively employed in a variety of application fields because the results obtained from them can allow us to analyze various, important characteristics within databases more easily and automatically. However, traditional frequent pattern mining methods, which simply extract all of the possible frequent patterns such that each of their support values is not smaller than a user-given minimum support threshold, have the following problems. First, traditional approaches have to generate a numerous number of patterns according to the features of a given database and the degree of threshold settings, and the number can also increase in geometrical progression. In addition, such works also cause waste of runtime and memory resources. Furthermore, the pattern results excessively generated from the methods also lead to troubles of pattern analysis for the mining results. In order to solve such issues of previous traditional frequent pattern mining approaches, the concept of representative pattern mining and its various related works have been proposed. In contrast to the traditional ones that find all the possible frequent patterns from databases, representative pattern mining approaches selectively extract a smaller number of patterns that represent general frequent patterns. In this paper, we describe details and characteristics of pattern condensing techniques that consider the maximality or closure property of generated frequent patterns, and conduct comparison and analysis for the techniques. Given a frequent pattern, satisfying the maximality for the pattern signifies that all of the possible super sets of the pattern must have smaller support values than a user-specific minimum support threshold; meanwhile, satisfying the closure property for the pattern means that there is no superset of which the support is equal to that of the pattern with respect to all the possible super sets. By mining maximal frequent patterns or closed frequent ones, we can achieve effective pattern compression and also perform mining operations with much smaller time and space resources. In addition, compressed patterns can be converted into the original frequent pattern forms again if necessary; especially, the closed frequent pattern notation has the ability to convert representative patterns into the original ones again without any information loss. That is, we can obtain a complete set of original frequent patterns from closed frequent ones. Although the maximal frequent pattern notation does not guarantee a complete recovery rate in the process of pattern conversion, it has an advantage that can extract a smaller number of representative patterns more quickly compared to the closed frequent pattern notation. In this paper, we show the performance results and characteristics of the aforementioned techniques in terms of pattern generation, runtime, and memory usage by conducting performance evaluation with respect to various real data sets collected from the real world. For more exact comparison, we also employ the algorithms implementing these techniques on the same platform and Implementation level.