Search | Korea Science

Analysis and Evaluation of Frequent Pattern Mining Technique based on Landmark Window (랜드마크 윈도우 기반의 빈발 패턴 마이닝 기법의 분석 및 성능평가)

Pyun, Gwangbum;Yun, Unil
- Journal of Internet Computing and Services
- /
- v.15 no.3
- /
- pp.101-107
- /
- 2014
With the development of online service, recent forms of databases have been changed from static database structures to dynamic stream database structures. Previous data mining techniques have been used as tools of decision making such as establishment of marketing strategies and DNA analyses. However, the capability to analyze real-time data more quickly is necessary in the recent interesting areas such as sensor network, robotics, and artificial intelligence. Landmark window-based frequent pattern mining, one of the stream mining approaches, performs mining operations with respect to parts of databases or each transaction of them, instead of all the data. In this paper, we analyze and evaluate the techniques of the well-known landmark window-based frequent pattern mining algorithms, called Lossy counting and hMiner. When Lossy counting mines frequent patterns from a set of new transactions, it performs union operations between the previous and current mining results. hMiner, which is a state-of-the-art algorithm based on the landmark window model, conducts mining operations whenever a new transaction occurs. Since hMiner extracts frequent patterns as soon as a new transaction is entered, we can obtain the latest mining results reflecting real-time information. For this reason, such algorithms are also called online mining approaches. We evaluate and compare the performance of the primitive algorithm, Lossy counting and the latest one, hMiner. As the criteria of our performance analysis, we first consider algorithms' total runtime and average processing time per transaction. In addition, to compare the efficiency of storage structures between them, their maximum memory usage is also evaluated. Lastly, we show how stably the two algorithms conduct their mining works with respect to the databases that feature gradually increasing items. With respect to the evaluation results of mining time and transaction processing, hMiner has higher speed than that of Lossy counting. Since hMiner stores candidate frequent patterns in a hash method, it can directly access candidate frequent patterns. Meanwhile, Lossy counting stores them in a lattice manner; thus, it has to search for multiple nodes in order to access the candidate frequent patterns. On the other hand, hMiner shows worse performance than that of Lossy counting in terms of maximum memory usage. hMiner should have all of the information for candidate frequent patterns to store them to hash's buckets, while Lossy counting stores them, reducing their information by using the lattice method. Since the storage of Lossy counting can share items concurrently included in multiple patterns, its memory usage is more efficient than that of hMiner. However, hMiner presents better efficiency than that of Lossy counting with respect to scalability evaluation due to the following reasons. If the number of items is increased, shared items are decreased in contrast; thereby, Lossy counting's memory efficiency is weakened. Furthermore, if the number of transactions becomes higher, its pruning effect becomes worse. From the experimental results, we can determine that the landmark window-based frequent pattern mining algorithms are suitable for real-time systems although they require a significant amount of memory. Hence, we need to improve their data structures more efficiently in order to utilize them additionally in resource-constrained environments such as WSN(Wireless sensor network).
https://doi.org/10.7472/jksii.2014.15.3.101 인용 PDF KSCI

A Comparative Study on the Effective Deep Learning for Fingerprint Recognition with Scar and Wrinkle (상처와 주름이 있는 지문 판별에 효율적인 심층 학습 비교연구)

Kim, JunSeob;Rim, BeanBonyka;Sung, Nak-Jun;Hong, Min
- Journal of Internet Computing and Services
- /
- v.21 no.4
- /
- pp.17-23
- /
- 2020
Biometric information indicating measurement items related to human characteristics has attracted great attention as security technology with high reliability since there is no fear of theft or loss. Among these biometric information, fingerprints are mainly used in fields such as identity verification and identification. If there is a problem such as a wound, wrinkle, or moisture that is difficult to authenticate to the fingerprint image when identifying the identity, the fingerprint expert can identify the problem with the fingerprint directly through the preprocessing step, and apply the image processing algorithm appropriate to the problem. Solve the problem. In this case, by implementing artificial intelligence software that distinguishes fingerprint images with cuts and wrinkles on the fingerprint, it is easy to check whether there are cuts or wrinkles, and by selecting an appropriate algorithm, the fingerprint image can be easily improved. In this study, we developed a total of 17,080 fingerprint databases by acquiring all finger prints of 1,010 students from the Royal University of Cambodia, 600 Sokoto open data sets, and 98 Korean students. In order to determine if there are any injuries or wrinkles in the built database, criteria were established, and the data were validated by experts. The training and test datasets consisted of Cambodian data and Sokoto data, and the ratio was set to 8: 2. The data of 98 Korean students were set up as a validation data set. Using the constructed data set, five CNN-based architectures such as Classic CNN, AlexNet, VGG-16, Resnet50, and Yolo v3 were implemented. A study was conducted to find the model that performed best on the readings. Among the five architectures, ResNet50 showed the best performance with 81.51%.
https://doi.org/10.7472/jksii.2020.21.4.17 인용 PDF KSCI HTML

A preliminary study for development of an automatic incident detection system on CCTV in tunnels based on a machine learning algorithm (기계학습(machine learning) 기반 터널 영상유고 자동 감지 시스템 개발을 위한 사전검토 연구)

Shin, Hyu-Soung;Kim, Dong-Gyou;Yim, Min-Jin;Lee, Kyu-Beom;Oh, Young-Sup
- Journal of Korean Tunnelling and Underground Space Association
- /
- v.19 no.1
- /
- pp.95-107
- /
- 2017
In this study, a preliminary study was undertaken for development of a tunnel incident automatic detection system based on a machine learning algorithm which is to detect a number of incidents taking place in tunnel in real time and also to be able to identify the type of incident. Two road sites where CCTVs are operating have been selected and a part of CCTV images are treated to produce sets of training data. The data sets are composed of position and time information of moving objects on CCTV screen which are extracted by initially detecting and tracking of incoming objects into CCTV screen by using a conventional image processing technique available in this study. And the data sets are matched with 6 categories of events such as lane change, stoping, etc which are also involved in the training data sets. The training data are learnt by a resilience neural network where two hidden layers are applied and 9 architectural models are set up for parametric studies, from which the architectural model, 300(first hidden layer)-150(second hidden layer) is found to be optimum in highest accuracy with respect to training data as well as testing data not used for training. From this study, it was shown that the highly variable and complex traffic and incident features could be well identified without any definition of feature regulation by using a concept of machine learning. In addition, detection capability and accuracy of the machine learning based system will be automatically enhanced as much as big data of CCTV images in tunnel becomes rich.
https://doi.org/10.9711/KTAJ.2017.19.1.095 인용 PDF KSCI

Modeling of Sensorineural Hearing Loss for the Evaluation of Digital Hearing Aid Algorithms (디지털 보청기 알고리즘 평가를 위한 감음신경성 난청의 모델링)

김동욱;박영철
- Journal of Biomedical Engineering Research
- /
- v.19 no.1
- /
- pp.59-68
- /
- 1998
Digital hearing aids offer many advantages over conventional analog hearing aids. With the advent of high speed digital signal processing chips, new digital techniques have been introduced to digital hearing aids. In addition, the evaluation of new ideas in hearing aids is necessarily accompanied by intensive subject-based clinical tests which requires much time and cost. In this paper, we present an objective method to evaluate and predict the performance of hearing aid systems without the help of such subject-based tests. In the hearing impairment simulation(HIS) algorithm, a sensorineural hearing impairment medel is established from auditory test data of the impaired subject being simulated. Also, the nonlinear behavior of the loudness recruitment is defined using hearing loss functions generated from the measurements. To transform the natural input sound into the impaired one, a frequency sampling filter is designed. The filter is continuously refreshed with the level-dependent frequency response function provided by the impairment model. To assess the performance, the HIS algorithm was implemented in real-time using a floating-point DSP. Signals processed with the real-time system were presented to normal subjects and their auditory data modified by the system was measured. The sensorineural hearing impairment was simulated and tested. The threshold of hearing and the speech discrimination tests exhibited the efficiency of the system in its use for the hearing impairment simulation. Using the HIS system we evaluated three typical hearing aid algorithms.
PDF

An Implementation of Lighting Control System using Interpretation of Context Conflict based on Priority (우선순위 기반의 상황충돌 해석 조명제어시스템 구현)

Seo, Won-Il;Kwon, Sook-Youn;Lim, Jae-Hyun
- Journal of Internet Computing and Services
- /
- v.17 no.1
- /
- pp.23-33
- /
- 2016
The current smart lighting is shaped to offer the lighting environment suitable for current context, after identifying user's action and location through a sensor. The sensor-based context awareness technology just considers a single user, and the studies to interpret many users' various context occurrences and conflicts lack. In existing studies, a fuzzy theory and algorithm including ReBa have been used as the methodology to solve context conflict. The fuzzy theory and algorithm including ReBa just avoid an opportunity of context conflict that may occur by providing services by each area, after the spaces where users are located are classified into many areas. Therefore, they actually cannot be regarded as customized service type that can offer personal preference-based context conflict. This paper proposes a priority-based LED lighting control system interpreting multiple context conflicts, which decides services, based on the granted priority according to context type, when service conflict is faced with, due to simultaneous occurrence of various contexts to many users. This study classifies the residential environment into such five areas as living room, 'bed room, study room, kitchen and bath room, and the contexts that may occur within each area are defined as 20 contexts such as exercising, doing makeup, reading, dining and entering, targeting several users. The proposed system defines various contexts of users using an ontology-based model and gives service of user oriented lighting environment through rule based on standard and context reasoning engine. To solve the issue of various context conflicts among users in the same space and at the same time point, the context in which user concentration is required is set in the highest priority. Also, visual comfort is offered as the best alternative priority in the case of the same priority. In this manner, they are utilized as the criteria for service selection upon conflict occurrence.
https://doi.org/10.7472/jksii.2016.17.1.23 인용 PDF KSCI

Optimal supervised LSA method using selective feature dimension reduction (선택적 자질 차원 축소를 이용한 최적의 지도적 LSA 방법)

Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
- Science of Emotion and Sensibility
- /
- v.13 no.1
- /
- pp.47-60
- /
- 2010
Most of the researches about classification usually have used kNN(k-Nearest Neighbor), SVM(Support Vector Machine), which are known as learn-based model, and Bayesian classifier, NNA(Neural Network Algorithm), which are known as statistics-based methods. However, there are some limitations of space and time when classifying so many web pages in recent internet. Moreover, most studies of classification are using uni-gram feature representation which is not good to represent real meaning of words. In case of Korean web page classification, there are some problems because of korean words property that the words have multiple meanings(polysemy). For these reasons, LSA(Latent Semantic Analysis) is proposed to classify well in these environment(large data set and words' polysemy). LSA uses SVD(Singular Value Decomposition) which decomposes the original term-document matrix to three different matrices and reduces their dimension. From this SVD's work, it is possible to create new low-level semantic space for representing vectors, which can make classification efficient and analyze latent meaning of words or document(or web pages). Although LSA is good at classification, it has some drawbacks in classification. As SVD reduces dimensions of matrix and creates new semantic space, it doesn't consider which dimensions discriminate vectors well but it does consider which dimensions represent vectors well. It is a reason why LSA doesn't improve performance of classification as expectation. In this paper, we propose new LSA which selects optimal dimensions to discriminate and represent vectors well as minimizing drawbacks and improving performance. This method that we propose shows better and more stable performance than other LSAs' in low-dimension space. In addition, we derive more improvement in classification as creating and selecting features by reducing stopwords and weighting specific values to them statistically.
PDF

Comparison of Forest Carbon Stocks Estimation Methods Using Forest Type Map and Landsat TM Satellite Imagery (임상도와 Landsat TM 위성영상을 이용한 산림탄소저장량 추정 방법 비교 연구)

Kim, Kyoung-Min;Lee, Jung-Bin;Jung, Jaehoon
- Korean Journal of Remote Sensing
- /
- v.31 no.5
- /
- pp.449-459
- /
- 2015
The conventional National Forest Inventory(NFI)-based forest carbon stock estimation method is suitable for national-scale estimation, but is not for regional-scale estimation due to the lack of NFI plots. In this study, for the purpose of regional-scale carbon stock estimation, we created grid-based forest carbon stock maps using spatial ancillary data and two types of up-scaling methods. Chungnam province was chosen to represent the study area and for which the $5^{th}$ NFI (2006~2009) data was collected. The first method (method 1) selects forest type map as ancillary data and uses regression model for forest carbon stock estimation, whereas the second method (method 2) uses satellite imagery and k-Nearest Neighbor(k-NN) algorithm. Additionally, in order to consider uncertainty effects, the final AGB carbon stock maps were generated by performing 200 iterative processes with Monte Carlo simulation. As a result, compared to the NFI-based estimation(21,136,911 tonC), the total carbon stock was over-estimated by method 1(22,948,151 tonC), but was under-estimated by method 2(19,750,315 tonC). In the paired T-test with 186 independent data, the average carbon stock estimation by the NFI-based method was statistically different from method2(p<0.01), but was not different from method1(p>0.01). In particular, by means of Monte Carlo simulation, it was found that the smoothing effect of k-NN algorithm and mis-registration error between NFI plots and satellite image can lead to large uncertainty in carbon stock estimation. Although method 1 was found suitable for carbon stock estimation of forest stands that feature heterogeneous trees in Korea, satellite-based method is still in demand to provide periodic estimates of un-investigated, large forest area. In these respects, future work will focus on spatial and temporal extent of study area and robust carbon stock estimation with various satellite images and estimation methods.
https://doi.org/10.7780/kjrs.2015.31.5.9 인용 PDF KSCI

Semantic Access Path Generation in Web Information Management (웹 정보의 관리에 있어서 의미적 접근경로의 형성에 관한 연구)

Lee, Wookey
- Journal of the Korea Society of Computer and Information
- /
- v.8 no.2
- /
- pp.51-56
- /
- 2003
The structuring of Web information supports a strong user side viewpoint that a user wants his/her own needs on snooping a specific Web site. Not only the depth first algorithm or the breadth-first algorithm, but also the Web information is abstracted to a hierarchical structure. A prototype system is suggested in order to visualize and to represent a semantic significance. As a motivating example, the Web test site is suggested and analyzed with respect to several keywords. As a future research, the Web site model should be extended to the whole WWW and an accurate assessment function needs to be devised by which several suggested models should be evaluated.
PDF

Comparison between REML and Bayesian via Gibbs Sampling Algorithm with a Mixed Animal Model to Estimate Genetic Parameters for Carcass Traits in Hanwoo(Korean Native Cattle) (한우의 도체형질 유전모수 추정을 위한 REML과 Bayesian via Gibbs Sampling 방법의 비교 연구)

Roh, S.H.;Kim, B.W.;Kim, H.S.;Min, H.S.;Yoon, H.B.;Lee, D.H.;Jeon, J.T.;Lee, J.G.
- Journal of Animal Science and Technology
- /
- v.46 no.5
- /
- pp.719-728
- /
- 2004
The aims of this study were to estimate genetic parameters for carcass traits on Hanwoo(Korean Native Cattle) and to compare two different statistical algorithms for estimating genetic parameters. Data obtained from 1526 steers at Hanwoo Improvement Center and Hanwoo Improvement Complex Area from 1996 to 2001 were used for the analyses. The carcass traits considered in these studies were carcass weight, dressing percent, eye muscle area, backfat thickness, and marbling score. Estimated genetic parameters using EM-REML algorithm were compared to those by Bayesian inference via Gibbs Sampling to find out statistical properties. The estimated heritabilities of carcass traits by REML method were 0.28, 0.25, 0.35, 0.39 and 0.51, respectively and those by Gibbs Sampling method were 0.29, 0.25, 0.40, 0.42 and 0.54, respectively. This estimates were not significantly different, even though the estimated heritabilities by Gibbs Sampling method were higher than ones by REML method. Since the estimated statistics by REML method and Gibbs Sampling method were not significantly different in this study, it is inferred that both mothods could be efficiently applied for the analysis of carcass traits of cattle. However, further studies are demanded to define an optimal statistical method for handling large scale performance data.
https://doi.org/10.5187/JAST.2004.46.5.719 인용 PDF KSCI

Recommender system using BERT sentiment analysis (BERT 기반 감성분석을 이용한 추천시스템)

Park, Ho-yeon;Kim, Kyoung-jae
- Journal of Intelligence and Information Systems
- /
- v.27 no.2
- /
- pp.1-15
- /
- 2021
If it is difficult for us to make decisions, we ask for advice from friends or people around us. When we decide to buy products online, we read anonymous reviews and buy them. With the advent of the Data-driven era, IT technology's development is spilling out many data from individuals to objects. Companies or individuals have accumulated, processed, and analyzed such a large amount of data that they can now make decisions or execute directly using data that used to depend on experts. Nowadays, the recommender system plays a vital role in determining the user's preferences to purchase goods and uses a recommender system to induce clicks on web services (Facebook, Amazon, Netflix, Youtube). For example, Youtube's recommender system, which is used by 1 billion people worldwide every month, includes videos that users like, "like" and videos they watched. Recommended system research is deeply linked to practical business. Therefore, many researchers are interested in building better solutions. Recommender systems use the information obtained from their users to generate recommendations because the development of the provided recommender systems requires information on items that are likely to be preferred by the user. We began to trust patterns and rules derived from data rather than empirical intuition through the recommender systems. The capacity and development of data have led machine learning to develop deep learning. However, such recommender systems are not all solutions. Proceeding with the recommender systems, there should be no scarcity in all data and a sufficient amount. Also, it requires detailed information about the individual. The recommender systems work correctly when these conditions operate. The recommender systems become a complex problem for both consumers and sellers when the interaction log is insufficient. Because the seller's perspective needs to make recommendations at a personal level to the consumer and receive appropriate recommendations with reliable data from the consumer's perspective. In this paper, to improve the accuracy problem for "appropriate recommendation" to consumers, the recommender systems are proposed in combination with context-based deep learning. This research is to combine user-based data to create hybrid Recommender Systems. The hybrid approach developed is not a collaborative type of Recommender Systems, but a collaborative extension that integrates user data with deep learning. Customer review data were used for the data set. Consumers buy products in online shopping malls and then evaluate product reviews. Rating reviews are based on reviews from buyers who have already purchased, giving users confidence before purchasing the product. However, the recommendation system mainly uses scores or ratings rather than reviews to suggest items purchased by many users. In fact, consumer reviews include product opinions and user sentiment that will be spent on evaluation. By incorporating these parts into the study, this paper aims to improve the recommendation system. This study is an algorithm used when individuals have difficulty in selecting an item. Consumer reviews and record patterns made it possible to rely on recommendations appropriately. The algorithm implements a recommendation system through collaborative filtering. This study's predictive accuracy is measured by Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). Netflix is strategically using the referral system in its programs through competitions that reduce RMSE every year, making fair use of predictive accuracy. Research on hybrid recommender systems combining the NLP approach for personalization recommender systems, deep learning base, etc. has been increasing. Among NLP studies, sentiment analysis began to take shape in the mid-2000s as user review data increased. Sentiment analysis is a text classification task based on machine learning. The machine learning-based sentiment analysis has a disadvantage in that it is difficult to identify the review's information expression because it is challenging to consider the text's characteristics. In this study, we propose a deep learning recommender system that utilizes BERT's sentiment analysis by minimizing the disadvantages of machine learning. This study offers a deep learning recommender system that uses BERT's sentiment analysis by reducing the disadvantages of machine learning. The comparison model was performed through a recommender system based on Naive-CF(collaborative filtering), SVD(singular value decomposition)-CF, MF(matrix factorization)-CF, BPR-MF(Bayesian personalized ranking matrix factorization)-CF, LSTM, CNN-LSTM, GRU(Gated Recurrent Units). As a result of the experiment, the recommender system based on BERT was the best.
https://doi.org/10.13088/jiis.2021.27.2.001 인용 PDF KSCI

Search Result 13,015, Processing Time 0.055 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)