Search | Korea Science

Classification and Analysis of Data Mining Algorithms (데이터마이닝 알고리즘의 분류 및 분석)

Lee, Jung-Won;Kim, Ho-Sook;Choi, Ji-Young;Kim, Hyon-Hee;Yong, Hwan-Seung;Lee, Sang-Ho;Park, Seung-Soo
- Journal of KIISE:Databases
- /
- v.28 no.3
- /
- pp.279-300
- /
- 2001
Data mining plays an important role in knowledge discovery process and usually various existing algorithms are selected for the specific purpose of the mining. Currently, data mining techniques are actively to the statistics, business, electronic commerce, biology, and medical area and currently numerous algorithms are being researched and developed for these applications. However, in a long run, only a few algorithms, which are well-suited to specific applications with excellent performance in large database, will survive. So it is reasonable to focus our effort on those selected algorithms in the future. This paper classifies about 30 existing algorithms into 7 categories - association rule, clustering, neural network, decision tree, genetic algorithm, memory-based reasoning, and bayesian network. First of all, this work analyzes systematic hierarchy and characteristics of algorithms and we present 14 criteria for classifying the algorithms and the results based on this criteria. Finally, we propose the best algorithms among some comparable algorithms with different features and performances. The result of this paper can be used as a guideline for data mining researches as well as field applications of data mining.
PDF

Efficient Dynamic Weighted Frequent Pattern Mining by using a Prefix-Tree (Prefix-트리를 이용한 동적 가중치 빈발 패턴 탐색 기법)

Jeong, Byeong-Soo;Farhan, Ahmed
- The KIPS Transactions:PartD
- /
- v.17D no.4
- /
- pp.253-258
- /
- 2010
Traditional frequent pattern mining considers equal profit/weight value of every item. Weighted Frequent Pattern (WFP) mining becomes an important research issue in data mining and knowledge discovery by considering different weights for different items. Existing algorithms in this area are based on fixed weight. But in our real world scenarios the price/weight/importance of a pattern may vary frequently due to some unavoidable situations. Tracking these dynamic changes is very necessary in different application area such as retail market basket data analysis and web click stream management. In this paper, we propose a novel concept of dynamic weight and an algorithm DWFPM (dynamic weighted frequent pattern mining). Our algorithm can handle the situation where price/weight of a pattern may vary dynamically. It scans the database exactly once and also eligible for real time data processing. To our knowledge, this is the first research work to mine weighted frequent patterns using dynamic weights. Extensive performance analyses show that our algorithm is very efficient and scalable for WFP mining using dynamic weights.
https://doi.org/10.3745/KIPSTD.2010.17D.4.253 인용 PDF KSCI

Development of a Web Service System of Large Capacity Image Data: Focusing on the System Established for Ministry of Environment (대용량 영상자료 웹 서비스 시스템의 개발: 환경부 구축 사례 중심으로)

Lee, Sang-Ik;Shin, Sang-Hee;Choi, Yun-Soo;Lee, Im-Pyeong
- Journal of Korean Society for Geospatial Information Science
- /
- v.12 no.3 s.30
- /
- pp.61-67
- /
- 2004
Satellite and aerial images are effectively used to monitor ecological and environmental situation. More and more officials in the Ministry of Environment thus need to utilize these image data for various administrative affairs. However, it is difficult not only to deliver to the officials these image data mostly of large capacity through network but also for them to actively use the delivered data without specialized knowledge in remote sensing and image processing. Therefore, we established a large rapacity image data service system employing image compressive transmission and web-based image processing techniques. This system allows the officials to rapidly access all the associated image data and conveniently utilize the data using various functions implemented for remote sensing, image processing, GIS operations. Consequently, this system have been actively utilized for the decision making processes of the officials and hence accomplished a great reduction in the resources required for the data analysis for various administrative affairs.
PDF

Study on Soil Moisture Predictability using Machine Learning Technique (머신러닝 기법을 활용한 토양수분 예측 가능성 연구)

Jo, Bongjun;Choi, Wanmin;Kim, Youngdae;kim, Kisung;Kim, Jonggun
- Proceedings of the Korea Water Resources Association Conference
- /
- 2020.06a
- /
- pp.248-248
- /
- 2020
토양수분은 증발산, 유출, 침투 등 물수지 요소들과 밀접한 연관이 있는 주요한 변수 중에 하나이다. 토양수분의 정도는 토양의 특성, 토지이용 형태, 기상 상태 등에 따라 공간적으로 상이하며, 특히 기상 상태에 따라 시간적 변동성을 보이고 있다. 기존 토양수분 측정은 토양시료 채취를 통한 실내 실험 측정과 측정 장비를 통한 현장 조사 방법이 있으나 시간적, 경제적 한계점이 있으며, 원격탐사 기법은 공간적으로 넓은 범위를 포함하지만 시간 해상도가 낮은 단점이 있다. 또한, 모델링을 통한 토양수분 예측 기술은 전문적인 지식이 요구되며, 복잡한 입력자료의 구축이 요구된다. 최근 머신러닝 기법은 수많은 자료 학습을 통해 사용자가 원하는 출력값을 도출하는데 널리 활용되고 있다. 이에 본 연구에서는 토양수분과 연관된 다양한 기상 인자들(강수량, 풍속, 습도 등)을 활용하여 머신러닝기법의 반복학습을 통한 토양수분의 예측 가능성을 분석하고자 한다. 이를 위해 시공간적으로 토양수분 실측 자료가 잘 구축되어 있는 청미천과 설마천 유역을 대상으로 머신러닝 기법을 적용하였다. 두 대상지에서 2008년~2012년 수문자료를 확보하였으며, 기상자료는 기상자료개방포털과 WAMIS를 통해 자료를 확보하였다. 토양수분 자료와 기상자료를 머신러닝 알고리즘을 통해 학습하고 2012년 기상 자료를 바탕으로 토양수분을 예측하였다. 사용되는 머신러닝 기법은 의사결정 나무(Decision Tree), 신경망(Multi Layer Perceptron, MLP), K-최근접 이웃(K-Nearest Neighbors, KNN), 서포트 벡터 머신(Support Vector Machine, SVM), 랜덤 포레스트(Random Forest), 그래디언트 부스팅 (Gradient Boosting)이다. 토양수분과 기상인자 간의 상관관계를 분석하기 위해 히트맵(Heat Map)을 이용하였다. 히트맵 분석 결과 토양수분의 시간적 변동은 다양한 기상 자료 중 강수량과 상대습도가 가장 큰 영향력을 보여주었다. 또한 다양한 기상 인자 기반 머신러닝 기법 적용 결과에서는 두 지역 모두 신경망(MLP) 기법을 제외한 모든 기법이 전반적으로 실측값과 유사한 형태를 보였으며 비교 그래프에서도 실측값과 예측 값이 유사한 추세를 나타냈다. 따라서 상관관계있는 과거 기상자료를 통해 머신러닝 기법 기반 토양수분의 시간적 변동 예측이 가능할 것으로 판단된다.
PDF

A Fundamental Study for a Dispersion Characteristics of Surface Waves on an Influence of Adjacent Structures (인접구조물의 영향에 의한 표면파 분산특성의 기초연구)

Cho, Mi-Ra;Cho, Sung-Ho;Kim, Bong-Chan;Kim, Suhk-Chol
- KSCE Journal of Civil and Environmental Engineering Research
- /
- v.28 no.4C
- /
- pp.239-245
- /
- 2008
In this study, a fundamental-level study was performed to establish knowledge-base for the development of optimal surface-wave method for urban areas with adjacent structures. First, theoretical modelling was performed to investigate the influence of adjacent structures on dispersion characteristics of surface waves. Later, the geotechnical sites with a concrete model of adjacent structure and a real subway box structure were tested by surface-wave method to investigate the influence of adjacent structures. The major influencing factors of adjacent structures on surface-wave propagation were direct distance between measurement array and adjacent structure, stiffness contrast between layers and type of seismic source.
https://doi.org/10.12652/Ksce.2008.28.4C.239 인용 PDF

Visible and SWIR Satellite Image Fusion Using Multi-Resolution Transform Method Based on Haze-Guided Weight Map (Haze-Guided Weight Map 기반 다중해상도 변환 기법을 활용한 가시광 및 SWIR 위성영상 융합)

Taehong Kwak;Yongil Kim
- Korean Journal of Remote Sensing
- /
- v.39 no.3
- /
- pp.283-295
- /
- 2023
With the development of sensor and satellite technology, numerous high-resolution and multi-spectral satellite images have been available. Due to their wavelength-dependent reflection, transmission, and scattering characteristics, multi-spectral satellite images can provide complementary information for earth observation. In particular, the short-wave infrared (SWIR) band can penetrate certain types of atmospheric aerosols from the benefit of the reduced Rayleigh scattering effect, which allows for a clearer view and more detailed information to be captured from hazed surfaces compared to the visible band. In this study, we proposed a multi-resolution transform-based image fusion method to combine visible and SWIR satellite images. The purpose of the fusion method is to generate a single integrated image that incorporates complementary information such as detailed background information from the visible band and land cover information in the haze region from the SWIR band. For this purpose, this study applied the Laplacian pyramid-based multi-resolution transform method, which is a representative image decomposition approach for image fusion. Additionally, we modified the multiresolution fusion method by combining a haze-guided weight map based on the prior knowledge that SWIR bands contain more information in pixels from the haze region. The proposed method was validated using very high-resolution satellite images from Worldview-3, containing multi-spectral visible and SWIR bands. The experimental data including hazed areas with limited visibility caused by smoke from wildfires was utilized to validate the penetration properties of the proposed fusion method. Both quantitative and visual evaluations were conducted using image quality assessment indices. The results showed that the bright features from the SWIR bands in the hazed areas were successfully fused into the integrated feature maps without any loss of detailed information from the visible bands.
https://doi.org/10.7780/kjrs.2023.39.3.3 인용 PDF HTML

Development of Intelligent Internet Shopping Mall Supporting Tool Based on Software Agents and Knowledge Discovery Technology (소프트웨어 에이전트 및 지식탐사기술 기반 지능형 인터넷 쇼핑몰 지원도구의 개발)

김재경;김우주;조윤호;김제란
- Journal of Intelligence and Information Systems
- /
- v.7 no.2
- /
- pp.153-177
- /
- 2001
Nowadays, product recommendation is one of the important issues regarding both CRM and Internet shopping mall. Generally, a recommendation system tracks past actions of a group of users to make a recommendation to individual members of the group. The computer-mediated marketing and commerce have grown rapidly and thereby automatic recommendation methodologies have got great attentions. But the researches and commercial tools for product recommendation so far, still have many aspects that merit further considerations. To supplement those aspects, we devise a recommendation methodology by which we can get further recommendation effectiveness when applied to Internet shopping mall. The suggested methodology is based on web log information, product taxonomy, association rule mining, and decision tree learning. To implement this we also design and intelligent Internet shopping mall support system based on agent technology and develop it as a prototype system. We applied this methodology and the prototype system to a leading Korean Internet shopping mall and provide some experimental results. Through the experiment, we found that the suggested methodology can perform recommendation tasks both effectively and efficiently in real world problems. Its systematic validity issues are also discussed.
PDF

Research Trend Analysis for Fault Detection Methods Using Machine Learning (머신러닝을 사용한 단층 탐지 기술 연구 동향 분석)

Bae, Wooram;Ha, Wansoo
- Economic and Environmental Geology
- /
- v.53 no.4
- /
- pp.479-489
- /
- 2020
A fault is a geological structure that can be a migration path or a cap rock of hydrocarbon such as oil and gas, formed from source rock. The fault is one of the main targets of seismic exploration to find reservoirs in which hydrocarbon have accumulated. However, conventional fault detection methods using lateral discontinuity in seismic data such as semblance, coherence, variance, gradient magnitude and fault likelihood, have problem that professional interpreters have to invest lots of time and computational costs. Therefore, many researchers are conducting various studies to save computational costs and time for fault interpretation, and machine learning technologies attracted attention recently. Among various machine learning technologies, many researchers are conducting fault interpretation studies using the support vector machine, multi-layer perceptron, deep neural networks and convolutional neural networks algorithms. Especially, researchers use not only their own convolution networks but also proven networks in image processing to predict fault locations and fault information such as strike and dip. In this paper, by investigating and analyzing these studies, we found that the convolutional neural networks based on the U-Net from image processing is the most effective one for fault detection and interpretation. Further studies can expect better results from fault detection and interpretation using the convolutional neural networks along with transfer learning and data augmentation.
https://doi.org/10.9719/EEG.2020.53.4.479 인용 PDF KSCI

The Generic Terms and the Standards of a Delimitation for Oceans and Seas based on S-23(Names and Limits of Oceans and Seas) (S-23(Names and Limits of Oceans and Seas)을 기초로 한 바다의 속성지명과 바다경계의 획정 근거 분석)

Sung, Hyo Hyun;Kang, Jihyun
- Journal of the Korean Geographical Society
- /
- v.48 no.6
- /
- pp.914-928
- /
- 2013
Establishment of limits and names for oceans and seas is necessary for a safety of navigation. Even if there are no national and international standard for the delimitation of sea boundaries, we can take guidelines for the delimitation of sea boundaries through the analysis of IHO official publications, Limits and Names for Oceans and Sea; S-23. This paper shows the changes of the spatial limit of seas since first edition publication, and the standards for a delimitation of oceans and seas were analyzed using S-23 4th edition draft(2002) in terms of physical geographic features. The generic terms of S-23 include Ocean, Sea, Channel, Passage, Strait, Sound, Gulf, Bay and Bight, and each generic term shows hierarchical structures. Several seas show different characteristics compared with definitions of IHO dictionary. Sea boundaries are delimited by longitude and latitude, cape, river mouth, sandbar, and so on. Undersea features such as a shelf, trench, trough, rise, bank and reef are also important features for delimitation of sea boundary. Especially, seas that are delimited by undersea feature are mainly located Arctic and Southern ocean area in S-23 4th edition. Advanced knowledge of marine science with a technical advance might affect to delimit for sea boundary.
PDF

A News Video Mining based on Multi-modal Approach and Text Mining (멀티모달 방법론과 텍스트 마이닝 기반의 뉴스 비디오 마이닝)

Lee, Han-Sung;Im, Young-Hee;Yu, Jae-Hak;Oh, Seung-Geun;Park, Dai-Hee
- Journal of KIISE:Databases
- /
- v.37 no.3
- /
- pp.127-136
- /
- 2010
With rapid growth of information and computer communication technologies, the numbers of digital documents including multimedia data have been recently exploded. In particular, news video database and news video mining have became the subject of extensive research, to develop effective and efficient tools for manipulation and analysis of news videos, because of their information richness. However, many research focus on browsing, retrieval and summarization of news videos. Up to date, it is a relatively early state to discover and to analyse the plentiful latent semantic knowledge from news videos. In this paper, we propose the news video mining system based on multi-modal approach and text mining, which uses the visual-textual information of news video clips and their scripts. The proposed system systematically constructs a taxonomy of news video stories in automatic manner with hierarchical clustering algorithm which is one of text mining methods. Then, it multilaterally analyzes the topics of news video stories by means of time-cluster trend graph, weighted cluster growth index, and network analysis. To clarify the validity of our approach, we analyzed the news videos on "The Second Summit of South and North Korea in 2007".
PDF KSCI

Search Result 95, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)