• Title/Summary/Keyword: Mine 알고리즘

Search Result 37, Processing Time 0.024 seconds

Mining Frequent Pattern from Large Spatial Data (대용량 공간 데이터로 부터 빈발 패턴 마이닝)

  • Lee, Dong-Gyu;Yi, Gyeong-Min;Jung, Suk-Ho;Lee, Seong-Ho;Ryu, Keun-Ho
    • Journal of Korea Spatial Information System Society
    • /
    • v.12 no.1
    • /
    • pp.49-56
    • /
    • 2010
  • Many researches of frequent pattern mining technique for detecting unknown patterns on spatial data have studied actively. Existing data structures have classified into tree-structure and array-structure, and those structures show the weakness of performance on dense or sparse data. Since spatial data have obtained the characteristics of dense and sparse patterns, it is important for us to mine quickly dense and sparse patterns using only single algorithm. In this paper, we propose novel data structure as compressed patricia frequent pattern tree and frequent pattern mining algorithm based on proposed data structure which can detect frequent patterns quickly in terms of both dense and sparse frequent patterns mining. In our experimental result, proposed algorithm proves about 10 times faster than existing FP-Growth algorithm on both dense and sparse data.

PPFP(Push and Pop Frequent Pattern Mining): A Novel Frequent Pattern Mining Method for Bigdata Frequent Pattern Mining (PPFP(Push and Pop Frequent Pattern Mining): 빅데이터 패턴 분석을 위한 새로운 빈발 패턴 마이닝 방법)

  • Lee, Jung-Hun;Min, Youn-A
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.12
    • /
    • pp.623-634
    • /
    • 2016
  • Most of existing frequent pattern mining methods address time efficiency and greatly rely on the primary memory. However, in the era of big data, the size of real-world databases to mined is exponentially increasing, and hence the primary memory is not sufficient enough to mine for frequent patterns from large real-world data sets. To solve this problem, there are some researches for frequent pattern mining method based on disk, but the processing time compared to the memory based methods took very time consuming. There are some researches to improve scalability of frequent pattern mining, but their processes are very time consuming compare to the memory based methods. In this paper, we present PPFP as a novel disk-based approach for mining frequent itemset from big data; and hence we reduced the main memory size bottleneck. PPFP algorithm is based on FP-growth method which is one of the most popular and efficient frequent pattern mining approaches. The mining with PPFP consists of two setps. (1) Constructing an IFP-tree: After construct FP-tree, we assign index number for each node in FP-tree with novel index numbering method, and then insert the indexed FP-tree (IFP-tree) into disk as IFP-table. (2) Mining frequent patterns with PPFP: Mine frequent patterns by expending patterns using stack based PUSH-POP method (PPFP method). Through this new approach, by using a very small amount of memory for recursive and time consuming operation in mining process, we improved the scalability and time efficiency of the frequent pattern mining. And the reported test results demonstrate them.

Adaptive Enhancement of Low-light Video Images Algorithm Based on Visual Perception (시각 감지 기반의 저조도 영상 이미지 적응 보상 증진 알고리즘)

  • Li Yuan;Byung-Won Min
    • Journal of Internet of Things and Convergence
    • /
    • v.10 no.2
    • /
    • pp.51-60
    • /
    • 2024
  • Aiming at the problem of low contrast and difficult to recognize video images in low-light environment, we propose an adaptive contrast compensation enhancement algorithm based on human visual perception. First of all, the video image characteristic factors in low-light environment are extracted: AL (average luminance), ABWF (average bandwidth factor), and the mathematical model of human visual CRC(contrast resolution compensation) is established according to the difference of the original image's grayscale/chromaticity level, and the proportion of the three primary colors of the true color is compensated by the integral, respectively. Then, when the degree of compensation is lower than the bright vision precisely distinguishable difference, the compensation threshold is set to linearly compensate the bright vision to the full bandwidth. Finally, the automatic optimization model of the compensation ratio coefficient is established by combining the subjective image quality evaluation and the image characteristic factor. The experimental test results show that the video image adaptive enhancement algorithm has good enhancement effect, good real-time performance, can effectively mine the dark vision information, and can be widely used in different scenes.

Mining Maximal Frequent Contiguous Sequences in Biological Data Sequences (생물학적 데이터 서열들에서 빈번한 최대길이 연속 서열 마이닝)

  • Kang, Tae-Ho;Yoo, Jae-Soo
    • The KIPS Transactions:PartD
    • /
    • v.15D no.2
    • /
    • pp.155-162
    • /
    • 2008
  • Biological sequences such as DNA sequences and amino acid sequences typically contain a large number of items. They have contiguous sequences that ordinarily consist of hundreds of frequent items. In biological sequences analysis(BSA), a frequent contiguous sequence search is one of the most important operations. Many studies have been done for mining sequential patterns efficiently. Most of the existing methods for mining sequential patterns are based on the Apriori algorithm. In particular, the prefixSpan algorithm is one of the most efficient sequential pattern mining schemes based on the Apriori algorithm. However, since the algorithm expands the sequential patterns from frequent patterns with length-1, it is not suitable for biological dataset with long frequent contiguous sequences. In recent years, the MacosVSpan algorithm was proposed based on the idea of the prefixSpan algorithm to significantly reduce its recursive process. However, the algorithm is still inefficient for mining frequent contiguous sequences from long biological data sequences. In this paper, we propose an efficient method to mine maximal frequent contiguous sequences in large biological data sequences by constructing the spanning tree with the fixed length. To verify the superiority of the proposed method, we perform experiments in various environments. As the result, the experiments show that the proposed method is much more efficient than MacosVSpan in terms of retrieval performance.

Data-Mining in Business Performance Database Using Explanation-Based Genetic Algorithms (설명기반 유전자알고리즘을 활용한 경영성과 데이터베이스이 데이터마이닝)

  • 조성훈;정민용
    • Korean Management Science Review
    • /
    • v.18 no.1
    • /
    • pp.135-145
    • /
    • 2001
  • In recent environment of dynamic management, there is growing recognition that information and knowledge management systems are essential for efficient/effective decision making by CEO. To cope with this situation, we suggest the Data-Miming scheme as a key component of integrated information and knowledge management system. The proposed system measures business performance by considering both VA(Value-Added), which represents stakeholder’s point of view and EVA (Economic Value-Added), which represents shareholder’s point of view. To mine the new information & Knowledge discovery, we applied the improved genetic algorithms that consider predictability, understandability (lucidity) and reasonability factors simultaneously, we use a linear combination model for GAs learning structure. Although this model’s predictability will be more decreased than non-linear model, this model can increase the knowledge’s understandability that is meaning of induced values. Moreover, we introduce a random variable scheme based on normal distribution for initial chromosomes in GAs, so we can expect to increase the knowledge’s reasonability that is degree of expert’s acceptability. the random variable scheme based on normal distribution uses statistical correlation/determination coefficient that is calculated with training data. To demonstrate the performance of the system, we conducted a case study using financial data of Korean automobile industry over 16 years from 1981 to 1996, which is taken from database of KISFAS (Korea Investors Services Financial Analysis System).

  • PDF

Analysis of Ground Subsidence using ALOS PALSAR (2006~2010) in Taebaek, Kangwon (ALOS PALSAR(2006년~2010년) 위성영상을 이용한 강원도 태백시 지반침하 관측 및 분석)

  • Cho, Min-Ji;Kim, Sang-Wan
    • Economic and Environmental Geology
    • /
    • v.45 no.5
    • /
    • pp.503-512
    • /
    • 2012
  • We performed DInSAR (Differential Interferometric SAR) and SBAS (Small BAseline Subset) analysis using spaceborne SAR (Synthetic Aperture Radar) in order to detect a surface subsidence in Taebaek area, Kangwon, which are suitable to the monitoring of broad and inaccessible areas. During the period from October 2006 to June 2010, we acquired twenty-three ALOS PALSAR data sets (path/frame=425/730) for this study. The ninety-six differential interferograms with a perpendicular baseline less than 1100 m were constructed by ROI_PAC, then the mean velocity map of surface displacement was derived from SBAS analysis. As a result, it was confirmed that the ground displacement occurred about 4 cm/yr at Seokgong-Jangseong and Kyungdong mines and 2 cm/yr at Saehan-Eoryong-Jungdong and Hwangji mines in Taebaek area, Kangwon. It seems that the subsidence in study area is closely related to mining activities because the most of subsiding areas are well matched with mining areas. The subsidence at Kyungdong mine shows continuous and fast velocity in about $2{\times}2$ km area. Therefore the further analysis and the effort to prevent disaster are required in this area.

Temperature Prediction of Underground Working Place Using Artificial Neural Networks (인공신경망을 이용한 심부 갱내온도 예측)

  • Kim, Yun-Kwang;Kim, Jin
    • Tunnel and Underground Space
    • /
    • v.17 no.4
    • /
    • pp.301-310
    • /
    • 2007
  • The prediction of temperature in the workings for the propriety examination for the development of a deep coal bed and the ventilation design is fairly important. It is quite demanding to obtain precise thermal conductivity of rock due to the variety and the complexity of the rock types contiguous to the coal bed. Therefore, to estimate the thermal conductivity corresponding to this geological situation and complex gallery conditions, a computing program which is TemPredict, is developed in this study. It employs Artificial Neural Network and calculates the climatic conditions in galleries. This advanced neural network is based upon the Back-Propagation Algorithm and composed of the input layers that are acceptant of the physical and geological factors of the coal bed and the hidden layers each of which has the 5 and 3 neurons. To verify TemPredict, the calculated result is compared with the measured one at the entrance of -300 ML 9X of Jang-sung production department, Jang-sung Coal Mine. The difference between the results calculated by TemPredict ($25.65^{\circ}C$) and measured ($25.7^{\circ}C$) is only $0.05^{\circ}C$, which is less than the allowable error 5%. The result has more than 95% of very high reliability. The temperature prediction for the main carriage gallery 9X in -425 ML under construction when it is completed is made. Its result is $28.2^{\circ}C$. In the future, it would contribute to the ventilation design for the mine and the underground structures.

Efficient Dynamic Weighted Frequent Pattern Mining by using a Prefix-Tree (Prefix-트리를 이용한 동적 가중치 빈발 패턴 탐색 기법)

  • Jeong, Byeong-Soo;Farhan, Ahmed
    • The KIPS Transactions:PartD
    • /
    • v.17D no.4
    • /
    • pp.253-258
    • /
    • 2010
  • Traditional frequent pattern mining considers equal profit/weight value of every item. Weighted Frequent Pattern (WFP) mining becomes an important research issue in data mining and knowledge discovery by considering different weights for different items. Existing algorithms in this area are based on fixed weight. But in our real world scenarios the price/weight/importance of a pattern may vary frequently due to some unavoidable situations. Tracking these dynamic changes is very necessary in different application area such as retail market basket data analysis and web click stream management. In this paper, we propose a novel concept of dynamic weight and an algorithm DWFPM (dynamic weighted frequent pattern mining). Our algorithm can handle the situation where price/weight of a pattern may vary dynamically. It scans the database exactly once and also eligible for real time data processing. To our knowledge, this is the first research work to mine weighted frequent patterns using dynamic weights. Extensive performance analyses show that our algorithm is very efficient and scalable for WFP mining using dynamic weights.

A study on the application of blockchain to the edge computing-based Internet of Things (에지 컴퓨팅 기반의 사물인터넷에 대한 블록체인 적용 방안 연구)

  • Choi, Jung-Yul
    • Journal of Digital Convergence
    • /
    • v.17 no.12
    • /
    • pp.219-228
    • /
    • 2019
  • Thanks to the development of information technology and the vitalization of smart services, the Internet of Things (IoT) technology, in which various smart devices are connected to the network, has been continuously developed. In the legacy IoT architecture, data processing has been centralized based on cloud computing, but there are concerns about a single point of failure, end-to-end transmission delay, and security. To solve these problems, it is necessary to apply decentralized blockchain technology to the IoT. However, it is hard for the IoT devices with limited computing power to mine blocks, which consumes a great amount of computing resources. To overcome this difficulty, this paper proposes an IoT architecture based on the edge computing technology that can apply blockchain technology to IoT devices, which lack computing resources. This paper also presents an operaional procedure of blockchain in the edge computing-based IoT architecture.

Analytic Verification of Optimal Degaussing Technique using a Scaled Model Ship (축소 모델 함정을 이용한 소자 최적화 기법의 해석적 검증)

  • Cho, Dong-Jin
    • Journal of the Korean Magnetics Society
    • /
    • v.27 no.2
    • /
    • pp.63-69
    • /
    • 2017
  • Naval ships are particularly required to maintain acoustic and magnetic silence due to their operational characteristics. Among them, underwater magnetic field signals derived by ships are likely to be detected by threats such as surveillance systems and mine systems at close distance. In order to increase the survivability of the vessels, various techniques for reducing the magnetic field signal are being studied and it is necessary to consider not only the magnitude of the magnetic field signal but also the gradient of it. In this paper, we use the commercial electromagnetic finite element analysis tool to predict the induced magnetic field signal of ship's scaled model, and arrange the degaussing coil. And the optimum degaussing current of the coil was derived by applying the particle swarm optimization algorithm considering the gradient constraint. The validity of the optimal degaussing technique is verified analytically by comparing the magnetic field signals after the degaussing with or without gradient constraint.