• Title/Summary/Keyword: Component mining

Search Result 143, Processing Time 0.026 seconds

21세기 광물자원과 우리의 환경

  • 오민수
    • Proceedings of the KSEEG Conference
    • /
    • 2002.10a
    • /
    • pp.53-67
    • /
    • 2002
  • As in the past, we are concerned today with the magnitudes of mineral resources and the adequacy of these resources to meet future needs. In looking at global resource issues, we should consider the need for the resource, its supply, and the environmental consequences of using it. The need for a resource can become a resource dependency, specially as the global population expands and each of us becomes Increasingly dependent upon hundreds of natural materials. Therefore, our great mineral consumption makes the human population a true “Geologic Force”, which will be even more significant in the future when the global population is projected to reach alarming proportions. Although our supplies of mineral resources probably will be sufficient for the 21st century, the uneven distribution of minerals in the Earth's crust almost certainly will continue to be a major problem. The most likely result will be major shifts in both prices and sources of supply of many mineral resources. As for energy resources, we must avoid an obsessive dependency on one fuel and expand instead to other energy resources. Finally, because the use of resources affects the environment, we need to focus on resource exploitation and global pollution, particularly in regard to ground water and arable land. We must manage our resources so as to be in balance with our environment. And the accelerated industrialization of South Korean economy over the last three decades has resulted in the mass consumption of mineral commodities. South Korea has around 50 useful mineral commodities for the mineral industry, among 330 kinds of minerals described. The component ratio of the mining industry sector of the gross national production(GNP) in South Korea dropped from 1.2% in 1971 to 0.34% in 1997 due to the rapid growth of other industries in the country. During the period from 1971 to 1997, the average growth rate of mineral consumption in South Korea was 9.13% yearly and that of GMP per capita was 14.97%. The mineral consumptions per capita showed a continual increase during the last 30 years as follows(parenthesis: GW per capita); 0.99 metric tons in 1997($289), 3.83 metric tons in 1989($5, 210), 6.11 metric tons in 1995 ($10, 037), and 6.66 metric tons in 1997($9, 511). The total amount of mineral consumption in South Korea was 33 million tons of 32 mineral commodities in 1971, and 306 million metric tons of 47 mineral commodities in 1997.

  • PDF

Prediction of golf scores on the PGA tour using statistical models (PGA 투어의 골프 스코어 예측 및 분석)

  • Lim, Jungeun;Lim, Youngin;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.41-55
    • /
    • 2017
  • This study predicts the average scores of top 150 PGA golf players on 132 PGA Tour tournaments (2013-2015) using data mining techniques and statistical analysis. This study also aims to predict the Top 10 and Top 25 best players in 4 different playoffs. Linear and nonlinear regression methods were used to predict average scores. Stepwise regression, all best subset, LASSO, ridge regression and principal component regression were used for the linear regression method. Tree, bagging, gradient boosting, neural network, random forests and KNN were used for nonlinear regression method. We found that the average score increases as fairway firmness or green height or average maximum wind speed increases. We also found that the average score decreases as the number of one-putts or scrambling variable or longest driving distance increases. All 11 different models have low prediction error when predicting the average scores of PGA Tournaments in 2015 which is not included in the training set. However, the performances of Bagging and Random Forest models are the best among all models and these two models have the highest prediction accuracy when predicting the Top 10 and Top 25 best players in 4 different playoffs.

A Novel of Data Clustering Architecture for Outlier Detection to Electric Power Data Analysis (전력데이터 분석에서 이상점 추출을 위한 데이터 클러스터링 아키텍처에 관한 연구)

  • Jung, Se Hoon;Shin, Chang Sun;Cho, Young Yun;Park, Jang Woo;Park, Myung Hye;Kim, Young Hyun;Lee, Seung Bae;Sim, Chun Bo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.10
    • /
    • pp.465-472
    • /
    • 2017
  • In the past, researchers mainly used the supervised learning technique of machine learning to analyze power data and investigated the identification of patterns through the data mining technique. Data analysis research, however, faces its limitations with the old data classification and analysis techniques today when the size of electric power data has increased with the possible real-time provision of data. This study thus set out to propose a clustering architecture to analyze large-sized electric power data. The clustering process proposed in the study supplements the K-means algorithm, an unsupervised learning technique, for its problems and is capable of automating the entire process from the collection of electric power data to their analysis. In the present study, power data were categorized and analyzed in total three levels, which include the row data level, clustering level, and user interface level. In addition, the investigator identified K, the ideal number of clusters, based on principal component analysis and normal distribution and proposed an altered K-means algorithm to reduce data that would be categorized as ideal points in order to increase the efficiency of clustering.

Characterizing CO2 Supersaturation and Net Atmospheric Flux in the Middle and Lower Nakdong River (낙동강 중하류에서 이산화탄소 과포화 및 순배출 특성 분석)

  • Lee, Eun Ju;Chung, Se Woong;Park, Hyung Seok
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.416-416
    • /
    • 2019
  • 육상 담수는 대기중 이산화탄소($CO_2$) 배출의 중요한 발생원으로 주목되고 있다. 하천 및 강에서 대기중으로 배출되는 $CO_2$는 전 세계 탄소순환의 핵심요소이며, 대부분의 하천과 강은 $CO_2$로 과포화 되어있다. 세계적으로 하천 및 강의 $CO_2$ 배출량은 호수 및 저수지의 배출량보다 약 5배 많은 것으로 보고되고 있으나, 국내연구에서는 연구사례가 드물다. 따라서 본 연구의 목적은 낙동강 중하류에 위치해있는 강정고령보(GGW), 달성보(DSW), 합천창녕보(HCW), 창녕함안보(CHW)에서 발생되는 순 대기 배출 플럭스(Net Atmospheric Flux, NAF)의 동적 변동 특성을 분석하고, 데이터마이닝 기법을 적용하여 쉽게 수집할 수 있는 물리적 및 수질 변수로 $CO_2$ NAF를 추정하는데 사용할 수 있는 간략한 예측 모델을 개발하는데 있다. $CO_2$ NAF는 대기-수면 경계면에서의 $CO_2$ 부분압($pCO_2$)의 차에 기체전달속도를 곱하여 산정하였으며, 기체전달속도는 Cole and Caraco(1998)가 제안한 식을 사용하였다. 담수와 해수의 탄산염 시스템에서 열역학적 화학평형을 모두 고려한 $CO_2$SYS 프로그램을 사용하여 수중의 $pCO_2$를 산정하였고, $CO_2$ NAF는 Henry의 법칙과 Fick의 1차 확산법칙을 사용하여 계산하였다. $CO_2$ NAF의 시간적 변동성에 영향을 미치는 환경요인을 평가하기 위해서 상관분석, 주성분분석(Principal Component Analysis; PCA), 단계적다중회귀모델(Step-wise Multiple Linear Regression; SMLR), 랜덤포레스트(Random Forest; RF)방법을 사용하였다. SMLR 모델은 R package인 olsrr, RF 모델은 R package인 caret, randomForest를 이용하여 분석하였다. 연구 결과, 4개 보 상류 하천구간은 조류의 성장이 활발한 일부 기간을 제외한 대부분의 기간에서 $CO_2$를 대기로 배출하는 종속영양시스템(Heterotrophic system)을 보였다. $CO_2$ NAF의 중위값은 HCW에서 최소 $391.5mg-CO_2/m^2day$, DSW에서 최대 $1472.7mg-CO_2/m^2day$였다. 모든 보에서 NAF는 pH와 강한 음의 상관관계를 보였으며, $pCO_2$와 Chl-a도 음의 상관관계를 보였다. 이는 조류가 수중에서 $CO_2$를 소비하고 pH를 증가시키기 때문이다. PCA 분석 결과, NAF와 $pCO_2$가 높은 공분산을 보였으며, pH와 Chl-a는 반대 방향으로 군집되어 상관분석과 동일한 결과를 보였다. 이 연구를 통해 개발된 SMLR 모델과 RF 모델의 Adj. $R^2$ 값은 모든 보에서 0.77 이상으로 나왔으며, $pCO_2$ 측정 데이터가 없더라도 하천의 $CO_2$ NAF를 추정하는 방법으로 사용될 수 있을 것으로 평가된다.

  • PDF

Factors analysis of the cyanobacterial dominance in the four weirs installed in of Nakdong River (낙동강의 중·하류 4개보에서 남조류 우점 환경 요인 분석)

  • Kim, Sung jin;Chung, Se woong;Park, Hyung seok;Cho, Young cheol;Lee, Hee suk
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.413-413
    • /
    • 2019
  • 하천과 호수에서 남조류의 이상 과잉증식 문제(이하 녹조문제)는 담수생태계의 생물다양성을 감소시키며, 음용수의 이취미 원인물질을 발생시켜 물 이용에 장해가 된다. 또한 독소를 생산하는 유해남조류가 대량 증식할 경우에는 가축이나 인간의 건강에 치명적 해를 끼치기도 한다. 그 동안 국내에서 녹조문제는 댐 저수지와 하구호와 같은 정체수역에서 간헐적으로 문제를 일으켰으나, 4대강사업(2010-2011)으로 16개의 보가 설치된 이후 낙동강, 금강, 영산강 등 대하천에서도 광범위하게 발생되고 있어 중요한 사회적 환경적 이슈로 대두되었다. 한편, 대하천에 설치된 보 구간에서 빈번히 발생하는 녹조현상의 원인에 대해서는 전 지구적 기온상승에 따른 기후변화의 영향이라는 주장과 유역으로부터 영양염류의 과도한 유입, 가뭄에 따른 유량감소, 보 설치에 따른 체류시간 증가 등 다양한 의견이 제시되고 있으나, 대상 유역과 수체의 특성에 따라 녹조 발생의 원인이 상이하거나 또는 다양한 요인이 복합적으로 작용하기 때문에 보편적 해석(universal interpretation)이 어려운 것이 현실이다. 따라서 각 수계별, 보별 녹조현상에 대한 정확한 원인분석과 효과적인 대책 마련을 위해서는 집중된 실험자료와 데이터마이닝 기법에 근거로 한 보다 과학적이고 객관적인 접근이 이루어져야 한다. 본 연구에서는 2012년 보 설치 이후 남조류에 의한 녹조현상이 빈번히 발생하고 있는 낙동강 4개보(강정고령보, 달성보, 합천창녕보, 창녕함안보)를 대상으로 집중적인 현장조사와 실험분석을 수행하고, 수집된 기상, 수문, 수질, 조류 자료에 대해 통계분석과 다양한 데이터모델링 기법을 적용하여 보별 남조류 우점 환경조건과 이를 제어하기 위한 주요 조절변수를 규명하는데 있다. 연구대상 보 별 수질과 식물플랑크톤의 정성 및 정량 실험은 2017년 5월부터 2018년 11월까지 2년에 걸쳐 실시하였으며, 남조류 세포수 밀도와 환경요인과의 상관성 분석을 실시하고, 단계적 다중회귀모델(Step-wise Multiple Linear Regressions, SMLR), 랜덤포레스트(Random Forests, RF) 모델과 재귀적 변수 제거 기법(Recursive Feature Elimination using Random Forest, RFE-RF)을 이용한 변수중요도 평가, 의사결정나무(Decision Tree, DT), 주성분분석(Principal Component Analysis, PCA) 기법 등 다양한 모수적 및 비모수적 데이터마이닝 결과를 바탕으로 각 보별 남 조류 우점 환경요인을 종합적으로 해석하였다.

  • PDF

Health Risk Management using Feature Extraction and Cluster Analysis considering Time Flow (시간흐름을 고려한 특징 추출과 군집 분석을 이용한 헬스 리스크 관리)

  • Kang, Ji-Soo;Chung, Kyungyong;Jung, Hoill
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.1
    • /
    • pp.99-104
    • /
    • 2021
  • In this paper, we propose health risk management using feature extraction and cluster analysis considering time flow. The proposed method proceeds in three steps. The first is the pre-processing and feature extraction step. It collects user's lifelog using a wearable device, removes incomplete data, errors, noise, and contradictory data, and processes missing values. Then, for feature extraction, important variables are selected through principal component analysis, and data similar to the relationship between the data are classified through correlation coefficient and covariance. In order to analyze the features extracted from the lifelog, dynamic clustering is performed through the K-means algorithm in consideration of the passage of time. The new data is clustered through the similarity distance measurement method based on the increment of the sum of squared errors. Next is to extract information about the cluster by considering the passage of time. Therefore, using the health decision-making system through feature clusters, risks able to managed through factors such as physical characteristics, lifestyle habits, disease status, health care event occurrence risk, and predictability. The performance evaluation compares the proposed method using Precision, Recall, and F-measure with the fuzzy and kernel-based clustering. As a result of the evaluation, the proposed method is excellently evaluated. Therefore, through the proposed method, it is possible to accurately predict and appropriately manage the user's potential health risk by using the similarity with the patient.

Effects of genotype and environmental factors on content variations of the bioactive constituents in rice seeds (벼의 유전형질과 재배환경 요인이 기능성물질 함량 변이에 미치는 영향 비교)

  • Soo-Yun Park;Hyoun-Min Park;Jung-Won Jung;So Ra Jin;Sang-Gu Lee;Eun-Ha Kim;Seonwoo Oh
    • Journal of Applied Biological Chemistry
    • /
    • v.65 no.4
    • /
    • pp.429-438
    • /
    • 2022
  • The composition of crops reveal natural variation according to genetic characteristics and environmental factors such as the cultivated regions. For comparative investigation of the impact of genetic difference and environmental influence on the levels of bioactive components in rice seeds, 23 cultivars including indica, japonica, and tongil rice were grown in two location in Korea (Jeonju and Cheonan) for two years (2015 and 2016). Sixteen compounds consisting of tocopherols, tocotrienols, phytosterols, and policosanols were identified from 368 rice samples and the compositional data were subjected to data mining processes including principal component analysis and Pearson's correlation analysis. Under 4 different environmental conditions (Jeonju in 2015, Cheonan in 2015, Jeonju in 2016, Cheonan in 2016), the natural variability of rice seeds showed that the genetic background (indica vs japonica vs tongil) had more impact on the compositional changes of bioactive components compared to the environments. Especially, the results of correlation analysis revealed negative correlation between α-, β-tocopherols and γ-, δ-tocopherols as a representative genetic effect that did not changed by the environmental influence.

Index-based Searching on Timestamped Event Sequences (타임스탬프를 갖는 이벤트 시퀀스의 인덱스 기반 검색)

  • 박상현;원정임;윤지희;김상욱
    • Journal of KIISE:Databases
    • /
    • v.31 no.5
    • /
    • pp.468-478
    • /
    • 2004
  • It is essential in various application areas of data mining and bioinformatics to effectively retrieve the occurrences of interesting patterns from sequence databases. For example, let's consider a network event management system that records the types and timestamp values of events occurred in a specific network component(ex. router). The typical query to find out the temporal casual relationships among the network events is as fellows: 'Find all occurrences of CiscoDCDLinkUp that are fellowed by MLMStatusUP that are subsequently followed by TCPConnectionClose, under the constraint that the interval between the first two events is not larger than 20 seconds, and the interval between the first and third events is not larger than 40 secondsTCPConnectionClose. This paper proposes an indexing method that enables to efficiently answer such a query. Unlike the previous methods that rely on inefficient sequential scan methods or data structures not easily supported by DBMSs, the proposed method uses a multi-dimensional spatial index, which is proven to be efficient both in storage and search, to find the answers quickly without false dismissals. Given a sliding window W, the input to a multi-dimensional spatial index is a n-dimensional vector whose i-th element is the interval between the first event of W and the first occurrence of the event type Ei in W. Here, n is the number of event types that can be occurred in the system of interest. The problem of‘dimensionality curse’may happen when n is large. Therefore, we use the dimension selection or event type grouping to avoid this problem. The experimental results reveal that our proposed technique can be a few orders of magnitude faster than the sequential scan and ISO-Depth index methods.hods.

Evaluation of Web Service Similarity Assessment Methods (웹서비스 유사성 평가 방법들의 실험적 평가)

  • Hwang, You-Sub
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.4
    • /
    • pp.1-22
    • /
    • 2009
  • The World Wide Web is transitioning from being a mere collection of documents that contain useful information toward providing a collection of services that perform useful tasks. The emerging Web service technology has been envisioned as the next technological wave and is expected to play an important role in this recent transformation of the Web. By providing interoperable interface standards for application-to-application communication, Web services can be combined with component based software development to promote application interaction and integration both within and across enterprises. To make Web services for service-oriented computing operational, it is important that Web service repositories not only be well-structured but also provide efficient tools for developers to find reusable Web service components that meet their needs. As the potential of Web services for service-oriented computing is being widely recognized, the demand for effective Web service discovery mechanisms is concomitantly growing. A number of techniques for Web service discovery have been proposed, but the discovery challenge has not been satisfactorily addressed. Unfortunately, most existing solutions are either too rudimentary to be useful or too domain dependent to be generalizable. In this paper, we propose a Web service organizing framework that combines clustering techniques with string matching and leverages the semantics of the XML-based service specification in WSDL documents. We believe that this is one of the first attempts at applying data mining techniques in the Web service discovery domain. Our proposed approach has several appealing features : (1) It minimizes the requirement of prior knowledge from both service consumers and publishers; (2) It avoids exploiting domain dependent ontologies; and (3) It is able to visualize the semantic relationships among Web services. We have developed a prototype system based on the proposed framework using an unsupervised artificial neural network and empirically evaluated the proposed approach and tool using real Web service descriptions drawn from operational Web service registries. We report on some preliminary results demonstrating the efficacy of the proposed approach.

  • PDF

Analysis Corrosion Products Formed on the Great Buddha Image of Kotokuin Temple in Kamakura (고덕원 국보 동조아미타여래좌상의 표면에 생성한 부식생성물의 해석)

  • Matsuda Shiro;Aoki Shigeo;Kang, Dai-il
    • 보존과학연구
    • /
    • s.17
    • /
    • pp.161-182
    • /
    • 1996
  • In natural atmosphere, copper and copper alloy have been used to make buddha statues and ornaments of historic buildings since the abovementioned metals have corrosion resistance in some extent, and the patinaformed on the surface of the metals has provided the people aesthetic satisfaction with its beauty. But in atmosphere polluted by $SO_x$and $NO_x$, the patina layer does not work as a protective film, and it allows damages of the metal. Since 1992, Tokyo National Research Institute of Cultural Properties(TNRICP)has conducted studies on the influence of atmospheric pollution on metal cultural property held under open air. The Great Buddha Image which is located in Kamakura about 50km west from Tokyo, has been selected as one of the objects to study because it is made by copper alloy and it has stood exposed in the air for about a few hundreds years. Furthermore it is also the reason to study on it that there are many cultural properties in the surroundings of it. We have analysed the components and the structure of the corrosion products formed on the surface of the Buddha, have carried out exposure tests using the alloy samples which have simulated the components of the Great Image, and have observed climated and polluted air in order to discuss the relation between corrosion of metals in open air and conditions of the atmosphere. In this paper, the authors have described the components and the structure of the corrosion product formed on the surface of the Great Image by means of X-ray fluorescence spectroscopy and X-ray diffraction. The conclusions are as follows. (1) Sulfate patina composed mainly with brochantite were detected on the all sides of the Image and the amount of the patina is found more on the back of the Image facing to north. (2) Antlerite were detected on the back and a park of the left side facing to west, and formation of it was considered to have close relation with malignant atmosphere. (3) A big amount of chloride patina which mainly composed of atacamite were observed on the front facing to south. (4) Carbonate patina mainly composed of malachite were detected on the area where brochantite was often detected as well. It suggested that malachite had been transformed into brochantite by deteriorated atmosphere. (5) On the all sides of the Image, patina were observed together with copper oxides mainly composed of cuprous oxide. It showed that the surface layer of the Image consists of two layers : inner layer of oxide and outer layer of patina. (6) Corrosion products of lead which was a component of copperalloy were detected on the all sides : the main lead product found on the front was chlorophosphate whereas the one on the back was sulfate.

  • PDF