• Title/Summary/Keyword: Choice set information

Search Result 114, Processing Time 0.022 seconds

The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms (다중 클래스 데이터셋의 메타특징이 판별 알고리즘의 성능에 미치는 영향 연구)

  • Kim, Jeonghun;Kim, Min Yong;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.23-45
    • /
    • 2020
  • Big data is creating in a wide variety of fields such as medical care, manufacturing, logistics, sales site, SNS, and the dataset characteristics are also diverse. In order to secure the competitiveness of companies, it is necessary to improve decision-making capacity using a classification algorithm. However, most of them do not have sufficient knowledge on what kind of classification algorithm is appropriate for a specific problem area. In other words, determining which classification algorithm is appropriate depending on the characteristics of the dataset was has been a task that required expertise and effort. This is because the relationship between the characteristics of datasets (called meta-features) and the performance of classification algorithms has not been fully understood. Moreover, there has been little research on meta-features reflecting the characteristics of multi-class. Therefore, the purpose of this study is to empirically analyze whether meta-features of multi-class datasets have a significant effect on the performance of classification algorithms. In this study, meta-features of multi-class datasets were identified into two factors, (the data structure and the data complexity,) and seven representative meta-features were selected. Among those, we included the Herfindahl-Hirschman Index (HHI), originally a market concentration measurement index, in the meta-features to replace IR(Imbalanced Ratio). Also, we developed a new index called Reverse ReLU Silhouette Score into the meta-feature set. Among the UCI Machine Learning Repository data, six representative datasets (Balance Scale, PageBlocks, Car Evaluation, User Knowledge-Modeling, Wine Quality(red), Contraceptive Method Choice) were selected. The class of each dataset was classified by using the classification algorithms (KNN, Logistic Regression, Nave Bayes, Random Forest, and SVM) selected in the study. For each dataset, we applied 10-fold cross validation method. 10% to 100% oversampling method is applied for each fold and meta-features of the dataset is measured. The meta-features selected are HHI, Number of Classes, Number of Features, Entropy, Reverse ReLU Silhouette Score, Nonlinearity of Linear Classifier, Hub Score. F1-score was selected as the dependent variable. As a result, the results of this study showed that the six meta-features including Reverse ReLU Silhouette Score and HHI proposed in this study have a significant effect on the classification performance. (1) The meta-features HHI proposed in this study was significant in the classification performance. (2) The number of variables has a significant effect on the classification performance, unlike the number of classes, but it has a positive effect. (3) The number of classes has a negative effect on the performance of classification. (4) Entropy has a significant effect on the performance of classification. (5) The Reverse ReLU Silhouette Score also significantly affects the classification performance at a significant level of 0.01. (6) The nonlinearity of linear classifiers has a significant negative effect on classification performance. In addition, the results of the analysis by the classification algorithms were also consistent. In the regression analysis by classification algorithm, Naïve Bayes algorithm does not have a significant effect on the number of variables unlike other classification algorithms. This study has two theoretical contributions: (1) two new meta-features (HHI, Reverse ReLU Silhouette score) was proved to be significant. (2) The effects of data characteristics on the performance of classification were investigated using meta-features. The practical contribution points (1) can be utilized in the development of classification algorithm recommendation system according to the characteristics of datasets. (2) Many data scientists are often testing by adjusting the parameters of the algorithm to find the optimal algorithm for the situation because the characteristics of the data are different. In this process, excessive waste of resources occurs due to hardware, cost, time, and manpower. This study is expected to be useful for machine learning, data mining researchers, practitioners, and machine learning-based system developers. The composition of this study consists of introduction, related research, research model, experiment, conclusion and discussion.

Development of 'Carbon Footprint' Concept and Its Utilization Prospects in the Agricultural and Forestry Sector ('탄소발자국' 개념의 발전 과정과 농림 부문에서의 활용 전망)

  • Choi, Sung-Won;Kim, Hakyoung;Kim, Joon
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.17 no.4
    • /
    • pp.358-383
    • /
    • 2015
  • The concept of 'carbon footprint' has been developed as a means of quantifying the specific emissions of the greenhouse gases (GHGs) that cause global warming. Although there are still neither clear definitions of the term nor rules for units or the scope of its estimation, it is broadly accepted that the carbon footprint is the total amount of GHGs, expressed as $CO_2$ equivalents, emitted into the atmosphere directly or indirectly at all processes of the production by an individual or organization. According to the ISO/TS 14067, the carbon footprint of a product is calculated by multiplying the units of activity of processes that emit GHGs by emission factor of the processes, and by summing them up. Based on this, 'carbon labelling' system has been implemented in various ways over the world to provide consumers the opportunities of comparison and choice, and to encourage voluntary activities of producers to reduce GHG emissions. In the agricultural sector, as a judgment basis to help purchaser with ethical consumption, 'low-carbon agricultural and livestock products certification' system is expected to have more utilization value. In this process, the 'cradle to gate' approach (which excludes stages for usage and disposal) is mainly used to set the boundaries of the life cycle assessment for agricultural products. The estimation of carbon footprint for the entire agricultural and forestry sector should take both removals and emissions into account in the "National Greenhouse Gas Inventory Report". The carbon accumulation in the biomass of perennial trees in cropland should be considered also to reduce the total GHG emissions. In order to accomplish this, tower-based flux measurements can be used, which provide a direct quantification of $CO_2$ exchange during the entire life cycle. Carbon footprint information can be combined with other indicators to develop more holistic assessment indicators for sustainable agricultural and forestry ecosystems.

Development of Smart Packaging for Cream Type Cosmetic (크림 제형 화장품용 스마트 패키징 기술 개발)

  • Jeon, Sooyeon;Moon, Byounggeoun;Oh, Jaeyoung;Kang, Hosang;Jang, Geun;Lee, Kisung
    • KOREAN JOURNAL OF PACKAGING SCIENCE & TECHNOLOGY
    • /
    • v.25 no.3
    • /
    • pp.79-87
    • /
    • 2019
  • The degree of cosmetic's oxidation depends on the storage conditions and external conditions when using the product. The microbial contamination and oxygen exposure often results in the quality deterioration of cosmetics. In addition, the problem is that consumers often use cream-type cosmetics, which have short expiration period (6-12 months), even after the product is expired. When using the deteriorated cosmetics, it can be fatal to consumers' safety including some symptoms such as folliculitis, rashes, edema, and dermatitis. Therefore, it is necessary to develop sealed smart packaging for cosmetics to prevent the deterioration of cosmetics and improve consumer safety. In this study, we have developed smart packaging design for cosmetics that can measure the surrounding environment and expiration date for the cosmetics in the real time. In addition, the smart packaging includes sensor, which are linked to the mobile application. Users can find out the measurement results through the application. Also, the packaging design and functions were set up based on the survey results by the user and feasible model can be produced based on user choice. The measurement in the three environment has been done after manufactured the sensor, PCB, and mobile application. As a result, it works normally within a certain range under all three environmental conditions. It is believed that the information on expiration dates and storage environment can be efficiently delivered to the consumers through developed cosmetics smart packaging and applications. The development of UI/UX design for consumer is further studied. The UX/UI design of the application plays an essential role in achieving this goal through the commercialization the cosmetic products in the wide range.

Limitations of National Responsibility and its Application on Marine Environmental Pollution beyond Borders -Focused on the Effects of China's Three Gorges Dam on the Marine Environment in the East China Sea- (국경을 넘는 해양환경오염에 대한 국가책임과 적용의 한계 -중국의 산샤댐 건설로 인한 동중국해 해양환경 영향을 중심으로-)

  • Yang, Hee Cheol
    • Ocean and Polar Research
    • /
    • v.37 no.4
    • /
    • pp.341-356
    • /
    • 2015
  • A nation has a sovereign right to develop and use its natural resources according to its policies with regard to development and the relevant environment. A nation also has an obligation not to harm other countries or damage environments of neighboring countries as consequences of such actions of developments or use of natural resources. However, international precedents induce a nation to take additional actions not to cause more damages from the specific acts causing environmental damages beyond national borders, when such acts have economic and social importance. That is to say that there is a tendency to resolve such issues in a way to promote the balance between the mutual interests by allowing such actions to continue. A solution to China's Three Gorges Dam dilemma based on a soft law approach is more credible than relying on a good faith approach of national responsibilities and international legal proceedings since the construction and operation of the dam falls within the category of exercising national sovereign rights. If a large scale construction project such as the Three Gorges Dam or operation of a nuclear power plant causes or may cause environmental damage beyond the border of a nation engaged in such an undertaking, countries affected by this undertaking should jointly monitor the environmental effects in a spirit of cooperation rather than trying to stop the construction and should seek cooperative solutions of mutual understanding to establish measures to prevent further damages. If China's Three Gorges Dam construction and operation cause or contain the possibility of causing serious damages to marine environment, China cannot set aside its national responsibility to meet international obligations if China is aware of or knows about the damage that has occurred or may occur but fail to prevent, minimize, reverse or eliminate additional chances of such damages, or fails to put in place measures in order to prevent the recurrence of such damages. However, Korea must be able to prove a causal relationship between the relevant actions and resulting damages if it is to raise objections to the construction or request certain damage-prevention actions against crucial adverse effects on the marine environment out of respect for China's right to develop resources and acts of use thereof. Therefore, it is essential to cumulate continuous monitoring and evaluations information pertaining to marine environmental changes and impacts or responses of affected waters as well as acquisition of scientific baseline data with observed changes in such baseline. As China has adopted a somewhat nonchalant attitude toward taking adequate actions to protect against marine pollution risks or adverse effects caused by the construction and operation of China's Three Gorges Dam, there is a need to persuade China to adopt a more active stance and become involved in the monitoring and co-investigation of the Yellow Sea in order to protect the marine environment. Moreover, there is a need to build a regular environmental monitoring system that includes the evaluation of environmental effects beyond borders. The Espoo Convention can serve as a mechanism to ease potential conflicts of national interest in the Northeast Asian waters where political and historical sensitivities are acute. Especially, the recent diplomatic policy advanced by Korea and China can be implemented as an important example of gentle cooperation as the policy tool of choice is based on regional cooperation or cooperation between different regions.