• Title/Summary/Keyword: Cross between two classes

Search Result 27, Processing Time 0.028 seconds

The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms (다중 클래스 데이터셋의 메타특징이 판별 알고리즘의 성능에 미치는 영향 연구)

  • Kim, Jeonghun;Kim, Min Yong;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.23-45
    • /
    • 2020
  • Big data is creating in a wide variety of fields such as medical care, manufacturing, logistics, sales site, SNS, and the dataset characteristics are also diverse. In order to secure the competitiveness of companies, it is necessary to improve decision-making capacity using a classification algorithm. However, most of them do not have sufficient knowledge on what kind of classification algorithm is appropriate for a specific problem area. In other words, determining which classification algorithm is appropriate depending on the characteristics of the dataset was has been a task that required expertise and effort. This is because the relationship between the characteristics of datasets (called meta-features) and the performance of classification algorithms has not been fully understood. Moreover, there has been little research on meta-features reflecting the characteristics of multi-class. Therefore, the purpose of this study is to empirically analyze whether meta-features of multi-class datasets have a significant effect on the performance of classification algorithms. In this study, meta-features of multi-class datasets were identified into two factors, (the data structure and the data complexity,) and seven representative meta-features were selected. Among those, we included the Herfindahl-Hirschman Index (HHI), originally a market concentration measurement index, in the meta-features to replace IR(Imbalanced Ratio). Also, we developed a new index called Reverse ReLU Silhouette Score into the meta-feature set. Among the UCI Machine Learning Repository data, six representative datasets (Balance Scale, PageBlocks, Car Evaluation, User Knowledge-Modeling, Wine Quality(red), Contraceptive Method Choice) were selected. The class of each dataset was classified by using the classification algorithms (KNN, Logistic Regression, Nave Bayes, Random Forest, and SVM) selected in the study. For each dataset, we applied 10-fold cross validation method. 10% to 100% oversampling method is applied for each fold and meta-features of the dataset is measured. The meta-features selected are HHI, Number of Classes, Number of Features, Entropy, Reverse ReLU Silhouette Score, Nonlinearity of Linear Classifier, Hub Score. F1-score was selected as the dependent variable. As a result, the results of this study showed that the six meta-features including Reverse ReLU Silhouette Score and HHI proposed in this study have a significant effect on the classification performance. (1) The meta-features HHI proposed in this study was significant in the classification performance. (2) The number of variables has a significant effect on the classification performance, unlike the number of classes, but it has a positive effect. (3) The number of classes has a negative effect on the performance of classification. (4) Entropy has a significant effect on the performance of classification. (5) The Reverse ReLU Silhouette Score also significantly affects the classification performance at a significant level of 0.01. (6) The nonlinearity of linear classifiers has a significant negative effect on classification performance. In addition, the results of the analysis by the classification algorithms were also consistent. In the regression analysis by classification algorithm, Naïve Bayes algorithm does not have a significant effect on the number of variables unlike other classification algorithms. This study has two theoretical contributions: (1) two new meta-features (HHI, Reverse ReLU Silhouette score) was proved to be significant. (2) The effects of data characteristics on the performance of classification were investigated using meta-features. The practical contribution points (1) can be utilized in the development of classification algorithm recommendation system according to the characteristics of datasets. (2) Many data scientists are often testing by adjusting the parameters of the algorithm to find the optimal algorithm for the situation because the characteristics of the data are different. In this process, excessive waste of resources occurs due to hardware, cost, time, and manpower. This study is expected to be useful for machine learning, data mining researchers, practitioners, and machine learning-based system developers. The composition of this study consists of introduction, related research, research model, experiment, conclusion and discussion.

Impact of pore fluid heterogeneities on angle-dependent reflectivity in poroelastic layers: A study driven by seismic petrophysics

  • Ahmad, Mubasher;Ahmed, Nisar;Khalid, Perveiz;Badar, Muhammad A.;Akram, Sohail;Hussain, Mureed;Anwar, Muhammad A.;Mahmood, Azhar;Ali, Shahid;Rehman, Anees U.
    • Geomechanics and Engineering
    • /
    • v.17 no.4
    • /
    • pp.343-354
    • /
    • 2019
  • The present study demonstrates the application of seismic petrophysics and amplitude versus angle (AVA) forward modeling to identify the reservoir fluids, discriminate their saturation levels and natural gas composition. Two case studies of the Lumshiwal Formation (mainly sandstone) of the Lower Cretaceous age have been studied from the Kohat Sub-basin and the Middle Indus Basin of Pakistan. The conventional angle-dependent reflection amplitudes such as P converted P ($R_{PP}$) and S ($R_{PS}$), S converted S ($R_{SS}$) and P ($R_{SP}$) and newly developed AVA attributes (${\Delta}R_{PP}$, ${\Delta}R_{PS}$, ${\Delta}R_{SS}$ and ${\Delta}R_{SP}$) are analyzed at different gas saturation levels in the reservoir rock. These attributes are generated by taking the differences between the water wet reflection coefficient and the reflection coefficient at unknown gas saturation. Intercept (A) and gradient (B) attributes are also computed and cross-plotted at different gas compositions and gas/water scenarios to define the AVO class of reservoir sands. The numerical simulation reveals that ${\Delta}R_{PP}$, ${\Delta}R_{PS}$, ${\Delta}R_{SS}$ and ${\Delta}R_{SP}$ are good indicators and able to distinguish low and high gas saturation with a high level of confidence as compared to conventional reflection amplitudes such as P-P, P-S, S-S and S-P. In A-B cross-plots, the gas lines move towards the fluid (wet) lines as the proportion of heavier gases increase in the Lumshiwal Sands. Because of the upper contacts with different sedimentary rocks (Shale/Limestone) in both wells, the same reservoir sand exhibits different response similar to AVO classes like class I and class IV. This study will help to analyze gas sands by using amplitude based attributes as direct gas indicators in further gas drilling wells in clastic successions.

Ontology-Based Process-Oriented Knowledge Map Enabling Referential Navigation between Knowledge (지식 간 상호참조적 네비게이션이 가능한 온톨로지 기반 프로세스 중심 지식지도)

  • Yoo, Kee-Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.61-83
    • /
    • 2012
  • A knowledge map describes the network of related knowledge into the form of a diagram, and therefore underpins the structure of knowledge categorizing and archiving by defining the relationship of the referential navigation between knowledge. The referential navigation between knowledge means the relationship of cross-referencing exhibited when a piece of knowledge is utilized by a user. To understand the contents of the knowledge, a user usually requires additionally information or knowledge related with each other in the relation of cause and effect. This relation can be expanded as the effective connection between knowledge increases, and finally forms the network of knowledge. A network display of knowledge using nodes and links to arrange and to represent the relationship between concepts can provide a more complex knowledge structure than a hierarchical display. Moreover, it can facilitate a user to infer through the links shown on the network. For this reason, building a knowledge map based on the ontology technology has been emphasized to formally as well as objectively describe the knowledge and its relationships. As the necessity to build a knowledge map based on the structure of the ontology has been emphasized, not a few researches have been proposed to fulfill the needs. However, most of those researches to apply the ontology to build the knowledge map just focused on formally expressing knowledge and its relationships with other knowledge to promote the possibility of knowledge reuse. Although many types of knowledge maps based on the structure of the ontology were proposed, no researches have tried to design and implement the referential navigation-enabled knowledge map. This paper addresses a methodology to build the ontology-based knowledge map enabling the referential navigation between knowledge. The ontology-based knowledge map resulted from the proposed methodology can not only express the referential navigation between knowledge but also infer additional relationships among knowledge based on the referential relationships. The most highlighted benefits that can be delivered by applying the ontology technology to the knowledge map include; formal expression about knowledge and its relationships with others, automatic identification of the knowledge network based on the function of self-inference on the referential relationships, and automatic expansion of the knowledge-base designed to categorize and store knowledge according to the network between knowledge. To enable the referential navigation between knowledge included in the knowledge map, and therefore to form the knowledge map in the format of a network, the ontology must describe knowledge according to the relation with the process and task. A process is composed of component tasks, while a task is activated after any required knowledge is inputted. Since the relation of cause and effect between knowledge can be inherently determined by the sequence of tasks, the referential relationship between knowledge can be circuitously implemented if the knowledge is modeled to be one of input or output of each task. To describe the knowledge with respect to related process and task, the Protege-OWL, an editor that enables users to build ontologies for the Semantic Web, is used. An OWL ontology-based knowledge map includes descriptions of classes (process, task, and knowledge), properties (relationships between process and task, task and knowledge), and their instances. Given such an ontology, the OWL formal semantics specifies how to derive its logical consequences, i.e. facts not literally present in the ontology, but entailed by the semantics. Therefore a knowledge network can be automatically formulated based on the defined relationships, and the referential navigation between knowledge is enabled. To verify the validity of the proposed concepts, two real business process-oriented knowledge maps are exemplified: the knowledge map of the process of 'Business Trip Application' and 'Purchase Management'. By applying the 'DL-Query' provided by the Protege-OWL as a plug-in module, the performance of the implemented ontology-based knowledge map has been examined. Two kinds of queries to check whether the knowledge is networked with respect to the referential relations as well as the ontology-based knowledge network can infer further facts that are not literally described were tested. The test results show that not only the referential navigation between knowledge has been correctly realized, but also the additional inference has been accurately performed.

Structural Properties of Social Network and Diffusion of Product WOM: A Sociocultural Approach (사회적 네트워크 구조특성과 제품구전의 확산: 사회문화적 접근)

  • Yoon, Sung-Joon;Han, Hee-Eun
    • Journal of Distribution Research
    • /
    • v.16 no.1
    • /
    • pp.141-177
    • /
    • 2011
  • I. Research Objectives: Most of the previous studies on diffusion have concentrated on efficacy of WOM communication with the use of variables at individual level (Iacobucci 1996; Midgley et al. 1992). However, there is a paucity of studies which investigated network's structural properties as antecedents of WOM from the perspective of consumers' sociocultural propensities. Against this research backbone, this study attempted to link the network's structural properties and consumer' WOM behavior on cross-national basis. The major research objective of this study was to examine the relationship between network properties and WOM by comparing Korean and Chinese consumers. Specific objectives of this research are threefold; firstly, it sought to examine whether network properties (i.e., tie strength, centrality, range) affect WOM (WOM intention and quality of WOM). Secondly, it aimed to explore the moderating effects of cutural orientation (uncertainty avoidance and individuality) on the relationship between network properties and WOM. Thirdly, it substantiates the role of innovativeness as antecedents to both network properties and WOM. II. Research Hypotheses: Based on the above research objectives, the study put forth the following research hypotheses to validate. ${\cdot}$ H 1-1 : The Strength of tie between two counterparts within network will positively influence WOM effectivenes ${\cdot}$ H 1-2 : The network centrality will positively influence the WOM effectiveness ${\cdot}$ H 1-3 : The network range will positively influence the WOM effectiveness ${\cdot}$ H 2-1 : The consumer's uncertainty avoidance tendency will moderate the relationship between network properties and WOM effectiveness ${\cdot}$ H 2-2 : The consumer's individualism tendency will moderate the relationship between network properties and WOM effectiveness ${\cdot}$ H 3-1 : The consumer's innovativeness will positively influence the social network properties ${\cdot}$ H 3-2 : The consumer's innovativeness will positively influence WOM effectiveness III. Methodology: Through a pilot study and back-translation, two versions of questionnaire were prepared, one in Korean and the other in Chinese. The chinese data were collected from the chinese students enrolled in language schools in Suwon city in Korea, while Korean data were collected from students taking classes in a major university in Seoul. A total of 277 questionnaire were used for analysis of Korean data and 212 for Chinese data. The reason why Chinese students living in Korea rather than in China were selected was based on two factors: one was to neutralize the differences (ie, retail channel availability) that may arise from living in separate countries and the second was to minimize the difference in communication venues such as internet accessibility and cell phone usability. SPSS 12.0 and AMOS 7.0 were used for analysis. IV. Results: Prior to hypothesis verification, mean differences between the two countries in terms of major constructs were performed with the following result; As for network properties (tie strength, centrality and range), Koreans showed higher scores in all three constructs. For cultural orientation traits, Koreans scored higher only on uncertainty avoidance trait than Chinese. As a result of verifying the first research objective, confirming the relationship between network properties and WOM effectiveness, on Korean side, tie strength(Beta=.116; t=1.785) and centrality (Beta=.499; t=6.776) significantly influenced on WOM intention, and similar finding was obtained for Chinese side, with tie strength (Beta=.246; t=3.544) and centrality (Beta=.247; t=3.538) being significant. However, with regard to WOM argument quality, Korean data yielded only centrality (Beta=.82; t=7.600) having a significant impact on WOM, whereas China showed both tie strength(Beat=.142; t=2.052) and centrality(Beta=.348; t=5.031) being influential. To answer for the second research objective addressing the moderating role of cultural orientation, moderated regression anaylsis was performed and the result showed that uncertainty avoidance moderated between network range and WOM intention for both Korea and China, But for Korea, the uncertainty avoidance moderated between tie strength and WOM quality, while for China it moderated between network range and WOM intention. And innovativeness moderated between tie strength and WOM intention for Korea but it moderated between network range and WOM intention for China. As a result of analysing for third research objective, we found that for Korea, innovativeness positively influenced centrality only (Beta=.546; t=10.808), while for China it influenced both tie strength (Beta=.203; t=2.998) and centrality(Beta=.518; t=8.782). But for both countries alike, the innovativeness influenced positively on WOM (WOM intention and WOM quality). V. Implications: The study yields the two practical implications. Firstly, the result suggests that companies targeting multinational customers need to identify segments which are susceptible to the positive WOM and WOM information based on individual traits such as uncertainty avoidance and individualism and based on that, develop marketing communication strategy. Secondly, the companies need to divide the market on Roger's five innovation stages and based on this information, enforce marketing strategy which utilizes social networking tools such as public media and WOM. For instance, innovator and early adopters, if provided with new product information, will be able to capitalize upon the network advantages and thus add informational value to network operations using SNS or corporate blog.

  • PDF

Nitrogen Uptake, Yield and Gross Income of Sweet Corn as Affected by Nitrogen (질소시비량이 단옥수수의 질소흡수, 수량 및 조수입에 미치는 영향)

  • 이석순;최상집
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.35 no.1
    • /
    • pp.83-89
    • /
    • 1990
  • A sweet corn hybrid, Golden Cross Bantam 70, was grown at 0, 5, 10, 15 and 20kg/10a of nitrogen (N) under the transparent P. E. film mulch to find the best yield evaluation method. Culm length, ear height, number of tillers increased and silking date was earlier by 1-2 days with increased N level. Leaf area index of main culm at harvest increased with increased N level. Marketable ears were divided into two classes according to the whole sale market price; the frist grade of which husked ear weight over 150g (unhusked ear weight 230g) and the second grade of which husked ear weight between 100 and 150g (unhusked ear weight between 180 and 230g). Average length, thickness, and weight of both grades of marketable ears were not different among the N levels. The proportion of the first grade increased with increased N level. However, total number and weight of marketable ears and gross income per 10a calculated considering weight and number of ears increased with increased N level. There were highly positive correlations between gross income and ear number or ear weight per l0a. The number and weight of marketable ears were underestimated at high N levels compared with gross income. Dry matter yield of stover ranged 740-963kg/10a and increased with increased N level with 20. 8-24.5% dry matter content. Rice black-streaked dwarf virus infection rate was 11.8-15.0%, but it was not related to N level. N concentration in ear was similar but that in stover increased with increased N level. Total N uptake increased but N recovery decreased with increased N level.

  • PDF

A Study on the Successful Case of Brand Renewal through American National Brand 'C' Company's Marketing Strategy (미국(美國) 내셔널브랜드 C사(社)의 마케팅전략(戰略)을 통한 브랜드리뉴얼 성공사례(成功事例) 연구(硏究))

  • Koh, Hee-Sook
    • Journal of Fashion Business
    • /
    • v.6 no.1
    • /
    • pp.137-154
    • /
    • 2002
  • It's not easy to renew old brand of over 50 years history to the tastes of new consumer of our time. Most of national brands that has a history of some 20 years in Korea have strove for continuation and growth of brand to no avails, which can be taken as a good example of current situation. For instance, C company, one of the National brand of US which has a history of 51 years, has made its position secure as a fashion group and based itself on a sound foundation by establishing new marketing strategy and completing successful brand renewal in the process of strategic M&A with Italian company. Those successful marketing strategies are as follows. 1) they regarded both market and consumer oriented marketing activity as company's highest priority strategy and put great emphasis upon concentration on target market and reestablishment of brand image of business casual wear. 2) Setting up and operating planning team composed of merchandizer alone in Milano, they set the direction of plan on the basis of concentrated research on potential item in market according to thorough market research done by buying office in Korea, branch office in Hong Kong and buyer in US prior to blueprint planning for season. 3) Great emphasis was placed on business which focused on intensive presentation of basic key item for apparel career women who are main consumer group in the midium-low prices market in US and on supplementation of size and color. they named this line 'collectibles' and helped their customer develop their own clothes plan without worrying about the change of color and fabric by supporting same fabric and color throughout the year and enabled them to add variation easily by supplementing new trend item. 4) Company set black as a main color that lots of apparel career women find easy to care and to express their own image and presented them with pebble which belongs to navy and beige and added fashion color such as wine and brown etc as season goes by. They constructed basic line in order for their customers to coordinate purchased item with new one or to add them to present collection, and to achieve efficient sale by setting up strategy which allows this cross coordination and changing pattern occasionally. 5) Though basic jacket for 99$, short slim skirt for 49$ are products within midium-low prices range, in the material planning stage aiming at production of item that has both resonable function appealing to consumer and is fashionable, synthetic material had to be used as a main source due to price competitiveness. Despite this situation, considering comfortable sense of fit and refined drape of silhouette that has no sign of cheap material, whole collectible line was divided into two items, which contributed to reduction of cost. In case of material that is composed of triacetate and polyester in 70 to 30 ratio, was used up to 4 million yard, which allowed drastic curtailment of cost accompanied by concentration. In case of 'collectibles' line, using Korean material mainly, C company chose to have their product sewed in Southeast Asian countries where transportation is well developed and both productivity and quality verified by operating global production system which aiming at cutdown of cost through outsourcing production from the country where labor cost is low and getting finished product. Polarization between present consumers telling us that consumers with the mind of middle classes in the past no longer exists between consumers who seek after only fine article of highest quality and wise consumers who are sensible enough to judge bubble on correlation between price and quality. To cope with this change in new consumer mind, apparel makes changing their policy so as to produce item that has reasonable quality and falls within affordable price range anywhere in the world. and they're striving to get out of difficult situation by operating global marketing strategy which stresses separation of planning, production and sale and sensibility of fashion shared worldwide. The marketing strategy of C company can be exemplified as a successful one.

Development of Predictive Models for Rights Issues Using Financial Analysis Indices and Decision Tree Technique (경영분석지표와 의사결정나무기법을 이용한 유상증자 예측모형 개발)

  • Kim, Myeong-Kyun;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.59-77
    • /
    • 2012
  • This study focuses on predicting which firms will increase capital by issuing new stocks in the near future. Many stakeholders, including banks, credit rating agencies and investors, performs a variety of analyses for firms' growth, profitability, stability, activity, productivity, etc., and regularly report the firms' financial analysis indices. In the paper, we develop predictive models for rights issues using these financial analysis indices and data mining techniques. This study approaches to building the predictive models from the perspective of two different analyses. The first is the analysis period. We divide the analysis period into before and after the IMF financial crisis, and examine whether there is the difference between the two periods. The second is the prediction time. In order to predict when firms increase capital by issuing new stocks, the prediction time is categorized as one year, two years and three years later. Therefore Total six prediction models are developed and analyzed. In this paper, we employ the decision tree technique to build the prediction models for rights issues. The decision tree is the most widely used prediction method which builds decision trees to label or categorize cases into a set of known classes. In contrast to neural networks, logistic regression and SVM, decision tree techniques are well suited for high-dimensional applications and have strong explanation capabilities. There are well-known decision tree induction algorithms such as CHAID, CART, QUEST, C5.0, etc. Among them, we use C5.0 algorithm which is the most recently developed algorithm and yields performance better than other algorithms. We obtained data for the rights issue and financial analysis from TS2000 of Korea Listed Companies Association. A record of financial analysis data is consisted of 89 variables which include 9 growth indices, 30 profitability indices, 23 stability indices, 6 activity indices and 8 productivity indices. For the model building and test, we used 10,925 financial analysis data of total 658 listed firms. PASW Modeler 13 was used to build C5.0 decision trees for the six prediction models. Total 84 variables among financial analysis data are selected as the input variables of each model, and the rights issue status (issued or not issued) is defined as the output variable. To develop prediction models using C5.0 node (Node Options: Output type = Rule set, Use boosting = false, Cross-validate = false, Mode = Simple, Favor = Generality), we used 60% of data for model building and 40% of data for model test. The results of experimental analysis show that the prediction accuracies of data after the IMF financial crisis (59.04% to 60.43%) are about 10 percent higher than ones before IMF financial crisis (68.78% to 71.41%). These results indicate that since the IMF financial crisis, the reliability of financial analysis indices has increased and the firm intention of rights issue has been more obvious. The experiment results also show that the stability-related indices have a major impact on conducting rights issue in the case of short-term prediction. On the other hand, the long-term prediction of conducting rights issue is affected by financial analysis indices on profitability, stability, activity and productivity. All the prediction models include the industry code as one of significant variables. This means that companies in different types of industries show their different types of patterns for rights issue. We conclude that it is desirable for stakeholders to take into account stability-related indices and more various financial analysis indices for short-term prediction and long-term prediction, respectively. The current study has several limitations. First, we need to compare the differences in accuracy by using different data mining techniques such as neural networks, logistic regression and SVM. Second, we are required to develop and to evaluate new prediction models including variables which research in the theory of capital structure has mentioned about the relevance to rights issue.