Processing math: 100%
  • Title/Summary/Keyword: intelligence information society

Search Result 3,527, Processing Time 0.034 seconds

A Study on the Construal Level and Intention of Autonomous Driving Taxi According to Message Framing (해석수준과 메시지 프레이밍에 따른 자율주행택시의 사용의도에 관한 연구)

  • Yoon, Seong Jeong;Kim, Min Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.135-155
    • /
    • 2018
  • The purpose of this study is to analyze the difference of interpretation level and intention to use message framing when autonomous vehicle, which is emerging as the product of 4th industrial revolution, is used as taxi, Interpretation level refers to the interpretation of a product or service, assuming that it will happen in the near future or in the distant future. Message framing refers to the formation of positive or negative expressions or messages at the extremes of benefits and losses. In other words, previous studies interpret the value of a product or service differently according to these two concepts. The purpose of this study is to investigate whether there are differences in intention to use when two concepts are applied when an autonomous vehicle is launched as a taxi. The results are summarized as follows: First, the message format explaining the gain and why should be used when using the autonomous taxi in the message framing configuration, and the loss and how when the autonomous taxi is not used. Messages were constructed and compared. The two message framing differed (t = 3.063), and the message type describing the benefits and reasons showed a higher intention to use. In addition, the results according to interpretation level are summarized as follows. There was a difference in intentions to use when assuming that it would occur in the near future and in the near future with respect to the gain and loss, Respectively. In summary, in order to increase the intention of using autonomous taxis, it is concluded that messages should be given to people assuming positive messages (Gain) and what can happen in the distant future. In addition, this study will be able to utilize the research method in studying intention to use new technology. However, this study has the following limitations. First, it assumes message framing and time without user experience of autonomous taxi. This will be different from the actual experience of using an autonomous taxi in the future. Second, self-driving cars should technical progress is continuing, but laws and institutions must be established in order to commercialize it and build the infrastructure to operate the autonomous car. Considering this fact, the results of this study can not reflect a more realistic aspect. However, there is a practical limit to search for users with sufficient experience in new technologies such as autonomous vehicles. In fact, although the autonomous car to take advantage of the public transportation by taxi is now ready for the road infrastructure, and technical and legal public may not be willing to choose to not have enough knowledge to use the Autonomous cab. Therefore, the main purpose of this study is that by assuming that autonomous cars will be commercialized by taxi you can do to take advantage of the autonomous car, it is necessary to frame the message, why can most effectively be used to find how to deliver. In addition, the research methodology should be improved and future research should be done as follows. First, most students responded in this study. It is also true that it is difficult to generalize the hypotheses to be tested in this study. Therefore, in future studies, it would be reasonable to investigate the population of various distribution considering the age, area, occupation, education level, etc. Where autonomous taxi can be used rather than those who can drive. Second, it is desirable to construct various message framing of the questionnaire, but it is necessary to learn various message framing in advance and to prevent errors in response to the next message framing. Therefore, it is desirable to measure the message framing with a certain amount of time when the questionnaire is designed.

Development of a Detection Model for the Companies Designated as Administrative Issue in KOSDAQ Market (KOSDAQ 시장의 관리종목 지정 탐지 모형 개발)

  • Shin, Dong-In;Kwahk, Kee-Young
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.157-176
    • /
    • 2018
  • The purpose of this research is to develop a detection model for companies designated as administrative issue in KOSDAQ market using financial data. Administration issue designates the companies with high potential for delisting, which gives them time to overcome the reasons for the delisting under certain restrictions of the Korean stock market. It acts as an alarm to inform investors and market participants of which companies are likely to be delisted and warns them to make safe investments. Despite this importance, there are relatively few studies on administration issues prediction model in comparison with the lots of studies on bankruptcy prediction model. Therefore, this study develops and verifies the detection model of the companies designated as administrative issue using financial data of KOSDAQ companies. In this study, logistic regression and decision tree are proposed as the data mining models for detecting administrative issues. According to the results of the analysis, the logistic regression model predicted the companies designated as administrative issue using three variables - ROE(Earnings before tax), Cash flows/Shareholder's equity, and Asset turnover ratio, and its overall accuracy was 86% for the validation dataset. The decision tree (Classification and Regression Trees, CART) model applied the classification rules using Cash flows/Total assets and ROA(Net income), and the overall accuracy reached 87%. Implications of the financial indictors selected in our logistic regression and decision tree models are as follows. First, ROE(Earnings before tax) in the logistic detection model shows the profit and loss of the business segment that will continue without including the revenue and expenses of the discontinued business. Therefore, the weakening of the variable means that the competitiveness of the core business is weakened. If a large part of the profits is generated from one-off profit, it is very likely that the deterioration of business management is further intensified. As the ROE of a KOSDAQ company decreases significantly, it is highly likely that the company can be delisted. Second, cash flows to shareholder's equity represents that the firm's ability to generate cash flow under the condition that the financial condition of the subsidiary company is excluded. In other words, the weakening of the management capacity of the parent company, excluding the subsidiary's competence, can be a main reason for the increase of the possibility of administrative issue designation. Third, low asset turnover ratio means that current assets and non-current assets are ineffectively used by corporation, or that asset investment by corporation is excessive. If the asset turnover ratio of a KOSDAQ-listed company decreases, it is necessary to examine in detail corporate activities from various perspectives such as weakening sales or increasing or decreasing inventories of company. Cash flow / total assets, a variable selected by the decision tree detection model, is a key indicator of the company's cash condition and its ability to generate cash from operating activities. Cash flow indicates whether a firm can perform its main activities(maintaining its operating ability, repaying debts, paying dividends and making new investments) without relying on external financial resources. Therefore, if the index of the variable is negative(-), it indicates the possibility that a company has serious problems in business activities. If the cash flow from operating activities of a specific company is smaller than the net profit, it means that the net profit has not been cashed, indicating that there is a serious problem in managing the trade receivables and inventory assets of the company. Therefore, it can be understood that as the cash flows / total assets decrease, the probability of administrative issue designation and the probability of delisting are increased. In summary, the logistic regression-based detection model in this study was found to be affected by the company's financial activities including ROE(Earnings before tax). However, decision tree-based detection model predicts the designation based on the cash flows of the company.

The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms (다중 클래스 데이터셋의 메타특징이 판별 알고리즘의 성능에 미치는 영향 연구)

  • Kim, Jeonghun;Kim, Min Yong;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.23-45
    • /
    • 2020
  • Big data is creating in a wide variety of fields such as medical care, manufacturing, logistics, sales site, SNS, and the dataset characteristics are also diverse. In order to secure the competitiveness of companies, it is necessary to improve decision-making capacity using a classification algorithm. However, most of them do not have sufficient knowledge on what kind of classification algorithm is appropriate for a specific problem area. In other words, determining which classification algorithm is appropriate depending on the characteristics of the dataset was has been a task that required expertise and effort. This is because the relationship between the characteristics of datasets (called meta-features) and the performance of classification algorithms has not been fully understood. Moreover, there has been little research on meta-features reflecting the characteristics of multi-class. Therefore, the purpose of this study is to empirically analyze whether meta-features of multi-class datasets have a significant effect on the performance of classification algorithms. In this study, meta-features of multi-class datasets were identified into two factors, (the data structure and the data complexity,) and seven representative meta-features were selected. Among those, we included the Herfindahl-Hirschman Index (HHI), originally a market concentration measurement index, in the meta-features to replace IR(Imbalanced Ratio). Also, we developed a new index called Reverse ReLU Silhouette Score into the meta-feature set. Among the UCI Machine Learning Repository data, six representative datasets (Balance Scale, PageBlocks, Car Evaluation, User Knowledge-Modeling, Wine Quality(red), Contraceptive Method Choice) were selected. The class of each dataset was classified by using the classification algorithms (KNN, Logistic Regression, Nave Bayes, Random Forest, and SVM) selected in the study. For each dataset, we applied 10-fold cross validation method. 10% to 100% oversampling method is applied for each fold and meta-features of the dataset is measured. The meta-features selected are HHI, Number of Classes, Number of Features, Entropy, Reverse ReLU Silhouette Score, Nonlinearity of Linear Classifier, Hub Score. F1-score was selected as the dependent variable. As a result, the results of this study showed that the six meta-features including Reverse ReLU Silhouette Score and HHI proposed in this study have a significant effect on the classification performance. (1) The meta-features HHI proposed in this study was significant in the classification performance. (2) The number of variables has a significant effect on the classification performance, unlike the number of classes, but it has a positive effect. (3) The number of classes has a negative effect on the performance of classification. (4) Entropy has a significant effect on the performance of classification. (5) The Reverse ReLU Silhouette Score also significantly affects the classification performance at a significant level of 0.01. (6) The nonlinearity of linear classifiers has a significant negative effect on classification performance. In addition, the results of the analysis by the classification algorithms were also consistent. In the regression analysis by classification algorithm, Naïve Bayes algorithm does not have a significant effect on the number of variables unlike other classification algorithms. This study has two theoretical contributions: (1) two new meta-features (HHI, Reverse ReLU Silhouette score) was proved to be significant. (2) The effects of data characteristics on the performance of classification were investigated using meta-features. The practical contribution points (1) can be utilized in the development of classification algorithm recommendation system according to the characteristics of datasets. (2) Many data scientists are often testing by adjusting the parameters of the algorithm to find the optimal algorithm for the situation because the characteristics of the data are different. In this process, excessive waste of resources occurs due to hardware, cost, time, and manpower. This study is expected to be useful for machine learning, data mining researchers, practitioners, and machine learning-based system developers. The composition of this study consists of introduction, related research, research model, experiment, conclusion and discussion.

Building battery deterioration prediction model using real field data (머신러닝 기법을 이용한 납축전지 열화 예측 모델 개발)

  • Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.243-264
    • /
    • 2018
  • Although the worldwide battery market is recently spurring the development of lithium secondary battery, lead acid batteries (rechargeable batteries) which have good-performance and can be reused are consumed in a wide range of industry fields. However, lead-acid batteries have a serious problem in that deterioration of a battery makes progress quickly in the presence of that degradation of only one cell among several cells which is packed in a battery begins. To overcome this problem, previous researches have attempted to identify the mechanism of deterioration of a battery in many ways. However, most of previous researches have used data obtained in a laboratory to analyze the mechanism of deterioration of a battery but not used data obtained in a real world. The usage of real data can increase the feasibility and the applicability of the findings of a research. Therefore, this study aims to develop a model which predicts the battery deterioration using data obtained in real world. To this end, we collected data which presents change of battery state by attaching sensors enabling to monitor the battery condition in real time to dozens of golf carts operated in the real golf field. As a result, total 16,883 samples were obtained. And then, we developed a model which predicts a precursor phenomenon representing deterioration of a battery by analyzing the data collected from the sensors using machine learning techniques. As initial independent variables, we used 1) inbound time of a cart, 2) outbound time of a cart, 3) duration(from outbound time to charge time), 4) charge amount, 5) used amount, 6) charge efficiency, 7) lowest temperature of battery cell 1 to 6, 8) lowest voltage of battery cell 1 to 6, 9) highest voltage of battery cell 1 to 6, 10) voltage of battery cell 1 to 6 at the beginning of operation, 11) voltage of battery cell 1 to 6 at the end of charge, 12) used amount of battery cell 1 to 6 during operation, 13) used amount of battery during operation(Max-Min), 14) duration of battery use, and 15) highest current during operation. Since the values of the independent variables, lowest temperature of battery cell 1 to 6, lowest voltage of battery cell 1 to 6, highest voltage of battery cell 1 to 6, voltage of battery cell 1 to 6 at the beginning of operation, voltage of battery cell 1 to 6 at the end of charge, and used amount of battery cell 1 to 6 during operation are similar to that of each battery cell, we conducted principal component analysis using verimax orthogonal rotation in order to mitigate the multiple collinearity problem. According to the results, we made new variables by averaging the values of independent variables clustered together, and used them as final independent variables instead of origin variables, thereby reducing the dimension. We used decision tree, logistic regression, Bayesian network as algorithms for building prediction models. And also, we built prediction models using the bagging of each of them, the boosting of each of them, and RandomForest. Experimental results show that the prediction model using the bagging of decision tree yields the best accuracy of 89.3923%. This study has some limitations in that the additional variables which affect the deterioration of battery such as weather (temperature, humidity) and driving habits, did not considered, therefore, we would like to consider the them in the future research. However, the battery deterioration prediction model proposed in the present study is expected to enable effective and efficient management of battery used in the real filed by dramatically and to reduce the cost caused by not detecting battery deterioration accordingly.

The Behavior Analysis of Exhibition Visitors using Data Mining Technique at the KIDS & EDU EXPO for Children (유아교육 박람회에서 데이터마이닝 기법을 이용한 전시 관람 행동 패턴 분석)

  • Jung, Min-Kyu;Kim, Hyea-Kyeong;Choi, Il-Young;Lee, Kyoung-Jun;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.2
    • /
    • pp.77-96
    • /
    • 2011
  • An exhibition is defined as market events for specific duration to present exhibitors' main products to business or private visitors, and it plays a key role as effective marketing channels. As the importance of exhibition is getting more and more, domestic exhibition industry has achieved such a great quantitative growth. But, In contrast to the quantitative growth of domestic exhibition industry, the qualitative growth of Exhibition has not achieved competent growth. In order to improve the quality of exhibition, we need to understand the preference or behavior characteristics of visitors and to increase the level of visitors' attention and satisfaction through the understanding of visitors. So, in this paper, we used the observation survey method which is a kind of field research to understand visitors and collect the real data for the analysis of behavior pattern. And this research proposed the following methodology framework consisting of three steps. First step is to select a suitable exhibition to apply for our method. Second step is to implement the observation survey method. And we collect the real data for further analysis. In this paper, we conducted the observation survey method to obtain the real data of the KIDS & EDU EXPO for Children in SETEC. Our methodology was conducted on 160 visitors and 78 booths from November 4th to 6th in 2010. And, the last step is to analyze the record data through observation. In this step, we analyze the feature of exhibition using Demographic Characteristics collected by observation survey method at first. And then we analyze the individual booth features by the records of visited booth. Through the analysis of individual booth features, we can figure out what kind of events attract the attention of visitors and what kind of marketing activities affect the behavior pattern of visitors. But, since previous research considered only individual features influenced by exhibition, the research about the correlation among features is not performed much. So, in this research, additional analysis is carried out to supplement the existing research with data mining techniques. And we analyze the relation among booths using data mining techniques to know behavior patterns of visitors. Among data mining techniques, we make use of two data mining techniques, such as clustering analysis and ARM(Association Rule Mining) analysis. In clustering analysis, we use K-means algorithm to figure out the correlation among booths. Through data mining techniques, we figure out that there are two important features to affect visitors' behavior patterns in exhibition. One is the geographical features of booths. The other is the exhibit contents of booths. Those features are considered when the organizer of exhibition plans next exhibition. Therefore, the results of our analysis are expected to provide guideline to understanding visitors and some valuable insights for the exhibition from the earlier phases of exhibition planning. Also, this research would be a good way to increase the quality of visitor satisfaction. Visitors' movement paths, booth location, and distances between each booth are considered to plan next exhibition in advance. This research was conducted at the KIDS & EDU EXPO for Children in SETEC(Seoul Trade Exhibition & Convention), but it has some constraints to be applied directly to other exhibitions. Also, the results were derived from a limited number of data samples. In order to obtain more accurate and reliable results, it is necessary to conduct more experiments based on larger data samples and exhibitions on a variety of genres.

Design and Analysis of Online Advertising Expenditure Model based on Coupon Download (쿠폰 다운로드를 기준으로 하는 온라인 광고비 모델의 설계 및 분석)

  • Jun, Jung-Ho;Lee, Kyoung-Jun
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.1-19
    • /
    • 2010
  • In offline environment, unlike traditional advertising model through TV, newspaper, and radio, online advertising model draws instantaneous responses from potential consumers and it is convenient to assess. This kind of characteristics of Internet advertising model has driven the growth of advertising model among various Internet business models. There are, conventionally classified, CPM (Cost Per Mile), CPC (Cost Per Click), and CPS (Cost Per Sales) models as Internet advertising expenditure model. These can be examined in manners regarding risks that stakeholders should stand and degree of responsibility. CPM model that is based on number of advertisement exposure is mechanically exposed to users but not actually recognized by users resulting in risk of wasted expenditure by advertisers without any advertising effect. While on aspect of media, CPS model that is based on conversion action is the most risky model because of the conversion action such as product purchase is determined by capability of advertisers not that of media. In this regard, while there are issue of CPM and CPS models disadvantageously affecting only one side of Internet advertising business model value network, CPC model has been evaluated as reasonable both to advertisers and media, and occupied the largest segment of Internet advertising market. However, CPC model also can cause fraudulent behavior such as click fraud because of the competition or dishonest amount of advertising expenditure. On the user aspect, unintentionally accessed advertisements can lead to more inappropriate expenditure from advertisers. In this paper, we suggest "CPCD"(Cost Per Coupon Download) model. This goes beyond simple clicking of advertisements and advertising expenditure is exerted when users download a coupon from advertisers, which is a concept in between CPC and CPS models. To achieve the purpose, we describe the scenario of advertiser perspective, processes, participants and their benefits of CPCD model. Especially, we suggest the new value in online coupon; "possibility of storage" and "complement for delivery to the target group". We also analyze the working condition for advertiser by a comparison of CPC and CPCD models through advertising expenditure simulation. The result of simulation implies that the CPCD model suits more properly to advertisers with medium-low price products rather than that of high priced goods. This denotes that since most of advertisers in CPC model are dealing with medium-low priced products, the result is very interesting. At last, we contemplate applicability of CPCD model in ubiquitous environment.

Analysis of Knowledge Community for Knowledge Creation and Use (지식 생성 및 활용을 위한 지식 커뮤니티 효과 분석)

  • Huh, Jun-Hyuk;Lee, Jung-Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.85-97
    • /
    • 2010
  • Internet communities are a typical space for knowledge creation and use on the Internet as people discuss their common interests within the internet communities. When we define 'Knowledge Communities' as internet communities that are related to knowledge creation and use, they are categorized into 4 different types such as 'Search Engine,' 'Open Communities,' 'Specialty Communities,' and 'Activity Communities.' Each type of knowledge community does not remain the same, for example. Rather, it changes with time and is also affected by the external business environment. Therefore, it is critical to develop processes for practical use of such changeable knowledge communities. Yet there is little research regarding a strategic framework for knowledge communities as a source of knowledge creation and use. The purposes of this study are (1) to find factors that can affect knowledge creation and use for each type of knowledge community and (2) to develop a strategic framework for practical use of the knowledge communities. Based on previous research, we found 7 factors that have considerable impacts on knowledge creation and use. They were 'Fitness,' 'Reliability,' 'Systemicity,' 'Richness,' 'Similarity,' 'Feedback,' and 'Understanding.' We created 30 different questions from each type of knowledge community. The questions included common sense, IT, business and hobbies, and were uniformly selected from various knowledge communities. Instead of using survey, we used these questions to ask users of the 4 representative web sites such as Google from Search Engine, NAVER Knowledge iN from Open Communities, SLRClub from Specialty Communities, and Wikipedia from Activity Communities. These 4 representative web sites were selected based on popularity (i.e., the 4 most popular sites in Korea). They were also among the 4 most frequently mentioned sitesin previous research. The answers of the 30 knowledge questions were collected and evaluated by the 11 IT experts who have been working for IT companies more than 3 years. When evaluating, the 11 experts used the above 7 knowledge factors as criteria. Using a stepwise linear regression for the evaluation of the 7 knowledge factors, we found that each factors affects differently knowledge creation and use for each type of knowledge community. The results of the stepwise linear regression analysis showed the relationship between 'Understanding' and other knowledge factors. The relationship was different regarding the type of knowledge community. The results indicated that 'Understanding' was significantly related to 'Reliability' at 'Search Engine type', to 'Fitness' at 'Open Community type', to 'Reliability' and 'Similarity' at 'Specialty Community type', and to 'Richness' and 'Similarity' at 'Activity Community type'. A strategic framework was created from the results of this study and such framework can be useful for knowledge communities that are not stable with time. For the success of knowledge community, the results of this study suggest that it is essential to ensure there are factors that can influence knowledge communities. It is also vital to reinforce each factor has its unique influence on related knowledge community. Thus, these changeable knowledge communities should be transformed into an adequate type with proper business strategies and objectives. They also should be progressed into a type that covers varioustypes of knowledge communities. For example, DCInside started from a small specialty community focusing on digital camera hardware and camerawork and then was transformed to an open community focusing on social issues through well-known photo galleries. NAVER started from a typical search engine and now covers an open community and a special community through additional web services such as NAVER knowledge iN, NAVER Cafe, and NAVER Blog. NAVER is currently competing withan activity community such as Wikipedia through the NAVER encyclopedia that provides similar services with NAVER encyclopedia's users as Wikipedia does. Finally, the results of this study provide meaningfully practical guidance for practitioners in that which type of knowledge community is most appropriate to the fluctuated business environment as knowledge community itself evolves with time.

Selection Model of System Trading Strategies using SVM (SVM을 이용한 시스템트레이딩전략의 선택모형)

  • Park, Sungcheol;Kim, Sun Woong;Choi, Heung Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.59-71
    • /
    • 2014
  • System trading is becoming more popular among Korean traders recently. System traders use automatic order systems based on the system generated buy and sell signals. These signals are generated from the predetermined entry and exit rules that were coded by system traders. Most researches on system trading have focused on designing profitable entry and exit rules using technical indicators. However, market conditions, strategy characteristics, and money management also have influences on the profitability of the system trading. Unexpected price deviations from the predetermined trading rules can incur large losses to system traders. Therefore, most professional traders use strategy portfolios rather than only one strategy. Building a good strategy portfolio is important because trading performance depends on strategy portfolios. Despite of the importance of designing strategy portfolio, rule of thumb methods have been used to select trading strategies. In this study, we propose a SVM-based strategy portfolio management system. SVM were introduced by Vapnik and is known to be effective for data mining area. It can build good portfolios within a very short period of time. Since SVM minimizes structural risks, it is best suitable for the futures trading market in which prices do not move exactly the same as the past. Our system trading strategies include moving-average cross system, MACD cross system, trend-following system, buy dips and sell rallies system, DMI system, Keltner channel system, Bollinger Bands system, and Fibonacci system. These strategies are well known and frequently being used by many professional traders. We program these strategies for generating automated system signals for entry and exit. We propose SVM-based strategies selection system and portfolio construction and order routing system. Strategies selection system is a portfolio training system. It generates training data and makes SVM model using optimal portfolio. We make m×n data matrix by dividing KOSPI 200 index futures data with a same period. Optimal strategy portfolio is derived from analyzing each strategy performance. SVM model is generated based on this data and optimal strategy portfolio. We use 80% of the data for training and the remaining 20% is used for testing the strategy. For training, we select two strategies which show the highest profit in the next day. Selection method 1 selects two strategies and method 2 selects maximum two strategies which show profit more than 0.1 point. We use one-against-all method which has fast processing time. We analyse the daily data of KOSPI 200 index futures contracts from January 1990 to November 2011. Price change rates for 50 days are used as SVM input data. The training period is from January 1990 to March 2007 and the test period is from March 2007 to November 2011. We suggest three benchmark strategies portfolio. BM1 holds two contracts of KOSPI 200 index futures for testing period. BM2 is constructed as two strategies which show the largest cumulative profit during 30 days before testing starts. BM3 has two strategies which show best profits during testing period. Trading cost include brokerage commission cost and slippage cost. The proposed strategy portfolio management system shows profit more than double of the benchmark portfolios. BM1 shows 103.44 point profit, BM2 shows 488.61 point profit, and BM3 shows 502.41 point profit after deducting trading cost. The best benchmark is the portfolio of the two best profit strategies during the test period. The proposed system 1 shows 706.22 point profit and proposed system 2 shows 768.95 point profit after deducting trading cost. The equity curves for the entire period show stable pattern. With higher profit, this suggests a good trading direction for system traders. We can make more stable and more profitable portfolios if we add money management module to the system.

Participation Level in Online Knowledge Sharing: Behavioral Approach on Wikipedia (온라인 지식공유의 참여정도: 위키피디아에 대한 행태적 접근)

  • Park, Hyun Jung;Lee, Hong Joo;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.97-121
    • /
    • 2013
  • With the growing importance of knowledge for sustainable competitive advantages and innovation in a volatile environment, many researches on knowledge sharing have been conducted. However, previous researches have mostly relied on the questionnaire survey which has inherent perceptive errors of respondents. The current research has drawn the relationship among primary participant behaviors towards the participation level in knowledge sharing, basically from online user behaviors on Wikipedia, a representative community for online knowledge collaboration. Without users' participation in knowledge sharing, knowledge collaboration for creating knowledge cannot be successful. By the way, the editing patterns of Wikipedia users are diverse, resulting in different revisiting periods for the same number of edits, and thus varying results of shared knowledge. Therefore, we illuminated the participation level of knowledge sharing from two different angles of number of edits and revisiting period. The behavioral dimensions affecting the level of participation in knowledge sharing includes the article talk for public discussion and user talk for private messaging, and community registration, which are observable on Wiki platform. Public discussion is being progressed on article talk pages arranged for exchanging ideas about each article topic. An article talk page is often divided into several sections which mainly address specific type of issues raised during the article development procedure. From the diverse opinions about the relatively trivial things such as what text, link, or images should be added or removed and how they should be restructured to the profound professional insights are shared, negotiated, and improved over the course of discussion. Wikipedia also provides personal user talk pages as a private messaging tool. On these pages, diverse personal messages such as casual greetings, stories about activities on Wikipedia, and ordinary affairs of life are exchanged. If anyone wants to communicate with another person, he or she visits the person's user talk page and leaves a message. Wikipedia articles are assessed according to seven quality grades, of which the featured article level is the highest. The dataset includes participants' behavioral data related with 2,978 articles, which have reached the featured article level, with editing histories of articles, their article talk histories, and user talk histories extracted from user talk pages for each article. The time period for analysis is from the initiation of articles until their promotion to the featured article level. The number of edits represents the total number of participation in the editing of an article, and the revisiting period is the time difference between the first and last edits. At first, the participation levels of each user category classified according to behavioral dimensions have been analyzed and compared. And then, robust regressions have been conducted on the relationships among independent variables reflecting the degree of behavioral characteristics and the dependent variable representing the participation level. Especially, through adopting a motivational theory adequate for online environment in setting up research hypotheses, this work suggests a theoretical framework for the participation level of online knowledge sharing. Consequently, this work reached the following practical behavioral results besides some theoretical implications. First, both public discussion and private messaging positively affect the participation level in knowledge sharing. Second, public discussion exerts greater influence than private messaging on the participation level. Third, a synergy effect of public discussion and private messaging on the number of edits was found, whereas a pretty weak negative interaction effect of them on the revisiting period was observed. Fourth, community registration has a significant impact on the revisiting period, whereas being insignificant on the number of edits. Fifth, when it comes to the relation generated from private messaging, the frequency or depth of relation is shown to be more critical than the scope of relation for the participation level.

Prediction of commitment and persistence in heterosexual involvements according to the styles of loving using a datamining technique (데이터마이닝을 활용한 사랑의 형태에 따른 연인관계 몰입수준 및 관계 지속여부 예측)

  • Park, Yoon-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.69-85
    • /
    • 2016
  • Successful relationship with loving partners is one of the most important factors in life. In psychology, there have been some previous researches studying the factors influencing romantic relationships. However, most of these researches were performed based on statistical analysis; thus they have limitations in analyzing complex non-linear relationships or rules based reasoning. This research analyzes commitment and persistence in heterosexual involvement according to styles of loving using a datamining technique as well as statistical methods. In this research, we consider six different styles of loving - 'eros', 'ludus', 'stroge', 'pragma', 'mania' and 'agape' which influence romantic relationships between lovers, besides the factors suggested by the previous researches. These six types of love are defined by Lee (1977) as follows: 'eros' is romantic, passionate love; 'ludus' is a game-playing or uncommitted love; 'storge' is a slow developing, friendship-based love; 'pragma' is a pragmatic, practical, mutually beneficial relationship; 'mania' is an obsessive or possessive love and, lastly, 'agape' is a gentle, caring, giving type of love, brotherly love, not concerned with the self. In order to do this research, data from 105 heterosexual couples were collected. Using the data, a linear regression method was first performed to find out the important factors associated with a commitment to partners. The result shows that 'satisfaction', 'eros' and 'agape' are significant factors associated with the commitment level for both male and female. Interestingly, in male cases, 'agape' has a greater effect on commitment than 'eros'. On the other hand, in female cases, 'eros' is a more significant factor than 'agape' to commitment. In addition to that, 'investment' of the male is also crucial factor for male commitment. Next, decision tree analysis was performed to find out the characteristics of high commitment couples and low commitment couples. In order to build decision tree models in this experiment, 'decision tree' operator in the datamining tool, Rapid Miner was used. The experimental result shows that males having a high satisfaction level in relationship show a high commitment level. However, even though a male may not have a high satisfaction level, if he has made a lot of financial or mental investment in relationship, and his partner shows him a certain amount of 'agape', then he also shows a high commitment level to the female. In the case of female, a women having a high 'eros' and 'satisfaction' level shows a high commitment level. Otherwise, even though a female may not have a high satisfaction level, if her partner shows a certain amount of 'mania' then the female also shows a high commitment level. Finally, this research built a prediction model to establish whether the relationship will persist or break up using a decision tree. The result shows that the most important factor influencing to the break up is a 'narcissistic tendency' of the male. In addition to that, 'satisfaction', 'investment' and 'mania' of both male and female also affect a break up. Interestingly, while the 'mania' level of a male works positively to maintain the relationship, that of a female has a negative influence. The contribution of this research is adopting a new technique of analysis using a datamining method for psychology. In addition, the results of this research can provide useful advice to couples for building a harmonious relationship with each other. This research has several limitations. First, the experimental data was sampled based on oversampling technique to balance the size of each classes. Thus, it has a limitation of evaluating performances of the predictive models objectively. Second, the result data, whether the relationship persists of not, was collected relatively in short periods - 6 months after the initial data collection. Lastly, most of the respondents of the survey is in their 20's. In order to get more general results, we would like to extend this research to general populations.