• Title/Summary/Keyword: Using Voice

Search Result 2,098, Processing Time 0.032 seconds

Blind Rhythmic Source Separation (블라인드 방식의 리듬 음원 분리)

  • Kim, Min-Je;Yoo, Ji-Ho;Kang, Kyeong-Ok;Choi, Seung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.8
    • /
    • pp.697-705
    • /
    • 2009
  • An unsupervised (blind) method is proposed aiming at extracting rhythmic sources from commercial polyphonic music whose number of channels is limited to one. Commercial music signals are not usually provided with more than two channels while they often contain multiple instruments including singing voice. Therefore, instead of using conventional modeling of mixing environments or statistical characteristics, we should introduce other source-specific characteristics for separating or extracting sources in the under determined environments. In this paper, we concentrate on extracting rhythmic sources from the mixture with the other harmonic sources. An extension of nonnegative matrix factorization (NMF), which is called nonnegative matrix partial co-factorization (NMPCF), is used to analyze multiple relationships between spectral and temporal properties in the given input matrices. Moreover, temporal repeatability of the rhythmic sound sources is implicated as a common rhythmic property among segments of an input mixture signal. The proposed method shows acceptable, but not superior separation quality to referred prior knowledge-based drum source separation systems, but it has better applicability due to its blind manner in separation, for example, when there is no prior information or the target rhythmic source is irregular.

A Study on Effects of the vocal psychotherapy upon Self-Consciousness (성악심리치료활동을 통한 자기의식 변화에 관한 연구)

  • Lee, Hyun Joo
    • Journal of Music and Human Behavior
    • /
    • v.4 no.2
    • /
    • pp.66-83
    • /
    • 2007
  • The purpose of this study is to learn both effects of the vocal psychotherapy on the self-consciousness and the variety of the self-consciousness on the vocal psychotherapy in return. The research for this study was performed to three subjects who were students of E university, Seoul, ten times for sixty minutes. The subjects were all volunteers for the advertisement on a music-therapy program searching for them on the web site of E university. The vocal psychotherapy program consists of four steps and each of them consists of two to four short terms again. Both before and after the experiment, examinations on self-consciousness were done to recognize the change of the subjects' self-consciousness which would be caused by the vocal psychotherapy activity. After every short term, the subjects were asked to write reports to closely analyze the change of self-consciousness according to the terms and the variety of the subjects. The effect of the vocal psychotherapy activity on the changes of scores in the self-consciousness examination is the first thing to point out on this study. There appeared some personal varieties on the total scores of the examination and scores of some sub-categories. Especially, there were different scores on the private self-consciousness, the public self-consciousness, and the social anxiety between before and after performing the vocal psychotherapy program. Subject A, who had got the best score of all on the scope of the private self-consciousness, showed the steepest decrease on the very scope. On the contrary, the subject showed decrease of scores of the public self-consciousness and the social anxiety in the relatively little rate. Subject B, who had got the highest score of the three on the public self-consciousness, showed the steepest decrease on that of all scopes and showed no difference on the social anxiety scope. In the case of the last one, subject C, who had relatively low scores on the private and public self-consciousness than the others, the private self-consciousness score increased but the public self-consciousness and the social anxiety scores decreased. The changes of the scores of each questions were examined in order to see possible other changes that had not been exposed on the changes of the total and sub-categories scores. As a result of that, of all twenty-eight questions, there were changes about one to two points. Subject A showed the difference with thirteen questions, subject B with sixteen and subject C with nineteen questions. The rate of change of subject C was relatively small but more questions changed and the change of score was wider than the others. Considering all those results, It can be possibly said that the vocal psychotherapy affects the changes of the scores of sub-categories in self-consciousness examination. The next thing to point out on this study is the change of recognition that was exposed on the subjects' report after every short term of the program. As a result of the close analyzing, according to the short terms and variety of self-consciousness, recognizing the way express subjects themselves by voice and recognizing their own voices appeared to be different. How much they cared about others and why they did so were also different. According to the self reports, subject A cared much about her inner thought and emotion and tended to concentrate herself as a social object. There appeared some positive emotional experiments such as emotional abundance and art curiosities on her reports but at the same time some negative emotions such as state-trait anxiety and neuroticism also appeared. Subject B, who showed high scores on the private and public self-consciousness like subject A, had a similar tendency that concentrates on herself as a social object but she showed more social anxiety than subject A. Subject C got relatively lower points in self-consciousness examination, tended to care about herself, and had less negative emotions such as state-trait anxiety than other subjects. Also, with terms going on, she showed changes in the way of caring about her own voice and others. This study has some unique significances in helping people who have problems caused by self-estimation activated with self-consciousness, using voices closely related to one's own self, performing the vocal skills discipline to solve the technical problems. Also, this study has a potentiality that the vocal psychotherapy activity can be effectively used as a way affects the mental health and developing personality.

  • PDF

Mature Market Sub-segmentation and Its Evaluation by the Degree of Homogeneity (동질도 평가를 통한 실버세대 세분군 분류 및 평가)

  • Bae, Jae-ho
    • Journal of Distribution Science
    • /
    • v.8 no.3
    • /
    • pp.27-35
    • /
    • 2010
  • As the population, buying power, and intensity of self-expression of the elderly generation increase, its importance as a market segment is also growing. Therefore, the mass marketing strategy for the elderly generation must be changed to a micro-marketing strategy based on the results of sub-segmentation that suitably captures the characteristics of this generation. Furthermore, as a customer access strategy is decided by sub-segmentation, proper segmentation is one of the key success factors for micro-marketing. Segments or sub-segments are different from sectors, because segmentation or sub-segmentation for micro-marketing is based on the homogeneity of customer needs. Theoretically, complete segmentation would reveal a single voice. However, it is impossible to achieve complete segmentation because of economic factors, factors that affect effectiveness, etc. To obtain a single voice from a segment, we sometimes need to divide it into many individual cases. In such a case, there would be a many segments to deal with. On the other hand, to maximize market access performance, fewer segments are preferred. In this paper, we use the term "sub-segmentation" instead of "segmentation," because we divide a specific segment into more detailed segments. To sub-segment the elderly generation, this paper takes their lifestyles and life stages into consideration. In order to reflect these aspects, various surveys and several rounds of expert interviews and focused group interviews (FGIs) were performed. Using the results of these qualitative surveys, we can define six sub-segments of the elderly generation. This paper uses five rules to divide the elderly generation. The five rules are (1) mutually exclusive and collectively exhaustive (MECE) sub-segmentation, (2) important life stages, (3) notable lifestyles, (4) minimum number of and easy classifiable sub-segments, and (5) significant difference in voices among the sub-segments. The most critical point for dividing the elderly market is whether children are married. The other points are source of income, gender, and occupation. In this paper, the elderly market is divided into six sub-segments. As mentioned, the number of sub-segments is a very key point for a successful marketing approach. Too many sub-segments would lead to narrow substantiality or lack of actionability. On the other hand, too few sub-segments would have no effects. Therefore, the creation of the optimum number of sub-segments is a critical problem faced by marketers. This paper presents a method of evaluating the fitness of sub-segments that was deduced from the preceding surveys. The presented method uses the degree of homogeneity (DoH) to measure the adequacy of sub-segments. This measure uses quantitative survey questions to calculate adequacy. The ratio of significantly homogeneous questions to the total numbers of survey questions indicates the DoH. A significantly homogeneous question is defined as a question in which one case is selected significantly more often than others. To show whether a case is selected significantly more often than others, we use a hypothesis test. In this case, the null hypothesis (H0) would be that there is no significant difference between the selection of one case and that of the others. Thus, the total number of significantly homogeneous questions is the total number of cases in which the null hypothesis is rejected. To calculate the DoH, we conducted a quantitative survey (total sample size was 400, 60 questions, 4~5 cases for each question). The sample size of the first sub-segment-has no unmarried offspring and earns a living independently-is 113. The sample size of the second sub-segment-has no unmarried offspring and is economically supported by its offspring-is 57. The sample size of the third sub-segment-has unmarried offspring and is employed and male-is 70. The sample size of the fourth sub-segment-has unmarried offspring and is not employed and male-is 45. The sample size of the fifth sub-segment-has unmarried offspring and is female and employed (either the female herself or her husband)-is 63. The sample size of the last sub-segment-has unmarried offspring and is female and not employed (not even the husband)-is 52. Statistically, the sample size of each sub-segment is sufficiently large. Therefore, we use the z-test for testing hypotheses. When the significance level is 0.05, the DoHs of the six sub-segments are 1.00, 0.95, 0.95, 0.87, 0.93, and 1.00, respectively. When the significance level is 0.01, the DoHs of the six sub-segments are 0.95, 0.87, 0.85, 0.80, 0.88, and 0.87, respectively. These results show that the first sub-segment is the most homogeneous category, while the fourth has more variety in terms of its needs. If the sample size is sufficiently large, more segmentation would be better in a given sub-segment. However, as the fourth sub-segment is smaller than the others, more detailed segmentation is not proceeded. A very critical point for a successful micro-marketing strategy is measuring the fit of a sub-segment. However, until now, there have been no robust rules for measuring fit. This paper presents a method of evaluating the fit of sub-segments. This method will be very helpful for deciding the adequacy of sub-segmentation. However, it has some limitations that prevent it from being robust. These limitations include the following: (1) the method is restricted to only quantitative questions; (2) the type of questions that must be involved in calculation pose difficulties; (3) DoH values depend on content formation. Despite these limitations, this paper has presented a useful method for conducting adequate sub-segmentation. We believe that the present method can be applied widely in many areas. Furthermore, the results of the sub-segmentation of the elderly generation can serve as a reference for mature marketing.

  • PDF

The Audience Behavior-based Emotion Prediction Model for Personalized Service (고객 맞춤형 서비스를 위한 관객 행동 기반 감정예측모형)

  • Ryoo, Eun Chung;Ahn, Hyunchul;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.73-85
    • /
    • 2013
  • Nowadays, in today's information society, the importance of the knowledge service using the information to creative value is getting higher day by day. In addition, depending on the development of IT technology, it is ease to collect and use information. Also, many companies actively use customer information to marketing in a variety of industries. Into the 21st century, companies have been actively using the culture arts to manage corporate image and marketing closely linked to their commercial interests. But, it is difficult that companies attract or maintain consumer's interest through their technology. For that reason, it is trend to perform cultural activities for tool of differentiation over many firms. Many firms used the customer's experience to new marketing strategy in order to effectively respond to competitive market. Accordingly, it is emerging rapidly that the necessity of personalized service to provide a new experience for people based on the personal profile information that contains the characteristics of the individual. Like this, personalized service using customer's individual profile information such as language, symbols, behavior, and emotions is very important today. Through this, we will be able to judge interaction between people and content and to maximize customer's experience and satisfaction. There are various relative works provide customer-centered service. Specially, emotion recognition research is emerging recently. Existing researches experienced emotion recognition using mostly bio-signal. Most of researches are voice and face studies that have great emotional changes. However, there are several difficulties to predict people's emotion caused by limitation of equipment and service environments. So, in this paper, we develop emotion prediction model based on vision-based interface to overcome existing limitations. Emotion recognition research based on people's gesture and posture has been processed by several researchers. This paper developed a model that recognizes people's emotional states through body gesture and posture using difference image method. And we found optimization validation model for four kinds of emotions' prediction. A proposed model purposed to automatically determine and predict 4 human emotions (Sadness, Surprise, Joy, and Disgust). To build up the model, event booth was installed in the KOCCA's lobby and we provided some proper stimulative movie to collect their body gesture and posture as the change of emotions. And then, we extracted body movements using difference image method. And we revised people data to build proposed model through neural network. The proposed model for emotion prediction used 3 type time-frame sets (20 frames, 30 frames, and 40 frames). And then, we adopted the model which has best performance compared with other models.' Before build three kinds of models, the entire 97 data set were divided into three data sets of learning, test, and validation set. The proposed model for emotion prediction was constructed using artificial neural network. In this paper, we used the back-propagation algorithm as a learning method, and set learning rate to 10%, momentum rate to 10%. The sigmoid function was used as the transform function. And we designed a three-layer perceptron neural network with one hidden layer and four output nodes. Based on the test data set, the learning for this research model was stopped when it reaches 50000 after reaching the minimum error in order to explore the point of learning. We finally processed each model's accuracy and found best model to predict each emotions. The result showed prediction accuracy 100% from sadness, and 96% from joy prediction in 20 frames set model. And 88% from surprise, and 98% from disgust in 30 frames set model. The findings of our research are expected to be useful to provide effective algorithm for personalized service in various industries such as advertisement, exhibition, performance, etc.

Increasing Accuracy of Stock Price Pattern Prediction through Data Augmentation for Deep Learning (데이터 증강을 통한 딥러닝 기반 주가 패턴 예측 정확도 향상 방안)

  • Kim, Youngjun;Kim, Yeojeong;Lee, Insun;Lee, Hong Joo
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.1-12
    • /
    • 2019
  • As Artificial Intelligence (AI) technology develops, it is applied to various fields such as image, voice, and text. AI has shown fine results in certain areas. Researchers have tried to predict the stock market by utilizing artificial intelligence as well. Predicting the stock market is known as one of the difficult problems since the stock market is affected by various factors such as economy and politics. In the field of AI, there are attempts to predict the ups and downs of stock price by studying stock price patterns using various machine learning techniques. This study suggest a way of predicting stock price patterns based on the Convolutional Neural Network(CNN) among machine learning techniques. CNN uses neural networks to classify images by extracting features from images through convolutional layers. Therefore, this study tries to classify candlestick images made by stock data in order to predict patterns. This study has two objectives. The first one referred as Case 1 is to predict the patterns with the images made by the same-day stock price data. The second one referred as Case 2 is to predict the next day stock price patterns with the images produced by the daily stock price data. In Case 1, data augmentation methods - random modification and Gaussian noise - are applied to generate more training data, and the generated images are put into the model to fit. Given that deep learning requires a large amount of data, this study suggests a method of data augmentation for candlestick images. Also, this study compares the accuracies of the images with Gaussian noise and different classification problems. All data in this study is collected through OpenAPI provided by DaiShin Securities. Case 1 has five different labels depending on patterns. The patterns are up with up closing, up with down closing, down with up closing, down with down closing, and staying. The images in Case 1 are created by removing the last candle(-1candle), the last two candles(-2candles), and the last three candles(-3candles) from 60 minutes, 30 minutes, 10 minutes, and 5 minutes candle charts. 60 minutes candle chart means one candle in the image has 60 minutes of information containing an open price, high price, low price, close price. Case 2 has two labels that are up and down. This study for Case 2 has generated for 60 minutes, 30 minutes, 10 minutes, and 5minutes candle charts without removing any candle. Considering the stock data, moving the candles in the images is suggested, instead of existing data augmentation techniques. How much the candles are moved is defined as the modified value. The average difference of closing prices between candles was 0.0029. Therefore, in this study, 0.003, 0.002, 0.001, 0.00025 are used for the modified value. The number of images was doubled after data augmentation. When it comes to Gaussian Noise, the mean value was 0, and the value of variance was 0.01. For both Case 1 and Case 2, the model is based on VGG-Net16 that has 16 layers. As a result, 10 minutes -1candle showed the best accuracy among 60 minutes, 30 minutes, 10 minutes, 5minutes candle charts. Thus, 10 minutes images were utilized for the rest of the experiment in Case 1. The three candles removed from the images were selected for data augmentation and application of Gaussian noise. 10 minutes -3candle resulted in 79.72% accuracy. The accuracy of the images with 0.00025 modified value and 100% changed candles was 79.92%. Applying Gaussian noise helped the accuracy to be 80.98%. According to the outcomes of Case 2, 60minutes candle charts could predict patterns of tomorrow by 82.60%. To sum up, this study is expected to contribute to further studies on the prediction of stock price patterns using images. This research provides a possible method for data augmentation of stock data.

  • PDF

Applying Social Strategies for Breakdown Situations of Conversational Agents: A Case Study using Forewarning and Apology (대화형 에이전트의 오류 상황에서 사회적 전략 적용: 사전 양해와 사과를 이용한 사례 연구)

  • Lee, Yoomi;Park, Sunjeong;Suk, Hyeon-Jeong
    • Science of Emotion and Sensibility
    • /
    • v.21 no.1
    • /
    • pp.59-70
    • /
    • 2018
  • With the breakthrough of speech recognition technology, conversational agents have become pervasive through smartphones and smart speakers. The recognition accuracy of speech recognition technology has developed to the level of human beings, but it still shows limitations on understanding the underlying meaning or intention of words, or understanding long conversation. Accordingly, the users experience various errors when interacting with the conversational agents, which may negatively affect the user experience. In addition, in the case of smart speakers with a voice as the main interface, the lack of feedback on system and transparency was reported as the main issue when the users using. Therefore, there is a strong need for research on how users can better understand the capability of the conversational agents and mitigate negative emotions in error situations. In this study, we applied social strategies, "forewarning" and "apology", to conversational agent and investigated how these strategies affect users' perceptions of the agent in breakdown situations. For the study, we created a series of demo videos of a user interacting with a conversational agent. After watching the demo videos, the participants were asked to evaluate how they liked and trusted the agent through an online survey. A total of 104 respondents were analyzed and found to be contrary to our expectation based on the literature study. The result showed that forewarning gave a negative impression to the user, especially the reliability of the agent. Also, apology in a breakdown situation did not affect the users' perceptions. In the following in-depth interviews, participants explained that they perceived the smart speaker as a machine rather than a human-like object, and for this reason, the social strategies did not work. These results show that the social strategies should be applied according to the perceptions that user has toward agents.

The Clinical Efficacy of Uvulopalatopharyngoplasty in the Treatment of Obstructive Sleep Apnea Syndrome (폐쇄성 수면 무호흡 증후군 치료에서 구개수구개인두성형술의 임상적 유용성)

  • Moon, Hwa-Sik;Choi, Young-Mee;Park, Young-Hak;Kim, Young-Kyoon;Kim, Kwan-Hyoung;Song, Jeong-Sup;Park, Sung-Hak
    • Tuberculosis and Respiratory Diseases
    • /
    • v.44 no.6
    • /
    • pp.1366-1381
    • /
    • 1997
  • Background : Uvulopalatopharyngoplasty(UPPP) has become the most common surgical treatment for obstructive sleep apnea syndrome(OSAS). However, the results of this therapeutic modality have been quite variable with successful results by several authors and poor results by others. Until recently, in Korea, there is only a few reports about the clinical efficacy of UPPP. A prospective study was undertaken to evaluate the effectiveness and complications of UPPP. Method : Twenty-six OSAS patients who had undergone UPPP with preoperative and postoperative polysomnographic studies were included in this study. Two definitions of surgical success were used. The responder was defined, using a conventional criteria, as a 50% or more reduction in apnea index(AI) or apneahypopnea index(AHI) after UPPP, or a postoperative AI of <10 or AHI of <20. The initial cure was defined, using our own criteria, as a postoperative AI of <5 or AHI of <10. Complications were categorized in two groups : early(disorders during the first 10 postoperative days) and late. Results : Eighteen patients(69.2%) were responders, and ten patients(38.5%) were considered as initial cure. On the other hand, in five patients (19.2%), postoperative polysomnographic data demonstrated deterioration compared with preoperative data. Reduction rate of AI or AHI following UPPP was not significantly related to the preoperative body mass index, AI or AHI. There was no significant change of sleep architecture before and after UPPP in responder and initial cure groups. Early complications such as pain, dyspnea, bleeding, nasal reflux, dysphagia or wound disruption were observed in all patients. Late complications such as nasal reflux, voice change, dysphagia, loss of taste, pharyngeal dryness or foreign body sensation were discovered in 22 patients (84.6%). However, all early and late complications were of minor importance. Conclusion : The response to UPPP was favorable in approximately 70% of OSAS patient. However, the initial Cure rate of UPPP was relatively low. We suggest that selection of more appropriate surgical candidates and adequate surgical protocol is necessary to obtain a more successful result with UPPP.

  • PDF

A study of Artificial Intelligence (AI) Speaker's Development Process in Terms of Social Constructivism: Focused on the Products and Periodic Co-revolution Process (인공지능(AI) 스피커에 대한 사회구성 차원의 발달과정 연구: 제품과 시기별 공진화 과정을 중심으로)

  • Cha, Hyeon-ju;Kweon, Sang-hee
    • Journal of Internet Computing and Services
    • /
    • v.22 no.1
    • /
    • pp.109-135
    • /
    • 2021
  • his study classified the development process of artificial intelligence (AI) speakers through analysis of the news text of artificial intelligence (AI) speakers shown in traditional news reports, and identified the characteristics of each product by period. The theoretical background used in the analysis are news frames and topic frames. As analysis methods, topic modeling and semantic network analysis using the LDA method were used. The research method was a content analysis method. From 2014 to 2019, 2710 news related to AI speakers were first collected, and secondly, topic frames were analyzed using Nodexl algorithm. The result of this study is that, first, the trend of topic frames by AI speaker provider type was different according to the characteristics of the four operators (communication service provider, online platform, OS provider, and IT device manufacturer). Specifically, online platform operators (Google, Naver, Amazon, Kakao) appeared as a frame that uses AI speakers as'search or input devices'. On the other hand, telecommunications operators (SKT, KT) showed prominent frames for IPTV, which is the parent company's flagship business, and 'auxiliary device' of the telecommunication business. Furthermore, the frame of "personalization of products and voice service" was remarkable for OS operators (MS, Apple), and the frame for IT device manufacturers (Samsung) was "Internet of Things (IoT) Integrated Intelligence System". The econd, result id that the trend of the topic frame by AI speaker development period (by year) showed a tendency to develop around AI technology in the first phase (2014-2016), and in the second phase (2017-2018), the social relationship between AI technology and users It was related to interaction, and in the third phase (2019), there was a trend of shifting from AI technology-centered to user-centered. As a result of QAP analysis, it was found that news frames by business operator and development period in AI speaker development are socially constituted by determinants of media discourse. The implication of this study was that the evolution of AI speakers was found by the characteristics of the parent company and the process of co-evolution due to interactions between users by business operator and development period. The implications of this study are that the results of this study are important indicators for predicting the future prospects of AI speakers and presenting directions accordingly.

Anura Call Monitoring Data Collection and Quality Management through Citizen Participation (시민참여형 무미목 양서류 음성신호 수집 및 품질관리 방안)

  • Kyeong-Tae Kim;Hyun-Jung Lee;Won-Kyong Song
    • Korean Journal of Environment and Ecology
    • /
    • v.38 no.3
    • /
    • pp.230-245
    • /
    • 2024
  • Amphibians, sensitive to external environmental changes, serve as bioindicator species for assessing alterations or disturbances in local ecosystems. It is known that one-third of amphibian species within the order Anura are at risk of extinction due to anthropogenic threats such as habitat destruction and fragmentation caused by urbanization. To develop effective protection and conservation strategies for anuran amphibians, species surveys that account for population characteristics are essential. This study aimed to investigate the potential for citizen participation in ecological monitoring using the mating calls of anura species. We also proposed suitable quality control measures to mitigate errors and biases, ensuring the extraction of reliable species occurrence data. The Citizen Science project was carried out nationwide from April 1 to August 31, 2022, targeting 12 species of anura amphibians in Korea. Citizens voluntarily participated in voice signal monitoring, where they listened to anura species' mating calls and recorded them using a mobile application. Additionally, we established a quality control process to extract reliable species occurrence data, categorizing errors and biases from citizen-collected data into three levels: omission, commission, and incorrect identification. A total of 6,808 observations were collected during the citizen participation in anura species vocalization monitoring. Through the quality control process, errors and biases were identified in 1,944 (28.55%) of the 6,808 data. The most common type of error was omission, accounting for 922 cases (47.43%), followed by incorrect identification with 540 cases (27.78%), and commission with 482 cases (24.79%). During the Citizen Science project, we successfully recorded the mating calls of 10 out of the 12 anuran amphibian species in Korea, excluding the Asian toads (Bufo gargarizans Cantor), Korean brown frog (Rana coreana). Difficulties in collecting mating calls were primarily attributed to challenges in observing due to population decline or discrepancies between the breeding season of non-emergent individuals and the timing of the citizen science project. This study represents the first investigation of distribution status and species emergence data collection through mating calls of anura species in Korea based on citizen participation. It can serve as a foundation for designing future bioacoustic monitoring that incorporates citizen science and quality control measures for citizen science data.

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.