• Title/Summary/Keyword: Social media analysis

Search Result 1,464, Processing Time 0.025 seconds

The Identification of Females Fans Identify with the Male Beauty Influencers in SNS - Focusing on Jacques Lacan's Gaze (SNS에 남성 뷰티 인플루언서를 향한 여성 팬의 동일시 - 라캉의 응시 이론을 중심으로)

  • LI LINGJIE
    • Trans-
    • /
    • v.15
    • /
    • pp.57-79
    • /
    • 2023
  • This study aims to explore the strategies and effects of SNS images used by four popular male beauty influencers to gain identification with their female fans. The research selected four male beauty influencers, namely Li Jiaqi, Jeffree Star, James Charles, and Bretman Rock, with a high number of subscribers on Instagram, YouTube, and TikTok as of July 21, 2023. By observing the content they posted on SNS, the study analyzed the types, characteristics, and relevance of male beauty influencer images with their female fans using Lacan's gaze theory. Additionally, concepts related to gaze, such as the mirror stage, the screen, and objet petit a, were supplemented to conduct an in-depth analysis of the characteristics of male beauty influencer images and the motivations of female viewers. The study results suggest that male beauty influencers can maintain an intimate relationship, referred to as 'girl-friendship,' with their female fans through the identification formed by the homogeneity within the feminized mirror images. Furthermore, male beauty influencers can transform female viewers from being seen as objects to seeing them as subjects by presenting images that embrace diversity in gender identity, challenging the traditional notions of societal gender norms. Therefore, the images of male beauty influencers not only challenge gender stereotypes but also cater to the demands for independence and equality of modern young women, promote understanding of feminine gaze, and explore the potential for democratization and inclusivity on social media platforms from a new perspective.

Generating Sponsored Blog Texts through Fine-Tuning of Korean LLMs (한국어 언어모델 파인튜닝을 통한 협찬 블로그 텍스트 생성)

  • Bo Kyeong Kim;Jae Yeon Byun;Kyung-Ae Cha
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.3
    • /
    • pp.1-12
    • /
    • 2024
  • In this paper, we fine-tuned KoAlpaca, a large-scale Korean language model, and implemented a blog text generation system utilizing it. Blogs on social media platforms are widely used as a marketing tool for businesses. We constructed training data of positive reviews through emotion analysis and refinement of collected sponsored blog texts and applied QLoRA for the lightweight training of KoAlpaca. QLoRA is a fine-tuning approach that significantly reduces the memory usage required for training, with experiments in an environment with a parameter size of 12.8B showing up to a 58.8% decrease in memory usage compared to LoRA. To evaluate the generative performance of the fine-tuned model, texts generated from 100 inputs not included in the training data produced on average more than twice the number of words compared to the pre-trained model, with texts of positive sentiment also appearing more than twice as often. In a survey conducted for qualitative evaluation of generative performance, responses indicated that the fine-tuned model's generated outputs were more relevant to the given topics on average 77.5% of the time. This demonstrates that the positive review generation language model for sponsored content in this paper can enhance the efficiency of time management for content creation and ensure consistent marketing effects. However, to reduce the generation of content that deviates from the category of positive reviews due to elements of the pre-trained model, we plan to proceed with fine-tuning using the augmentation of training data.

Stock-Index Invest Model Using News Big Data Opinion Mining (뉴스와 주가 : 빅데이터 감성분석을 통한 지능형 투자의사결정모형)

  • Kim, Yoo-Sin;Kim, Nam-Gyu;Jeong, Seung-Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.143-156
    • /
    • 2012
  • People easily believe that news and stock index are closely related. They think that securing news before anyone else can help them forecast the stock prices and enjoy great profit, or perhaps capture the investment opportunity. However, it is no easy feat to determine to what extent the two are related, come up with the investment decision based on news, or find out such investment information is valid. If the significance of news and its impact on the stock market are analyzed, it will be possible to extract the information that can assist the investment decisions. The reality however is that the world is inundated with a massive wave of news in real time. And news is not patterned text. This study suggests the stock-index invest model based on "News Big Data" opinion mining that systematically collects, categorizes and analyzes the news and creates investment information. To verify the validity of the model, the relationship between the result of news opinion mining and stock-index was empirically analyzed by using statistics. Steps in the mining that converts news into information for investment decision making, are as follows. First, it is indexing information of news after getting a supply of news from news provider that collects news on real-time basis. Not only contents of news but also various information such as media, time, and news type and so on are collected and classified, and then are reworked as variable from which investment decision making can be inferred. Next step is to derive word that can judge polarity by separating text of news contents into morpheme, and to tag positive/negative polarity of each word by comparing this with sentimental dictionary. Third, positive/negative polarity of news is judged by using indexed classification information and scoring rule, and then final investment decision making information is derived according to daily scoring criteria. For this study, KOSPI index and its fluctuation range has been collected for 63 days that stock market was open during 3 months from July 2011 to September in Korea Exchange, and news data was collected by parsing 766 articles of economic news media M company on web page among article carried on stock information>news>main news of portal site Naver.com. In change of the price index of stocks during 3 months, it rose on 33 days and fell on 30 days, and news contents included 197 news articles before opening of stock market, 385 news articles during the session, 184 news articles after closing of market. Results of mining of collected news contents and of comparison with stock price showed that positive/negative opinion of news contents had significant relation with stock price, and change of the price index of stocks could be better explained in case of applying news opinion by deriving in positive/negative ratio instead of judging between simplified positive and negative opinion. And in order to check whether news had an effect on fluctuation of stock price, or at least went ahead of fluctuation of stock price, in the results that change of stock price was compared only with news happening before opening of stock market, it was verified to be statistically significant as well. In addition, because news contained various type and information such as social, economic, and overseas news, and corporate earnings, the present condition of type of industry, market outlook, the present condition of market and so on, it was expected that influence on stock market or significance of the relation would be different according to the type of news, and therefore each type of news was compared with fluctuation of stock price, and the results showed that market condition, outlook, and overseas news was the most useful to explain fluctuation of news. On the contrary, news about individual company was not statistically significant, but opinion mining value showed tendency opposite to stock price, and the reason can be thought to be the appearance of promotional and planned news for preventing stock price from falling. Finally, multiple regression analysis and logistic regression analysis was carried out in order to derive function of investment decision making on the basis of relation between positive/negative opinion of news and stock price, and the results showed that regression equation using variable of market conditions, outlook, and overseas news before opening of stock market was statistically significant, and classification accuracy of logistic regression accuracy results was shown to be 70.0% in rise of stock price, 78.8% in fall of stock price, and 74.6% on average. This study first analyzed relation between news and stock price through analyzing and quantifying sensitivity of atypical news contents by using opinion mining among big data analysis techniques, and furthermore, proposed and verified smart investment decision making model that could systematically carry out opinion mining and derive and support investment information. This shows that news can be used as variable to predict the price index of stocks for investment, and it is expected the model can be used as real investment support system if it is implemented as system and verified in the future.

Improving the Accuracy of Document Classification by Learning Heterogeneity (이질성 학습을 통한 문서 분류의 정확성 향상 기법)

  • Wong, William Xiu Shun;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.21-44
    • /
    • 2018
  • In recent years, the rapid development of internet technology and the popularization of smart devices have resulted in massive amounts of text data. Those text data were produced and distributed through various media platforms such as World Wide Web, Internet news feeds, microblog, and social media. However, this enormous amount of easily obtained information is lack of organization. Therefore, this problem has raised the interest of many researchers in order to manage this huge amount of information. Further, this problem also required professionals that are capable of classifying relevant information and hence text classification is introduced. Text classification is a challenging task in modern data analysis, which it needs to assign a text document into one or more predefined categories or classes. In text classification field, there are different kinds of techniques available such as K-Nearest Neighbor, Naïve Bayes Algorithm, Support Vector Machine, Decision Tree, and Artificial Neural Network. However, while dealing with huge amount of text data, model performance and accuracy becomes a challenge. According to the type of words used in the corpus and type of features created for classification, the performance of a text classification model can be varied. Most of the attempts are been made based on proposing a new algorithm or modifying an existing algorithm. This kind of research can be said already reached their certain limitations for further improvements. In this study, aside from proposing a new algorithm or modifying the algorithm, we focus on searching a way to modify the use of data. It is widely known that classifier performance is influenced by the quality of training data upon which this classifier is built. The real world datasets in most of the time contain noise, or in other words noisy data, these can actually affect the decision made by the classifiers built from these data. In this study, we consider that the data from different domains, which is heterogeneous data might have the characteristics of noise which can be utilized in the classification process. In order to build the classifier, machine learning algorithm is performed based on the assumption that the characteristics of training data and target data are the same or very similar to each other. However, in the case of unstructured data such as text, the features are determined according to the vocabularies included in the document. If the viewpoints of the learning data and target data are different, the features may be appearing different between these two data. In this study, we attempt to improve the classification accuracy by strengthening the robustness of the document classifier through artificially injecting the noise into the process of constructing the document classifier. With data coming from various kind of sources, these data are likely formatted differently. These cause difficulties for traditional machine learning algorithms because they are not developed to recognize different type of data representation at one time and to put them together in same generalization. Therefore, in order to utilize heterogeneous data in the learning process of document classifier, we apply semi-supervised learning in our study. However, unlabeled data might have the possibility to degrade the performance of the document classifier. Therefore, we further proposed a method called Rule Selection-Based Ensemble Semi-Supervised Learning Algorithm (RSESLA) to select only the documents that contributing to the accuracy improvement of the classifier. RSESLA creates multiple views by manipulating the features using different types of classification models and different types of heterogeneous data. The most confident classification rules will be selected and applied for the final decision making. In this paper, three different types of real-world data sources were used, which are news, twitter and blogs.

Adaptation Process to Menopause (폐경에 대한 적응 과정)

  • 이미라
    • Journal of Korean Academy of Nursing
    • /
    • v.24 no.4
    • /
    • pp.623-634
    • /
    • 1994
  • Although the average menopausal age has not changed, women's life span has increased. Today's women live longer after their menopause than those in the past, and this calls for attention in both nursing and medical fields. Many studies have revealed how women reacted to menopause and suffered from it. But they did not discriminate the menopausal meaning and effects from the climacteric phenomena. So, this author tried to clarify what menopause itself meant to the climacteric women, by means of grounded theory methodology. The interviewees were 21 women, whose ages were between 46 and 60 years. They were selected by theoretical sampling technique, and the author tried to include all levels of important variables such as age, educational background, religion and job. Data were collected by the author through in -depth interviews and observations in July, 1994. The interviews were mostly done in the homes of the subjects, or in some cases at the author's office or in a hospital. Interviews took from 30 minutes to 2 hours. Interviews were tape recorded and transcribed later by a research assistant. Data were analyzed as gathered, by the constant comparative method proposed by Strauss and Corbin. Eleven concepts were discovered from the data, and they were grouped under six higher order categories. These six categories were "to give menopause a meaning", "to experience value change", "to have self-help strategies", "to have no strategies", "to live a life worth living", "to have a sense of powerlessness" Among these "to experionce value change" was . selected as the core category. Five major categories were systematically integrated around the core category. Women's adaptation to menopause was defined as proceeding as follows : Most women felt relief and sorrow at the same time when they faced menopause, and some only sorrow or agony. Then, they consulted with others about menopausal symptoms, or tried to think of them by themselves. Finally, they gave menopause a meaning, which was that menopause and its symptoms were natural phenomena. But menopause made women reflect on them-selves and their past lives. As they reflected on themselves, their value on life began to change. As their value changed, some women seeked self help strategies. Those self help strategies were what they had learned from collegues, professionals or mass media. The quality of their lives depended on whether they practiced self help strategies or not. Three types of lives were found. Twelve women enjoyed a life worth living, and practiced the self help strategies, because they accepted menopause a chance to change. They were characterized by a high educational level, having a professional job and a sincere faith in God. Seven women were living as usual, because they did not have the necessity to change. They were high school graduates and house wives. Two women recognized menopause a chance to change, but they did not try self help strategies. Their characteristic was low educational level. Those who did not try self help strategies complained of powerlessness to varying degrees. The educational background, full-time jobs and faith helped women adapt to menopause positively. But social support was not helpful to women's adaptation to menepause. Three hypotheses were derived from the analysis. (1) The higher the educational level, the more theneed to change. (2) Women with higher educational background will practice self help strategies more than those with lower edcational background. (3) The more women practice self help strategies, the worthier lives they will live. Suggestions for further studies are as follows. (1) Studies to test hypotheses are needed. (2) A study to find the relationship between the degree of practicing self help strategies and locus of control. (3) Spiritual approaches would better be applied to help menopausal women. (4) Education through mass media should be given mere frequently.

  • PDF

An Analysis of the Internal Marketing Impact on the Market Capitalization Fluctuation Rate based on the Online Company Reviews from Jobplanet (직원을 위한 내부마케팅이 기업의 시가 총액 변동률에 미치는 영향 분석: 잡플래닛 기업 리뷰를 중심으로)

  • Kichul Choi;Sang-Yong Tom Lee
    • Information Systems Review
    • /
    • v.20 no.2
    • /
    • pp.39-62
    • /
    • 2018
  • Thanks to the growth of computing power and the recent development of data analytics, researchers have started to work on the data produced by users through the Internet or social media. This study is in line with these recent research trends and attempts to adopt data analytical techniques. We focus on the impact of "internal marketing" factors on firm performance, which is typically studied through survey methodologies. We looked into the job review platform Jobplanet (www.jobplanet.co.kr), which is a website where employees and former employees anonymously review companies and their management. With web crawling processes, we collected over 40K data points and performed morphological analysis to classify employees' reviews for internal marketing data. We then implemented econometric analysis to see the relationship between internal marketing and market capitalization. Contrary to the findings of extant survey studies, internal marketing is positively related to a firm's market capitalization only within a limited area. In most of the areas, the relationships are negative. Particularly, female-friendly environment and human resource development (HRD) are the areas exhibiting positive relations with market capitalization in the manufacturing industry. In the service industry, most of the areas, such as employ welfare and work-life balance, are negatively related with market capitalization. When firm size is small (or the history is short), female-friendly environment positively affect firm performance. On the contrary, when firm size is big (or the history is long), most of the internal marketing factors are either negative or insignificant. We explain the theoretical contributions and managerial implications with these results.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

Questions and Answers about the Humidifier Disinfectant Disaster as of February 2017 (가습기살균제 참사의 진행과 교훈(Q&A))

  • Choi, Yeyong
    • Journal of Environmental Health Sciences
    • /
    • v.43 no.1
    • /
    • pp.1-22
    • /
    • 2017
  • 'The worstest environment disaster', 'World's first biocide massacre', 'Home-based Sewol ferry disaster' are all phrases attached to the recent humidifier disinfectant disaster. In the spring of 2011, four of 8 pregnant women including 1 adult man passed away at a university hospital in Seoul due to breathing failure. Epidemiologic investigation conducted by the Korean CDC soon revealed the inhalation of humidifier disinfectant, which had been widely used in Korea during the winter, to be responsible for the disease. As well as lung fibrosis hardening of the lungs, other diseases including asthma, rhinitis, skin disease, liver disease, fetal disease or cancers have been researched for their relation with exposure to the products. By February 9, 2017, 5,342 cases had registered for health problems and 1,131 of them were already dead (20.8% mortality rate). Based on studies by government agencies and a telephone survey of the general population by Seoul National University and civic groups, around 20% of the general public of Korea has used these products. Since the market release of the first product by SK Chemical in 1994, over 7.1 million items from around 20 brands were sold up to 2011. Most of the products were manufactured by well-known large conglomerates such as SK, Lotte, Samsung, Shinsegye, LG, and GS, as well as some European companies including UK-based Reckitt Benckiser and TESCO, the German firm Henkel, the Danish firm KeTox, and an Irish company. Even though this disaster was unveiled in 2011 by the Korean government, the issue of the victims was neglected for over five years. In 2016, an unexpected but intensive investigation by prosecutors found that Reckitt Benckiser manipulated and concealed animal tests for its own brand and brought several university experts and company employees to court. The matter was an intense social issue in Korea from May to June with a surge in media coverage. The prosecutor's investigation and a nationwide boycott campaign organized by victims and environmental groups against Reckitt Benckiser, whose product had been used by more than 70% of victims, led to the producer's official apology and a compensation scheme. A legislative investigation organized after the April 2016 national election revealed the producers' faults and the government's responsibility, but failed to meet expectations. A special law for the victims passed the National Assembly in January 2017 and a punitive system together with a massive environmental epidemiology investigation are expected to be the only solutions for this tragedy. Sciences of medicine, toxicology and environmental health have provided decisive evidence so far, but for the remaining problems the perspectives of social sciences such as sociology and jurisprudence are highly necessary, similar to with the Minamata disease and Wonjin Rayon events. It may not be easy to follow this issue using unfamiliar terminology from medical and chemical science and the long, complicated history of the event. For these reasons the author has attempted to write this article in a question and answer format to render it easier to follow. The 17 questions are: Q1 What is humidifier disinfectant? Q2 What kind of health problems are caused by humidifier disinfectant? Q3 How many victims are there? Q4 What is the analysis of the 1,112 cases of death? Q5 What is the problem with the government's diagnostic criteria and the solution? Q6 Who made what brands? Q7 Has there been a recall? What is still on sale? Q8 Was safety not checked by any producers? Q9 What are the government's responsibilities? Q10 Is it true that these products were sold only in Korea? Q11 Why and how was it unveiled only in 2011 after 17 years of sales? Q12 What delayed the resolution of the victim issue? Q13 What is the background of the prosecutor's investigation in early 2016? Q14 Is it possible to report new victim cases without evidence of product purchase? Q15 What is happening with the victim issue? Q16 How does it compare with the cases of Minamata disease and Wonjin Rayon? Q17 Are there prevention measures and lessons?

A Study on the Care Needs of Family-Caregivers to the Patients with Stroke (뇌졸중환자 가족의 간호요구)

  • Kim Mi-Hee
    • Journal of Korean Academy of Fundamentals of Nursing
    • /
    • v.4 no.2
    • /
    • pp.175-192
    • /
    • 1997
  • The purpose of this study was to identify the care needs of family-caregivers to the patients with stroke. Subjects were 115 family-caregivers caring for the patients while they were in-patients or out-patients with stroke in two general hospitals and one oriental medicine hospital located in Seoul and Kwang-Ju. The instrument used for this study was made by the researcher on the basis of results of literature review and interviews with family-caregivers, composed of 35 items. Internal validity by calculation of cronbach's alpha with data of respondents was 0.91, which was regarded as high. The Data were analyzed by SAS program, with percentage, mean, t-test, and ANOVA. Factor structures of care needs of family-caregivers were elicited by factor analysis(PCA, Varimax rotation). Datum collection had been from July 1 to August 14, 1997. The results of this study were as follows : 1. The mean score of the sum of the care needs of family-caregivers was 3.96 and the highest-mean item was 'need for immediate care(M=4.77)', and the lowest-mean item was 'need for chaplian's visit (M=2.82)'. 2. Care needs of the family-caregivers were : Need to be informed of the disease, treatment and care ; need of education and assistance related to physical functional level ; need of social support and consultation ; need of management of nursing problem related to immobility ; need of appreciation ; need of the way to communicate with patients ; need of immediate care and help. The highest mean factor was the 'need for immediate care and help(M=4.74)', and the lowest mean factor was the 'need of appreciation(M=3.58)'. 3. The variables influencing the degree of care needs perceived by family-caregivers to the patients with stroke were as follows : There were significant differences between need to be informed of the disease, treatment and care and general characteristic factors, which were family caregiver's sex (p=.0178), caring period(p=.0223) and patient's suffering period(p=.0244). There were significant differences between need of education and assistance related to physical functional level and general characteristic factors, which were patient's paralysis(p=.0177), patient's ADL dependency(p=.0032). There were significant differences between need of social support and consultation and general characteristic factors, which were family caregiver's sex(p=.0055), occupation(p=.0159), religion(p=.0093) and patient's sex(p=.0134). There was significant difference in the degree of need of management of nursing problem related to immobility, according to the patient's ADL dependency(p=.0493). There were significant differences between need of appreciation and general characteristic factors, which were family caregiver's age(p=.0107), sex(p=.0133), and patient's age(p=.0338). There were significant differences between need of the way to communicate with patient and general characteristic factors, which were patient's paralysis(p=.0002) and aphasia(p=.0001). There were significant differences between need of immediate care and help and general characteristic factors, which were family caregiver's caring period(p=.0162) and patient's suffering period(p=.0116). 4. The mean score of patient's ADL dependency was 3. 38 and the highest-mean item was 'ascending and descending stairs(M=4.12)', and the lowest-mean item was 'drinking(M=2.60)'. There was no significant difference in the degrees of care needs related to the patient's ADL dependency. 5. The highest information source of family-caregivers was from the doctors about the disease, treatment and care(26.1%). The second highest one was from mass media(20.8%), and the third one was from the nurses. The above findings may be used as the basic data to seek more efficient way of elevating nursing practice and quality for family-caregivers to the patients with stroke.

  • PDF

A Checklist to Improve the Fairness in AI Financial Service: Focused on the AI-based Credit Scoring Service (인공지능 기반 금융서비스의 공정성 확보를 위한 체크리스트 제안: 인공지능 기반 개인신용평가를 중심으로)

  • Kim, HaYeong;Heo, JeongYun;Kwon, Hochang
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.3
    • /
    • pp.259-278
    • /
    • 2022
  • With the spread of Artificial Intelligence (AI), various AI-based services are expanding in the financial sector such as service recommendation, automated customer response, fraud detection system(FDS), credit scoring services, etc. At the same time, problems related to reliability and unexpected social controversy are also occurring due to the nature of data-based machine learning. The need Based on this background, this study aimed to contribute to improving trust in AI-based financial services by proposing a checklist to secure fairness in AI-based credit scoring services which directly affects consumers' financial life. Among the key elements of trustworthy AI like transparency, safety, accountability, and fairness, fairness was selected as the subject of the study so that everyone could enjoy the benefits of automated algorithms from the perspective of inclusive finance without social discrimination. We divided the entire fairness related operation process into three areas like data, algorithms, and user areas through literature research. For each area, we constructed four detailed considerations for evaluation resulting in 12 checklists. The relative importance and priority of the categories were evaluated through the analytic hierarchy process (AHP). We use three different groups: financial field workers, artificial intelligence field workers, and general users which represent entire financial stakeholders. According to the importance of each stakeholder, three groups were classified and analyzed, and from a practical perspective, specific checks such as feasibility verification for using learning data and non-financial information and monitoring new inflow data were identified. Moreover, financial consumers in general were found to be highly considerate of the accuracy of result analysis and bias checks. We expect this result could contribute to the design and operation of fair AI-based financial services.