• Title/Summary/Keyword: Evaluation statistics

Search Result 1,249, Processing Time 0.027 seconds

Development of Physical Fitness Standard Indicators According to the Bone Age in Youth (유소년의 골연령에 따른 체력 표준지표 개발)

  • Kim, Dae-Hoon;Yoon, Hyoung-ki;Oh, Sei-Yi;Lee, Young-Jun;Cho, Seok-Yeon;Song, Dae-Sik;Seo, Dong-Nyeuck;Kim, Ju-Won;Na, Gyu-Min;Kim, Min-Jun;Oh, Kyung-A
    • Journal of the Korean Applied Science and Technology
    • /
    • v.38 no.6
    • /
    • pp.1627-1642
    • /
    • 2021
  • This study aims to evaluate physical fitness according to the bone age of youth, and ultimately provide basic data for balanced development of youth through physical fitness standard indicators according to the bone age. A total of 730 youth aged 11 to 13 years in bone age and 11 to 13 years in chronological age were selected as subjects; and after taking X-ray films to calculate the bone age, they were evaluated by using the TW3 method. A total of 2 components in physique, which were stature and weight, were measured using a stadiometer(Hanebio, Korea, 2021) and Inbody 270(Biospace, Korea, 2019). A total of 7 components in physical fitness were measured as well, which included muscular strength (Hand Grip Strength), balance (Bass Stick Test), agility (Plate Tapping), power (Standing Long Jump), flexibility (Sit&Reach), muscular endurance (Sit-Up), and cardiovascular endurance (Shuttle Run). Descriptive statistics and independent t-test were conducted for data processing using the SPSS PC/Program(Version 26.0), and it was considered significant at the level of p< .05. The results of this study may be summarized as follow. First, the result of comparing the bone age and the chronological age of 11 to 13 years old in physical fitness, males showed significant difference in muscular strength, power, muscular endurance, and cardiovasular endurance. In females, muscular strength, balance, agility, power, flexibility, muscular endurance, and cardiovascular endurance showed significant difference. Second, physical fitness standard indicators were presented for each gender and age (11-13 years old) of youth according to the bone age; and based on this, physical fitness standard indicators, which are basic data for physical fitness evaluation according to the bone age of youth, were presented.

Survey of Conflict of Interest in the Clinical Research for IRB Members and Researchers (임상시험심사위원회 위원과 연구자를 대상으로 임상연구에서 이해상충에 대한 설문조사연구)

  • Maeng, Chi Hoon;Kang, Su Jin;Lee, Sun Ju;Yim, Hyeon Woo;Choe, Byung-in;Shin, Im Hee;Huh, Jung-Sik;Kwon, Ivo;Yoo, Soyoung;Lee, Mi-Kyung;Shin, Hee-Young;Kim, Duck-An
    • The Journal of KAIRB
    • /
    • v.2 no.1
    • /
    • pp.23-31
    • /
    • 2020
  • Purpose: To obtain opinions from Korean Institutional Review Board (IRB) members' self-evaluation on ability to conduct fairness review of clinical trial protocol with presence of conflict of interest and from investigators and IRB members on financial conflict of interest through surveying. Methods: IRB members and researchers in 9 different hospitals were asked to answer survey questions via email. Results: Responders were 115 personnel (IRB Chair/vice 18, medical member 30, non-medical member 28, and researcher 39) from 9 centers. Compared to IRB medical members, IRB chair/vice respondents scored higher with statistically significance on 10 point scale (8.44±1.381 vs. 7.30±1.685, p=0.005) when asked to self-evaluate fairness reviewing a protocol proposed by an investigator from the same department and a protocol from the company that supports the scientific committee of responders. When reviewing a protocol proposed by a hospital director, non-medical members scored statistically significantly higher than medical-members (7.47±1.76 vs. 8.07±2.70, p=0.034). When asked about the limitation of labor fee for principal investigator on phase 3 Human clinical trials of the Investigational new drug, while the responses range was wide, 60% answered that labor cost of principal investigator should be less than 30% of total budget for clinical trials with a budget of 100 million won. 51.3% answered that there is no need to disclose the labor cost of the principal investigator in the consent form. Since every investigator can be influenced unconsciously by conflict of interest, the answer that 'responder agrees that there is need for management' was the most chosen answer (IRB member 61.8%, investigator 64.1%, multiple answers allowed). Conclusion: Considering scores on questions of fairness by IRB members were between 7.23-8.56 on scale of 0 to 10 point when IRB members were asked about reviewing a clinical trial protocol, it cannot be said with absolute certainty that there is no issue regarding fairness in the review process. Therefore, there should be more ways to safeguard fairness for these issues. There is a need that the disclosure amount of honorarium from sponsor should be lower than 100 million Korean won. Considering the results of the survey in which respondents expressed their thoughts, it is likely that more education on the concept of conflict of interest is needed.

  • PDF

A Study on the Performance Verification Method of Small-Sized LTE-Maritime Transceiver (소형 초고속해상무선통신망 송수신기 성능 검증 방안에 관한 연구)

  • Seok Woo;Bu-young Kim;Woo-Seong Shim
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.7
    • /
    • pp.902-909
    • /
    • 2023
  • This study evaluated the performance test of a small-sized LTE-Maritime(LTE-M) transceiver that was developed and promoted to expand the use of intelligent maritime traf ic information services led by the Ministry of Oceans and Fisheries with the aim of supporting the prevention of maritime accidents. Accoriding to statistics, approximately 30% of all marine accidents in Korean water occur with ships weighing less than 3 tons. Therefore, the blind spots of maritime safety must be supplemented through the development of small-sized transceivers. The small transceiver may be used in fishing boats that are active near coastal waters and in water leisure equipment near the coastline. Therefore, verifying whether sufficient performance and stable communication quality are provided is necessary, considering the environment of their real usage. In this study, we reviewed the communication quality goals of the LTE-M network and the performance requirements of small-sized transceivers suggested by the Ministry of Oceans and Fisheries, and proposed a test plan to appropriately evaluate the performance of small-sized transceivers. The validity of the proposed test method was verified for six real-sea areas with a high frequency of marine accidents. Consequently, the downlink and uplink transmission speeds of the small-sized LTE-M transceiver showed performances of 9 Mbps or more and 3 Mbps or more, respectively. In addition, using the coverage analysis system, coverage of more than 95% and 100% were confirmed in the intensive management zone (0-30 km) and interesting zone (30-50 km), respectively. The performance evaluation method and test results proposed in this paper are expected to be used as reference materials for verifying the performance of transceivers, contributing to the spread of government-promoted e-navigation services and small-sized transceivers.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

A New Exploratory Research on Franchisor's Provision of Exclusive Territories (가맹본부의 배타적 영업지역보호에 대한 탐색적 연구)

  • Lim, Young-Kyun;Lee, Su-Dong;Kim, Ju-Young
    • Journal of Distribution Research
    • /
    • v.17 no.1
    • /
    • pp.37-63
    • /
    • 2012
  • In franchise business, exclusive sales territory (sometimes EST in table) protection is a very important issue from an economic, social and political point of view. It affects the growth and survival of both franchisor and franchisee and often raises issues of social and political conflicts. When franchisee is not familiar with related laws and regulations, franchisor has high chance to utilize it. Exclusive sales territory protection by the manufacturer and distributors (wholesalers or retailers) means sales area restriction by which only certain distributors have right to sell products or services. The distributor, who has been granted exclusive sales territories, can protect its own territory, whereas he may be prohibited from entering in other regions. Even though exclusive sales territory is a quite critical problem in franchise business, there is not much rigorous research about the reason, results, evaluation, and future direction based on empirical data. This paper tries to address this problem not only from logical and nomological validity, but from empirical validation. While we purse an empirical analysis, we take into account the difficulties of real data collection and statistical analysis techniques. We use a set of disclosure document data collected by Korea Fair Trade Commission, instead of conventional survey method which is usually criticized for its measurement error. Existing theories about exclusive sales territory can be summarized into two groups as shown in the table below. The first one is about the effectiveness of exclusive sales territory from both franchisor and franchisee point of view. In fact, output of exclusive sales territory can be positive for franchisors but negative for franchisees. Also, it can be positive in terms of sales but negative in terms of profit. Therefore, variables and viewpoints should be set properly. The other one is about the motive or reason why exclusive sales territory is protected. The reasons can be classified into four groups - industry characteristics, franchise systems characteristics, capability to maintain exclusive sales territory, and strategic decision. Within four groups of reasons, there are more specific variables and theories as below. Based on these theories, we develop nine hypotheses which are briefly shown in the last table below with the results. In order to validate the hypothesis, data is collected from government (FTC) homepage which is open source. The sample consists of 1,896 franchisors and it contains about three year operation data, from 2006 to 2008. Within the samples, 627 have exclusive sales territory protection policy and the one with exclusive sales territory policy is not evenly distributed over 19 representative industries. Additional data are also collected from another government agency homepage, like Statistics Korea. Also, we combine data from various secondary sources to create meaningful variables as shown in the table below. All variables are dichotomized by mean or median split if they are not inherently dichotomized by its definition, since each hypothesis is composed by multiple variables and there is no solid statistical technique to incorporate all these conditions to test the hypotheses. This paper uses a simple chi-square test because hypotheses and theories are built upon quite specific conditions such as industry type, economic condition, company history and various strategic purposes. It is almost impossible to find all those samples to satisfy them and it can't be manipulated in experimental settings. However, more advanced statistical techniques are very good on clean data without exogenous variables, but not good with real complex data. The chi-square test is applied in a way that samples are grouped into four with two criteria, whether they use exclusive sales territory protection or not, and whether they satisfy conditions of each hypothesis. So the proportion of sample franchisors which satisfy conditions and protect exclusive sales territory, does significantly exceed the proportion of samples that satisfy condition and do not protect. In fact, chi-square test is equivalent with the Poisson regression which allows more flexible application. As results, only three hypotheses are accepted. When attitude toward the risk is high so loyalty fee is determined according to sales performance, EST protection makes poor results as expected. And when franchisor protects EST in order to recruit franchisee easily, EST protection makes better results. Also, when EST protection is to improve the efficiency of franchise system as a whole, it shows better performances. High efficiency is achieved as EST prohibits the free riding of franchisee who exploits other's marketing efforts, and it encourages proper investments and distributes franchisee into multiple regions evenly. Other hypotheses are not supported in the results of significance testing. Exclusive sales territory should be protected from proper motives and administered for mutual benefits. Legal restrictions driven by the government agency like FTC could be misused and cause mis-understandings. So there need more careful monitoring on real practices and more rigorous studies by both academicians and practitioners.

  • PDF

Electronic Word-of-Mouth in B2C Virtual Communities: An Empirical Study from CTrip.com (B2C허의사구중적전자구비(B2C虚拟社区中的电子口碑): 관우휴정려유망적실증연구(关于携程旅游网的实证研究))

  • Li, Guoxin;Elliot, Statia;Choi, Chris
    • Journal of Global Scholars of Marketing Science
    • /
    • v.20 no.3
    • /
    • pp.262-268
    • /
    • 2010
  • Virtual communities (VCs) have developed rapidly, with more and more people participating in them to exchange information and opinions. A virtual community is a group of people who may or may not meet one another face to face, and who exchange words and ideas through the mediation of computer bulletin boards and networks. A business-to-consumer virtual community (B2CVC) is a commercial group that creates a trustworthy environment intended to motivate consumers to be more willing to buy from an online store. B2CVCs create a social atmosphere through information contribution such as recommendations, reviews, and ratings of buyers and sellers. Although the importance of B2CVCs has been recognized, few studies have been conducted to examine members' word-of-mouth behavior within these communities. This study proposes a model of involvement, statistics, trust, "stickiness," and word-of-mouth in a B2CVC and explores the relationships among these elements based on empirical data. The objectives are threefold: (i) to empirically test a B2CVC model that integrates measures of beliefs, attitudes, and behaviors; (ii) to better understand the nature of these relationships, specifically through word-of-mouth as a measure of revenue generation; and (iii) to better understand the role of stickiness of B2CVC in CRM marketing. The model incorporates three key elements concerning community members: (i) their beliefs, measured in terms of their involvement assessment; (ii) their attitudes, measured in terms of their satisfaction and trust; and, (iii) their behavior, measured in terms of site stickiness and their word-of-mouth. Involvement is considered the motivation for consumers to participate in a virtual community. For B2CVC members, information searching and posting have been proposed as the main purpose for their involvement. Satisfaction has been reviewed as an important indicator of a member's overall community evaluation, and conceptualized by different levels of member interactions with their VC. The formation and expansion of a VC depends on the willingness of members to share information and services. Researchers have found that trust is a core component facilitating the anonymous interaction in VCs and e-commerce, and therefore trust-building in VCs has been a common research topic. It is clear that the success of a B2CVC depends on the stickiness of its members to enhance purchasing potential. Opinions communicated and information exchanged between members may represent a type of written word-of-mouth. Therefore, word-of-mouth is one of the primary factors driving the diffusion of B2CVCs across the Internet. Figure 1 presents the research model and hypotheses. The model was tested through the implementation of an online survey of CTrip Travel VC members. A total of 243 collected questionnaires was reduced to 204 usable questionnaires through an empirical process of data cleaning. The study's hypotheses examined the extent to which involvement, satisfaction, and trust influence B2CVC stickiness and members' word-of-mouth. Structural Equation Modeling tested the hypotheses in the analysis, and the structural model fit indices were within accepted thresholds: ${\chi}^2^$/df was 2.76, NFI was .904, IFI was .931, CFI was .930, and RMSEA was .017. Results indicated that involvement has a significant influence on satisfaction (p<0.001, ${\beta}$=0.809). The proportion of variance in satisfaction explained by members' involvement was over half (adjusted $R^2$=0.654), reflecting a strong association. The effect of involvement on trust was also statistically significant (p<0.001, ${\beta}$=0.751), with 57 percent of the variance in trust explained by involvement (adjusted $R^2$=0.563). When the construct "stickiness" was treated as a dependent variable, the proportion of variance explained by the variables of trust and satisfaction was relatively low (adjusted $R^2$=0.331). Satisfaction did have a significant influence on stickiness, with ${\beta}$=0.514. However, unexpectedly, the influence of trust was not even significant (p=0.231, t=1.197), rejecting that proposed hypothesis. The importance of stickiness in the model was more significant because of its effect on e-WOM with ${\beta}$=0.920 (p<0.001). Here, the measures of Stickiness explain over eighty of the variance in e-WOM (Adjusted $R^2$=0.846). Overall, the results of the study supported the hypothesized relationships between members' involvement in a B2CVC and their satisfaction with and trust of it. However, trust, as a traditional measure in behavioral models, has no significant influence on stickiness in the B2CVC environment. This study contributes to the growing body of literature on B2CVCs, specifically addressing gaps in the academic research by integrating measures of beliefs, attitudes, and behaviors in one model. The results provide additional insights to behavioral factors in a B2CVC environment, helping to sort out relationships between traditional measures and relatively new measures. For practitioners, the identification of factors, such as member involvement, that strongly influence B2CVC member satisfaction can help focus technological resources in key areas. Global e-marketers can develop marketing strategies directly targeting B2CVC members. In the global tourism business, they can target Chinese members of a B2CVC by providing special discounts for active community members or developing early adopter programs to encourage stickiness in the community. Future studies are called for, and more sophisticated modeling, to expand the measurement of B2CVC member behavior and to conduct experiments across industries, communities, and cultures.

Assessment Study on Educational Programs for the Gifted Students in Mathematics (영재학급에서의 수학영재프로그램 평가에 관한 연구)

  • Kim, Jung-Hyun;Whang, Woo-Hyung
    • Communications of Mathematical Education
    • /
    • v.24 no.1
    • /
    • pp.235-257
    • /
    • 2010
  • Contemporary belief is that the creative talented can create new knowledge and lead national development, so lots of countries in the world have interest in Gifted Education. As we well know, U.S.A., England, Russia, Germany, Australia, Israel, and Singapore enforce related laws in Gifted Education to offer Gifted Classes, and our government has also created an Improvement Act in January, 2000 and Enforcement Ordinance for Gifted Improvement Act was also announced in April, 2002. Through this initiation Gifted Education can be possible. Enforcement Ordinance was revised in October, 2008. The main purpose of this revision was to expand the opportunity of Gifted Education to students with special education needs. One of these programs is, the opportunity of Gifted Education to be offered to lots of the Gifted by establishing Special Classes at each school. Also, it is important that the quality of Gifted Education should be combined with the expansion of opportunity for the Gifted. Social opinion is that it will be reckless only to expand the opportunity for the Gifted Education, therefore, assessment on the Teaching and Learning Program for the Gifted is indispensible. In this study, 3 middle schools were selected for the Teaching and Learning Programs in mathematics. Each 1st Grade was reviewed and analyzed through comparative tables between Regular and Gifted Education Programs. Also reviewed was the content of what should be taught, and programs were evaluated on assessment standards which were revised and modified from the present teaching and learning programs in mathematics. Below, research issues were set up to assess the formation of content areas and appropriateness for Teaching and Learning Programs for the Gifted in mathematics. A. Is the formation of special class content areas complying with the 7th national curriculum? 1. Which content areas of regular curriculum is applied in this program? 2. Among Enrichment and Selection in Curriculum for the Gifted, which one is applied in this programs? 3. Are the content areas organized and performed properly? B. Are the Programs for the Gifted appropriate? 1. Are the Educational goals of the Programs aligned with that of Gifted Education in mathematics? 2. Does the content of each program reflect characteristics of mathematical Gifted students and express their mathematical talents? 3. Are Teaching and Learning models and methods diverse enough to express their talents? 4. Can the assessment on each program reflect the Learning goals and content, and enhance Gifted students' thinking ability? The conclusions are as follows: First, the best contents to be taught to the mathematical Gifted were found to be the Numeration, Arithmetic, Geometry, Measurement, Probability, Statistics, Letter and Expression. Also, Enrichment area and Selection area within the curriculum for the Gifted were offered in many ways so that their Giftedness could be fully enhanced. Second, the educational goals of Teaching and Learning Programs for the mathematical Gifted students were in accordance with the directions of mathematical education and philosophy. Also, it reflected that their research ability was successful in reaching the educational goals of improving creativity, thinking ability, problem-solving ability, all of which are required in the set curriculum. In order to accomplish the goals, visualization, symbolization, phasing and exploring strategies were used effectively. Many different of lecturing types, cooperative learning, discovery learning were applied to accomplish the Teaching and Learning model goals. For Teaching and Learning activities, various strategies and models were used to express the students' talents. These activities included experiments, exploration, application, estimation, guess, discussion (conjecture and refutation) reconsideration and so on. There were no mention to the students about evaluation and paper exams. While the program activities were being performed, educational goals and assessment methods were reflected, that is, products, performance assessment, and portfolio were mainly used rather than just paper assessment.

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.