• 제목/요약/키워드: Industrial Clustering

Search Result 399, Processing Time 0.025 seconds

Analysis of Global Entrepreneurship Trends Due to COVID-19: Focusing on Crunchbase (Covid-19에 따른 글로벌 창업 트렌드 분석: Crunchbase를 중심으로)

  • Shinho Kim;Youngjung Geum
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.18 no.3
    • /
    • pp.141-156
    • /
    • 2023
  • Due to the unprecedented worldwide pandemic of the new Covid-19 infection, business trends of companies have changed significantly. Therefore, it is strongly required to monitor the rapid changes of innovation trends to design and plan future businesses. Since the pandemic, many studies have attempted to analyze business changes, but they are limited to specific industries and are insufficient in terms of data objectivity. In response, this study aims to analyze business trends after Covid-19 using Crunchbase, a global startup data. The data is collected and preprocessed every two years from 2018 to 2021 to compare the business trends. To capture the major trends, a network analysis is conducted for the industry groups and industry information based on the co-occurrence. To analyze the minor trends, LDA-based topic modelling and word2vec-based clustering is used. As a result, e-commerce, education, delivery, game and entertainment industries are promising based on their technological advances, showing extension and diversification of industry boundaries as well as digitalization and servitization of business contents. This study is expected to help venture capitalists and entrepreneurs to understand the rapid changes under the impact of Covid-19 and to make right decisions for the future.

  • PDF

Identification of Employee Experience Factors and Their Influence on Job Satisfaction (직원경험 요인 파악 및 직무 만족도에 끼치는 영향력 분석)

  • Juhyeon Lee;So-Hyun Lee;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.25 no.2
    • /
    • pp.181-203
    • /
    • 2023
  • With the fierce competition of companies for the attraction of outstanding individuals, job satisfaction of employees has been of importance. In this circumstance, many companies try to invest in job satisfaction improvement by finding employees' everyday experiences and difficulties. However, due to a lack of understanding of the employee experience, their investments are not paying off. This study examined the relationship between employee experience and job satisfaction using employee reviews and company ratings from Glassdoor, one of the largest employee communities worldwide. We use text mining techniques such as K-means clustering and LDA topic-based sentiment analysis to extract key experience factors by job level, and DistilBERT sentiment analysis to measure the sentiment score of each employee experience factor. The drawn employee experience factors and each sentiment score were analyzed quantitatively, and thereby relations between each employee experience factor and job satisfaction were analyzed. As a result, this study found that there is a significant difference between the workplace experiences of managers and general employees. In addition, employee experiences that affect job satisfaction also differed between positions, such as customer relationship and autonomy, which did not affect the satisfaction of managers. This study used text mining and quantitative modeling method based on theory of work adjustment so as to find and verify main factors of employee experience, and thus expanded research literature. In addition, the results of this study are applicable to the personnel management strategy for improving employees' job satisfaction, and are expected to improve corporate productivity ultimately.

Analysis of Utilization Characteristics, Health Behaviors and Health Management Level of Participants in Private Health Examination in a General Hospital (일개 종합병원의 민간 건강검진 수검자의 검진이용 특성, 건강행태 및 건강관리 수준 분석)

  • Kim, Yoo-Mi;Park, Jong-Ho;Kim, Won-Joong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.1
    • /
    • pp.301-311
    • /
    • 2013
  • This study aims to analyze characteristics, health behaviors and health management level related to private health examination recipients in one general hospital. To achieve this, we analyzed 150,501 cases of private health examination data for 11 years from 2001 to 2011 for 20,696 participants in 2011 in a Dae-Jeon general hospital health examination center. The cluster analysis for classify private health examination group is used z-score standardization of K-means clustering method. The logistic regression analysis, decision tree and neural network analysis are used to periodic/non-periodic private health examination classification model. 1,000 people were selected as a customer management business group that has high probability to be non-periodic private health examination patients in new private health examination. According to results of this study, private health examination group was categorized by new, periodic and non-periodic group. New participants in private health examination were more 30~39 years old person than other age groups and more patients suspected of having renal disease. Periodic participants in private health examination were more male participants and more patients suspected of having hyperlipidemia. Non-periodic participants in private health examination were more smoking and sitting person and more patients suspected of having anemia and diabetes mellitus. As a result of decision tree, variables related to non-periodic participants in private health examination were sex, age, residence, exercise, anemia, hyperlipidemia, diabetes mellitus, obesity and liver disease. In particular, 71.4% of non-periodic participants were female, non-anemic, non-exercise, and suspicious obesity person. To operation of customized customer management business for private health examination will contribute to efficiency in health examination center.

An Expert System for the Estimation of the Growth Curve Parameters of New Markets (신규시장 성장모형의 모수 추정을 위한 전문가 시스템)

  • Lee, Dongwon;Jung, Yeojin;Jung, Jaekwon;Park, Dohyung
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.17-35
    • /
    • 2015
  • Demand forecasting is the activity of estimating the quantity of a product or service that consumers will purchase for a certain period of time. Developing precise forecasting models are considered important since corporates can make strategic decisions on new markets based on future demand estimated by the models. Many studies have developed market growth curve models, such as Bass, Logistic, Gompertz models, which estimate future demand when a market is in its early stage. Among the models, Bass model, which explains the demand from two types of adopters, innovators and imitators, has been widely used in forecasting. Such models require sufficient demand observations to ensure qualified results. In the beginning of a new market, however, observations are not sufficient for the models to precisely estimate the market's future demand. For this reason, as an alternative, demands guessed from those of most adjacent markets are often used as references in such cases. Reference markets can be those whose products are developed with the same categorical technologies. A market's demand may be expected to have the similar pattern with that of a reference market in case the adoption pattern of a product in the market is determined mainly by the technology related to the product. However, such processes may not always ensure pleasing results because the similarity between markets depends on intuition and/or experience. There are two major drawbacks that human experts cannot effectively handle in this approach. One is the abundance of candidate reference markets to consider, and the other is the difficulty in calculating the similarity between markets. First, there can be too many markets to consider in selecting reference markets. Mostly, markets in the same category in an industrial hierarchy can be reference markets because they are usually based on the similar technologies. However, markets can be classified into different categories even if they are based on the same generic technologies. Therefore, markets in other categories also need to be considered as potential candidates. Next, even domain experts cannot consistently calculate the similarity between markets with their own qualitative standards. The inconsistency implies missing adjacent reference markets, which may lead to the imprecise estimation of future demand. Even though there are no missing reference markets, the new market's parameters can be hardly estimated from the reference markets without quantitative standards. For this reason, this study proposes a case-based expert system that helps experts overcome the drawbacks in discovering referential markets. First, this study proposes the use of Euclidean distance measure to calculate the similarity between markets. Based on their similarities, markets are grouped into clusters. Then, missing markets with the characteristics of the cluster are searched for. Potential candidate reference markets are extracted and recommended to users. After the iteration of these steps, definite reference markets are determined according to the user's selection among those candidates. Then, finally, the new market's parameters are estimated from the reference markets. For this procedure, two techniques are used in the model. One is clustering data mining technique, and the other content-based filtering of recommender systems. The proposed system implemented with those techniques can determine the most adjacent markets based on whether a user accepts candidate markets. Experiments were conducted to validate the usefulness of the system with five ICT experts involved. In the experiments, the experts were given the list of 16 ICT markets whose parameters to be estimated. For each of the markets, the experts estimated its parameters of growth curve models with intuition at first, and then with the system. The comparison of the experiments results show that the estimated parameters are closer when they use the system in comparison with the results when they guessed them without the system.

Dynamic Virtual Ontology using Tags with Semantic Relationship on Social-web to Support Effective Search (효율적 자원 탐색을 위한 소셜 웹 태그들을 이용한 동적 가상 온톨로지 생성 연구)

  • Lee, Hyun Jung;Sohn, Mye
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.19-33
    • /
    • 2013
  • In this research, a proposed Dynamic Virtual Ontology using Tags (DyVOT) supports dynamic search of resources depending on user's requirements using tags from social web driven resources. It is general that the tags are defined by annotations of a series of described words by social users who usually tags social information resources such as web-page, images, u-tube, videos, etc. Therefore, tags are characterized and mirrored by information resources. Therefore, it is possible for tags as meta-data to match into some resources. Consequently, we can extract semantic relationships between tags owing to the dependency of relationships between tags as representatives of resources. However, to do this, there is limitation because there are allophonic synonym and homonym among tags that are usually marked by a series of words. Thus, research related to folksonomies using tags have been applied to classification of words by semantic-based allophonic synonym. In addition, some research are focusing on clustering and/or classification of resources by semantic-based relationships among tags. In spite of, there also is limitation of these research because these are focusing on semantic-based hyper/hypo relationships or clustering among tags without consideration of conceptual associative relationships between classified or clustered groups. It makes difficulty to effective searching resources depending on user requirements. In this research, the proposed DyVOT uses tags and constructs ontologyfor effective search. We assumed that tags are extracted from user requirements, which are used to construct multi sub-ontology as combinations of tags that are composed of a part of the tags or all. In addition, the proposed DyVOT constructs ontology which is based on hierarchical and associative relationships among tags for effective search of a solution. The ontology is composed of static- and dynamic-ontology. The static-ontology defines semantic-based hierarchical hyper/hypo relationships among tags as in (http://semanticcloud.sandra-siegel.de/) with a tree structure. From the static-ontology, the DyVOT extracts multi sub-ontology using multi sub-tag which are constructed by parts of tags. Finally, sub-ontology are constructed by hierarchy paths which contain the sub-tag. To create dynamic-ontology by the proposed DyVOT, it is necessary to define associative relationships among multi sub-ontology that are extracted from hierarchical relationships of static-ontology. The associative relationship is defined by shared resources between tags which are linked by multi sub-ontology. The association is measured by the degree of shared resources that are allocated into the tags of sub-ontology. If the value of association is larger than threshold value, then associative relationship among tags is newly created. The associative relationships are used to merge and construct new hierarchy the multi sub-ontology. To construct dynamic-ontology, it is essential to defined new class which is linked by two more sub-ontology, which is generated by merged tags which are highly associative by proving using shared resources. Thereby, the class is applied to generate new hierarchy with extracted multi sub-ontology to create a dynamic-ontology. The new class is settle down on the ontology. So, the newly created class needs to be belong to the dynamic-ontology. So, the class used to new hyper/hypo hierarchy relationship between the class and tags which are linked to multi sub-ontology. At last, DyVOT is developed by newly defined associative relationships which are extracted from hierarchical relationships among tags. Resources are matched into the DyVOT which narrows down search boundary and shrinks the search paths. Finally, we can create the DyVOT using the newly defined associative relationships. While static data catalog (Dean and Ghemawat, 2004; 2008) statically searches resources depending on user requirements, the proposed DyVOT dynamically searches resources using multi sub-ontology by parallel processing. In this light, the DyVOT supports improvement of correctness and agility of search and decreasing of search effort by reduction of search path.

Term Mapping Methodology between Everyday Words and Legal Terms for Law Information Search System (법령정보 검색을 위한 생활용어와 법률용어 간의 대응관계 탐색 방법론)

  • Kim, Ji Hyun;Lee, Jong-Seo;Lee, Myungjin;Kim, Wooju;Hong, June Seok
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.137-152
    • /
    • 2012
  • In the generation of Web 2.0, as many users start to make lots of web contents called user created contents by themselves, the World Wide Web is overflowing by countless information. Therefore, it becomes the key to find out meaningful information among lots of resources. Nowadays, the information retrieval is the most important thing throughout the whole field and several types of search services are developed and widely used in various fields to retrieve information that user really wants. Especially, the legal information search is one of the indispensable services in order to provide people with their convenience through searching the law necessary to their present situation as a channel getting knowledge about it. The Office of Legislation in Korea provides the Korean Law Information portal service to search the law information such as legislation, administrative rule, and judicial precedent from 2009, so people can conveniently find information related to the law. However, this service has limitation because the recent technology for search engine basically returns documents depending on whether the query is included in it or not as a search result. Therefore, it is really difficult to retrieve information related the law for general users who are not familiar with legal terms in the search engine using simple matching of keywords in spite of those kinds of efforts of the Office of Legislation in Korea, because there is a huge divergence between everyday words and legal terms which are especially from Chinese words. Generally, people try to access the law information using everyday words, so they have a difficulty to get the result that they exactly want. In this paper, we propose a term mapping methodology between everyday words and legal terms for general users who don't have sufficient background about legal terms, and we develop a search service that can provide the search results of law information from everyday words. This will be able to search the law information accurately without the knowledge of legal terminology. In other words, our research goal is to make a law information search system that general users are able to retrieval the law information with everyday words. First, this paper takes advantage of tags of internet blogs using the concept for collective intelligence to find out the term mapping relationship between everyday words and legal terms. In order to achieve our goal, we collect tags related to an everyday word from web blog posts. Generally, people add a non-hierarchical keyword or term like a synonym, especially called tag, in order to describe, classify, and manage their posts when they make any post in the internet blog. Second, the collected tags are clustered through the cluster analysis method, K-means. Then, we find a mapping relationship between an everyday word and a legal term using our estimation measure to select the fittest one that can match with an everyday word. Selected legal terms are given the definite relationship, and the relations between everyday words and legal terms are described using SKOS that is an ontology to describe the knowledge related to thesauri, classification schemes, taxonomies, and subject-heading. Thus, based on proposed mapping and searching methodologies, our legal information search system finds out a legal term mapped with user query and retrieves law information using a matched legal term, if users try to retrieve law information using an everyday word. Therefore, from our research, users can get exact results even if they do not have the knowledge related to legal terms. As a result of our research, we expect that general users who don't have professional legal background can conveniently and efficiently retrieve the legal information using everyday words.

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.

An Investigation on the Periodical Transition of News related to North Korea using Text Mining (텍스트마이닝을 활용한 북한 관련 뉴스의 기간별 변화과정 고찰)

  • Park, Chul-Soo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.63-88
    • /
    • 2019
  • The goal of this paper is to investigate changes in North Korea's domestic and foreign policies through automated text analysis over North Korea represented in South Korean mass media. Based on that data, we then analyze the status of text mining research, using a text mining technique to find the topics, methods, and trends of text mining research. We also investigate the characteristics and method of analysis of the text mining techniques, confirmed by analysis of the data. In this study, R program was used to apply the text mining technique. R program is free software for statistical computing and graphics. Also, Text mining methods allow to highlight the most frequently used keywords in a paragraph of texts. One can create a word cloud, also referred as text cloud or tag cloud. This study proposes a procedure to find meaningful tendencies based on a combination of word cloud, and co-occurrence networks. This study aims to more objectively explore the images of North Korea represented in South Korean newspapers by quantitatively reviewing the patterns of language use related to North Korea from 2016. 11. 1 to 2019. 5. 23 newspaper big data. In this study, we divided into three periods considering recent inter - Korean relations. Before January 1, 2018, it was set as a Before Phase of Peace Building. From January 1, 2018 to February 24, 2019, we have set up a Peace Building Phase. The New Year's message of Kim Jong-un and the Olympics of Pyeong Chang formed an atmosphere of peace on the Korean peninsula. After the Hanoi Pease summit, the third period was the silence of the relationship between North Korea and the United States. Therefore, it was called Depression Phase of Peace Building. This study analyzes news articles related to North Korea of the Korea Press Foundation database(www.bigkinds.or.kr) through text mining, to investigate characteristics of the Kim Jong-un regime's South Korea policy and unification discourse. The main results of this study show that trends in the North Korean national policy agenda can be discovered based on clustering and visualization algorithms. In particular, it examines the changes in the international circumstances, domestic conflicts, the living conditions of North Korea, the South's Aid project for the North, the conflicts of the two Koreas, North Korean nuclear issue, and the North Korean refugee problem through the co-occurrence word analysis. It also offers an analysis of South Korean mentality toward North Korea in terms of the semantic prosody. In the Before Phase of Peace Building, the results of the analysis showed the order of 'Missiles', 'North Korea Nuclear', 'Diplomacy', 'Unification', and ' South-North Korean'. The results of Peace Building Phase are extracted the order of 'Panmunjom', 'Unification', 'North Korea Nuclear', 'Diplomacy', and 'Military'. The results of Depression Phase of Peace Building derived the order of 'North Korea Nuclear', 'North and South Korea', 'Missile', 'State Department', and 'International'. There are 16 words adopted in all three periods. The order is as follows: 'missile', 'North Korea Nuclear', 'Diplomacy', 'Unification', 'North and South Korea', 'Military', 'Kaesong Industrial Complex', 'Defense', 'Sanctions', 'Denuclearization', 'Peace', 'Exchange and Cooperation', and 'South Korea'. We expect that the results of this study will contribute to analyze the trends of news content of North Korea associated with North Korea's provocations. And future research on North Korean trends will be conducted based on the results of this study. We will continue to study the model development for North Korea risk measurement that can anticipate and respond to North Korea's behavior in advance. We expect that the text mining analysis method and the scientific data analysis technique will be applied to North Korea and unification research field. Through these academic studies, I hope to see a lot of studies that make important contributions to the nation.

SKU recommender system for retail stores that carry identical brands using collaborative filtering and hybrid filtering (협업 필터링 및 하이브리드 필터링을 이용한 동종 브랜드 판매 매장간(間) 취급 SKU 추천 시스템)

  • Joe, Denis Yongmin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.77-110
    • /
    • 2017
  • Recently, the diversification and individualization of consumption patterns through the web and mobile devices based on the Internet have been rapid. As this happens, the efficient operation of the offline store, which is a traditional distribution channel, has become more important. In order to raise both the sales and profits of stores, stores need to supply and sell the most attractive products to consumers in a timely manner. However, there is a lack of research on which SKUs, out of many products, can increase sales probability and reduce inventory costs. In particular, if a company sells products through multiple in-store stores across multiple locations, it would be helpful to increase sales and profitability of stores if SKUs appealing to customers are recommended. In this study, the recommender system (recommender system such as collaborative filtering and hybrid filtering), which has been used for personalization recommendation, is suggested by SKU recommendation method of a store unit of a distribution company that handles a homogeneous brand through a plurality of sales stores by country and region. We calculated the similarity of each store by using the purchase data of each store's handling items, filtering the collaboration according to the sales history of each store by each SKU, and finally recommending the individual SKU to the store. In addition, the store is classified into four clusters through PCA (Principal Component Analysis) and cluster analysis (Clustering) using the store profile data. The recommendation system is implemented by the hybrid filtering method that applies the collaborative filtering in each cluster and measured the performance of both methods based on actual sales data. Most of the existing recommendation systems have been studied by recommending items such as movies and music to the users. In practice, industrial applications have also become popular. In the meantime, there has been little research on recommending SKUs for each store by applying these recommendation systems, which have been mainly dealt with in the field of personalization services, to the store units of distributors handling similar brands. If the recommendation method of the existing recommendation methodology was 'the individual field', this study expanded the scope of the store beyond the individual domain through a plurality of sales stores by country and region and dealt with the store unit of the distribution company handling the same brand SKU while suggesting a recommendation method. In addition, if the existing recommendation system is limited to online, it is recommended to apply the data mining technique to develop an algorithm suitable for expanding to the store area rather than expanding the utilization range offline and analyzing based on the existing individual. The significance of the results of this study is that the personalization recommendation algorithm is applied to a plurality of sales outlets handling the same brand. A meaningful result is derived and a concrete methodology that can be constructed and used as a system for actual companies is proposed. It is also meaningful that this is the first attempt to expand the research area of the academic field related to the existing recommendation system, which was focused on the personalization domain, to a sales store of a company handling the same brand. From 05 to 03 in 2014, the number of stores' sales volume of the top 100 SKUs are limited to 52 SKUs by collaborative filtering and the hybrid filtering method SKU recommended. We compared the performance of the two recommendation methods by totaling the sales results. The reason for comparing the two recommendation methods is that the recommendation method of this study is defined as the reference model in which offline collaborative filtering is applied to demonstrate higher performance than the existing recommendation method. The results of this model are compared with the Hybrid filtering method, which is a model that reflects the characteristics of the offline store view. The proposed method showed a higher performance than the existing recommendation method. The proposed method was proved by using actual sales data of large Korean apparel companies. In this study, we propose a method to extend the recommendation system of the individual level to the group level and to efficiently approach it. In addition to the theoretical framework, which is of great value.