• Title/Summary/Keyword: limitation

Search Result 8,489, Processing Time 0.043 seconds

Quantitative Analysis of Small Intestinal Mucosa Using Morphometry in Cow's Milk-Sensitive Enteropathy (우유 과민성 장병증(cow's milk-sensitive enteropathy)에서 소장 생검조직의 형태학적 계측을 이용한 정량적 분석)

  • Hwang, Jin-Bok;Kim, Yong-Jin
    • Pediatric Gastroenterology, Hepatology & Nutrition
    • /
    • v.1 no.1
    • /
    • pp.45-55
    • /
    • 1998
  • Purpose: To make objective standards of small intestinal mucosal changes in cow's milk-sensitive enteropathy (CMSE) we analyzed histological changes of endoscopic duodenal mucosa biopsy specimens from normal children and patients of CMSE. Methods: We review the medical records of patients who had been admitted and diagnosed as CMSE by means of gastrofiberscopic duodenal mucosal biopsy following cow's milk challenge and withdrawal. Thirteen babies with CMSE, ranging from 14 days to 56 days of age, were studied. Five non-CMSE patients were used as control, ranging from 22 days to 72 days of age. The morphometric parameters under study were villous height, crypt zone depth, ratio of villous height to crypt zone depth, total mucosal thickness and length of surface epithelium by using H & E stained specimens under the drawing apparatus attached microscope. In addition, the numbers of lymphocytes in the epithelium and eosinophil cells in the lamina propria and epithelium were measured. Results: In the duodenal mucosal biopsy specimens in CMSE we found partial and subtotal villous atrophy with an increased number of interepithelial lymphocytes. The mean villous height($135{\pm}59\;{\mu}m$), ratio of villous height to crypt zone depth ($0.46{\pm}0.28$), total mucosal thickness ($499{\pm}56\;{\mu}m$), length of surface epithelium of small intestinal mucosa ($889{\pm}231\;{\mu}m$) in CMSE was significantly decreased compared with the control (p<0.05). The mean crypt zone depth ($311{\pm}65\;{\mu}m$) was significantly greater than the control ($188{\pm}24\;{\mu}m$)(p<0.05). Infiltration of interepithelial lymphocytes ($34.1{\pm}10.5$) were significantly greater than the control ($13.6{\pm}3.6$)(p<0.05). The number of eosinophil cells in both lamina propria and epithelium was no significant differences between groups (p>0.05). The small intestinal mucosa in treated CMSE showed much improved enteropathy of villous height, crypt zone depth, interepithelial lymphocytes compared with the control as well as untreated CMSE. Conclusion: Quantitation of mucosal dimensions confirmed the presence of CMSE. It seems to be a limitation in the capacity of crypt cells to compensate for the loss of villous epithelium in CMSE. Specimens obtained by gastrofiberscopic duodenal mucosal biopsy were suitable for morphometric diagnosis of CMSE. Improvement of CMSE also can be confirmed histologically after the therapy of protein hydrolysate.

  • PDF

A Study of Influence of Filgrastim on PET/CT In Diffuse Large B cell Lymphoma (미만성 거대 B 세포 림프종 환자에서 Filgrastim 사용이 PET/CT 영상에 미치는 영향에 대한 고찰)

  • NamKoong, Hyuk;Park, Hoon-Hee;Ban, Yung-Gak;Kang, Sin-Chang;Kim, Sang-Kyoo;Lim, Han-Sang;Lee, Chang-Ho
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.13 no.3
    • /
    • pp.17-23
    • /
    • 2009
  • Purpose: It has been known that PET/CT is very valuable in follow-up study of diffuse large B cell lymphoma (DLBCL). Generally, in DLBCL, radiotherapy and chemotherapy has been progressed, because the lesion hasn‘t been limited to one site. And, it has lead to the decrease of leukocyte like neutropenia, due to myelosuppression of chemotherapy. So, in that case, administration of Filgrastim (Granulocyte colony-stimulating factor; G-CSF) is universal. However, in short time after administration, PET/CT has limitation to offer accurate images, through the uptake of $^{18}F$-FDG is increased in the region that is activated bone marrow by hematopoietic growth. Therefore, the aim of this study is that PET/CT in a certain period of time after administration of Filgrastim is able to show normal degree of $^{18}F$-FDG uptake. Materials and Methods: 10 patients under follow-up study of diffuse large B cell lymphoma were examined in this study from January, 2007 to January, 2009 (Male: 4 persons; Female: 6 persons; The mean age: 53.8 years old; The mean weight: 57.3 Kg). Using PET/CT (Discovery STe; GE Healthcare, Milwaukee, WI, USA), whole body images were acquired in 1 hour after $^{18}F$-FDG injection. For image analysis, each ROI ($120\;mm^2$) was drawn on $C^6$ (the sixth C-spine), $L_4$ (the forth L-spine), liver, spleen, and lung, then SUV (Standard Uptake Value)s were measured. We compared with each uptake between in 1-day and 5~7 days after administration of Filgrastim at same patient, so confirmed significance about these by SPSS version 12. Results: In case of $C_6$, $L_4$, spleen, every SUV of 1 day later was remarkably higher than that of 5~7 days later, but liver and lung were similar. Also, the images acquired after 5~7 days distinct remarkably and show normal degree of $^{18}F$-FDG uptake, because uptake of bone was almost disappeared. Conclusions: In this study, each SUV was prominent difference as a period of time after Filgrastim’s administration. And Filgrastim makes concentrate uptake of $^{18}F$-FDG in bone, but, after 5~7 days, bone‘s uptake was greatly decreased. Therefore, we are able to infer a certain period of time that shows normal degree of uptake, by numerical value proven. Also, we consider that this study contribute to advanced study about the other agent like Pegfilgrastim, Lenograstim besides Filgrastim, afterwards.

  • PDF

Feed Intake Evaluation of Korean Cattle (Hanwoo) Fed Diets Containing Different Levels of Compound Fattening Periods (한우의 육성 및 비육기간중 배합사료 급여 수준에 따른 사료섭취량 조사)

  • Shin, K.J.;Oh, Y.G.;Lee, S.S.;Kim, K.H.;Kim, C.H.;Paik, B.H.
    • Journal of Animal Science and Technology
    • /
    • v.44 no.1
    • /
    • pp.95-104
    • /
    • 2002
  • A study was conducted to evaluate feed intake of Hanwoo bulls and steers fed diets of compound feed and rice straw. Twenty bull calves and sixty steers at 5 to 7 months old were used. The experimental period was divided into three feeding stages which are growing period (〈300 kg body weight (BW)), early fattening period (300-450 kg BW) and late fattening period (〉450 kg BW). The animals were given the diets containing 14.1% crude protein (CP) and 70.0% total digestible nutrients (TDN) in the growing period, 12.1% CP and 70.6% TDN in the early fattening period, and 11.2% CP and 71.9% TDN in the late fattening period, respectively. Experiment 1 was designed to compare feed intake (as-fed basis) between Hanwoo bulls and steers fed the experimental diets ad libitum. In Experiment 2, Hanwoo steers were allocated in one of three compound feed feeding treatments to investigate feed intake (as-fed basis). The treatment groups were ① feeding level 1 group fed compound feed ad libitum through the whole periods; ② feeding level 2 group fed 1.0% compound feed per kg BW in the growing period, 1.5% compound feed per kg BW in the early fattening period and compound feed ad libitum in the late fattening period; and ③ feeding level 3 group fed 1.5% compound feed per kg BW in the growing period, 2.0% compound feed per kg BW in the early fattening period and compound feed ad libitum in the late fattening period. In Experiment 1, the average daily feed intake of bulls increased linearly through the whole experimental period while the feed intake of steers increased until their body weight was reached upto 521 kg, afterward reduced. Average daily feed intake was about 3.5% per kg BW of both bulls and steers at the beginning (150 kg BW) of Experiment 1 while bulls and steers at 600 kg BW consumed the diets of 2.0 and 1.5% per kg BW, respectively. In Experiment 2, the average daily feed intake of steers in the feeding level 1 group gradually increased through the growing and early fattening periods and then steadily reduced over the late fattening period. The average daily feed intake in the feeding level 2 group linearly increased through the whole period while the feed intake in the feeding level 3 group showed a relatively rapid increase and reached a peak at 455 kg BW, and then sharply dropped. The average daily feed intake of steers in the feeding level 1 at the beginning (150 kg BW) of Experiment 2 was about 3.5% per kg BW but there was a reduction (1.5% per kg BW) at 600 kg BW. Besides, The feed intake of steers in the feeding level 2 and 3 in which compound feed was given with limitation increased to 2.0-3.0% per kg BW in the growing period and then reduced to 1.5-2.0% per kg BW. The limited compound feed feeding of steers in Experiment 2 resulted in higher rice straw intakes up to two to three folds and two folds in the growing and early fattening periods, respectively, than the ad libitum feeding.

Vitamin $B_{12}$ Contents in Some Korean Fermented Foods and Edible Seaweeds (한국의 장류, 김치 및 식용 해조류를 중심으로 하는 일부 상용 식품의 비타민 $B_{12}$ 함량 분석 연구)

  • Kwak, Chung-Shil;Hwang, Jin-Yong;Watanabe, Fumio;Park, Sang-Chul
    • Journal of Nutrition and Health
    • /
    • v.41 no.5
    • /
    • pp.439-447
    • /
    • 2008
  • There is a limitation to estimate vitamin $B_{12}$ intake due to lack of data on vitamin $B_{12}$ content in many Korean foods. In this study, vitamin $B_{12}$ content was determined in some soybean or vegetable-fermented foods, edible seaweeds and other frequently consumed foods in Korea by microbioassay using Lactobacillus delbruecki ATCC 7830. The traditional type of Doenjang and Chungkookjang contained 1.85 ${\mu}g/100$ g and 0.69 ${\mu}g/100$ g of vitamin $B_{12}$, respectively, while the factory-type of Doenjang and Chungkookjang contained 0.04-0.86 ${\mu}g/100$ g and 0.06-0.15 ${\mu}g/100$ g. Vitamin $B_{12}$ was not detected in steamed soybeans and Tofu which is a not-fermented soybean product, indicating that vitamin $B_{12}$ in Doenjang and Chungkookjang might be produced during the fermentation process. The Korean-style soy sauce contained 0.04 ${\mu}g$ vitamin $B_{12}$/100 mL, but vitamin $B_{12}$ was not detected in Japanese-style soy sauce and white miso. Commercial Kimchi, a representative Korean vegetable- fermented food, made of Korean cabbage, Yeolmu, or Mustard leaves contained 0.013-0.03 ${\mu}g$ vitamin $B_{12}$/100 g, while Kimchi without red pepper and fermented fish sauce (White Kimchi) did not. Vitamin $B_{12}$ content was very high in some edible seaweeds such as laver (66.76 ${\mu}g/100$ g dry weight) and sea lettuce (84.74 ${\mu}g/100$ g dry weight), and it was 17.12 ${\mu}g/100$ g of dried small anchovy, 1.07 ${\mu}g/100$ g of whole egg, and 0.02 ${\mu}g/100$ g of coffee mix. From these results, it is assumed that Koreans take substantial amount of vitamin $B_{12}$ from plant-origin foods. And, with these data, we will be able to calculate dietary vitamin $B_{12}$ content more correctly than before. In conclusion, soybean-fermented foods, Kimchi, laver and sea lettuce are recommendable as good sources of vitamin $B_{12}$ for vegetarians or Korean elderly on grain and vegetable based diet.

An Analysis of the Differences in Management Performance by Business Categories from the Perspective of Small Business Systematization (영세 소상공인 조직화에 대한 직능업종별 차이분석과 경영성과)

  • Suh, Geun-Ha;Seo, Mi-Ok;Yoon, Sung-Wook
    • Journal of Distribution Science
    • /
    • v.9 no.2
    • /
    • pp.111-122
    • /
    • 2011
  • The purpose of this study is to survey the successful cases of small and medium Business Systematization Cognition by examining their entrepreneurial characteristics and analysing the factors affecting their success. To that end, previous studies on the association types of small businesses were studied. A research model was developed, and research hypotheses for an empirical analysis were established upon it. Suh et al. (2010) insist on the importance of Small Business Systematization in Korea but also show that small business performance is suffering: they are too small to stand alone. That is why association is so crucial for them: they must stand together. Unfortunately, association is difficult, as they have few specific links and little motivation. Even in franchising networks, association tends to be initiated by big franchisers, not small ones. In that sense, association among small businesses is crucial for their long-term survival. With this in mind, this study examines how they think and feel about the issue of 'Industrial Classification', how important Industrial Classification is to their business success, and what kinds of problems it raises in the markets. This study seeks the different cognitions among the association types of small businesses from the perspectives of participation motivation, systematization expectation, policy demand level, and management performance. We assume that different industrial classification types of small businesses will have different cognitions concerning these factors. There are four basic industrial classification types of small businesses: retail sales, restaurant, service, and manufacturing. To date, most of the studies in this area have focused on collecting data on the external environments of small businesses or performing statistical analyses on their status. In this study, we surveyed 4 market areas in Busan, Masan, and Changwon in Korea, where business associations consist of merchants, shop owners, and traders. We surveyed 330 shops and merchants by sending a questionnaire or visiting. Finally, 268 questionnaires were collected and used for the analysis. An ANOVA, T-test, and regression analyses were conducted to test the research hypotheses. The results demonstrate that there are differences in cognition depending upon the industrial classification type. Restaurants generally have a higher cognition concerning job offer problems and a lower cognition concerning their competitiveness. Restaurants also depend more on systematization expectation than do the other industrial classification types. On the policy demand level, restaurants have a higher cognition. This study identifies several factors that are contributing to management performance through differences in cognition that depend upon association type: systematization expectation and policy demand level have positive effects on management performance; participation motivation has a negative effect on management performance. We confirm also that the image factors of different cognitions are linked to an awareness of the value of systematization and that these factors show sequential and continual patterns in the course of generating performances. In conclusion, this study carries significant implications in its classifying of small businesses into the four different associational types (retail sales, restaurant, services, and manufacturing). We believe our study to be the first one to conduct an empirical survey in this subject area. More studies in this area will likely use our research frameworks. The data show that regionally based industrial classification associations such as those in rural cities or less developed areas tend to suffer more problems than those in urban areas. Moreover, restaurants suffer more problems than the norm. Most of the problems raised in this study concern the act of 'associating itself'. Most associations have serious difficulties in associating. On the other hand, the area where they have the least policy demand is that of service types. This study contributes to the argument that associating, rather than financial assistance or management consulting, promotes the start-up and managerial performance of small businesses. This study also has some limitations. The main limitation is the number of questionnaires. We could not survey all the industrial classification types across the country because of budget and time limitations. If we had, we could have produced many more useful results and enhanced the precision of our analysis. The history of systemization is very short and the number of industrial classification associations is relatively low in Korea. We should keep in mind, though, that this is very crucial to systemization entrepreneurs starting their businesses, as it can heavily affect their chances of success. Being strongly associated with each other might be critical to the business success of industrial classification members. Thus, the government needs to put more effort and resources into supporting the drive of industrial classification members to become more strongly associated.

  • PDF

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

  • Lee, Minchul;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.183-203
    • /
    • 2018
  • News articles are the most suitable medium for examining the events occurring at home and abroad. Especially, as the development of information and communication technology has brought various kinds of online news media, the news about the events occurring in society has increased greatly. So automatically summarizing key events from massive amounts of news data will help users to look at many of the events at a glance. In addition, if we build and provide an event network based on the relevance of events, it will be able to greatly help the reader in understanding the current events. In this study, we propose a method for extracting event networks from large news text data. To this end, we first collected Korean political and social articles from March 2016 to March 2017, and integrated the synonyms by leaving only meaningful words through preprocessing using NPMI and Word2Vec. Latent Dirichlet allocation (LDA) topic modeling was used to calculate the subject distribution by date and to find the peak of the subject distribution and to detect the event. A total of 32 topics were extracted from the topic modeling, and the point of occurrence of the event was deduced by looking at the point at which each subject distribution surged. As a result, a total of 85 events were detected, but the final 16 events were filtered and presented using the Gaussian smoothing technique. We also calculated the relevance score between events detected to construct the event network. Using the cosine coefficient between the co-occurred events, we calculated the relevance between the events and connected the events to construct the event network. Finally, we set up the event network by setting each event to each vertex and the relevance score between events to the vertices connecting the vertices. The event network constructed in our methods helped us to sort out major events in the political and social fields in Korea that occurred in the last one year in chronological order and at the same time identify which events are related to certain events. Our approach differs from existing event detection methods in that LDA topic modeling makes it possible to easily analyze large amounts of data and to identify the relevance of events that were difficult to detect in existing event detection. We applied various text mining techniques and Word2vec technique in the text preprocessing to improve the accuracy of the extraction of proper nouns and synthetic nouns, which have been difficult in analyzing existing Korean texts, can be found. In this study, the detection and network configuration techniques of the event have the following advantages in practical application. First, LDA topic modeling, which is unsupervised learning, can easily analyze subject and topic words and distribution from huge amount of data. Also, by using the date information of the collected news articles, it is possible to express the distribution by topic in a time series. Second, we can find out the connection of events in the form of present and summarized form by calculating relevance score and constructing event network by using simultaneous occurrence of topics that are difficult to grasp in existing event detection. It can be seen from the fact that the inter-event relevance-based event network proposed in this study was actually constructed in order of occurrence time. It is also possible to identify what happened as a starting point for a series of events through the event network. The limitation of this study is that the characteristics of LDA topic modeling have different results according to the initial parameters and the number of subjects, and the subject and event name of the analysis result should be given by the subjective judgment of the researcher. Also, since each topic is assumed to be exclusive and independent, it does not take into account the relevance between themes. Subsequent studies need to calculate the relevance between events that are not covered in this study or those that belong to the same subject.

Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach (온라인 리뷰 분석을 통한 상품 평가 기준 추출: LDA 및 k-최근접 이웃 접근법을 활용하여)

  • Lee, Ji Hyeon;Jung, Sang Hyung;Kim, Jun Ho;Min, Eun Joo;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.97-117
    • /
    • 2020
  • Product evaluation criteria is an indicator describing attributes or values of products, which enable users or manufacturers measure and understand the products. When companies analyze their products or compare them with competitors, appropriate criteria must be selected for objective evaluation. The criteria should show the features of products that consumers considered when they purchased, used and evaluated the products. However, current evaluation criteria do not reflect different consumers' opinion from product to product. Previous studies tried to used online reviews from e-commerce sites that reflect consumer opinions to extract the features and topics of products and use them as evaluation criteria. However, there is still a limit that they produce irrelevant criteria to products due to extracted or improper words are not refined. To overcome this limitation, this research suggests LDA-k-NN model which extracts possible criteria words from online reviews by using LDA and refines them with k-nearest neighbor. Proposed approach starts with preparation phase, which is constructed with 6 steps. At first, it collects review data from e-commerce websites. Most e-commerce websites classify their selling items by high-level, middle-level, and low-level categories. Review data for preparation phase are gathered from each middle-level category and collapsed later, which is to present single high-level category. Next, nouns, adjectives, adverbs, and verbs are extracted from reviews by getting part of speech information using morpheme analysis module. After preprocessing, words per each topic from review are shown with LDA and only nouns in topic words are chosen as potential words for criteria. Then, words are tagged based on possibility of criteria for each middle-level category. Next, every tagged word is vectorized by pre-trained word embedding model. Finally, k-nearest neighbor case-based approach is used to classify each word with tags. After setting up preparation phase, criteria extraction phase is conducted with low-level categories. This phase starts with crawling reviews in the corresponding low-level category. Same preprocessing as preparation phase is conducted using morpheme analysis module and LDA. Possible criteria words are extracted by getting nouns from the data and vectorized by pre-trained word embedding model. Finally, evaluation criteria are extracted by refining possible criteria words using k-nearest neighbor approach and reference proportion of each word in the words set. To evaluate the performance of the proposed model, an experiment was conducted with review on '11st', one of the biggest e-commerce companies in Korea. Review data were from 'Electronics/Digital' section, one of high-level categories in 11st. For performance evaluation of suggested model, three other models were used for comparing with the suggested model; actual criteria of 11st, a model that extracts nouns by morpheme analysis module and refines them according to word frequency, and a model that extracts nouns from LDA topics and refines them by word frequency. The performance evaluation was set to predict evaluation criteria of 10 low-level categories with the suggested model and 3 models above. Criteria words extracted from each model were combined into a single words set and it was used for survey questionnaires. In the survey, respondents chose every item they consider as appropriate criteria for each category. Each model got its score when chosen words were extracted from that model. The suggested model had higher scores than other models in 8 out of 10 low-level categories. By conducting paired t-tests on scores of each model, we confirmed that the suggested model shows better performance in 26 tests out of 30. In addition, the suggested model was the best model in terms of accuracy. This research proposes evaluation criteria extracting method that combines topic extraction using LDA and refinement with k-nearest neighbor approach. This method overcomes the limits of previous dictionary-based models and frequency-based refinement models. This study can contribute to improve review analysis for deriving business insights in e-commerce market.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

Media Habits of Sensation Seekers (감지추구자적매체습관(感知追求者的媒体习惯))

  • Blakeney, Alisha;Findley, Casey;Self, Donald R.;Ingram, Rhea;Garrett, Tony
    • Journal of Global Scholars of Marketing Science
    • /
    • v.20 no.2
    • /
    • pp.179-187
    • /
    • 2010
  • Understanding consumers' preferences and use of media types is imperative for marketing and advertising managers, especially in today's fragmented market. A clear understanding assists managers in making more effective selections of appropriate media outlets, yet individuals' choices of type and use of media are based on a variety of characteristics. This paper examines one personality trait, sensation seeking, which has not appeared in the literature examining "new" media preferences and use. Sensation seeking is a personality trait defined as "the need for varied, novel, and complex sensations and experiences and the willingness to take physical and social risks for the sake of such experiences" (Zuckerman 1979). Six hypotheses were developed from a review of the literature. Particular attention was given to the Uses and Gratification theory (Katz 1959), which explains various reasons why people choose media types and their motivations for using the different types of media. Current theory suggests that High Sensation Seekers (HSS), due to their needs for novelty, arousal and unconventional content and imagery, would exhibit higher frequency of use of new media. Specifically, we hypothesize that HSS will use the internet more than broadcast (H1a) or print media (H1b) and more than low (LSS) (H2a) or medium sensation seekers (MSS) (H2b). In addition, HSS have been found to be more social and have higher numbers of friends therefore are expected to use social networking websites such as Facebook/MySpace (H3) and chat rooms (H4) more than LSS (a) and MSS (b). Sensation seekers can manifest into a range of behaviors including disinhibition,. It is expected that alternative social networks such as Facebook/MySpace (H5) and chat rooms (H6) will be used more often for those who have higher levels of disinhibition than low (a) or medium (b) levels. Data were collected using an online survey of participants in extreme sports. In order to reach this group, an improved version of a snowball sampling technique, chain-referral method, was used to select respondents for this study. This method was chosen as it is regarded as being effective to reach otherwise hidden population groups (Heckathorn, 1997). A final usable sample of 1108 respondents, which was mainly young (56.36% under 34), male (86.1%) and middle class (58.7% with household incomes over USD 50,000) was consistent with previous studies on sensation seeking. Sensation seeking was captured using an existing measure, the Brief Sensation Seeking Scale (Hoyle et al., 2002). Media usage was captured by measuring the self reported usage of various media types. Results did not support H1a and b. HSS did not show higher levels of usage of alternative media such as the internet showing in fact lower mean levels of usage than all the other types of media. The highest media type used by HSS was print media, suggesting that there is a revolt against the mainstream. Results support H2a and b that HSS are more frequent users of the internet than LSS or MSS. Further analysis revealed that there are significant differences in the use of print media between HSS and LSS, suggesting that HSS may seek out more specialized print publications in their respective extreme sport activity. Hypothesis 3a and b showed that HSS use Facebook/MySpace more frequently than either LSS or MSS. There were no significant differences in the use of chat rooms between LSS and HSS, so as a consequence no support for H4a, although significant for MSS H4b. Respondents with varying levels of disinhibition were expected to have different levels of use of Facebook/MySpace and chat-rooms. There was support for the higher levels of use of Facebook/MySpace for those with high levels of disinhibition than low or medium levels, supporting H5a and b. Similarly there was support for H6b, Those with high levels of disinhibition use chat-rooms significantly more than those with medium levels but not for low levels (H6a). The findings are counterintuitive and give some interesting insights for managers. First, although HSS use online media more frequently than LSS or MSS, this groups use of online media is less than either print or broadcast media. The advertising executive should not place too much emphasis on online media for this important market segment. Second, social media, such as facebook/Myspace and chatrooms should be examined by managers as potential ways to reach this group. Finally, there is some implication for public policy by the higher levels of use of social media by those who are disinhibited. These individuals are more inclined to engage in more socially risky behavior which may have some dire implications, e.g. by internet predators or future employers. There is a limitation in the study in that only those who engage in extreme sports are included. This is by nature a HSS activity. A broader population is therefore needed to test if these results hold.