• Title/Summary/Keyword: 새로운 분석모델

Search Result 2,621, Processing Time 0.031 seconds

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

A Study on People Counting in Public Metro Service using Hybrid CNN-LSTM Algorithm (Hybrid CNN-LSTM 알고리즘을 활용한 도시철도 내 피플 카운팅 연구)

  • Choi, Ji-Hye;Kim, Min-Seung;Lee, Chan-Ho;Choi, Jung-Hwan;Lee, Jeong-Hee;Sung, Tae-Eung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.131-145
    • /
    • 2020
  • In line with the trend of industrial innovation, IoT technology utilized in a variety of fields is emerging as a key element in creation of new business models and the provision of user-friendly services through the combination of big data. The accumulated data from devices with the Internet-of-Things (IoT) is being used in many ways to build a convenience-based smart system as it can provide customized intelligent systems through user environment and pattern analysis. Recently, it has been applied to innovation in the public domain and has been using it for smart city and smart transportation, such as solving traffic and crime problems using CCTV. In particular, it is necessary to comprehensively consider the easiness of securing real-time service data and the stability of security when planning underground services or establishing movement amount control information system to enhance citizens' or commuters' convenience in circumstances with the congestion of public transportation such as subways, urban railways, etc. However, previous studies that utilize image data have limitations in reducing the performance of object detection under private issue and abnormal conditions. The IoT device-based sensor data used in this study is free from private issue because it does not require identification for individuals, and can be effectively utilized to build intelligent public services for unspecified people. Especially, sensor data stored by the IoT device need not be identified to an individual, and can be effectively utilized for constructing intelligent public services for many and unspecified people as data free form private issue. We utilize the IoT-based infrared sensor devices for an intelligent pedestrian tracking system in metro service which many people use on a daily basis and temperature data measured by sensors are therein transmitted in real time. The experimental environment for collecting data detected in real time from sensors was established for the equally-spaced midpoints of 4×4 upper parts in the ceiling of subway entrances where the actual movement amount of passengers is high, and it measured the temperature change for objects entering and leaving the detection spots. The measured data have gone through a preprocessing in which the reference values for 16 different areas are set and the difference values between the temperatures in 16 distinct areas and their reference values per unit of time are calculated. This corresponds to the methodology that maximizes movement within the detection area. In addition, the size of the data was increased by 10 times in order to more sensitively reflect the difference in temperature by area. For example, if the temperature data collected from the sensor at a given time were 28.5℃, the data analysis was conducted by changing the value to 285. As above, the data collected from sensors have the characteristics of time series data and image data with 4×4 resolution. Reflecting the characteristics of the measured, preprocessed data, we finally propose a hybrid algorithm that combines CNN in superior performance for image classification and LSTM, especially suitable for analyzing time series data, as referred to CNN-LSTM (Convolutional Neural Network-Long Short Term Memory). In the study, the CNN-LSTM algorithm is used to predict the number of passing persons in one of 4×4 detection areas. We verified the validation of the proposed model by taking performance comparison with other artificial intelligence algorithms such as Multi-Layer Perceptron (MLP), Long Short Term Memory (LSTM) and RNN-LSTM (Recurrent Neural Network-Long Short Term Memory). As a result of the experiment, proposed CNN-LSTM hybrid model compared to MLP, LSTM and RNN-LSTM has the best predictive performance. By utilizing the proposed devices and models, it is expected various metro services will be provided with no illegal issue about the personal information such as real-time monitoring of public transport facilities and emergency situation response services on the basis of congestion. However, the data have been collected by selecting one side of the entrances as the subject of analysis, and the data collected for a short period of time have been applied to the prediction. There exists the limitation that the verification of application in other environments needs to be carried out. In the future, it is expected that more reliability will be provided for the proposed model if experimental data is sufficiently collected in various environments or if learning data is further configured by measuring data in other sensors.

A Proposal of a Keyword Extraction System for Detecting Social Issues (사회문제 해결형 기술수요 발굴을 위한 키워드 추출 시스템 제안)

  • Jeong, Dami;Kim, Jaeseok;Kim, Gi-Nam;Heo, Jong-Uk;On, Byung-Won;Kang, Mijung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.1-23
    • /
    • 2013
  • To discover significant social issues such as unemployment, economy crisis, social welfare etc. that are urgent issues to be solved in a modern society, in the existing approach, researchers usually collect opinions from professional experts and scholars through either online or offline surveys. However, such a method does not seem to be effective from time to time. As usual, due to the problem of expense, a large number of survey replies are seldom gathered. In some cases, it is also hard to find out professional persons dealing with specific social issues. Thus, the sample set is often small and may have some bias. Furthermore, regarding a social issue, several experts may make totally different conclusions because each expert has his subjective point of view and different background. In this case, it is considerably hard to figure out what current social issues are and which social issues are really important. To surmount the shortcomings of the current approach, in this paper, we develop a prototype system that semi-automatically detects social issue keywords representing social issues and problems from about 1.3 million news articles issued by about 10 major domestic presses in Korea from June 2009 until July 2012. Our proposed system consists of (1) collecting and extracting texts from the collected news articles, (2) identifying only news articles related to social issues, (3) analyzing the lexical items of Korean sentences, (4) finding a set of topics regarding social keywords over time based on probabilistic topic modeling, (5) matching relevant paragraphs to a given topic, and (6) visualizing social keywords for easy understanding. In particular, we propose a novel matching algorithm relying on generative models. The goal of our proposed matching algorithm is to best match paragraphs to each topic. Technically, using a topic model such as Latent Dirichlet Allocation (LDA), we can obtain a set of topics, each of which has relevant terms and their probability values. In our problem, given a set of text documents (e.g., news articles), LDA shows a set of topic clusters, and then each topic cluster is labeled by human annotators, where each topic label stands for a social keyword. For example, suppose there is a topic (e.g., Topic1 = {(unemployment, 0.4), (layoff, 0.3), (business, 0.3)}) and then a human annotator labels "Unemployment Problem" on Topic1. In this example, it is non-trivial to understand what happened to the unemployment problem in our society. In other words, taking a look at only social keywords, we have no idea of the detailed events occurring in our society. To tackle this matter, we develop the matching algorithm that computes the probability value of a paragraph given a topic, relying on (i) topic terms and (ii) their probability values. For instance, given a set of text documents, we segment each text document to paragraphs. In the meantime, using LDA, we can extract a set of topics from the text documents. Based on our matching process, each paragraph is assigned to a topic, indicating that the paragraph best matches the topic. Finally, each topic has several best matched paragraphs. Furthermore, assuming there are a topic (e.g., Unemployment Problem) and the best matched paragraph (e.g., Up to 300 workers lost their jobs in XXX company at Seoul). In this case, we can grasp the detailed information of the social keyword such as "300 workers", "unemployment", "XXX company", and "Seoul". In addition, our system visualizes social keywords over time. Therefore, through our matching process and keyword visualization, most researchers will be able to detect social issues easily and quickly. Through this prototype system, we have detected various social issues appearing in our society and also showed effectiveness of our proposed methods according to our experimental results. Note that you can also use our proof-of-concept system in http://dslab.snu.ac.kr/demo.html.

A Study on the Historical Development of Research Community in Korea: Focused on the Government Supported Institutes (연구자 집단의 성장과 변천: 정부 출연 연구 기관을 중심으로)

  • Park Jin-Hee
    • Journal of Science and Technology Studies
    • /
    • v.6 no.1 s.11
    • /
    • pp.119-152
    • /
    • 2006
  • This paper deals with the historical development of research community in Korea. As the former studies of the korean scientific community show, the government supported institutes played an important role in the formation of research community. Therefore the theme of this study is concerned with the historical development of the government supported institutes and the features of their researcher group. In this paper following questions will be answered: How the social status of these researcher group is changed, what kind of response on social problems or national politics they had, and which characteristic they showed with regards to the identity problem. After the korean liberation the government institutes, such as the Chungang Kongop Yonguso(industrial research center)and the Korean Atomic Energy Research Institute, contributed to the development of the first generation of research group. However this research group could hardly identify themselves as researcher, because they spent much time on testing, evaluation or education. The identity problem is also resulted from the deficiency of authority as research institute. The status of researcher had no difference from that of civil servant. With the establishment of KIST the korean research community came into blossom. The government supported institutes, which were founded after the model of KIST, allowed quantitative and qualitative growth of research community. Thanks to the guarantee of institutional authority and the new reward system, the researcher could get respect and improve its social status. During this period the researcher volunteered to help the government policies. We can find often the nationalistic statements in the research community. During 1990s the research group demonstrated different behaviors and attitude toward the government. The nationalistic ideology disappeared. Instead of that, the research group criticized the government policies and took actions against the government. Those changes are related with the lowered position of government supported institutes.

  • PDF

On the Biological Functions of Equine Chorionic Gonadotropin (말의 융모성 성선자극 호르몬의 생화학적 기능)

  • 민관식;윤종택
    • Korean Journal of Animal Reproduction
    • /
    • v.26 no.3
    • /
    • pp.299-308
    • /
    • 2002
  • In horse, a single gene encodes both eCG and eLH $\beta$ subunits. The difference between eCG and eLH lies in the structure of their glycoresidues, which are both sialylated and sulfated in LH and sialylated in CG eCG consists of highly glycosyiated $\alpha$- and $\beta$-subunits and is an unique member of the gonadotropin family because it elicits response characteristics of both FSH and LH in other species than the horse. This dual activity of eCG in heterologous species is of fundamental interest to the study of gonadotropin structure-function relationships and the understanding of the molecular bases of the specific interactions of these hormones with their receptors. Thus, eCG is a dintinct molecule from the view points of its biological function and glycoresidue structures. The oligosaccharide at Asn 56 of the $\alpha$-subunit plays an indispensable role, whereas the carboxyl-terminal extension of the eCG $\beta$-subunit with its associated O-linked oligosaccharides is not improtant for, the in vitro LH-like activity of eCG. In contrast, both N- and O-linked oligosaccharides play important roles for FSH-like activity and increase FSH-like activity by removal of N- and O-linked oligosaccharides. Therefore, the dual LH- and FSH-like activities of eCG can be clearly separated by removal of either the N-linked oligosaccharide on the $\alpha$-subunit or CTP-associated O-linked oligosaccharides from its $\beta$-subunit. The glycoresidues seem to play crucial roles fer biological activities. The tethered-eCG was effciently secreted and showed similar LH-like activity to the dimeric eCG $\alpha$/ $\beta$ and native eCG. FSH-like activity of the tethered-eCG was also shown similarly in comparison with the native and wild type eCG $\alpha$/ $\beta$. Our data for the first time suggest that the tethered-eCG can be expressed efficiently and the produced product by the CHO-Kl cells is fully LH- and FSH-like activities in rat in vitro bioassay system. Our results also suggest that this molecular can imply particular models ot FSH-like activity not LH-like activity in the eCG. Taken together, these data indicate that the constructs of tethered molecule will be useful in the study of mutants that affect subunit association and/or secretion.

A Novel Method to Study the Effects of Cyclosporine on Gingival Overgrowth in Children (소아에서 치은 과증식에 대한 cyclosporine의 효과를 연구하는 새로운 방법)

  • Han, Keumah;Kim, Jongsoo
    • Journal of the korean academy of Pediatric Dentistry
    • /
    • v.45 no.3
    • /
    • pp.271-279
    • /
    • 2018
  • Previous studies to elucidate the etiology of cyclosporine(Cs)-induced gingival overgrowth in children have not completely excluded all factors that may cause differences among individuals. This study examined the effect of cyclosporine on the metabolism of type 1 collagen(CoL-I) in experimental models that controlled the effects of biological variations on individuals. Five 5-week-old male Sprague-Dawley rats were administered Cs by gastric feeding for 6 weeks. Gingival specimens were harvested from the mandibular posterior area before beginning Cs administration and at 2, 4, and 6 weeks thereafter. Gingival fibroblasts were cultured from all the 20 biopsies collected from the gingiva. Half of the fibroblasts collected prior to the Cs administration were designated as Control. The other half of the fibroblasts were treated with Cs in vitro and called in vitro test group(Tt). The fibroblasts collected 2, 4, and 6 weeks after the Cs administration were called in vivo test groups : T2, T4, T6, respectively. Immunofluorescence microscopy was used to detect CoL-I in all the fibroblasts. CoL-I was analyzed at both the gene and protein expression levels by real-time polymerase chain reaction and western blotting. Changes in CoL-I before and after Cs treatment were evaluated from the gingiva of each rat. There was no significant difference in gene expression of CoL-I in the control and test groups. CoL-I protein expression levels of fibroblasts increased in in vitro Cs treatment for each individual, and also increased in in vivo Cs treatment. In this study, the experimental method that control biological variations that can occur due to differences among individuals was useful. Subsequent studies on other factors besides CoL-I and in-depth studies in humans are needed.

Cytotoxic Effects of Tenebrio molitor Larval Extracts against Hepatocellular Carcinoma (갈색거저리 유충 추출물의 간암세포에 대한 세포독성 효능)

  • Lee, Ji-Eun;Lee, An-Jung;Jo, Da-Eun;Cho, Ju Hyeong;Youn, Kumju;Yun, Eun-Young;Hwang, Jae-Sam;Jun, Mira;Kang, Byoung Heon
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.44 no.2
    • /
    • pp.200-207
    • /
    • 2015
  • Various natural products or their derivatives, mostly originating from plants, fungi, and bacteria, have been exploited as therapeutic drugs to treat various human diseases. In addition to previously explored organisms, research on natural compounds has now expanded into unexamined living organisms in order to identify novel bioactive substances. Here, we determined whether or not the larval form of the mealworm beetle Tenebrio molitor, a species of darkling beetle, contains cytotoxic substances that exclusively affect cancer cell viability. Ethanol extract and its solvent partitioned fractions, hexane and ethyl acetate fractions, showed anticancer effects against various human cancer cells derived from the prostate (PC3 and 22Rv1), cervix (HeLa), liver (PLC/PRF5, HepG2, Hep3B, and SK-HEP-1), colon (HCT116), lung (NCI-H460), breast (MDA-MB231), and ovary (SKOV3). Cell death induced by the fractions was a mix of apoptosis, necrosis, and autophagy. The hexane fraction was administered intraperitoneally to nude mice bearing a hepatocellular carcinoma SK-HEP-1 and showed inhibition of tumor growth in vivo. Therefore, we concluded that worm extracts contain cytotoxic substances, which can be enriched by proper fractionation protocols, and further separation and purification could lead to the identification of novel molecules to treat human cancers.

A Study on Strategy of Forest Rehabilitation Support Corresponding to the Spread of Marketization in North Korea (북한의 시장화 확산에 대응한 대북 산림복구 지원전략 연구)

  • Song, Minkyung;Yi, Jong-Min;Park, Kyung-Seok
    • Journal of Korean Society of Forest Science
    • /
    • v.106 no.4
    • /
    • pp.487-496
    • /
    • 2017
  • The marketization in North Korea is spreading rapidly. This study proposes forest rehabilitation strategy for North Korea in light of their major shift toward market economy. This current trend of marketization in North Korea is now affecting the forest sector, especially the way the residents utilize small forest land. For analyzing the influence of marketization on forest management in North Korea, we reviewed the official documents issued by North Korea and related materials of North Korean marketization. The government of Kim Jong Eun has set up policies and systems regarding the spread of marketization, such as guaranteeing individuals a right to dispose certain products on their own and establishing a special economic zone to attract foreign investments. In the forestry sector, the North Korean government has been trying to fully implement its forest restoration plan by carrying out measures like re-claiming of sloping lands that had been previously used by residents. However, as marketization progresses, it is expected that there lies much difficulty in government-led massive mobilization for forest restoration due to the increase of illegal logging to meet high demand for timber, illegal firewood harvesting, collecting non-timber products for livelihoods and illegal crop cultivation to sell in the market. Therefore, South Korea's support for forest restoration should also consider the recent marketization phenomenon in North Korea. It is necessary to formulate strategic measures such as conducting joint commercialization project on agroforestry management using cooperative farming unit, helping to improve income source from small forest lands, and to activate a comprehensive mountain village special economic zone by utilizing forest business. We do hope that our proposed forest rehabilitation strategy in this paper regarding the changes in North Korea's marketization and forest policy can give a meaningful suggestion on supporting forest restoration in North Korea in an effective way.

Predicting the Nutritional Value of Seafood Proteins as Measured by Newer In Vitro Model -1. C-PER and DC-PER of Shellfish Proteins- (수산식품단백질 품질평가를 위한 새로운 모델 설정 -1. 패류의 C-PER 및 DC-PER-)

  • Ryu, Hong-Soo;Lee, Kang-Ho;Kim, Jang-Yang;Choi, Byeong-Dae
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.14 no.3
    • /
    • pp.265-273
    • /
    • 1985
  • To predict the nutritional quality of seafood proteins using a newer in virto model, 10 species of shellfish protein samples were used in determining the extent of in vitro digestibility, trypsin indigestible substrate (TIS), computed protein efficiency ratio (C-PER), discriminant computed protein efficiency ratio (DC-PER) and predicted digestibility which calculated solely from amino acid profile. The content of TIS in eviscerated samples were ranged from 1.10 to 5.09 mg/g solid, whereas the whole samples were ranged from 1.26 to 7.30 mg/g solid expressed quantitatively as mg of soybean trypsin inhibitor. The in vitro digestibility showed $82{\sim}86%$ for eviscerated samples in contrast with $78{\sim}84%$ for whole ones. Therefore, the results suggested that in vitro digestibility of shellfish was influenced by the present of viscera. The lysine content of Mya arenaria, Saxidomus purpuratus, Anadara subcrenata, and Anadara broughronii were lower than that of ANRC casein, but Corbicula fluminea, Cyclina sinensis, and eviscerated Mytilus edulis, were showed the value about 10.0 g/16g N. In all samples, the content of tryptophan and cystein were more higher than those of ANRC casein. The C-PER of whole samples showed the value below 2.0 while the values above 2.5 noted in the eviscerated samples. DC-PER of most samples were greater than those of C-PER and a greater discrepancies were revealed in whole shellfish which possesses the lower in vitro digestibility. The shellfish sample showed a high in vitro digestibility and a low TIS content such as eviscerated ones may need the DC-PER and predicted digestibility procedures rather than C-PER and four-enzyme in vitro digestibility procedure could offer more advantages in predicting the protein quality of whole shellfish samples which have poor in vitro digestibility and high TIS content.

  • PDF

An Exploratory Study of REID Benefits for Apparel Retailing (의류소매업에서의 RFID 이점에 대한 탐색적 연구)

  • Kim, Hae-Jung;Kim, Eun-Young
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.30 no.12 s.159
    • /
    • pp.1697-1707
    • /
    • 2006
  • Relentless advances in information technology are constantly transforming market dynamics of the retail industry. RFID is an emerging innovative technology that can reduce labor costs, improve inventory control and increase sales by effective business processes. Apparel retailers need to recognize the benefits of RFID and identify critical success factors. By focusing on apparel retailers, this study attempts (1) to identify the reality of RFID associated with benefits; and (2) to prospect the implementation of RFID in apparel retailing. We conducted a focus group interview with selected six panels who were experts of retail industry in the United States to obtain data regarding RFID attributes. Content analysis was used to generate related excerpts and classify 31 attributes of RFID benefits from the meaningful 173 responses. For experience of RFID, retailers were familiar with RFID technology and expressed the belief that RFID basically would support an existing retail system for speed to markets. However, retailers addressed the level of experience with RFID technology that they were still in the early adoption stage among few innovative companies. The content analysis identified five dimensions of RFID benefits for apparel retailing: Visibility and Velocity, Revenue Enhancement, Customer Service, Security, and Employee Productivity. This result lends support to the belief that RFID has a significant potential to streamline supply chain management, store operation and customer service for apparel retailing. This study provides intellectual and managerial implications far practitioners and researchers by postulating the effective use of RFID in the apparel retail industry.