• Title/Summary/Keyword: Embedding

Search Result 1,870, Processing Time 0.031 seconds

Immunohistochemical study on the atretic and the growing follicles after experimental superovulation in rats I. Number of follicles by superovulation (과배란 유기된 rat 난소에 퇴축난포와 성장난포에 대한 면역조직화학적 연구 I. 동원된 난포수에 대하여)

  • Kwak, Soo-dong
    • Korean Journal of Veterinary Research
    • /
    • v.37 no.1
    • /
    • pp.71-78
    • /
    • 1997
  • This study was designed to investigate the number of the growing and mature follicles following gonadotrophin treatments for superovulation in mature rats. Eighteen mature rats (Sprague-Duwely, initially 190~230gm) were randomly alloted into 3 groups. One group was control group, another FSH-treated group was injected intramuscularly with 0.5 units of follicular stimulating hormone (FSH) / rat, and third PMS and HCG-treated group was intramuscularly injected with 20~25IU of pregnant mare serum (PMS) / rat and then at the 48 hrs later, with 20~25IU of human chorionic gonadotrophin (HCG) / rat. The uteri and ovaries of rats were collected and then were observed grossly and serial sections of paraffin embedding ovaries were stained with H-E. Number of ovarian follicles by following 3 grades of large, middle and small follicles from secondary and tertiary follicles were investigated by LM photography of preparations. Small follicles were classified as secondary follicles of preantral follicles with more than 2 layers of granulosa cells surrounding the oocyte and middle follicles were classified as secondary follicles with early signs of antral cavity or with more than one small cavity on either side of the oocytes and large follicles were classified as tertiary follicles with a single medium sized antral cavity or large well-formed antral cavity. In gross findings, the uteri were slightly swelling in FSH-treated group and markedly swelling or filled with fluid in the uterine lumen in PMS and HCG-treated group. In histological findings, the shape and size of the follicles were diverse in middle and large follicles of FSH-treated group and PMS and HCG-treated group, and proportion of atretic follicles was increased in FSH-treated group and PMS and HCG-treated group than those in control group. The uteri of FSH-treated group and PMS and HCG-treated group were hypertropied or filled with fluid in the lumens and walls of uteri. The wall tissue layers were flattened and their blood and lymph vessels were dilated. The mean number of follicle per ovary in control group were appeared to be $17.1{\pm}5.6$($14.0%{\pm}4.6%$), $37.8{\pm}9.1$($30.9{\pm}7.4%$) and $67.6{\pm}30.1$($55.2{\pm}24.6%$) respectively at large, middle and small follicles and total number of these 3 grade follicles were appeared to be $122.5{\pm}40.0$. The mean number of follicle per ovary in FSH-treated group were appeared to be $22.8{\pm}7.0$($17.4%{\pm}5.3%$), $43.4{\pm}6.6$($33.2{\pm}5.1%$) and $64.5{\pm}13.0$($49.3{\pm}9.9%$) respectively at large, middle and small follicles and total number of these 3 grade follicles were appeared to be $130.7{\pm}16.6$. The mean number of follicle per ovary in PMS and HCG-treated group were appeared to be $29.7{\pm}11.0$($16.3%{\pm}6.0%$), $61.9{\pm}17.2$($33.9{\pm}9.4%$) and $91.1{\pm}28.2$($49.9{\pm}15.4%$) respectively at large, middle and small follicles and total number of these 3 grade follicles were appeared to be $182.6{\pm}32.7$. The above findings reveal that large follicles were increased 29.8% in FSH-treated group and 73.7% in PMS and HCG-treated group than those in control group and in histologic findings, proportion of atretic follicles were more increased in ovaries with more number of more developing follicles.

  • PDF

EFFECT OF LIGHT IRRADIATION MODES ON THE MARGINAL LEAKAGE OF COMPOSITE RESIN RESTORATION (광조사 방식이 복합레진 수복물의 변연누출에 미치는 영향)

  • 박은숙;김기옥;김성교
    • Restorative Dentistry and Endodontics
    • /
    • v.26 no.4
    • /
    • pp.263-272
    • /
    • 2001
  • The aim of this study was to investigate the influence of four different light curing modes on the marginal leakage of Class V composite resin restoration. Eighty extracted human premolars were used. Wedge-shaped class Y cavities were prepared on the buccal surface of the tooth with high-speed diamond bur without bevel. The cavities were positioned half of the cavity above and half beyond the cemento-enamel junction. The depth, height, and width of the cavity were 2 mm, 3 mm and 2 mm respectively. The specimens were divided into 4 groups of 20 teeth each. All the specimen cavities were treated with Prime & Bond$^{R}$ NT dental adhesive system (Dentsply DeTrey GmbH, Germany) according to the manufacturer's instructions and cured for 10 seconds except group VI which were cured for 3 seconds. All the cavities were restored with resin composite Spectrum$^{TM}$ TPH A2 (Dentsply DeTrey GmbH, Germany) in a bulk. Resin composites were light-cured under 4 different modes. A regular intensity group (600 mW/${cm}^2$, group I) was irradiated for 30 s, a low intensity group (300 mW/${cm}^2$, group II) for 60 s and a ultra-high intensity group (1930 mW/${cm}^2$, group IV) for 3 s. A pulse-delay group (group III) was irradiated with 400 mW/${cm}^2$ for 2 s followed by 800 mW/${cm}^2$ for 10 s after 5 minutes delay. The Spectrum$^{TM}$ 800 (Dentsply DeTrey GmbH, Germany) light-curing units were used for groups I, II and III and Apollo 95E (DMD, U.S.A.) was used for group IV. The composite resin specimens were finished and polished immediately after light curing except group III which were finished and polished during delaying time. Specimens were stored in a physiologic saline solution at 37$^{\circ}C$ for 24 hours. After thermocycling (500$\times$, 5-55$^{\circ}C$), all teeth were covered with nail varnish up to 0.5 mm from the margins of the restorations, immersed in 37$^{\circ}C$, 2% methylene blue solution for 24 hours, and rinsed with tap water for 24 hours. After embedding in clear resin, the specimens were sectioned with a water-cooled diamond saw (Isomet$^{TM}$, Buehler Co., Lake Bluff, IL, U.S.A.) along the longitudinal axis of the tooth so as to pass the center of the restorations. The cut surfaces were examined under a stereomicroscope (SZ-PT Olympus, Japan) at ${\times}$25 magnification, and the images were captured with a CCD camera (GP-KR222, Panasonic, Japan) and stored in a computer with Studio Grabber program. Dye penetration depth at the restoration/dentin and the restoration/enamel interfaces was measured as a rate of the entire depth of the restoration using a software (Scion image, Scion Corp., U.S.A.) The data were analysed statistically using One-way ANOVA and Tukey's method. The results were as follows : 1. Pulse-Delay group did not show any significant difference in dye penetration rate from other groups at enamel and dentin margins (p>0.05) 2. At dentin margin, ultra-high intensity group showed significantly higher dye penetration rate than both regular intensity group and low intensity group (p<0.05). 3. At enamel margin, there were no statistically significant difference among four groups (p>0.05). 4. Dentin margin showed significantly higher dye penetration rate than enamel margin in all groups (p<0.05).

  • PDF

THE CHANGE OF BONE FORMATION ACCORDING TO MAGNETIC INTENSITY OF MAGNET PLACID INTO TITANIUM IMPLANT SPECIMENS (타이타늄 임플랜트 시편 내부에 설치한 자석의 자성강도에 따른 골형성 변화)

  • Hwang Yun-Tae;Lee Sung-Bok;Choi Dae-Gyun;Choi Boo-Byung
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.43 no.2
    • /
    • pp.232-247
    • /
    • 2005
  • Purpose. The purposes of this investigation were to discover the possibility of clinical application in the areas of dental implants and bone grafts by investigating the bone formation histologically around specimen which was depending on the intensity of magnetic field of neodymium magnet inside of the specimens. Material and method. 1. Measurement of magnetic intensity - placed the magnet inside of the specimen, and measured the intensity of magnetic field around the 1st thread and 3rd thread of specimen 20 times by using a Gaussmeter(Kanetec Co., Japan). 2. Surgical Procedure - Male rabbit was anesthetised by constant amount of Ketamine (0.25ml/kg) and Rompun (0.25ml/kg). After incising the flat part of tibia, and planted the specimens of titanium implant, control group was stitched without magnet, while experimental groups were placed a magnedisc 500(Aichi Steel Co., Japan) or magnedisc 800(Aichi Steel Co., Japan) into it, fixed by pattern resin and stitched. 3. Management after the surgery - In order to prevent it from the infection of bacteria and for antiinflammation, Gentamycin and Ketopro were injected during 1 week from operation day, and dressed with potadine. 4. Preparation of histomorphometric analysis - At 2, 4 and 8 weeks after the surgery, the animals were sacrificed by excessed Ketamine, and then, specimens were obtained including the operated part and some parts of tibia, and fixed it to 10% of PBS buffer solution. After embedding specimens in Technovit 1200 and B.P solution, made a H-E stain. Samples width was 75$\mu$m . In histological findings through the optical microscope and using Kappa image base program(Olympus Co. Japan), the bone contact ratio and bone area ratio of each parts of specimens were measured and analyzed. 5. Statistical analysis - Statistical analysis was accomplished with Mann Whitney U-test. Results and conclusion. 1. In histomorphometric findings, increased new bone formation was shown in both control & experimental groups through the experiment performed for 2, 4 & 8 weeks. After 4 weeks, more osteoblasts and osteoclasts with significant bone remodeling were shown in experimental groups. 2. In histomorphometric analysis, the bone contact ratios were 38.5% for experimental group 1, 29.5% for experimental group 2 and 11.9% for control group. Experimental groups were higher than control group(p<0.05) (Fig. 6, Table IV). The bone area ratios were 60.9% for experimental group 2, 46.4% for experimental group 1 and 36.0% for control group. There was no significantly statistical difference between experimental groups and control group(p<0.05) (Fig. 8, Table VII) 3. In comparision of the bone contact ratios at each measurement sites according to magnetic intensity, experimental group 2(5.6mT) was higher than control group at the 1st thread (p<0.05) and experimental group 1 (1.8mT) was higher than control group at the 3rd thread(p<0.05) (Fig. 7, Table V, VI). 4. In comparision of the bone area ratios at each measurement sites according to magnetic intensity, experimental group 2(5.6mT) was higher than control group and experimental group 1 (4.0mT) at the 1st thread(p<0.1) and experimental group 2(4.4mT) was higher than experimental group 1 (1.8mT) at the 3rd thread(p<0.1) (Fig. 9, Table IX, X). Experiment group 2 was largest, followed by experiment group l and control group at the 3rd thread of implant. There was a significant difference at the 1st thread of control group & experiment group 2, and at 1st thread & 3rd thread of experiment group 1 & 2, and not at control group experiment group 1.(p<0.1)

CIA-Level Driven Secure SDLC Framework for Integrating Security into SDLC Process (CIA-Level 기반 보안내재화 개발 프레임워크)

  • Kang, Sooyoung;Kim, Seungjoo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.5
    • /
    • pp.909-928
    • /
    • 2020
  • From the early 1970s, the US government began to recognize that penetration testing could not assure the security quality of products. Results of penetration testing such as identified vulnerabilities and faults can be varied depending on the capabilities of the team. In other words none of penetration team can assure that "vulnerabilities are not found" is not equal to "product does not have any vulnerabilities". So the U.S. government realized that in order to improve the security quality of products, the development process itself should be managed systematically and strictly. Therefore, the US government began to publish various standards related to the development methodology and evaluation procurement system embedding "security-by-design" concept from the 1980s. Security-by-design means reducing product's complexity by considering security from the initial phase of development lifecycle such as the product requirements analysis and design phase to achieve trustworthiness of product ultimately. Since then, the security-by-design concept has been spread to the private sector since 2002 in the name of Secure SDLC by Microsoft and IBM, and is currently being used in various fields such as automotive and advanced weapon systems. However, the problem is that it is not easy to implement in the actual field because the standard or guidelines related to Secure SDLC contain only abstract and declarative contents. Therefore, in this paper, we present the new framework in order to specify the level of Secure SDLC desired by enterprises. Our proposed CIA (functional Correctness, safety Integrity, security Assurance)-level-based security-by-design framework combines the evidence-based security approach with the existing Secure SDLC. Using our methodology, first we can quantitatively show gap of Secure SDLC process level between competitor and the company. Second, it is very useful when you want to build Secure SDLC in the actual field because you can easily derive detailed activities and documents to build the desired level of Secure SDLC.

The Influence of AH-26 and Zinc Oxide-Eugenol Root Canal Sealer on the Shear Bond Strength of Composite Resin to Dentin (AH-26 및 산화아연유지놀 근관실러가 상아질에 대한 복합레진의 전단결합강도에 미치는 영향)

  • Cho, Ju-Yeon;Jin, Myoung-Uk;Kim, Young-Kyung;Kim, Sung-Kyo
    • Restorative Dentistry and Endodontics
    • /
    • v.31 no.3
    • /
    • pp.147-152
    • /
    • 2006
  • The purpose of this study was to evaluate the influence of the AH-26 root canal sealer on the shear bond strength of composite resin to dentin. One hundred and forty four (144) extracted, sound human molars were used. After embedding in a cylindrical mold, the occlusal part of the anatomical crown was cut away and trimmed in order to create a flat dentin surface. The teeth were randomly divided into three groups; the AH-26 sealer was applied to the AH-26 group, and zinc-oxide eugenol (ZOE) paste was applied to the ZOE group. The dentin surface of the control group did not receive any sealer. A mount jig was placed against the surface of the teeth and the One-step dentin bonding agent was applied after acid etching. Charisma composite resin was packed into the mold and light cured. After polymerization, the alignment tube and mold were removed and the specimens were placed in distilled water at $37^{\circ}C$ for twenty four hours. The shear bond strength was measured by an Instron testing machine. The data for each group were subjected to one-way ANOVA and Tukey's studentized rank test so as to make comparisons between the groups. The AH-26 group and the control group showed significantly higher shear bond strength than the ZOE group (p<0.05). There were no significant differences between the AH-26 group and the control one (p>0.05). Under the conditions of this study, the AH-26 root canal sealer did not seem to affect the shear bond strength of the composite resin to dentin while the ZOE sealer did. Therefore, there may be no decrease in bond strength when the composite resin core is built up immediately after a canal filling with AH-26 as a root canal sealer.

Research on hybrid music recommendation system using metadata of music tracks and playlists (음악과 플레이리스트의 메타데이터를 활용한 하이브리드 음악 추천 시스템에 관한 연구)

  • Hyun Tae Lee;Gyoo Gun Lim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.145-165
    • /
    • 2023
  • Recommendation system plays a significant role on relieving difficulties of selecting information among rapidly increasing amount of information caused by the development of the Internet and on efficiently displaying information that fits individual personal interest. In particular, without the help of recommendation system, E-commerce and OTT companies cannot overcome the long-tail phenomenon, a phenomenon in which only popular products are consumed, as the number of products and contents are rapidly increasing. Therefore, the research on recommendation systems is being actively conducted to overcome the phenomenon and to provide information or contents that are aligned with users' individual interests, in order to induce customers to consume various products or contents. Usually, collaborative filtering which utilizes users' historical behavioral data shows better performance than contents-based filtering which utilizes users' preferred contents. However, collaborative filtering can suffer from cold-start problem which occurs when there is lack of users' historical behavioral data. In this paper, hybrid music recommendation system, which can solve cold-start problem, is proposed based on the playlist data of Melon music streaming service that is given by Kakao Arena for music playlist continuation competition. The goal of this research is to use music tracks, that are included in the playlists, and metadata of music tracks and playlists in order to predict other music tracks when the half or whole of the tracks are masked. Therefore, two different recommendation procedures were conducted depending on the two different situations. When music tracks are included in the playlist, LightFM is used in order to utilize the music track list of the playlists and metadata of each music tracks. Then, the result of Item2Vec model, which uses vector embeddings of music tracks, tags and titles for recommendation, is combined with the result of LightFM model to create final recommendation list. When there are no music tracks available in the playlists but only playlists' tags and titles are available, recommendation was made by finding similar playlists based on playlists vectors which was made by the aggregation of FastText pre-trained embedding vectors of tags and titles of each playlists. As a result, not only cold-start problem can be resolved, but also achieved better performance than ALS, BPR and Item2Vec by using the metadata of both music tracks and playlists. In addition, it was found that the LightFM model, which uses only artist information as an item feature, shows the best performance compared to other LightFM models which use other item features of music tracks.

The Effects of Managers on Organizational Performance in NBA and KBL Teams: The Moderating Role of Player Capabilities (프로스포츠 산업 조직 구성원의 역량에 따른 관리자의 역할: 미국프로농구(NBA)와 한국프로농구(KBL)의 감독과 선수단 전력 수준에 관한 실증연구 분석)

  • TAE SUNG, LEE;PHILSOO, KIM;SANG HYUN, LEE;SANG BUM, LEE
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.17 no.6
    • /
    • pp.195-208
    • /
    • 2022
  • The role of venture CEO and their intrinsic capabilities on organizational performance can be determined by the level of resource synchronization initiated by the focal managers. Despite the important role of venture CEOs, a systematic lack of in-depth theoretical and empirical studies on ruminating the relationship between the effects of a CEO's capabilities and organizational performance depending on the level of resource synchronization exist for the rationale of investigation. To supplement the limitations of previous studies, this research empirically analyzes the role of managers specifically synchronizing organizational resources that affect organizational performance in the professional sports industry. Based on the entrepreneurship theory and resource-based view (RBV), this research conceptualizes the roles of venture CEO and basketball head coach in the professional sports industry as very similar in terms of organizational structure and performance mechanism embedding entrepreneurial characteristics necessary for managing organizational resources. In this research, we hypothesized (1) organizational resource synchronization will mediate the positive relationship between the ability of professional basketball head coach and organizational performance and (2) the indirect effect of the professional basketball head coach's capabilities on organizational performance mediated by resource synchronization will be moderated by the capabilities of players. To test these hypotheses, we utilized the PROCESS macro model 58 with the empirical data of 9 seasons (2013~2014-2021~2022) of 30 National Basketball Association (NBA) and 10 Korean Basketball League (KBL) teams. The statistical results showed that (1) resource synchronization mediates the positive relationship between professional basketball head coach capabilities and organizational performance and (2) the capabilities of players moderated the indirect effects of the abilities of head coaches on team performance via resource synchronization. This paper contributes to both academic and practical domains of entrepreneurship by empirically testing the research model through objective professional sports data.

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

  • Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.59-83
    • /
    • 2018
  • With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.