• Title/Summary/Keyword: kernel

Search Result 2,915, Processing Time 0.028 seconds

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

A New Medium Maturing and High Quality Rice Variety with Lodging and Disease Resistance, 'Jinbo' (중생 고품질 내도복 내병성 벼 품종 '진보')

  • Kim, Jeong-Il;Park, No-Bong;Lee, Ji-Yoon;Park, Dong-Soo;Yeo, Un-Sang;Chang, Jae-Ki;Kang, Jung-Hun;Oh, Byeong-Geun;Kwon, Oh-Deog;Kwak, Do-Yeon;Lee, Jong-Hee;Yi, Gi-Hwan;Kim, Chun-Song;Song, You-Cheon;Cho, Jun-Hyun;Nam, Min-Hee;Choung, Jin-Il;Shin, Mun-Sik;Jeon, Myeong-Gi;Yang, Sae-Jun;Kang, Hang-Weon;Ahn, Jin-Gon;Kim, Jae-Kyu
    • Korean Journal of Breeding Science
    • /
    • v.43 no.3
    • /
    • pp.165-171
    • /
    • 2011
  • A new rice variety 'Jinbo' is a japonica rice (Oryza sativa L.) with good eating quality, lodging tolerance, and resistance to rice stripe virus (RSV) and bacterial blight disease (BB). It was developed by the rice breeding team of Yeongdeog Substation, National Institute of Crop Science (NICS), RDA in 2009. This variety was derived from a cross between 'Yeongdeog26' with good grain quality and wind tolerance and 'Koshihikari' with good eating quality in 1998 summer season. A promising line, YR21324-56-1-1, selected by pedigree breeding method, was designated as the name of 'Yeongdeog45' in 2005. After the local adaptability test was carried out at nine locations from 2006 to 2008, 'Yeongdeog45' was released as the name of 'Jinbo' in 2009. 'Jinbo' has short culm length as 74 cm and medium maturating growth duration. This variety is resistant to $K_1$, $K_2$, and $K_3$ races of bacterial blight and stripe virus and moderately resistant to leaf blast disease with durable resistance, and also it has tolerance to unfavorable environments such as cold and dried wind. 'Jinbo' has translucent and clear milled rice kernel without white core and white belly rice, and good eating quality as a result of panel test. The yield potential of 'Jinbo' in milled rice is about 5.65 MT/ha at ordinary fertilizer level in local adaptability test. This cultivar would be adaptable to middle plain, mid-west costal area, east-south coastal area, and south mid-mountainous area.

A New Medium Maturing and High Quality Rice Variety with Lodging and Disease Resistance, 'Haeoreumi' (중생 고품질 내도복 내병성 벼 품종 '해오르미')

  • Kim, Jeong-Il;Park, No-Bong;Park, Dong-Soo;Lee, Ji-Yoon;Yeo, Un-Sang;Chang, Jae-Ki;Kang, Jung-Hun;Oh, Byeong-Geun;Kwon, Oh-Deog;Kwak, Do-Yeon;Lee, Jong-Hee;Yi, Gihwan;Kim, Chun-Song;Song, You-Cheon;Cho, Jun-Hyun;Nam, Min-Hee;Choung, Jin-Il;Shin, Mun-Sik;Jeon, Myeong-Gi;Yang, Sae-Jun;Kang, Hang-Weon;Ahn, Jin-Gon;Kim, Jae-Kyu
    • Korean Journal of Breeding Science
    • /
    • v.42 no.6
    • /
    • pp.638-644
    • /
    • 2010
  • A new rice variety 'Haeoreumi' is a japonica rice (Oryza sativa L.) with lodging tolerance, resistance to rice stripe virus (RSV) and bacterial leaf blight (BLB), and high grain quality. It was developed by the rice breeding team of Yeongdeog Substation, National Institute of Crop Science (NICS), RDA in 2008. This variety was derived from a cross between 'Milyang165' with good grain quality and lodging resistance, and 'Haepyeongbyeo' with wind tolerance in winter season of 2000/2001. A promising line, YR22375-B-B-1, selected by pedigree breeding method, was designated as the name of 'Yeongdeog46' in 2005. 'Yeongdeog46' was released as the name of 'Haeoreumi' in 2008 after the local adaptability test that was carried out at nine locations from 2006 to 2008. 'Haeoreumi' has 74 cm short culm length as and medium maturating growth duration. This variety showed resistance to $K_1,\;K_2$, and $K_3$ races of bacterial blight, and stripe virus and moderate resistant to leaf blast disease with durable resistance, and also has tolerance to unfavorable environment such as cold, dry and cold salty wind. 'Haeoreumi' has translucent and clear milled rice kernel without white core and white belly rice, and good eating quality as a result of panel test. The yield potential of 'Haeoreumi' in milled rice is about 5.58MT/ha at ordinary fertilizer level of local adaptability test. This cultivar would be adaptable to Middle plain, mid-west costal area, and east-south coastal area.

Robo-Advisor Algorithm with Intelligent View Model (지능형 전망모형을 결합한 로보어드바이저 알고리즘)

  • Kim, Sunwoong
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.39-55
    • /
    • 2019
  • Recently banks and large financial institutions have introduced lots of Robo-Advisor products. Robo-Advisor is a Robot to produce the optimal asset allocation portfolio for investors by using the financial engineering algorithms without any human intervention. Since the first introduction in Wall Street in 2008, the market size has grown to 60 billion dollars and is expected to expand to 2,000 billion dollars by 2020. Since Robo-Advisor algorithms suggest asset allocation output to investors, mathematical or statistical asset allocation strategies are applied. Mean variance optimization model developed by Markowitz is the typical asset allocation model. The model is a simple but quite intuitive portfolio strategy. For example, assets are allocated in order to minimize the risk on the portfolio while maximizing the expected return on the portfolio using optimization techniques. Despite its theoretical background, both academics and practitioners find that the standard mean variance optimization portfolio is very sensitive to the expected returns calculated by past price data. Corner solutions are often found to be allocated only to a few assets. The Black-Litterman Optimization model overcomes these problems by choosing a neutral Capital Asset Pricing Model equilibrium point. Implied equilibrium returns of each asset are derived from equilibrium market portfolio through reverse optimization. The Black-Litterman model uses a Bayesian approach to combine the subjective views on the price forecast of one or more assets with implied equilibrium returns, resulting a new estimates of risk and expected returns. These new estimates can produce optimal portfolio by the well-known Markowitz mean-variance optimization algorithm. If the investor does not have any views on his asset classes, the Black-Litterman optimization model produce the same portfolio as the market portfolio. What if the subjective views are incorrect? A survey on reports of stocks performance recommended by securities analysts show very poor results. Therefore the incorrect views combined with implied equilibrium returns may produce very poor portfolio output to the Black-Litterman model users. This paper suggests an objective investor views model based on Support Vector Machines(SVM), which have showed good performance results in stock price forecasting. SVM is a discriminative classifier defined by a separating hyper plane. The linear, radial basis and polynomial kernel functions are used to learn the hyper planes. Input variables for the SVM are returns, standard deviations, Stochastics %K and price parity degree for each asset class. SVM output returns expected stock price movements and their probabilities, which are used as input variables in the intelligent views model. The stock price movements are categorized by three phases; down, neutral and up. The expected stock returns make P matrix and their probability results are used in Q matrix. Implied equilibrium returns vector is combined with the intelligent views matrix, resulting the Black-Litterman optimal portfolio. For comparisons, Markowitz mean-variance optimization model and risk parity model are used. The value weighted market portfolio and equal weighted market portfolio are used as benchmark indexes. We collect the 8 KOSPI 200 sector indexes from January 2008 to December 2018 including 132 monthly index values. Training period is from 2008 to 2015 and testing period is from 2016 to 2018. Our suggested intelligent view model combined with implied equilibrium returns produced the optimal Black-Litterman portfolio. The out of sample period portfolio showed better performance compared with the well-known Markowitz mean-variance optimization portfolio, risk parity portfolio and market portfolio. The total return from 3 year-period Black-Litterman portfolio records 6.4%, which is the highest value. The maximum draw down is -20.8%, which is also the lowest value. Sharpe Ratio shows the highest value, 0.17. It measures the return to risk ratio. Overall, our suggested view model shows the possibility of replacing subjective analysts's views with objective view model for practitioners to apply the Robo-Advisor asset allocation algorithms in the real trading fields.

Varietal and Locational Variation of Grain Quality Components of Rice Produced n Middle and Southern Plain Areas in Korea (중ㆍ남부 평야지산 발 형태 및 이화학적 특성의 품종 및 산지간 변이)

  • Choi, Hae-Chune;Chi, Jeong-Hyun;Lee, Chong-Seob;Kim, Young-Bae;Cho, Soo-Yeon
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.39 no.1
    • /
    • pp.15-26
    • /
    • 1994
  • To understand the relative contribution of varietal and environmental variation on various grain quality components in rice, grain appearance, milling recovery, several physicochemical properties of rice grain and texture or palatability of cooked rice for milled rice materials of seven cultivars(five japonica & two Tongil-type), produced at six locations of the middle and southern plain area of Korea in 1989, were evaluated and analyzed the obtained data. Highly significant varietal variations were detected in all grain quality components of the rice materials and marked locational variations with about 14-54% portion of total variation were recognized in grain appearance, milling recovery, alkali digestibility, protein content, K /Mg ratio, gelatinization temperature, breakdown and setback viscosities. Variations of variety x location interaction were especially large in overall palatability score of cooked rice and consistency or set- back viscosities of amylograph. Tongil-type cultivars showed poor marketing quality, lower milling recovery, slightly lower alkali digestibility and amylose content, a little higher protein content and K /Mg ratio, relatively higher peak, breakdown and consistency viscosities, significantly lower setback viscosity, and more undesirable palatability of cooked rice compared with japonica rices. The japonica rice varieties possessing good palatability of cooked rice were slightly low in protein content and a little high in K /Mg ratio and stickiness /hardness ratio of cooked rice. Rice 1000-kernel weight was significantly heavier in rice materials produced in Iri lowland compared with other locations. Milling recovery from rough to brown rice and ripening quality were lowest in Milyang late-planted rice while highest in Iri lowland and Gyehwa reclaimed-land rice. Amylose content of milled rice was about 1% lower in Gyehwa rice compared with other locations. Protein content of polished rice was about 1% lower in rice materials of middle plain area than those of southern plain regions. K/Mg ratio of milled rice was lowest in Iri rice while highest in Milyang rice. Alkali digestibility was highest in Milyang rice while lowest in Honam plain rice, but the temperature of gelatinization initiation of rice flour in amylograph was lowest in Suwon and Iri rices while highest in Milyang rice. Breakdown viscosity was lowest in Milyang rice and next lower in Ichon lowland rice while highest in Gyehwa and Iri rices, and setback viscosity was the contrary tendency. The stickiness/hardness ratio of cooked rice was slightly lower in southern-plain rices than in middle-plain ones, and the palatability of cooked rice was best in Namyang reclaimed-land rice and next better with the order of Suwon$\geq$Iri$\geq$Ichon$\geq$Gyehwa$\geq$Milyang rices. The rice materials can be classified genotypically into two ecotypes of japonica and Tongil-type rice groups, and environmentally into three regions of Milyang, middle and Honam lowland by the distribution on the plane of 1st and 2nd principal components contracted from eleven grain quality properties closely associated with palatability of cooked rice by principal component analysis.

  • PDF