• Title/Summary/Keyword: Morphological processing

Search Result 523, Processing Time 0.019 seconds

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.

Selection and Cultural Characteristics of Whole Chicken Feather-Degrading Bacterium, Bacillus sp. SMMJ-2 (Whole Chicken Feather-Degrading Keratinolytic Protease 생산균주의 분리 및 특성)

  • Park Sung-Min;Jung Hyuck-Jun;Yu Tae-Shick
    • Microbiology and Biotechnology Letters
    • /
    • v.34 no.1
    • /
    • pp.7-14
    • /
    • 2006
  • Feather, generated in large quantities as a byproduct of commercial poultry processing, is almost pure keratin, which is not easily degradable by common professes. Four strains, SMMJ-2, FL-3, NO-4 and RM-12 were isolated from soil for production of extracellular keratinolytic protease. They were identified as Bacillus sp. based on their morphological and physiological characteristics. They shown high protease activity on 5.0% skim milk agar medium and produced a substrate like mucoid on keratin agar medium. Bacillus sp. SMMJ-2 had a faster production time for producing keratinolytic protease than other strains. This strain did not completely degrade whole chicken feather for five days in basal medium but completely degraded whole chicken feather when supplied with nitrogen source for 40hours in keratinolytic producing medium ($0.7%\;K_{2}HPO_{4},\;0.2%\;KH_{2}PO_{4},\;0.1%$ fructose, 1.2% whole chicken feather, $0.01%\;Na_{2}CO_3$, pH 7.0). When supplied with chicken feather as nitrogen source, keratinolytic protease activity was 89 units/ml/min. When soybean meal was used as nitrogen source, the keratinolytic protease production reached a maximum of 106 units/ml/min after 48 hours under $30^{\circ}C$, 180 agitation. To isolate the keratinolytic protease, the culture filtrate was precipitated with $(NH_4)_{2}SO_4$ and acetone. The recovery rate of keratinolytic protease was about 96% after treatment with 50% acetone. The enzyme was stable in the range of $30{\sim}50^{\circ}C$ and pH $6.0{\sim}12.0$.

Development of Marker-free TaGlu-Ax1 Transgenic Rice Harboring a Wheat High-molecular-weight Glutenin Subunit (HMW-GS) Protein (벼에서 밀 고분자 글루테닌 단백질(TaGlu-Ax1) 발현을 통하여 쌀가루 가공적성 증진을 위한 마커프리(marker-free) 형질전환 벼의 개발)

  • Jeong, Namhee;Jeon, Seung-Ho;Kim, Dool-Yi;Lee, Choonseok;Ok, Hyun-Choong;Park, Ki-Do;Hong, Ha-Cheol;Lee, Seung-Sik;Moon, Jung-Kyung;Park, Soo-Kwon
    • Journal of Life Science
    • /
    • v.26 no.10
    • /
    • pp.1121-1129
    • /
    • 2016
  • High-molecular-weight glutenin subunits (HMW-GSs) are extremely important determinants of the functional properties of wheat dough. Transgenic rice plants containing a wheat TaGlu-Ax1 gene encoding a HMG-GS were produced from the Korean wheat cultivar ‘Jokyeong’ and used to enhance the bread-making quality of rice dough using the Agrobacterium-mediated co-transformation method. Two expression cassettes with separate DNA fragments containing only TaGlu-Ax1 and hygromycin phosphotransferase II (HPTII) resistance genes were introduced separately into the Agrobacterium tumefaciens EHA105 strain for co-infection. Rice calli were infected with each EHA105 strain harboring TaGlu-Ax1 or HPTII at a 3:1 ratio of TaGlu-Ax1 and HPTII. Among 210 hygromycin-resistant T0 plants, 20 transgenic lines harboring both the TaGlu-Ax1 and HPTII genes in the rice genome were obtained. The integration of the TaGlu-Ax1 gene into the rice genome was reconfirmed by Southern blot analysis. The transcripts and proteins of the wheat TaGlu-Ax1 were stably expressed in rice T1 seeds. Finally, the marker-free plants harboring only the TaGlu-Ax1 gene were successfully screened in the T1 generation. There were no morphological differences between the wild-type and marker-free transgenic plants. The quality of only one HMW-GS (TaGlu-Ax1) was unsuitable for bread making using transgenic rice dough. Greater numbers and combinations of HMW and LMW-GSs and gliadins of wheat are required to further improve the processing qualities of rice dough. TaGlu-Ax1 marker-free transgenic plants could provide good materials to make transgenic rice with improved bread-making qualities.