• Title/Summary/Keyword: Corpus analysis

Search Result 422, Processing Time 0.028 seconds

Construction of Consumer Confidence index based on Sentiment analysis using News articles (뉴스기사를 이용한 소비자의 경기심리지수 생성)

  • Song, Minchae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.1-27
    • /
    • 2017
  • It is known that the economic sentiment index and macroeconomic indicators are closely related because economic agent's judgment and forecast of the business conditions affect economic fluctuations. For this reason, consumer sentiment or confidence provides steady fodder for business and is treated as an important piece of economic information. In Korea, private consumption accounts and consumer sentiment index highly relevant for both, which is a very important economic indicator for evaluating and forecasting the domestic economic situation. However, despite offering relevant insights into private consumption and GDP, the traditional approach to measuring the consumer confidence based on the survey has several limits. One possible weakness is that it takes considerable time to research, collect, and aggregate the data. If certain urgent issues arise, timely information will not be announced until the end of each month. In addition, the survey only contains information derived from questionnaire items, which means it can be difficult to catch up to the direct effects of newly arising issues. The survey also faces potential declines in response rates and erroneous responses. Therefore, it is necessary to find a way to complement it. For this purpose, we construct and assess an index designed to measure consumer economic sentiment index using sentiment analysis. Unlike the survey-based measures, our index relies on textual analysis to extract sentiment from economic and financial news articles. In particular, text data such as news articles and SNS are timely and cover a wide range of issues; because such sources can quickly capture the economic impact of specific economic issues, they have great potential as economic indicators. There exist two main approaches to the automatic extraction of sentiment from a text, we apply the lexicon-based approach, using sentiment lexicon dictionaries of words annotated with the semantic orientations. In creating the sentiment lexicon dictionaries, we enter the semantic orientation of individual words manually, though we do not attempt a full linguistic analysis (one that involves analysis of word senses or argument structure); this is the limitation of our research and further work in that direction remains possible. In this study, we generate a time series index of economic sentiment in the news. The construction of the index consists of three broad steps: (1) Collecting a large corpus of economic news articles on the web, (2) Applying lexicon-based methods for sentiment analysis of each article to score the article in terms of sentiment orientation (positive, negative and neutral), and (3) Constructing an economic sentiment index of consumers by aggregating monthly time series for each sentiment word. In line with existing scholarly assessments of the relationship between the consumer confidence index and macroeconomic indicators, any new index should be assessed for its usefulness. We examine the new index's usefulness by comparing other economic indicators to the CSI. To check the usefulness of the newly index based on sentiment analysis, trend and cross - correlation analysis are carried out to analyze the relations and lagged structure. Finally, we analyze the forecasting power using the one step ahead of out of sample prediction. As a result, the news sentiment index correlates strongly with related contemporaneous key indicators in almost all experiments. We also find that news sentiment shocks predict future economic activity in most cases. In almost all experiments, the news sentiment index strongly correlates with related contemporaneous key indicators. Furthermore, in most cases, news sentiment shocks predict future economic activity; in head-to-head comparisons, the news sentiment measures outperform survey-based sentiment index as CSI. Policy makers want to understand consumer or public opinions about existing or proposed policies. Such opinions enable relevant government decision-makers to respond quickly to monitor various web media, SNS, or news articles. Textual data, such as news articles and social networks (Twitter, Facebook and blogs) are generated at high-speeds and cover a wide range of issues; because such sources can quickly capture the economic impact of specific economic issues, they have great potential as economic indicators. Although research using unstructured data in economic analysis is in its early stages, but the utilization of data is expected to greatly increase once its usefulness is confirmed.

Effect of Gonadotropin Releasing Hormone-Agonist on Apoptosis of Luteal Cells in Pregnant Rat (Gonadotropin Releasing Hormone-Agonist가 임신된 흰쥐 황체세포의 세포자연사에 미치는 영향)

  • 양현원;김종석;박철홍;윤용달
    • Development and Reproduction
    • /
    • v.6 no.2
    • /
    • pp.131-139
    • /
    • 2002
  • Since GnRH and its receptor genes are expressed in the ovary, it has been suggested that ovarian GnRH might be involved in the regulation of ovarian function and the apoptosis of ovarian cells. However, it was not known well on the expression and function of GnRH and its receptor in the corpus luteum. The present study was undertaken to investigate whether GnRH and its receptor are expressed in luteal cells and GnRH has any effect on the apoptosis of luteal cells. Luteal cells obtained from the pregnant rats were cultured and stained for GnRH and its receptor proteins. Cultured luteal cells showed distinct immunoreactivity against both anti-GnRH and anti-GnRH receptor antibodies. In addition, the presence of GnRH receptor protein in cultured cells was confirmed by Western blot analysis. To investigate the effect of GnRH on the apoptosis of luteal cells, luteal cells were cultured in the presence of 10$^{-6}$ M GnRH-agonist(GnRH-Ag) for 3, 8, and 12h. TUNEL assay showed that the number of cells undergoing apoptosis increased 12h after culture(P<0.05). DNA fragmentation analysis confirmed the results such that the cells treated for 12h showed the greatest increase of fragmentation(p<0.05). Further, Western blot analysis of cytochrome c in the mitochondrial and cytoplasmic fractions of the luteal cells showed that GnRH-Ag treatment increased the content of cytochrome c in cytoplasm. These results demonstrate that the luteal cells express GnRH and its receptor and GnRH-Ag treatment induces apoptosis of the luteal cells via mitochondrial release of cytochrome c. The present study suggest that the releasing of cytochrome c from mitochondria might be involved in the luteal cell apoptosis induced by GnRH-Ag.

  • PDF

The Effects of Unpredictable Stress on the LHR Expression and Reproductive Functions in Mouse Models (실험적 마우스 모델에서 예측 불가능한 스트레스가 황체형성호르몬 수용체의 발현과 생식기능에 미치는 영향에 관한 연구)

  • Choi, Sung-Young;Park, Jin-Heum;Zhu, Yuxia;Kim, Young-Jong;Park, Jae-Ok;Moon, Changjong;Shin, Taekyun;Ahn, Meejung;Kim, Suk-Soo;Park, Young-Sik;Chae, Hyung-Bok;Kim, Tae-Kyun;Kim, Seung-Joon
    • Journal of Veterinary Clinics
    • /
    • v.31 no.5
    • /
    • pp.394-402
    • /
    • 2014
  • The objective of this study was to investigate the effect of chronic unpredictable stress on the reproductive function and ovarian luteinizing hormone receptor (LHR) expression. 9-week-old C57BL/6 female mice were randomly divided into two groups: control group and stressed group. Mice have been stressed twice a day for 35 days with 12 different stressors which were randomly selected. The results demonstrate that there is significant increase in the anxiety-related behaviors (P < 0.05), decrease body weight gain rate (P < 0.01) and decrease in the average of litter size in stressed mice compared with control group (P < 0.01). Furthermore, the rate of primary, secondary and early antral follicles in stressed mice significantly decreased (P < 0.05), whereas that of atretic follicles significantly increased compared with control mice (P < 0.01). The immunohistochemical analysis revealed that reduced LHR expression in granulosa cells of follicle and luteal cells of corpus luteum in response to chronic unpredictable stress. The western blot analysis revealed significantly decrease in LHR expression in the stressed mice ovaries compared with the control (P < 0.05). These results suggest that ovarian LHR expression affected by chronic unpredictable stress and the modulated ovarian LHR is responsible for ovarian follicular maldevelopment and reproductive dysfunction.

KB-BERT: Training and Application of Korean Pre-trained Language Model in Financial Domain (KB-BERT: 금융 특화 한국어 사전학습 언어모델과 그 응용)

  • Kim, Donggyu;Lee, Dongwook;Park, Jangwon;Oh, Sungwoo;Kwon, Sungjun;Lee, Inyong;Choi, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.191-206
    • /
    • 2022
  • Recently, it is a de-facto approach to utilize a pre-trained language model(PLM) to achieve the state-of-the-art performance for various natural language tasks(called downstream tasks) such as sentiment analysis and question answering. However, similar to any other machine learning method, PLM tends to depend on the data distribution seen during the training phase and shows worse performance on the unseen (Out-of-Distribution) domain. Due to the aforementioned reason, there have been many efforts to develop domain-specified PLM for various fields such as medical and legal industries. In this paper, we discuss the training of a finance domain-specified PLM for the Korean language and its applications. Our finance domain-specified PLM, KB-BERT, is trained on a carefully curated financial corpus that includes domain-specific documents such as financial reports. We provide extensive performance evaluation results on three natural language tasks, topic classification, sentiment analysis, and question answering. Compared to the state-of-the-art Korean PLM models such as KoELECTRA and KLUE-RoBERTa, KB-BERT shows comparable performance on general datasets based on common corpora like Wikipedia and news articles. Moreover, KB-BERT outperforms compared models on finance domain datasets that require finance-specific knowledge to solve given problems.

A Comparative Analysis of Basal Body Temperature to Ultrasound, as a Method of Ovulation Detection in Induced Ovulatory Menstrual Cycles (배란유도주기에 따른 초음파검사와 기초체온표의 비교분석)

  • Choi, W.;Suh, B.H.;Lee, J.H.
    • Clinical and Experimental Reproductive Medicine
    • /
    • v.12 no.2
    • /
    • pp.25-37
    • /
    • 1985
  • Four points on the basal body temperature (B.B.T.) curve was correlated with the estimated time of ovulation, as determined by serial ultrasound in 50 induced menstrual cycles from 22 subjects. The time of ovulation was estimated by measuring the maximal diameter of follicles and observing the morphologic changes within the ovary from follicle to corpus luteum. The results were as following; 1. The diameter of the follicle measured at the day before disappearance was 21.1 mm on an average (S.D.: 2.14). The average follicular growth for 4 days before ovulation was measured at a rate of 2.8 mm/day, and rapid growth of follicle was observed 3.1 mm/day at the day before. 2. The changes associated with rupture of the follicles were the followings, in order of frequency; decrease in size(94%), disappearance of follicles(64%), fluid in the Cul-de-Sac(26%) and increased internal echoes(16%). 3. Only 20 of 50 cycles, exhibited a BBT dip and correlated with the estimated time of ovulation by ultrasound in 2 of which cases(10%). BBT nadir, 30 of 50 cycles, correlated in 5(16.7%). The first day of hyperthermic plateau(FDHP) and BBT coverline was exhibited in all cycles, correlated in 41(82%) and 35(70%) cases. 4. The relationship between the diameter of dominant dominant follicle, measured by ultrasound, and the basal body temperature curve were as following. During cycles in which dip was observed on the BBT curve, the follicular diameter were 10.5${\pm}$2.12 mm on 4 days prior to the point (D-4), and 12.5${\pm}$2.12 mm (D-3), 15.5${\pm$2.12 mm (D-2), 17.0${\pm}$1.41 mm (D-1) and 21.5${\pm}$2.12 mm just prior to the dip (D-0). In the nadir; 9.6${\pm}$1.67 mm (N-4), 12.8${\pm}$1.79 mm (N-3), 16.2${\pm}$1.92 mm (N-2), 18.2${\pm}$2.17 mm (N-1) and 21.4${\pm}$2.61 mm (N-0). In the First day of Hyperthemic Plateau (FDHP); 9.8${\pm}$1.36 mm (F-4), 12.4${\pm}$1.41 mm (F-3),15.1${\pm}$1.57 mm (F-2), 18.1${\pm}$1.67 mm (F-1) and 21.2${\pm}$2.25 mm (F-0). In the BBT coverline endopint; 9.9${\pm}$.39 mm (C-4), 12.5 ${\pm}$1.44 mm (C-3), 15.2${\pm}$1.64 mm (C-2), 18.0 ${\pm}$1.69 mm (C-1), and 21.2${\pm}$2.31 mm (C-0). 5. The relationship between the ultrasonographic signs of ovulation and the basal body temperature curve were as following. The BBT dip correlated with the ovulation in 2 cases, which revealed decrease in follicular diameter (100%), fluid pattem in the Cul-de-Sac (1 case, 50%) and complete disappearance of follicle (1 case, 50%). In the nadir (5 cases); the ultrasonographic signs of ovulation were decrease in follicular diameter (5 cases, 100%), fluid pattern in the Cul-de-Sac (1 case, 20%) and complete disappearance of follicle (3 cases, 60%). In the First day of Hyperthermic Plateau (41 cases); decrease in follicular diameter (40 cases, 97.6%), fluid pattern in the Cul-de-Sac (11 cases, 26.8%), appearance of internal echo and thickening of the wall (6 cases, 14.6%) and com plete disappearance of follicle (28 cases, 68.3%). In the BBT coverline endpoint (35 cases); decrease in follicular diameter (33 cases, 94.3%), fluid pattern in the Cul-de Sac (9 cases, 25.7%), appearance of internal echo and thickening of the wall (5 cases 14.3%) and complete disappearance of follicle (20 cases, 57.1%).

  • PDF

Age Determination by Tooth Wear and Histological Analysis of Seasonal Variation of Breeding in the Big White-Toothed Shrew, Crocidura lasiura (우수리땃쥐 Crocidura lasiura의 치아 마모에 의한 연령결정과 번식의 계절적 변이의 조직학적 분석)

  • Jeong, Soon-Jeong;Yoon, Myung-Hee;Choi, Jung-Mi;Kim, Hyun-Dae;Lim, Do-Seon;Park, Jin-Ju;Choi, Baik-Dong;Jeong, Moon-Jin
    • Applied Microscopy
    • /
    • v.40 no.1
    • /
    • pp.37-45
    • /
    • 2010
  • Captured wild specimens of the big white-toothed shrew, Crocidura lasiura were classified into three age classes by tooth wear and height of molars, and seasonal variations of breeding and reproductive organs were examined. Juveniles had not tooth wear in molars and height of the third molars were lower than the first and second molars, and had only non-breeding condition. Young adults had little tooth wear and the third molars reached to the first and second molars, and old adults had heavy tooth wear in molars, young adults and old adults had breeding or non-breeding condition according to the season. On the basis of histological examination, seasonal variations of breeding were confirmed that breeding condition of young and old adult males were continued from early February to early October although the breeding activity was the highest in April, that of females were continued from the end of March to October, males reached sexual maturity earlier than females. Whereas the breeding condition seems to cease for non-breeding season because of the deficiency of food resources, soil invertebrates. Young and old adult males of the breeding season had large testes with enlarged seminiferous tubules that were filled with numerous germ cells, and expanded caudal epididymides with a vast number of spermatozoa, and were more than 10.0 g in the body weight and 0.03 g in the testis and epididymis weight. The females of the breeding season were pregnant condition with 4~6 litters or had the Graafian follicles and the corpus lutea in the ovary, and were more than 9.6 g in the body weight.

Improving the Accuracy of Document Classification by Learning Heterogeneity (이질성 학습을 통한 문서 분류의 정확성 향상 기법)

  • Wong, William Xiu Shun;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.21-44
    • /
    • 2018
  • In recent years, the rapid development of internet technology and the popularization of smart devices have resulted in massive amounts of text data. Those text data were produced and distributed through various media platforms such as World Wide Web, Internet news feeds, microblog, and social media. However, this enormous amount of easily obtained information is lack of organization. Therefore, this problem has raised the interest of many researchers in order to manage this huge amount of information. Further, this problem also required professionals that are capable of classifying relevant information and hence text classification is introduced. Text classification is a challenging task in modern data analysis, which it needs to assign a text document into one or more predefined categories or classes. In text classification field, there are different kinds of techniques available such as K-Nearest Neighbor, Naïve Bayes Algorithm, Support Vector Machine, Decision Tree, and Artificial Neural Network. However, while dealing with huge amount of text data, model performance and accuracy becomes a challenge. According to the type of words used in the corpus and type of features created for classification, the performance of a text classification model can be varied. Most of the attempts are been made based on proposing a new algorithm or modifying an existing algorithm. This kind of research can be said already reached their certain limitations for further improvements. In this study, aside from proposing a new algorithm or modifying the algorithm, we focus on searching a way to modify the use of data. It is widely known that classifier performance is influenced by the quality of training data upon which this classifier is built. The real world datasets in most of the time contain noise, or in other words noisy data, these can actually affect the decision made by the classifiers built from these data. In this study, we consider that the data from different domains, which is heterogeneous data might have the characteristics of noise which can be utilized in the classification process. In order to build the classifier, machine learning algorithm is performed based on the assumption that the characteristics of training data and target data are the same or very similar to each other. However, in the case of unstructured data such as text, the features are determined according to the vocabularies included in the document. If the viewpoints of the learning data and target data are different, the features may be appearing different between these two data. In this study, we attempt to improve the classification accuracy by strengthening the robustness of the document classifier through artificially injecting the noise into the process of constructing the document classifier. With data coming from various kind of sources, these data are likely formatted differently. These cause difficulties for traditional machine learning algorithms because they are not developed to recognize different type of data representation at one time and to put them together in same generalization. Therefore, in order to utilize heterogeneous data in the learning process of document classifier, we apply semi-supervised learning in our study. However, unlabeled data might have the possibility to degrade the performance of the document classifier. Therefore, we further proposed a method called Rule Selection-Based Ensemble Semi-Supervised Learning Algorithm (RSESLA) to select only the documents that contributing to the accuracy improvement of the classifier. RSESLA creates multiple views by manipulating the features using different types of classification models and different types of heterogeneous data. The most confident classification rules will be selected and applied for the final decision making. In this paper, three different types of real-world data sources were used, which are news, twitter and blogs.

Age Determination by Tooth Wear and Histological Analysis of Seasonal Variation of Breeding in the Lesser White-Toothed Shrew, Crocidura suaveolens (작은땃쥐 Crocidura suaveolens의 치아 마모에 의한 연령결정과 번식의 계절적 변이의 조직학적 분석)

  • Jeong, Soon-Jeong;Yoon, Myung-Hee;Kim, Sook-Hyang;Ham, Joo-Hyun;Lim, Do-Seon;Choi, Baik-Dong;Park, Jin-Ju;Jeong, Moon-Jin
    • Applied Microscopy
    • /
    • v.40 no.3
    • /
    • pp.125-132
    • /
    • 2010
  • Captured specimens of the lesser white-toothed shrew, Crocidura suaveolens were classified into three age classes by tooth wear and seasonal variations of reproductive organs were investigated. Molars of juveniles had not tooth wear and the height of the third molars were lower than the first and second molars, young adults had smooth tooth wear and the third molars reached to the first and second molars, and old adults had heavy tooth wear and the third molars also reached to the first and second molars. On the basis of histological examination, seasonal variation of breeding was confirmed that breeding season of adult males was from early February to early October, having a peak of the breeding in April and July, and non-breeding season was from in the middle of October to late January. Young and old adult males of the breeding season had large testes with enlarged seminiferous tubules filling with numerous germ cells and expanded caudal epididymides with a vast number of spermatozoa, Young and old adult males of the non-breeding season had the small testes with the extremely slender seminiferous tubules filling with only spermatogonia and the reduced caudal epididymides without spermatozoa. Males weighing more than 3.9 g in the body weight and 0.013 g in the testis and epididymis weight reached sexual maturation in breeding season, and the females weighing more than 3.8 g in body weight of the breeding season were pregnant condition having 5~6 litters or had the Graafian follicles and the corpus lutea in the ovary.

Effects of Characteristics of Ovarian follicular Fluid and Ant-Inhibin Serum on Steroid Hormone Secretion by Hanwoo Granulosa Cells In Vitro (한우 난소의 Follicular Fluid의 특징과 과립막 세포의 스테로이드호르몬 분비에 대한 Anti-Inhibin Serum의 첨가효과)

  • 성환후;민관식;양병철;노환국;최선호;임기순;장유민;박성재;장원경
    • Korean Journal of Animal Reproduction
    • /
    • v.25 no.2
    • /
    • pp.119-124
    • /
    • 2001
  • This study was performed to investigate the effects of the peptide to carrier ratio on the immune and biological functions to inhibin immunization in Hanwoo. A peptide sequence kom the alpha -subunit (19~32 peptide) of porcine inhibin was synthesized for antigen and conjugated to human serum albumin(HSA) for carrier protein. Anti-inhibin sera(AI) were produced 52 day later from rabbit after injection of inhibin-$\alpha$ -subunit peptide conjugator for antigen with the interval of 2 weeks. Immune-blotting analysis using antibody specific fur inhibin-$\alpha$ subunits revealed that the inhibin was detected at 1.0 cm bovine follicular fluid(bFF). However, each stage of corpus lutea and 0.1 cm of follicular fluid were not detected. The maximal contents of estradiol-17 $\beta$ in Hanwoo ovarian follicular fluid were detected at 2.0 cm of follicular size(diameter), but the mean total contents of these hormone decreased significantly with decreasing diameter of follicles. However, progesterone contents of follicular fluid were high at 1.0 cm of follicle. Progesterone secretion by Hanwoo granulosa cell cultured for 48 hr in vitro was significantly (p<0.05) inhibited in 5% bFF and 5% bFF + 5% AI addition group compared with control group. Estradiol-17 $\beta$ secretion by Hanwoo granulosa cell cultured for 48 hr in vitro was significantly (p<0.05) increased in 5% AI and 5% AI + 5% bFF addtion group compared with control group. However, the groups added 5% AI were not changed compared to control groups in progesterone and estradiol-17 $\beta$. Taken together, we suggested that inhibin in the mature FF plays a pivotal role on the biosynthesis of steroid hormone of follicular cells during follicular development.

  • PDF

The Need for Paradigm Shift in Semantic Similarity and Semantic Relatedness : From Cognitive Semantics Perspective (의미간의 유사도 연구의 패러다임 변화의 필요성-인지 의미론적 관점에서의 고찰)

  • Choi, Youngseok;Park, Jinsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.111-123
    • /
    • 2013
  • Semantic similarity/relatedness measure between two concepts plays an important role in research on system integration and database integration. Moreover, current research on keyword recommendation or tag clustering strongly depends on this kind of semantic measure. For this reason, many researchers in various fields including computer science and computational linguistics have tried to improve methods to calculating semantic similarity/relatedness measure. This study of similarity between concepts is meant to discover how a computational process can model the action of a human to determine the relationship between two concepts. Most research on calculating semantic similarity usually uses ready-made reference knowledge such as semantic network and dictionary to measure concept similarity. The topological method is used to calculated relatedness or similarity between concepts based on various forms of a semantic network including a hierarchical taxonomy. This approach assumes that the semantic network reflects the human knowledge well. The nodes in a network represent concepts, and way to measure the conceptual similarity between two nodes are also regarded as ways to determine the conceptual similarity of two words(i.e,. two nodes in a network). Topological method can be categorized as node-based or edge-based, which are also called the information content approach and the conceptual distance approach, respectively. The node-based approach is used to calculate similarity between concepts based on how much information the two concepts share in terms of a semantic network or taxonomy while edge-based approach estimates the distance between the nodes that correspond to the concepts being compared. Both of two approaches have assumed that the semantic network is static. That means topological approach has not considered the change of semantic relation between concepts in semantic network. However, as information communication technologies make advantage in sharing knowledge among people, semantic relation between concepts in semantic network may change. To explain the change in semantic relation, we adopt the cognitive semantics. The basic assumption of cognitive semantics is that humans judge the semantic relation based on their cognition and understanding of concepts. This cognition and understanding is called 'World Knowledge.' World knowledge can be categorized as personal knowledge and cultural knowledge. Personal knowledge means the knowledge from personal experience. Everyone can have different Personal Knowledge of same concept. Cultural Knowledge is the knowledge shared by people who are living in the same culture or using the same language. People in the same culture have common understanding of specific concepts. Cultural knowledge can be the starting point of discussion about the change of semantic relation. If the culture shared by people changes for some reasons, the human's cultural knowledge may also change. Today's society and culture are changing at a past face, and the change of cultural knowledge is not negligible issues in the research on semantic relationship between concepts. In this paper, we propose the future directions of research on semantic similarity. In other words, we discuss that how the research on semantic similarity can reflect the change of semantic relation caused by the change of cultural knowledge. We suggest three direction of future research on semantic similarity. First, the research should include the versioning and update methodology for semantic network. Second, semantic network which is dynamically generated can be used for the calculation of semantic similarity between concepts. If the researcher can develop the methodology to extract the semantic network from given knowledge base in real time, this approach can solve many problems related to the change of semantic relation. Third, the statistical approach based on corpus analysis can be an alternative for the method using semantic network. We believe that these proposed research direction can be the milestone of the research on semantic relation.