• Title/Summary/Keyword: Corpus-based Study

Search Result 204, Processing Time 0.026 seconds

Effects of Ovarian Morphology and Culture Vessel on In vitro Development and Cell Number in Embryos of Korean Native Cows

  • Park, Yong-Soo;Kim, Jae-Myeoung
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.20 no.1
    • /
    • pp.31-35
    • /
    • 2007
  • The main purpose of this study was to improve the efficiency and quality of in vitro embryo production in Korean Native Cows (KNC). We examined the effects of ovarian morphologies (Experiment 1) and the culture vessel (Experiment 2) on in vitro maturation (IVM). We measured the subsequent development rates and cell numbers of blastocysts. In Experiment 1, the ovaries of KNC were divided into six groups, based on follicle and corpus luteum (CL) morphology. The development rates to the 2- and 8-cell stages were similar among the six groups. The development rates to blastocyst stages were significantly higher in the group without a CL or follicle (WOCL/F) than in the groups with follicular cysts (FCs), regressive CLs (RCLs) or cystic CLs (CCLs) (p<0.05). The cell number of the inner cell mass (ICM) of blastocysts in the FCs and RCLs groups, and the number of cells in the trophectoderm (TE) in the WOCL/F group, FCs, growing CLs (GCLs) and RCLs were significantly higher than in other groups (p<0.05). The total cell number (TCN) in the WOCL/F, FC and RCL groups was also significantly higher than in other groups (p<0.05). The ICM cell number/TCN ratio was significantly higher in the FC and RCL groups than in the GCL and DF groups (p<0.05). In Experiment 2, oocyte IVM was carried out in culture dishes, in 0.25- or 0.5-ml straws used for freezing sperm. The development rate to the 2-cell stage was significantly higher in the 0.5-ml straw group than in the 0.25-ml straw group. The development rates to the blastocyst stage were similar in the dish and the two straw groups. There were no differences in the cell numbers of ICM, TE or TCN or ICM cell number/TCN ratios between groups.

A Study on the Construction of the Automatic Extracts and Summaries - On the Basis of Scientific Journal Articles - (자동 발췌문/요약 시스템 구축에 관한 연구 - 학술지 논문기사를 중심으로 -)

  • Lee Tae-Young
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.3
    • /
    • pp.139-163
    • /
    • 2005
  • Various corpus-based approaches, rhetorical roles of discourse structure, and unifications of similar sentences were applied to construct the automatic Ext/Sums(extracts and summaries). Rhetorical roles of sentences like objective, method, background, result, conclusion, etc. for making elastic Ext/Sums were established and extraction engines according to respective role were prepared. The $90\%$ of Success rate in extracting the important sentences of sample articles was accomplished. Rearranging the selected sentences, it used unification of similar sentences using the cosine coefficient equation, deletion of unnecessary modification and insertion clauses, junction of short sentences, and connection of sentences able to link. They suggest the methods applying rhetorical roles of sentences, meaning and signature of noun and verb in clauses, and cue words and location will be researched to construct the more effective Ext/Sums.

The Study on Implementation of Crime Terms Classification System for Crime Issues Response

  • Jeong, Inkyu;Yoon, Cheolhee;Kang, Jang Mook
    • International Journal of Advanced Culture Technology
    • /
    • v.8 no.3
    • /
    • pp.61-72
    • /
    • 2020
  • The fear of crime, discussed in the early 1960s in the United States, is a psychological response, such as anxiety or concern about crime, the potential victim of a crime. These anxiety factors lead to the burden of the individual in securing the psychological stability and indirect costs of the crime against the society. Fear of crime is not a good thing, and it is a part that needs to be adjusted so that it cannot be exaggerated and distorted by the policy together with the crime coping and resolution. This is because fear of crime has as much harm as damage caused by criminal act. Eric Pawson has argued that the popular impression of violent crime is not formed because of media reports, but by official statistics. Therefore, the police should watch and analyze news related to fear of crime to reduce the social cost of fear of crime and prepare a preemptive response policy before the people have 'fear of crime'. In this paper, we propose a deep - based news classification system that helps police cope with crimes related to crimes reported in the media efficiently and quickly and precisely. The goal is to establish a system that can quickly identify changes in security issues that are rapidly increasing by categorizing news related to crime among news articles. To construct the system, crime data was learned so that news could be classified according to the type of crime. Deep learning was applied by using Google tensor flow. In the future, it is necessary to continue research on the importance of keyword according to early detection of issues that are rapidly increasing by crime type and the power of the press, and it is also necessary to constantly supplement crime related corpus.

Automatic Construction of a Negative/positive Corpus and Emotional Classification using the Internet Emotional Sign (인터넷 감정기호를 이용한 긍정/부정 말뭉치 구축 및 감정분류 자동화)

  • Jang, Kyoungae;Park, Sanghyun;Kim, Woo-Je
    • Journal of KIISE
    • /
    • v.42 no.4
    • /
    • pp.512-521
    • /
    • 2015
  • Internet users purchase goods on the Internet and express their positive or negative emotions of the goods in product reviews. Analysis of the product reviews become critical data to both potential consumers and to the decision making of enterprises. Therefore, the importance of opinion mining techniques which derive opinions by analyzing meaningful data from large numbers of Internet reviews. Existing studies were mostly based on comments written in English, yet analysis in Korean has not actively been done. Unlike English, Korean has characteristics of complex adjectives and suffixes. Existing studies did not consider the characteristics of the Internet language. This study proposes an emotional classification method which increases the accuracy of emotional classification by analyzing the characteristics of the Internet language connoting feelings. We can classify positive and negative comments about products automatically using the Internet emoticon. Also we can check the validity of the proposed algorithm through the result of high precision, recall and coverage for the evaluation of this method.

Building Specialized Language Model for National R&D through Knowledge Transfer Based on Further Pre-training (추가 사전학습 기반 지식 전이를 통한 국가 R&D 전문 언어모델 구축)

  • Yu, Eunji;Seo, Sumin;Kim, Namgyu
    • Knowledge Management Research
    • /
    • v.22 no.3
    • /
    • pp.91-106
    • /
    • 2021
  • With the recent rapid development of deep learning technology, the demand for analyzing huge text documents in the national R&D field from various perspectives is rapidly increasing. In particular, interest in the application of a BERT(Bidirectional Encoder Representations from Transformers) language model that has pre-trained a large corpus is growing. However, the terminology used frequently in highly specialized fields such as national R&D are often not sufficiently learned in basic BERT. This is pointed out as a limitation of understanding documents in specialized fields through BERT. Therefore, this study proposes a method to build an R&D KoBERT language model that transfers national R&D field knowledge to basic BERT using further pre-training. In addition, in order to evaluate the performance of the proposed model, we performed classification analysis on about 116,000 R&D reports in the health care and information and communication fields. Experimental results showed that our proposed model showed higher performance in terms of accuracy compared to the pure KoBERT model.

On the Study of the Interaction between Syntax and Semantics in See Verb Construction in English (영어 '보다(see)' 구문에 나타나는 통사와 의미의 상호관련성 연구)

  • Kim, Mija
    • Cross-Cultural Studies
    • /
    • v.39
    • /
    • pp.329-354
    • /
    • 2015
  • The major goals of this paper are to identify the degree into which the meanings of 'see' verb can be extended, focusing on the extended meanings shown in the expressions that denote our instinctive actions for survival, such as eating or drinking, etc., and to clarify the doubt on whether any syntactic pattern can be associated with the meaning in the process of meaning extension of 'see' verb. For doing this task, this paper picked out 2,000 examples randomly from COCA (Corpus of Contemporary American English), in which the verb 'see' is used. This paper classified the sentences into thirteen different sentence types, according to the syntactic patterns. This research showed that these thirteen syntactic types lead us to figure out the process of the meaning extension of the verb 'see'. With this result, this paper made an attempt to provide the four steps toward the meaning extension of verb 'see'. The verb 'see' in the first step denotes the meaning of purely seeing the visualized objects. This verb in the second step expresses the shifted function, under which the agent in the subject position takes the seeing action as a secondary task in order to carry out other main task. The verb in the third step denotes the extended meanings irrelevant to the seeing action, because the sentences on this step do not contain any visualized objects. In the last step this verb functions as conventional implicature whose meaning does not contribute to the whole meaning of a sentence. In addition, this paper identified that the syntactic properties are deeply associated with the process of meaning extension of the verb 'see', and tried to formalize this relationship between the syntax and semantics within the framework of Construction Grammar based on A. Goldberg.

A Study on Efficient Natural Language Processing Method based on Transformer (트랜스포머 기반 효율적인 자연어 처리 방안 연구)

  • Seung-Cheol Lim;Sung-Gu Youn
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.4
    • /
    • pp.115-119
    • /
    • 2023
  • The natural language processing models used in current artificial intelligence are huge, causing various difficulties in processing and analyzing data in real time. In order to solve these difficulties, we proposed a method to improve the efficiency of processing by using less memory and checked the performance of the proposed model. The technique applied in this paper to evaluate the performance of the proposed model is to divide the large corpus by adjusting the number of attention heads and embedding size of the BERT[1] model to be small, and the results are calculated by averaging the output values of each forward. In this process, a random offset was assigned to the sentences at every epoch to provide diversity in the input data. The model was then fine-tuned for classification. We found that the split processing model was about 12% less accurate than the unsplit model, but the number of parameters in the model was reduced by 56%.

Document Classification Methodology Using Autoencoder-based Keywords Embedding

  • Seobin Yoon;Namgyu Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.9
    • /
    • pp.35-46
    • /
    • 2023
  • In this study, we propose a Dual Approach methodology to enhance the accuracy of document classifiers by utilizing both contextual and keyword information. Firstly, contextual information is extracted using Google's BERT, a pre-trained language model known for its outstanding performance in various natural language understanding tasks. Specifically, we employ KoBERT, a pre-trained model on the Korean corpus, to extract contextual information in the form of the CLS token. Secondly, keyword information is generated for each document by encoding the set of keywords into a single vector using an Autoencoder. We applied the proposed approach to 40,130 documents related to healthcare and medicine from the National R&D Projects database of the National Science and Technology Information Service (NTIS). The experimental results demonstrate that the proposed methodology outperforms existing methods that rely solely on document or word information in terms of accuracy for document classification.

Trends in Incidence of Common Cancers in Iran

  • Enayatrad, Mostafa;Mirzaei, Maryam;Salehiniya, Hamid;Karimirad, Mohammad Reza;Vaziri, Siavash;Mansouri, Fiezollah;Moudi, Asieh
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.sup3
    • /
    • pp.39-42
    • /
    • 2016
  • Cancer is a major public health problem in Iran. The aim of this study was to evaluate trends in incidence of ten common cancers in Iran, based on the national cancer registry reports from 2004 to 2009. This epidemiological study was carried out based on existing age-standardized estimate cancer data from the national report on cancer registry/Ministry of Health in Iran. The obtained data were analyzed by test for linear trend and $P{\geq}0.05$ was taken as the significant level. Totals of 41,169 and 32,898 cases of cancer were registered in men and females, respectively, during these years. Overall age-standard incidence rates (ASRs) per 100,000 population according to primary site weres 125.6 and 113.4 in males and females, respectively. Between 2004 and 2009, the ten most common cancers (excluding skin cancer) were stomach (16.2), bladder (12.6), prostate (11), colon-rectum (10.14), hematopoeitic system (7.1), lung (6.1), esophagus (6.4), brain (3.2), lymph node (3.8) and larynx (3.4) in males; and in females were breast (27.4), colon-rectum (9.3), stomach (7.6), esophagus (6.4), hematopoeitic system (4.9), thyroid (3.9), ovary (3.6), corpus uteri (2.9), bladder (3.2) and lung (2.6). Moreover, results showed that skin cancer was estimated as the most common cancer in both sexes. The lowest and the highest incidence in females and males were reported respectively in 2004 and 2009. Over this period, the incidence of cancer in both sexes has been significantly increasing (p<0.01). Like other less developed and epidemiologically transitioning countries, the trend of age-standardized incidence rate of cancer in Iran is rising. Due to the increasing trends, the future burden of cancer in the Iran is going to be acute with the expected increases in aging populations. Determining and controlling potential risk factors of cancer should hopefully lead to decrease in its burden.

Research on Development of Support Tools for Local Government Business Transaction Operation Using Big Data Analysis Methodology (빅데이터 분석 방법론을 활용한 지방자치단체 단위과제 운영 지원도구 개발 연구)

  • Kim, Dabeen;Lee, Eunjung;Ryu, Hanjo
    • The Korean Journal of Archival Studies
    • /
    • no.70
    • /
    • pp.85-117
    • /
    • 2021
  • The purpose of this study is to investigate and analyze the current status of unit tasks, unit task operation, and record management problems used by local governments, and to present improvement measures using text-based big data technology based on the implications derived from the process. Local governments are in a serious state of record management operation due to errors in preservation period due to misclassification of unit tasks, inability to identify types of overcommon and institutional affairs, errors in unit tasks, errors in name, referenceable standards, and tools. However, the number of unit tasks is about 720,000, which cannot be effectively controlled due to excessive quantities, and thus strict and controllable tools and standards are needed. In order to solve these problems, this study developed a system that applies text-based analysis tools such as corpus and tokenization technology during big data analysis, and applied them to the names and construction terms constituting the record management standard. These unit task operation support tools are expected to contribute significantly to record management tasks as they can support standard operability such as uniform preservation period, identification of delegated office records, control of duplicate and similar unit task creation, and common tasks. Therefore, if the big data analysis methodology can be linked to BRM and RMS in the future, it is expected that the quality of the record management standard work will increase.