Search | Korea Science

Named Entity Recognition for Patent Documents Based on Conditional Random Fields (조건부 랜덤 필드를 이용한 특허 문서의 개체명 인식)

Lee, Tae Seok;Shin, Su Mi;Kang, Seung Shik
- KIPS Transactions on Software and Data Engineering
- /
- v.5 no.9
- /
- pp.419-424
- /
- 2016
Named entity recognition is required to improve the retrieval accuracy of patent documents or similar patents in the claims and patent descriptions. In this paper, we proposed an automatic named entity recognition for patents by using a conditional random field that is one of the best methods in machine learning research. Named entity recognition system has been constructed from the training set of tagged corpus with 660,000 words and 70,000 words are used as a test set for evaluation. The experiment shows that the accuracy is 93.6% and the Kappa coefficient is 0.67 between manual tagging and automatic tagging system. This figure is better than the Kappa coefficient 0.6 for manually tagged results and it shows that automatic named entity tagging system can be used as a practical tagging for patent documents in replacement of a manual tagging.
https://doi.org/10.3745/KTSDE.2016.5.9.419 인용 PDF KSCI

KorPatELECTRA : A Pre-trained Language Model for Korean Patent Literature to improve performance in the field of natural language processing(Korean Patent ELECTRA)

Jang, Ji-Mo;Min, Jae-Ok;Noh, Han-Sung
- Journal of the Korea Society of Computer and Information
- /
- v.27 no.2
- /
- pp.15-23
- /
- 2022
In the field of patents, as NLP(Natural Language Processing) is a challenging task due to the linguistic specificity of patent literature, there is an urgent need to research a language model optimized for Korean patent literature. Recently, in the field of NLP, there have been continuous attempts to establish a pre-trained language model for specific domains to improve performance in various tasks of related fields. Among them, ELECTRA is a pre-trained language model by Google using a new method called RTD(Replaced Token Detection), after BERT, for increasing training efficiency. The purpose of this paper is to propose KorPatELECTRA pre-trained on a large amount of Korean patent literature data. In addition, optimal pre-training was conducted by preprocessing the training corpus according to the characteristics of the patent literature and applying patent vocabulary and tokenizer. In order to confirm the performance, KorPatELECTRA was tested for NER(Named Entity Recognition), MRC(Machine Reading Comprehension), and patent classification tasks using actual patent data, and the most excellent performance was verified in all the three tasks compared to comparative general-purpose language models.
https://doi.org/10.9708/jksci.2022.27.02.015 인용 PDF KSCI HTML

Korean Machine Reading Comprehension for Patent Consultation Using BERT (BERT를 이용한 한국어 특허상담 기계독해)

Min, Jae-Ok;Park, Jin-Woo;Jo, Yu-Jeong;Lee, Bong-Gun
- KIPS Transactions on Software and Data Engineering
- /
- v.9 no.4
- /
- pp.145-152
- /
- 2020
MRC (Machine reading comprehension) is the AI NLP task that predict the answer for user's query by understanding of the relevant document and which can be used in automated consult services such as chatbots. Recently, the BERT (Pre-training of Deep Bidirectional Transformers for Language Understanding) model, which shows high performance in various fields of natural language processing, have two phases. First phase is Pre-training the big data of each domain. And second phase is fine-tuning the model for solving each NLP tasks as a prediction. In this paper, we have made the Patent MRC dataset and shown that how to build the patent consultation training data for MRC task. And we propose the method to improve the performance of the MRC task using the Pre-trained Patent-BERT model by the patent consultation corpus and the language processing algorithm suitable for the machine learning of the patent counseling data. As a result of experiment, we show that the performance of the method proposed in this paper is improved to answer the patent counseling query.
https://doi.org/10.3745/KTSDE.2020.9.4.145 인용 PDF KSCI

A case of partial trisomy 3p syndrome with rare clinical manifestations

Han, Dong-Hoon;Chang, Ji-Young;Lee, Woo-In;Bae, Chong-Woo
- Clinical and Experimental Pediatrics
- /
- v.55 no.3
- /
- pp.107-110
- /
- 2012
Partial trisomy 3p results from either unbalanced translocation or $de$ $novo$ duplication. Common clinical features consist of dysmorphic facial features, congenital heart defects, psychomotor and mental retardation, abnormal muscle tone, and hypoplastic genitalia. In this paper, we report a case of partial trisomy 3p with rare clinical manifestations. A full-term, female newborn was transferred to our clinic. She had cleft lip-plate, dysgenesis of the corpus callosum, patent ductus arteriosus, pulmonary hypertension, and severe right-sided hydronephrosis, associated with ureteropelvic junction obstruction. Cytogenetic investigation revealed partial trisomy 3p; 46,XX,der(4)t(3;4)(p21.1;p16). The karyotype of her father showed a balanced translocation, t(3;4)(p21.1;p16). Therefore, the size of duplication can be an important factor.
https://doi.org/10.3345/kjp.2012.55.3.107 인용 PDF KSCI

Automatic Extraction of Alternative Words using Parallel Corpus (병렬말뭉치를 이용한 대체어 자동 추출 방법)

Baik, Jong-Bum;Lee, Soo-Won
- Journal of KIISE:Computing Practices and Letters
- /
- v.16 no.12
- /
- pp.1254-1258
- /
- 2010
In information retrieval, different surface forms of the same object can cause poor performance of systems. In this paper, we propose the method extracting alternative words using translation words as features of each word extracted from parallel corpus, korean/english title pair of patent information. Also, we propose an association word filtering method to remove association words from an alternative word list. Evaluation results show that the proposed method outperforms other alternative word extraction methods.
PDF KSCI

A Study on the Performance Analysis of Entity Name Recognition Techniques Using Korean Patent Literature

Gim, Jangwon
- Journal of Advanced Information Technology and Convergence
- /
- v.10 no.2
- /
- pp.139-151
- /
- 2020
Entity name recognition is a part of information extraction that extracts entity names from documents and classifies the types of extracted entity names. Entity name recognition technologies are widely used in natural language processing, such as information retrieval, machine translation, and query response systems. Various deep learning-based models exist to improve entity name recognition performance, but studies that compared and analyzed these models on Korean data are insufficient. In this paper, we compare and analyze the performance of CRF, LSTM-CRF, BiLSTM-CRF, and BERT, which are actively used to identify entity names using Korean data. Also, we compare and evaluate whether embedding models, which are variously used in recent natural language processing tasks, can affect the entity name recognition model's performance improvement. As a result of experiments on patent data and Korean corpus, it was confirmed that the BiLSTM-CRF using FastText method showed the highest performance.
https://doi.org/10.14801/JAITC.2020.10.2.139 인용

Effects of Crataegii fructus on the Contractile Response of Rabbit Corpus Cavernosum (산사(山査)가 토끼 음경해면체의 수축에 미치는 영향)

Lee, Han Seok;Park, Sun Young
- Journal of Physiology & Pathology in Korean Medicine
- /
- v.27 no.5
- /
- pp.602-610
- /
- 2013
This study was aimed to evaluate the cavernosal relaxation effect of Crataegii fructus(CF) in the contracted rabbit penile corpus cavernosum by agonists.In order to study the effect of CF on the vasoconstriction of rabbit penile corpus cavernosum, isolated rabbit penile corpus cavernosum tissues were used for the experiment using organ baths containing Krebs solution.To investigate the cavernosal relaxation of CF, CF extract at $0.01{\sim}3.0mg/m{\ell}$ was added after penile corpus cavernosum were contracted by norepinephrine(NE) $1{\mu}M$. To analyze the mechanism of CF's vasorelaxation, CF extract infused into contracted penile tissues by NE after each treatment of indomethacin(IM), $N{\omega}$-nitro-L-arginine(L-NNA), methylene blue(MB), tetraethylammonium chloride(TEA).To study the effect of CF on influx of extracellular calcium chloride($Ca^{2+}$) in penile tissues, in $Ca^{2+}$-free krebs solution, $Ca^{2+}$ 1 mM infused into contracted penile tissues by NE after pretreatment of CF. Cytotoxic activity of CF on human umbilical vein endothelial cell(HUVEC) was measured by MTT assay, and nitric oxide(NO) prodution was measured by Griess reagent. CF relaxed cavernosal strip with endothelium contracted by NE, but in the strips without endothelium, CF-induced relaxation was significantly inhibited. The pretreatment of L-NNA, MB, TEA decreased significantly on the cavernosal relaxation than not-treatment of them. But the pretreatment of IM had no significant effect on the cavernosal relaxation. In $Ca^{2+}$-free krebs solution, when $Ca^{2+}$ infused into contracted penile tissues by NE, pretreatment of CF inhibit contraction induced by adding $Ca^{2+}$.NO production wasn't increased by treatment of CF on HUVEC. This findings showed that CF is effective for the relaxation of rabbit penile corpus cavernosum, and we suggest that CF relax rabbit corpus cavernosal smooth muscle through multiple action mechanisms that include increasing the release of nitric oxide from corporal sinusoidal endothelium, inhibition of $Ca^{2+}$ mobilization into cytosol from the extracellular fluid, and maybe a hyperpolarizing action.
PDF KSCI

Successful management of absent sternum in an infant using porcine acellular dermal matrix

Semlacher, Roy Alfred;Nuri, Muhammand A.K.
- Archives of Plastic Surgery
- /
- v.46 no.5
- /
- pp.470-474
- /
- 2019
Congenital absent sternum is a rare birth defect that requires early intervention for optimal long-term outcomes. Descriptions of the repair of absent sternum are limited to case reports, and no preferred method for management has been described. Herein, we describe the use of porcine acellular dermal matrix to reconstruct the sternum of an infant with sternal infection following attempted repair using synthetic mesh. The patient was a full-term male with trisomy 21, agenesis of corpus callosum, ventricular septal defect, patent ductus arteriosus, right-sided aortic arch, and congenital absence of sternum with no sternal bars. Following removal of the infected synthetic mesh, negative pressure wound therapy with instillation was used to manage the open wound and provide direct antibiotic therapy. When blood C-reactive protein levels declined to ${\leq}2mg/L$, the sternum was reconstructed using porcine acellular dermal matrix. At 21 months postoperative, the patient demonstrated no respiratory issues. Physical examination and computed tomography imaging identified good approximation of the clavicular heads and sternal cleft and forward curvature of the ribs. This case illustrates the benefits of negative pressure wound therapy and acellular dermal matrix for the reconstruction of absent sternum in the context of infected sternal surgical site previously repaired with synthetic mesh.
https://doi.org/10.5999/aps.2018.00829 인용 PDF KSCI

Clustering-based Statistical Machine Translation Using Syntactic Structure and Word Similarity (문장구조 유사도와 단어 유사도를 이용한 클러스터링 기반의 통계기계번역)

Kim, Han-Kyong;Na, Hwi-Dong;Li, Jin-Ji;Lee, Jong-Hyeok
- Journal of KIISE:Software and Applications
- /
- v.37 no.4
- /
- pp.297-304
- /
- 2010
Clustering method which based on sentence type or document genre is a technique used to improve translation quality of SMT(statistical machine translation) by domain-specific translation. But there is no previous research using sentence type and document genre information simultaneously. In this paper, we suggest an integrated clustering method that classifying sentence type by syntactic structure similarity and document genre by word similarity information. We interpolated domain-specific models from clusters with general models to improve translation quality of SMT system. Kernel function and cosine measures are applied to calculate structural similarity and word similarity. With these similarities, we used machine learning algorithms similar to K-means to clustering. In Japanese-English patent translation corpus, we got 2.5% point relative improvements of translation quality at optimal case.
PDF KSCI

Domain Adaptation Method for LHMM-based English Part-of-Speech Tagger (LHMM기반 영어 형태소 품사 태거의 도메인 적응 방법)

Kwon, Oh-Woog;Kim, Young-Gil
- Journal of KIISE:Computing Practices and Letters
- /
- v.16 no.10
- /
- pp.1000-1004
- /
- 2010
A large number of current language processing systems use a part-of-speech tagger for preprocessing. Most language processing systems required a tagger with the highest possible accuracy. Specially, the use of domain-specific advantages has become a hot issue in machine translation community to improve the translation quality. This paper addresses a method for customizing an HMM or LHMM based English tagger from general domain to specific domain. The proposed method is to semi-automatically customize the output and transition probabilities of HMM or LHMM using domain-specific raw corpus. Through the experiments customizing to Patent domain, our LHMM tagger adapted by the proposed method shows the word tagging accuracy of 98.87% and the sentence tagging accuracy of 78.5%. Also, compared with the general tagger, our tagger improved the word tagging accuracy of 2.24% (ERR: 66.4%) and the sentence tagging accuracy of 41.0% (ERR: 65.6%).
PDF KSCI

Search Result 10, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)