• 제목/요약/키워드: Bi-gram Analysis

검색결과 10건 처리시간 0.028초

텍스트 마이닝 기법을 활용한 ECDIS 사고보고서 분석 (Text Mining Analysis Technique on ECDIS Accident Report)

  • 이정석;이보경;조익순
    • 해양환경안전학회지
    • /
    • 제25권4호
    • /
    • pp.405-412
    • /
    • 2019
  • SOLAS에서는 국제 항해에 종사하는 총톤수 500톤 이상의 선박에 대하여 2018년 7월 1일 이후 도래하는 최초 검사까지 ECDIS를 설치해야 한다고 규정하고 있다. 새로운 주요 항해 장비로 ECDIS가 탑재되면서 ECDIS 사용에 관련한 다양한 사고가 발생하고 있다. MAIB, BSU, BEAmer, DMAIB, DSB에서 발행한 12가지의 사고보고서에는 항해사의 운용 미숙과 ECDS 시스템의 사고 원인으로 분석하였고, 사고 원인과 관련된 단어들을 정량적으로 분석하기 위해 R-프로그램을 사용하여 텍스트를 분석하였다. 도출 빈도에 따른 단어의 중요도를 나타내기 위해 텍스트 마이닝 기법인 단어 구름, 단어 연관성, 단어 가중치의 방법을 사용하였다. 단어 구름은 사용된 단어들의 빈도수를 구름 형태로 나타내는 방법으로써 N-gram 모델을 적용하였다. N-gram 모델 중 Uni-gram 분석 결과 ECDIS 단어, Bi-gram 분석 결과는 Safety Contour 단어의 사용 빈도가 가장 많았다. Bi-gram 분석을 기반으로 사고 원인 단어를 항해사와 ECDIS 시스템으로 구분하고, 연관된 단어들을 단어 연관성으로 나타내었다. 마지막으로 항해사와 ECDIS 시스템에 연관된 단어들을 단어 말뭉치로 구성한 후 단어 가중치를 적용하여 연도별 말뭉치 빈도 변화를 분석하였다. 추세선 그래프로 말뭉치 변화 경향을 분석한 결과, 항해사 말뭉치는 최근으로 올수록 감소하였으며 반대로 ECDIS 시스템 말뭉치는 점점 증가함을 나타내었다.

텍스트마이닝 방법론을 활용한 웨어러블 관련 키워드의 트렌드 분석 (Analyzing the Trend of Wearable Keywords using Text-mining Methodology)

  • 김민정
    • 디지털융복합연구
    • /
    • 제18권9호
    • /
    • pp.181-190
    • /
    • 2020
  • 본 연구는 신문기사로부터 수집한 웨어러블 관련 텍스트를 대상으로 텍스트마이닝을 수행하여 웨어러블 관련 키워드의 트렌드를 분석하였다. 이를 위해 1992년부터 2019년까지 신문기사 11,952건을 수집하여 빈도분석과 바이그램 분석을 적용하였다. 빈도분석 결과 삼성전자, LG전자, 애플이 최상위 빈도어로 추출되었으며 스마트워치, 스마트밴드가 기기 측면에서 지속적으로 등장하였음을 알 수 있었다. 또한 IT전시회가 매년 고빈도어로 나타났으며 차세대 기술 관련 키워드와 융합된 내용이 기사화되는 것을 볼 수 있었다. 바이그램 분석 결과, 세계-최초, 세계-최대 같은 단어 묶음이 지속적으로 등장하였으며 이슈나 이벤트가 발생할 때마다 관련된 새로운 단어 묶음이 도출됨을 확인할 수 있었다. 이러한 웨어러블 관련 키워드의 트렌드 추이 파악은 웨어러블 동향과 향후 방향성을 이해하는데 유용할 것이다.

채식주의자: 랭귀지 모델 접근 (A Language Model Approach to "The Vegetarian")

  • 김재준;권준혁;김유래;박명관;송상헌
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2017년도 제29회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.260-263
    • /
    • 2017
  • This paper is to broaden the possible spectrums of analyzing the Korean-written novel "The Vegetarian" by using the computational linguistics program. Through the use of language model, which was usually used in bi-gram analysis in corpus linguistics, to the International Man Booker award winning novel, the characteristics of "The Vegetarian" is investigated by comparing it to the English-written novel "A Little Life".

  • PDF

채식주의자: 랭귀지 모델 접근 (A Language Model Approach to "The Vegetarian")

  • 김재준;권준혁;김유래;박명관;송상헌
    • 한국어정보학회:학술대회논문집
    • /
    • 한국어정보학회 2017년도 제29회 한글및한국어정보처리학술대회
    • /
    • pp.260-263
    • /
    • 2017
  • This paper is to broaden the possible spectrums of analyzing the Korean-written novel "The Vegetarian" by using the computational linguistics program. Through the use of language model, which was usually used in bi-gram analysis in corpus linguistics, to the International Man Booker award winning novel, the characteristics of "The Vegetarian" is investigated by comparing it to the English-written novel "A Little Life".

  • PDF

토픽 모델링 및 바이그램 네트워크 분석 기법을 통한 여대생의 건강관리 및 웨어러블 디바이스 인식에 관한 연구 (Analyzing Female College Student's Recognition of Health Monitoring and Wearable Device Using Topic Modeling and Bi-gram Network Analysis)

  • 정우경;신동희
    • 정보관리학회지
    • /
    • 제38권4호
    • /
    • pp.129-152
    • /
    • 2021
  • 본 연구는 토픽 모델링 및 네트워크 분석 기법을 활용하여 여대생들의 웨어러블 디바이스에 대한 인식 및 선호도 분석, 건강관리에 대한 요구를 분석함으로써 여대생에게 맞는 웨어러블 디바이스 개발 방안을 제시하였다. 이를 위하여 S여자대학교 재학생들이 사용하는 커뮤니티에서 건강관리 및 웨어러블 디바이스와 관련된 게시글 2,457건을 수집하였고. 수집된 게시글과 댓글 데이터를 전처리한 뒤 LDA 기반의 토픽 모델링을 실시하였다. 토픽 모델링 기법을 통해 건강관리 및 웨어러블 디바이스와 관련하여 여대생들의 주요 쟁점들을 도출하고, 관련 키워드가 포함된 포스팅에 대해 바이그램 분석과 네트워크 분석을 수행하여 여대생들이 웨어러블 기기에 대해 가지고 있는 견해를 파악하고자 한다.

Synthesis, characterization, and biological significance of mixed ligand Schiff base and alizarin dye-metal complexes

  • Laith Jumaah Al-Gburi;Taghreed H. Al-Noor
    • 분석과학
    • /
    • 제37권4호
    • /
    • pp.239-250
    • /
    • 2024
  • This study reports the synthesis of a bi-dentate Schiff base ligand (L), 7-(2-((2-formylbenzylidene) amino)-2-phenylacetamido)-3-methyl-8-oxo-5-thia-1-azabicyclo[4.2.0]oct-2-ene-2-carboxylic acid, prepared from phthalaldehyde and cephalexin antibiotic. The synthesized Schiff base ligand (L) and the secondary ligand alizarin (Az) are used to prepare the new complexes [M(Az)2(L)] and [Cr(Az)2(L)]Cl, where M = Mn(II), Co(II), Ni(II), Cu(II), and Zn(II). The mode of bonding of the Schiff base has been characterized by UV-Visible, FT-IR, Mass, 1H-, and 13C-NMR spectroscopic techniques, and micro elemental analysis (CHNS). The complexes were characterized using UV-Vis, FT-IR, molar conductance, magnetic moment, and thermal analysis (TG/DTG). The molar conductance data revealed that the complexes are non-electrolytes except for [Cr(L)(Az)2]Cl, which is an electrolytic type 1:1. The Schiff base and its complexes have been tested for their biological activity against two strains of bacteria and one fungus. When screened against gram-positive and gram-negative pathogens, the Az and L ligands and their complexes showed potential antimicrobial activity.

Modern Methods of Text Analysis as an Effective Way to Combat Plagiarism

  • Myronenko, Serhii;Myronenko, Yelyzaveta
    • International Journal of Computer Science & Network Security
    • /
    • 제22권8호
    • /
    • pp.242-248
    • /
    • 2022
  • The article presents the analysis of modern methods of automatic comparison of original and unoriginal text to detect textual plagiarism. The study covers two types of plagiarism - literal, when plagiarists directly make exact copying of the text without changing anything, and intelligent, using more sophisticated techniques, which are harder to detect due to the text manipulation, like words and signs replacement. Standard techniques related to extrinsic detection are string-based, vector space and semantic-based. The first, most common and most successful target models for detecting literal plagiarism - N-gram and Vector Space are analyzed, and their advantages and disadvantages are evaluated. The most effective target models that allow detecting intelligent plagiarism, particularly identifying paraphrases by measuring the semantic similarity of short components of the text, are investigated. Models using neural network architecture and based on natural language sentence matching approaches such as Densely Interactive Inference Network (DIIN), Bilateral Multi-Perspective Matching (BiMPM) and Bidirectional Encoder Representations from Transformers (BERT) and its family of models are considered. The progress in improving plagiarism detection systems, techniques and related models is summarized. Relevant and urgent problems that remain unresolved in detecting intelligent plagiarism - effective recognition of unoriginal ideas and qualitatively paraphrased text - are outlined.

Bi-LSTM 기반의 한국어 감성사전 구축 방안 (KNU Korean Sentiment Lexicon: Bi-LSTM-based Method for Building a Korean Sentiment Lexicon)

  • 박상민;나철원;최민성;이다희;온병원
    • 지능정보연구
    • /
    • 제24권4호
    • /
    • pp.219-240
    • /
    • 2018
  • 감성사전은 감성 어휘에 대한 사전으로 감성 분석(Sentiment Analysis)을 위한 기초 자료로 활용된다. 이와 같은 감성사전을 구성하는 감성 어휘는 특정 도메인에 따라 감성의 종류나 정도가 달라질 수 있다. 예를 들면, '슬프다'라는 감성 어휘는 일반적으로 부정의 의미를 나타내지만 영화 도메인에 적용되었을 경우 부정의 의미를 나타내지 않는다. 그렇기 때문에 정확한 감성 분석을 수행하기 위해서는 특정 도메인에 알맞은 감성사전을 구축하는 것이 중요하다. 최근 특정 도메인에 알맞은 감성사전을 구축하기 위해 범용 감성 사전인 오픈한글, SentiWordNet 등을 활용한 연구가 진행되어 왔으나 오픈한글은 현재 서비스가 종료되어 활용이 불가능하며, SentiWordNet은 번역 간에 한국 감성 어휘들의 특징이 잘 반영되지 않는다는 문제점으로 인해 특정 도메인의 감성사전 구축을 위한 기초 자료로써 제약이 존재한다. 이 논문에서는 기존의 범용 감성사전의 문제점을 해결하기 위해 한국어 기반의 새로운 범용 감성사전을 구축하고 이를 KNU 한국어 감성사전이라 명명한다. KNU 한국어 감성사전은 표준국어대사전의 뜻풀이의 감성을 Bi-LSTM을 활용하여 89.45%의 정확도로 분류하였으며 긍정으로 분류된 뜻풀이에서는 긍정에 대한 감성 어휘를, 부정으로 분류된 뜻풀이에서는 부정에 대한 감성 어휘를 1-gram, 2-gram, 어구 그리고 문형 등 다양한 형태로 추출한다. 또한 다양한 외부 소스(SentiWordNet, SenticNet, 감정동사, 감성사전0603)를 활용하여 감성 어휘를 확장하였으며 온라인 텍스트 데이터에서 사용되는 신조어, 이모티콘에 대한 감성 어휘도 포함하고 있다. 이 논문에서 구축한 KNU 한국어 감성사전은 특정 도메인에 영향을 받지 않는 14,843개의 감성 어휘로 구성되어 있으며 특정 도메인에 대한 감성사전을 효율적이고 빠르게 구축하기 위한 기초 자료로 활용될 수 있다. 또한 딥러닝의 성능을 높이기 위한 입력 자질로써 활용될 수 있으며, 기본적인 감성 분석의 수행이나 기계 학습을 위한 대량의 학습 데이터 세트를 빠르게 구축에 활용될 수 있다.

산불이 토양 미생물 군집과 효소 활성 변화에 미치는 영향 (Effect of Fire on Microbial Community Structure and Enzyme Activities in Forest Soil)

  • 오주환;이슬비;박성은;이용복;김필주
    • 한국환경농학회지
    • /
    • 제27권2호
    • /
    • pp.133-138
    • /
    • 2008
  • Fire can affect microbial community structure of soil through altered environmental conditions, nutrient availability, and biotic source for microbial re-colonization. We examined the influence of fire on chemical properties and soil enzyme activities of soil for 10 months. We also characterized the soil microbial community structure through ester-linked fatty acid analysis(EL-FAME). For this study, we established five burned plots(1*1 m) and 5 unburned plots outside the margin of fire. Soil was sampled three soil cores in a each plots and composited for analysis at 1, 3, 5, 8, and 10 month after fire. The fire caused an increase in soil pH, exchangeable Ca, and Mg, organic matter, available $P_2O_5$ compared to unburned sites. The content of $NH_4-N$ in burned site was significantly higher than that of unburned site and this effect continued for 8 months after fire. There was no difference of $NO_3-N$ content in soil between burned and unburned site. Fire caused no change in acid phosphatase and arylsulfatase activities but $\beta$-glucosidase and alkaline phosphatase activities in burned site were increased compared to unburned site. Microbial biomass as estimated by total concentration of EL-FAMEs in burned sites was significantly higher than that of unburned sites at one month after fire. Burned site decreased the EL-FAMEs indicative of gram-positive bacteria and tended to increase the fatty acid associated with gram-negative bacteria at one and three months after fire. The sum of EL-FAME compound $18:2{\omega}6,9c$ and $18:1{\omega}9c$ as served fungal biomarkers was decreased in burned site compared to unburned site.

Sesquiterpenoids Bioconversion Analysis by Wood Rot Fungi

  • Lee, Su-Yeon;Ryu, Sun-Hwa;Choi, In-Gyu;Kim, Myungkil
    • 한국균학회소식:학술대회논문집
    • /
    • 한국균학회 2016년도 춘계학술대회 및 임시총회
    • /
    • pp.19-20
    • /
    • 2016
  • Sesquiterpenoids are defined as $C_{15}$ compounds derived from farnesyl pyrophosphate (FPP), and their complex structures are found in the tissue of many diverse plants (Degenhardt et al. 2009). FPP's long chain length and additional double bond enables its conversion to a huge range of mono-, di-, and tri-cyclic structures. A number of cyclic sesquiterpenes with alcohol, aldehyde, and ketone derivatives have key biological and medicinal properties (Fraga 1999). Fungi, such as the wood-rotting Polyporus brumalis, are excellent sources of pharmaceutically interesting natural products such as sesquiterpenoids. In this study, we investigated the biosynthesis of P. brumalis sesquiterpenoids on modified medium. Fungal suspensions of 11 white rot species were inoculated in modified medium containing $C_6H_{12}O_6$, $C_4H_{12}N_2O_6$, $KH_2PO_4$, $MgSO_4$, and $CaCl_2$ for 20 days. Cultivation was stopped by solvent extraction via separation of the mycelium. The metabolites were identified as follows: propionic acid (1), mevalonic acid lactone (2), ${\beta}$-eudesmane (3), and ${\beta}$-eudesmol (4), respectively (Figure 1). The main peaks of ${\beta}$-eudesmane and ${\beta}$-eudesmol, which were indicative of sesquiterpene structures, were consistently detected for 5, 7, 12, and 15 days These results demonstrated the existence of terpene metabolism in the mycelium of P. brumalis. Polyporus spp. are known to generate flavor components such as methyl 2,4-dihydroxy-3,6-dimethyl benzoate; 2-hydroxy-4-methoxy-6-methyl benzoic acid; 3-hydroxy-5-methyl phenol; and 3-methoxy-2,5-dimethyl phenol in submerged cultures (Hoffmann and Esser 1978). Drimanes of sesquiterpenes were reported as metabolites from P. arcularius and shown to exhibit antimicrobial activity against Gram-positive bacteria such as Staphylococcus aureus (Fleck et al. 1996). The main metabolites of P. brumalis, ${\beta}$-Eudesmol and ${\beta}$-eudesmane, were categorized as eudesmane-type sesquiterpene structures. The eudesmane skeleton could be biosynthesized from FPP-derived IPP, and approximately 1,000 structures have been identified in plants as essential oils. The biosynthesis of eudesmol from P. brumalis may thus be an important tool for the production of useful natural compounds as presumed from its identified potent bioactivity in plants. Essential oils comprising eudesmane-type sesquiterpenoids have been previously and extensively researched (Wu et al. 2006). ${\beta}$-Eudesmol is a well-known and important eudesmane alcohol with an anticholinergic effect in the vascular endothelium (Tsuneki et al. 2005). Additionally, recent studies demonstrated that ${\beta}$-eudesmol acts as a channel blocker for nicotinic acetylcholine receptors at the neuromuscular junction, and it can inhibit angiogenesis in vitro and in vivo by blocking the mitogen-activated protein kinase (MAPK) signaling pathway (Seo et al. 2011). Variation of nutrients was conducted to determine an optimum condition for the biosynthesis of sesquiterpenes by P. brumalis. Genes encoding terpene synthases, which are crucial to the terpene synthesis pathway, generally respond to environmental factors such as pH, temperature, and available nutrients (Hoffmeister and Keller 2007, Yu and Keller 2005). Calvo et al. described the effect of major nutrients, carbon and nitrogen, on the synthesis of secondary metabolites (Calvo et al. 2002). P. brumalis did not prefer to synthesize sesquiterpenes under all growth conditions. Results of differences in metabolites observed in P. brumalis grown in PDB and modified medium highlighted the potential effect inorganic sources such as $C_4H_{12}N_2O_6$, $KH_2PO_4$, $MgSO_4$, and $CaCl_2$ on sesquiterpene synthesis. ${\beta}$-eudesmol was apparent during cultivation except for when P. brumalis was grown on $MgSO_4$-free medium. These results demonstrated that $MgSO_4$ can specifically control the biosynthesis of ${\beta}$-eudesmol. Magnesium has been reported as a cofactor that binds to sesquiterpene synthase (Agger et al. 2008). Specifically, the $Mg^{2+}$ ions bind to two conserved metal-binding motifs. These metal ions complex to the substrate pyrophosphate, thereby promoting the ionization of the leaving groups of FPP and resulting in the generation of a highly reactive allylic cation. Effect of magnesium source on the sesquiterpene biosynthesis was also identified via analysis of the concentration of total carbohydrates. Our current study offered further insight that fungal sesquiterpene biosynthesis can be controlled by nutrients. To profile the metabolites of P. brumalis, the cultures were extracted based on the growth curve. Despite metabolites produced during mycelia growth, there was difficulty in detecting significant changes in metabolite production, especially those at low concentrations. These compounds may be of interest in understanding their synthetic mechanisms in P. brumalis. The synthesis of terpene compounds began during the growth phase at day 9. Sesquiterpene synthesis occurred after growth was complete. At day 9, drimenol, farnesol, and mevalonic lactone (or mevalonic acid lactone) were identified. Mevalonic acid lactone is the precursor of the mevalonic pathway, and particularly, it is a precursor for a number of biologically important lipids, including cholesterol hormones (Buckley et al. 2002). Farnesol is the precursor of sesquiterpenoids. Drimenol compounds, bi-cyclic-sesquiterpene alcohols, can be synthesized from trans-trans farnesol via cyclization and rearrangement (Polovinka et al. 1994). They have also been identified in the basidiomycota Lentinus lepideus as secondary metabolites. After 12 days in the growth phase, ${\beta}$-elemene caryophyllene, ${\delta}$-cadiene, and eudesmane were detected with ${\beta}$-eudesmol. The data showed the synthesis of sesquiterpene hydrocarbons with bi-cyclic structures. These compounds can be synthesized from FPP by cyclization. Cyclic terpenoids are synthesized through the formation of a carbon skeleton from linear precursors by terpene cyclase, which is followed by chemical modification by oxidation, reduction, methylation, etc. Sesquiterpene cyclase is a key branch-point enzyme that catalyzes the complex intermolecular cyclization of the linear prenyl diphosphate into cyclic hydrocarbons (Toyomasu et al. 2007). After 20 days in stationary phase, the oxygenated structures eudesmol, elemol, and caryophyllene oxide were detected. Thus, after growth, sesquiterpenes were identified. Per these results, we showed that terpene metabolism in wood-rotting fungi occurs in the stationary phase. We also showed that such metabolism can be controlled by magnesium supplementation in the growth medium. In conclusion, we identified P. brumalis as a wood-rotting fungus that can produce sesquiterpenes. To mechanistically understand eudesmane-type sesquiterpene biosynthesis in P. brumalis, further research into the genes regulating the dynamics of such biosynthesis is warranted.

  • PDF