• Title/Summary/Keyword: N-gram 분석

Search Result 137, Processing Time 0.024 seconds

A Joint Statistical Model for Word Spacing and Spelling Error Correction Simultaneously (띄어쓰기 및 철자 오류 동시교정을 위한 통계적 모델)

  • Noh, Hyung-Jong;Cha, Jeong-Won;Lee, GaryGeun-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.2
    • /
    • pp.131-139
    • /
    • 2007
  • In this paper, we present a preprocessor which corrects word spacing errors and spelling correction errors simultaneously. The proposed expands noisy-channel model so that it corrects both errors in colloquial style sentences effectively, while preprocessing algorithms have limitations because they correct each error separately. Using Eojeol transition pattern dictionary and statistical data such as n-gram and Jaso transition probabilities, it minimizes the usage of dictionaries and produces the corrected candidates effectively. In experiments we did not get satisfactory results at current stage, we noticed that the proposed methodology has the utility by analyzing the errors. So we expect that the preprocessor will function as an effective error corrector for general colloquial style sentence by doing more improvements.

Development of Text Mining-Based Accounting Terminology Analyzer for Financial Information Utilization (재정정보 활용을 위한 텍스트 마이닝 기반 회계용어 형태소 분석기 구축)

  • Jung, Geon-Yong;Yoon, Seung-Sik;Kang, Ju-Young
    • The Journal of Information Systems
    • /
    • v.28 no.4
    • /
    • pp.155-174
    • /
    • 2019
  • Purpose Social interest in financial statement notes has recently increased. However, contrary to the keen interest in financial statement notes, there is no morphological analyzer for accounting terms, which is why researchers are having considerable difficulty in carrying out research. In this study, we build a morphological analyzer for accounting related text mining techniques. This morphological analyzer can handle accounting terms like financial statements and we expect it to serve as a springboard for growth in the text mining research field. Design/methodology/approach In this study, we build customized korean morphological analyzer to extract proper accounting terms. First, we collect Company's Financial Statement notes, financial information data published by KPFIS(Korea Public Finance Information Service), K-IFRS accounting terms data. Second, we cleaning and tokeninzing and removing stopwords. Third, we customize morphological analyzer using n-gram methodology. Findings Existing morphological analyzer cannot extract accounting terms because it split accounting terms to many nouns. In this study, the new customized morphological analyzer can detect more appropriate accounting terms comparing to the existing morphological analyzer. We found that accounting words that were not detected by existing morphological analyzers were detected in new customized morphological analyzers.

Tendency and Network Analysis of Diet Using Big Data (빅데이터를 활용한 다이어트 현황 및 네트워크 분석)

  • Jung, Eun-Jin;Chang, Un-Jae
    • Journal of the Korean Dietetic Association
    • /
    • v.22 no.4
    • /
    • pp.310-319
    • /
    • 2016
  • Limitation of a questionnaire survey which is widely used is time and money, limited numbers of participants, biased confidence interval and unreliable results. To overcome these, we performed tendency and network analysis of diet using big Data in Koreans. The keyword on diet were collected from the portal site Naver from January 1, 2015 until December 31, 2015 and collected data were analyzed by simple frequency analysis, N-gram analysis, keyword network analysis and seasonality analysis. The results showed that diet menu appeared most frequently by N-gram analysis, even though exercise had the highest frequency by simple frequency analysis. In addition, keyword network analysis were categorized into four groups: diet group, exercise group, commercial diet program company group and commercial diet food group. The analysis of seasonality showed that subjects' interests in diet had increased steadily since February, 2015, although subjects were most interested indiet in July, these results suggest that the best strategies for weight loss are based on diet menu and starting diet before July. As people are especially sensitive to diet trends, researches are needed about annual analysis of big data.

Treatment and Prognosis according to Causative Organisms in Neonatal Bacterial Meningitis (신생아 세균성 뇌막염의 원인균에 따른 치료와 예후)

  • Kim, Dong Joon;Lee, Gwang Hoon;Lee, Hyung Won;Kim, Gil Hyun;Lee, Hak Soo
    • Pediatric Infection and Vaccine
    • /
    • v.4 no.1
    • /
    • pp.79-89
    • /
    • 1997
  • Purpose : Neonatal bacterial meningitis is the disease which clinical manifestations are nonspecific and several neurologic complications may occur. We studied neonatal bacterial meningitis, particularly in treatment and prognosis according to causative organisms -gram positive and gram negative bacteria- to assist in treatment of neonatal bacterial meningitis. Methods : We analysed twenty-four cases retrospectively who had been admitted in NICU or pediatric ward in Chung-ang Gil hospital from Jan. 1991 to Jun. 1996, and who had proven causative organisms in culture or latex agglutination[n test in CSF. Results : 1) The ratio of male to female was 2.4: 1. The mean birth weight and gestational age in cases with gram positive bacterial meningitis were $2.91{\pm}0.79kg$ and $38.4{\pm}2.74$ weeks and those in cases with yam negative bacterial meningitis were $3.30{\pm}0.90kg$ and $37.7{\pm}3.33$weeks respectively. There was no significant difference between the two groups. 2) The perinatal predisposing factors were pematurity, mecoinium staining amnionic fluid, matemal diabetes and pregnancy-induced hypertension, etc. The clinical manifestations Were fever, seizure, poor oral intake and fontanel bulging, etc. There were eleven cases with early onset bacterial meningitis(four cases by gram positive bacteria, seven cases by gram negative bacteria), and thirteen cases with late onset bacterial meningitis(seven cases by gram positive bacteria, six cases by gram negative bacteria). There was no significant difference between the two groups in terms of onset. 3) There were eleven cases with yam positive bacterial meningitis and they were coagulase-negative staphylococci(three cases), group B streptococci(three cases), Staphylococcus aureus(two cases), Streptococcus viridans(two cases), and enterococci(one case). And there were thirteen cases with gram negative bacterial menir gitis and they were Escherichia coli(seven cases), Klevsiella pneumoniae(three cases), Pseudomonas aeruginosa(one case), Acinetobactor(one case) and Enterobacter(one case). 4) The initial CSF WBC counts in cases with yam negative bacterial meningitis were significantly higher than those in cases with gram positive bacterial meningitis but the CSF protein and glucose levels were no significant difference in the two groups statistically. 5) The number of cases with abnormal findings in brain ultrasonography was seven in gram positive bacterial meningitis and ten in gram negative bacterial meningitis. 6) There were relatively high sensitivity to penicillin derivatives, the first generation cephalosporin and vancomycin in gram positive bacteria and to the third generation cephalosporin and amikacin in gram negative bacteria. 7) The mortality rate was 20.8%(5 cases were expired or discharged hopelessly). There was no significant difference between the two groups in prognosis. Conclusions : We recommend active treatment in noenatal bacterial meningitis to improve prognosis because the prognosis is poor.

  • PDF

Media-based Analysis of Gasoline Inventory with Korean Text Summarization (한국어 문서 요약 기법을 활용한 휘발유 재고량에 대한 미디어 분석)

  • Sungyeon Yoon;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.5
    • /
    • pp.509-515
    • /
    • 2023
  • Despite the continued development of alternative energies, fuel consumption is increasing. In particular, the price of gasoline fluctuates greatly according to fluctuations in international oil prices. Gas stations adjust their gasoline inventory to respond to gasoline price fluctuations. In this study, news datasets is used to analyze the gasoline consumption patterns through fluctuations of the gasoline inventory. First, collecting news datasets with web crawling. Second, summarizing news datasets using KoBART, which summarizes the Korean text datasets. Finally, preprocessing and deriving the fluctuations factors through N-Gram Language Model and TF-IDF. Through this study, it is possible to analyze and predict gasoline consumption patterns.

Brain-Waves Analysis according to Ego-state and OK-gram of Transactional Analysis Theory (교류분석이론의 자아상태와 인생태도에 따른 뇌파 분석)

  • Jeong, Cheon-Soo;Kim, Jung-Sam;Kim, Chong-Yeal
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.11
    • /
    • pp.858-863
    • /
    • 2014
  • Through this study, we found out whether to measure objectively by using Transactional Analysis(TA) evaluating their own growth and benefit of health through the change of Ego-state, the personality structure and brain waves monitoring electric signals occurring in the brain of the human biological signals. According to the results of brain-wave test, M-type is brain-waves of healthy adults and a, ${\beta}$ were dominantly observed in the occipital while not sleeping. In particular, ${\beta}$ appears widely throughout the brain during nervous or concentrating metal activities and unlike N-type experimenters, ${\beta}$ was found to be dominant in M-type experimenters even in stable condition. N-type is brain waves of healthy adults and a, ${\beta}$ were dominantly observed in the occipital while not sleeping. In particular, unlike nervous or concentrating M-type, there was no noise such as tension and blink while resting. In addition, it turned out that subjects with high levels of A ego do not return quickly to the stable state and show a lot of Blinking and swallowing saliva, noise regardless of the pattern of Egogram. And brain waves of 11 people that the difference in data of OK-gram and Ego-state is the same in all items or less than -5 showed a low amplitude of $20{\mu}V$ in general. In conclusion, this study identified that the theory of personality pattern of Transactional Analysis Theory and brain-wave findings are consistent and also found out that brain waves are also associated according to each Ego-state of Egogram.

A Study on the Perception of Artificial Intelligence Literacy and Artificial Intelligence Convergence Education Using Text Mining Analysis Techniques (텍스트 마이닝 분석기법을 활용한 인공지능 리터러시 및 인공지능 융합 교육에 관한 인식 연구)

  • Hyeok Yun;Jeongrang Kim
    • Journal of The Korean Association of Information Education
    • /
    • v.26 no.6
    • /
    • pp.553-566
    • /
    • 2022
  • This study collects social data and academic research data from portal sites and RISS, and analyzes TF-IDF, N-Gram, semantic network analysis, and CONCOR analysis to analyze the social awareness and current aspects of 'AI Literacy' and 'AI Convergence Education'. Through this, we tried to understand the social awareness aspect and the current situation, and to suggest implications and directions. In the social data, the collection of 'AI Convergence Education' was more than twice that of 'AI Literacy', indicating that awareness of 'AI Literacy' was relatively low. In 'AI Literacy', the keyword 'human' in social data showed no cluster to which it belonged, indicating a lack of philosophical interest in and awareness of humanities and AI. In addition, the keyword 'Ministry of Education' showed high frequency, importance, and centrality of connection only in the social data of 'AI convergence education', confirming that 'AI convergence education' is closely related to government policy.

Synthesis, spectral, thermal, structural study and theoretical treatment of new complexes of mannich base with Ni(II) and study of cytotoxicity effect on (Hepa-2) cell line and antimicrobial activity

  • Omar H. Al-Obaidi
    • Analytical Science and Technology
    • /
    • v.36 no.2
    • /
    • pp.70-79
    • /
    • 2023
  • The synthesis of the Mannich base as a ligand (L) N-(morpholino (phenyl) methyl) acetamide is the subject of this study. Elemental analyses, FT-IR spectra, UV-vis, 1H-NMR, and magnetic measurements were used to confirm the synthesis of the [Ni(L)2]Cl2 complex, thermal analysis (TG/DTG), atomic absorption, and scanning, and structurally explained as electron microscopy (SEM), and X-ray powder diffraction (XRD) methods. The melting point of the complex and its molar conductivity were also measured. The suggested geometries of the complexes formed have a tetrahedral structure, according to the data acquired using various techniques. Theoretical approaches to the complex formation have been investigated. For molecular mechanics and semi-empirical calculations, the HYPERCHEM6 program had been used. The effect of the novel Ni(II) complex on the cancer cell Hepa-2 (human hepatocellular ademocarcinoma), that is the human laryngeal cancer, was studied. It has been found that these ligand and complex have potent effects on the cancer cell. The antibacterial activity of the free ligand and its complex was evaluated against two kinds of human pathogenic bacteria. The first category is Gram-positive (Staphylococcus aureas, epiderimids), whereas the second group is Gram-negative (Psedamonas aeruginosa, Escherichia coli) (from the diffusion method). Finally, it was discovered that various chemicals had varied growth-inhibiting effects on bacteria.

Production, isolation and characterization of the antibiotic from Pseudomonas aeruginosa 3120 (Pseudomonas aeruginosa 3120으로부터 항생물질의 생산,분리 및 특성)

  • Ko, Hack-Ryong;Chun, Hyo-Kon;Kho, Yung-Hee;Sung, Nack-Kie
    • Applied Biological Chemistry
    • /
    • v.36 no.6
    • /
    • pp.428-433
    • /
    • 1993
  • A strain that inhibited the growth of Pellicularia sasakii was isolated from the soil and identified as Pseudomonas aeruginosa 3120. A dark brownish antibiotic, MRL3120 isolated and purified from the culture broth of P. aeruginosa 3120 was soluble in ethylacetate, chloroform and methanol, and it was active against gram-positive and negative bacteria as well as fungi. The structure of MRL3120 was identified as a chelate compound consisting of two N-methyl-N-thioformyl-hydroxylamine and a copper ion by the analysis of UV, IR, and EI-MS spectra and other physico-chemical properties and supposed to have a structure of fluopsin C related compound. Addition of $CuSO_4$ into the fermentation medium containing soybean meal increased antifungal activity but no activity was found in the presence of EDTA (0.1%, v/v). However antibiotic MRL3120 was not produced in the fermentation medium containing soytone instead of soybean meal but it was rapidly produced by the addition of $CuSO_4$.

  • PDF

Age and Gender Prediction from Korean Tweets with Stylometric Analysis (문체 분석을 활용한 한국어 트위터 사용자의 연령대 및 성별 예측)

  • Kim, Sang-Chae;Park, Jong-C.
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06b
    • /
    • pp.303-305
    • /
    • 2012
  • 사람들은 주변의 영향을 받아 가면서 각자의 독특한 글쓰기 양식을 만들어간다. 따라서 같은 연령대와 성별을 가지는 사람들은 유사한 글쓰기 양식을 나타내는 경향이 있다. 이와 같은 가정을 바탕으로, 본 연구에서는 다양한 연령대와 성별의 사람들이 작성한 트윗의 문체를 분석하여 임의의 트윗을 작성한 저자의 연령대와 성별을 예측하는 실험을 진행하였다. 한국어 웹 언어에서 자주 보이는 표현들을 토대로 구성한 자질들과, 그에 비해 데이터와 관계가 적은 n-gram 단위의 자질들을 함께 사용하여 예측을 진행함으로써, 최대 공산 기준치보다 25%가량 높은 정확도를 보이는 예측 결과를 얻게 되었다. 이와 함께 각 자질 구성이 예측에 얼마나 효율적으로 기여하는지에 대한 이해도를 높일 수 있었다.