• Title/Summary/Keyword: text extraction

Search Result 465, Processing Time 0.023 seconds

Traffic Data Generation Technique for Improving Network Attack Detection Using Deep Learning (네트워크 공격 탐지 성능향상을 위한 딥러닝을 이용한 트래픽 데이터 생성 연구)

  • Lee, Wooho;Hahm, Jaegyoon;Jung, Hyun Mi;Jeong, Kimoon
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.11
    • /
    • pp.1-7
    • /
    • 2019
  • Recently, various approaches to detect network attacks using machine learning have been studied and are being applied to detect new attacks and to increase precision. However, the machine learning method is dependent on feature extraction and takes a long time and complexity. It also has limitation of performace due to learning data imbalance. In this study, we propose a method to solve the degradation of classification performance due to imbalance of learning data among the limit points of detection system. To do this, we generate data using Generative Adversarial Networks (GANs) and propose a classification method using Convolutional Neural Networks (CNNs). Through this approach, we can confirm that the accuracy is improved when applied to the NSL-KDD and UNSW-NB15 datasets.

The Analysis of Chosun Danasty Poetry Using 3D Data Visualization (3D 시각화를 이용한 조선시대 시문 분석)

  • Min, Kyoung-Ju;Lee, Byoung-Chan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.7
    • /
    • pp.861-868
    • /
    • 2021
  • With the development of technology for visualizing big-data, tasks such as intuitively analyzing a lot of data, detecting errors, and deriving meaning are actively progressing. In this paper, we describe the design and implementation of a 3D analysis that collects and stores the writing data in Chinese characters provided by the Korean Classical Database of the Korean Classics Translation Institute, stores and progress the data, and visualizes the writing information in a 3D network diagram. It solves the problem when a large amount of data is expressed in 2D, intuitive that analysis, error detection, meaningful data extraction such as characteristics, similarity, differences, etc. and user convenience can be provided. In this paper, we improved the problems of analyzing Chosun dynasty poetry in Chinese characters using 2D visualization conducted in previous studies.

Development of a Blocks Recognition Application for Children's Education using a Smartphone Camera (스마트폰 카메라 기반 아동 교육용 산수 블록 인식 애플리케이션 개발)

  • Park, Sang-A;Oh, Ji-Won;Hong, In-Sik;Nam, Yunyoung
    • Journal of Internet Computing and Services
    • /
    • v.20 no.4
    • /
    • pp.29-38
    • /
    • 2019
  • Currently, information society is rapidly changing and demands innovation and creativity in various fields. Therefore, the importance of mathematics, which can be the basis of creativity and logic, is emphasized. The purpose of this paper is to develop a math education application that can further expand the logical thinking of mathematics and allow voluntary learning to occur through the use of readily available teaching aid for children to form motivation and interest in learning. This paper provides math education applications using a smartphone and blocks for children. The main function of the application is to shoot with the camera and show the calculated values. When a child uses a block to make a formula and shoots a block using a camera, you can directly see the calculated value of your formula. The preprocessing process, text extraction, and character recognition of the photographed images have been implemented using OpenCV libraries and Tesseract-OCR libraries.

Improvement of Satellite Ocean Information Service for Offshore Marine Industry (연근해 해양산업을 위한 위성해양 정보 서비스 개선방안)

  • Cho, Bo-Hyun;Lee, Gun-Wook;Kim, Dong-Chun;Yang, Keum-Cheol;Kim, SG;Yo, Seung-jae
    • Convergence Security Journal
    • /
    • v.18 no.1
    • /
    • pp.85-91
    • /
    • 2018
  • In this study, we design a marine environmental information service system with satellite images based on satellite images to reduce the damage caused by changes in the marine environment. The system provides satellite oceanographic information such as water temperature, chlorophyll, float, etc. as hierarchical texts, which is implemented as a unit module Web service so that it can be expanded in OpenAPI environment. And stability of system plug-in portability, service hours, data extraction precision and speed are used as a basis for diagnosing service stability. By securing the function and performance of the service system implemented in this study, it can be expanded to a complex technology that can customize the users by group by adding not only general services of existing systems operated by location but also information about a specific interested areas. Especially, various other items of interest including marine environment information are developed in modules, so we expect to be able to expand and service the system by plugging into the system and to spread it in technical linkage with the related institution information system.

  • PDF

Long Song Type Classification based on Lyrics

  • Namjil, Bayarsaikhan;Ganbaatar, Nandinbilig;Batsuuri, Suvdaa
    • Journal of Multimedia Information System
    • /
    • v.9 no.2
    • /
    • pp.113-120
    • /
    • 2022
  • Mongolian folk songs are inspired by Mongolian labor songs and are classified into long and short songs. Mongolian long songs have ancient origins, are rich in legends, and are a great source of folklore. So it was inscribed by UNESCO in 2008. Mongolian written literature is formed under the direct influence of oral literature. Mongolian long song has 3 classes: ayzam, suman, and besreg by their lyrics and structure. In ayzam long song, the world perfectly embodies the philosophical nature of world phenomena and the nature of human life. Suman long song has a wide range of topics such as the common way of life, respect for ancestors, respect for fathers, respect for mountains and water, livestock and animal husbandry, as well as the history of Mongolia. Besreg long songs are dominated by commanded and trained characters. In this paper, we proposed a method to classify their 3 types of long songs using machine learning, based on their lyrics structures without semantic information. We collected lyrics of over 80 long songs and extracted 11 features from every single song. The features are the name of a song, number of the verse, number of lines, number of words, general value, double value, elapsed time of verse, elapsed time of 5 words, and the longest elapsed time of 1 word, full text, and type label. In experimental results, our proposed features show on average 78% recognition rates in function type machine learning methods, to classify the ayzam, suman, and besreg classes.

Comparison of healing assessments of periapical endodontic surgery using conventional radiography and cone-beam computed tomography: A systematic review

  • Sharma, Garima;Abraham, Dax;Gupta, Alpa;Aggarwal, Vivek;Mehta, Namrata;Jala, Sucheta;Chauhan, Parul;Singh, Arundeep
    • Imaging Science in Dentistry
    • /
    • v.52 no.1
    • /
    • pp.1-9
    • /
    • 2022
  • Purpose: This systematic review aimed to compare assessments of the healing of periapical endodontic surgery using conventional radiography and cone-beam computed tomography (CBCT). Materials and Methods: This review of clinical studies was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist. All articles published from 1990 to March 2020 pertaining to clinical and radiographic healing assessments after endodontic surgery using conventional radiography and CBCT were included. The question was "healing assessment of endodontic surgery using cone-beam computed tomography." The review was conducted by manual searching, as well as undertaking a review of electronic literature databases, including PubMed and Scopus. The studies included compared radiographic and CBCT assessments of periapical healing after periapical endodontic surgery. Results: The initial search retrieved 372 articles. The titles and abstracts of these articles were read, leading to the selection of 73 articles for full-text analysis. After the eligibility criteria were applied, 11 articles were selected for data extraction and qualitative analysis. The majority of studies found that CBCT enabled better assessments of healing than conventional radiography, suggesting higher efficacy of CBCT for correct diagnosis and treatment planning. A risk of bias assessment was done for 10 studies, which fell into the low to moderate risk categories. Conclusion: Three-dimensional radiography provides an overall better assessment of healing, which is imperative for correct diagnosis and treatment planning.

Detection of Depression Trends in Literary Cyber Writers Using Sentiment Analysis and Machine Learning

  • Faiza Nasir;Haseeb Ahmad;CM Nadeem Faisal;Qaisar Abbas;Mubarak Albathan;Ayyaz Hussain
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.3
    • /
    • pp.67-80
    • /
    • 2023
  • Rice is an important food crop for most of the population in Nowadays, psychologists consider social media an important tool to examine mental disorders. Among these disorders, depression is one of the most common yet least cured disease Since abundant of writers having extensive followers express their feelings on social media and depression is significantly increasing, thus, exploring the literary text shared on social media may provide multidimensional features of depressive behaviors: (1) Background: Several studies observed that depressive data contains certain language styles and self-expressing pronouns, but current study provides the evidence that posts appearing with self-expressing pronouns and depressive language styles contain high emotional temperatures. Therefore, the main objective of this study is to examine the literary cyber writers' posts for discovering the symptomatic signs of depression. For this purpose, our research emphases on extracting the data from writers' public social media pages, blogs, and communities; (3) Results: To examine the emotional temperatures and sentences usage between depressive and not depressive groups, we employed the SentiStrength algorithm as a psycholinguistic method, TF-IDF and N-Gram for ranked phrases extraction, and Latent Dirichlet Allocation for topic modelling of the extracted phrases. The results unearth the strong connection between depression and negative emotional temperatures in writer's posts. Moreover, we used Naïve Bayes, Support Vector Machines, Random Forest, and Decision Tree algorithms to validate the classification of depressive and not depressive in terms of sentences, phrases and topics. The results reveal that comparing with others, Support Vectors Machines algorithm validates the classification while attaining highest 79% f-score; (4) Conclusions: Experimental results show that the proposed system outperformed for detection of depression trends in literary cyber writers using sentiment analysis.

Using Roots and Patterns to Detect Arabic Verbs without Affixes Removal

  • Abdulmonem Ahmed;Aybaba Hancrliogullari;Ali Riza Tosun
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.4
    • /
    • pp.1-6
    • /
    • 2023
  • Morphological analysis is a branch of natural language processing, is now a rapidly growing field. The fundamental tenet of morphological analysis is that it can establish the roots or stems of words and enable comparison to the original term. Arabic is a highly inflected and derivational language and it has a strong structure. Each root or stem can have a large number of affixes attached to it due to the non-concatenative nature of Arabic morphology, increasing the number of possible inflected words that can be created. Accurate verb recognition and extraction are necessary nearly all issues in well-known study topics include Web Search, Information Retrieval, Machine Translation, Question Answering and so forth. in this work we have designed and implemented an algorithm to detect and recognize Arbic Verbs from Arabic text.The suggested technique was created with "Python" and the "pyqt5" visual package, allowing for quick modification and easy addition of new patterns. We employed 17 alternative patterns to represent all verbs in terms of singular, plural, masculine, and feminine pronouns as well as past, present, and imperative verb tenses. All of the verbs that matched these patterns were used when a verb has a root, and the outcomes were reliable. The approach is able to recognize all verbs with the same structure without requiring any alterations to the code or design. The verbs that are not recognized by our method have no antecedents in the Arabic roots. According to our work, the strategy can rapidly and precisely identify verbs with roots, but it cannot be used to identify verbs that are not in the Arabic language. We advise employing a hybrid approach that combines many principles as a result.

Keyword Extraction through Text Mining and Open Source Software Category Classification based on Machine Learning Algorithms (텍스트 마이닝을 통한 키워드 추출과 머신러닝 기반의 오픈소스 소프트웨어 주제 분류)

  • Lee, Ye-Seul;Back, Seung-Chan;Joe, Yong-Joon;Shin, Dong-Myung
    • Journal of Software Assessment and Valuation
    • /
    • v.14 no.2
    • /
    • pp.1-9
    • /
    • 2018
  • The proportion of users and companies using open source continues to grow. The size of open source software market is growing rapidly not only in foreign countries but also in Korea. However, compared to the continuous development of open source software, there is little research on open source software subject classification, and the classification system of software is not specified either. At present, the user uses a method of directly inputting or tagging the subject, and there is a misclassification and hassle as a result. Research on open source software classification can also be used as a basis for open source software evaluation, recommendation, and filtering. Therefore, in this study, we propose a method to classify open source software by using machine learning model and propose performance comparison by machine learning model.

Text Network Analysis and Topic Modeling of News Articles on Lonely Death (고독사에 관한 언론보도기사의 텍스트네트워크 분석 및 토픽모델링)

  • Kim, Chunmi;Choi, Seungbeom;Kim, Eun Man
    • Journal of Korean Academy of Rural Health Nursing
    • /
    • v.18 no.2
    • /
    • pp.113-124
    • /
    • 2023
  • Purpose: The number of households vulnerable to isolation increases rapidly as social ties decrease, raising concerns about the associated increase in lonely deaths. This study aimed to identify issues related to lonely deaths by analyzing South Korean news articles; and to provide evidence for their use in preventing and managing lonely deaths via community nursing. Methods: This exploratory study analyzed the structure and trends of meaning of lonely deaths by identifying the association between keywords in news articles and lonely deaths. In this study, we searched for all news articles on lonely deaths, covering the period from January 1, 2010, to May 31, 2023. Data preprocessing and purification were conducted, followed by top-keyword extraction, keyword network analysis and topic modeling. The retrieved articles were analyzed using R and Python software. Results: Four main topics were identified: "discovering and responding to lonely death cases", "lonely deaths ending in lonely funerals", "supportive policies to prevent lonely deaths among of older adults", and "local government activities to prevent lonely deaths and support vulnerable populations." Conclusion: Based on these findings, it can be concluded that lonely death is a complex social phenomenon that can be prevented if society shows concern and care. Education related to lonely deaths should be included in nursing curricula for concrete action plans and professional development.