• 제목/요약/키워드: Text network

검색결과 1,103건 처리시간 0.028초

Determining Feature-Size for Text to Numeric Conversion based on BOW and TF-IDF

  • Alyamani, Hasan J.
    • International Journal of Computer Science & Network Security
    • /
    • 제22권1호
    • /
    • pp.283-287
    • /
    • 2022
  • Machine Learning is the most popular method used in data science. Growth of data is not only numeric data but also text data. Most of the algorithm of supervised and unsupervised machine learning algorithms use numeric data. Now it is required to convert text data into numeric. There are many techniques for this conversion. Researcher confuses which technique is best in what situation. Here in proposed work BOW (Bag-of-Words) and TF-IDF (Term-Frequency-Inverse-Document-Frequency) has been studied based on different features to determine best method. After experimental results on text data, TF-IDF and BOW both provide better performance at range from 100 to 150 number of features.

Helping People with Visual Disability Using AI

  • Naif Al Otaibi;Tariq S Almurayziq
    • International Journal of Computer Science & Network Security
    • /
    • 제24권1호
    • /
    • pp.205-208
    • /
    • 2024
  • Artificial Intelligence (AI) technology has evolved rapidly in recent years and is used in everything from banking to email management to surgery, but without the help of the visible, most of the fun features of the Internet include visual impairment. It benefits people with disabilities. The main purpose of this study is to find ways to help people with visual impairments using AI technology. A visually impaired request is made for the visually impaired. For example, when a message arrives that the program will notify you by voice (reads the sender's name, read the message, and replies to it if necessary), this is a special program installed on your mobile phone. This program uses a customized algorithm developed in Python to convert written text to voice, read text, and convert voice to written text on a message when a visually impaired person wants to respond. Then it sends the response in the form of a text message. Therefore, the research should lead to programs for people with visual impairments. This program makes mobile phones easier and more comfortable to use and makes the daily life easier for visual impairments.

Text Mining과 네트워크 분석을 활용한 교육훈련용 모의사격 시뮬레이션 경험지식 분석 (Analysis of Experience Knowledge of Shooting Simulation for Training Using the Text Mining and Network Analysis)

  • 김성규;손창호;김종만;정세교;박재현;전정환
    • 한국군사과학기술학회지
    • /
    • 제20권5호
    • /
    • pp.700-707
    • /
    • 2017
  • Recently, the military need more various education and training because of the increasing necessity of various operation. But the education and training of the military has the various difficulties such as the limitations of time, space and finance etc. In order to overcome the difficulties, the military use Defense Modeling and Simulation(DM&S). Although the participants in training has the empirical knowledge from education and training based on the simulation, the empirical knowledge is not shared because of particular characteristics of military such as security and the change of official. This situation obstructs the improving effectiveness of education and training. The purpose of this research is the systematizing and analysing the empirical knowledge using text mining and network analysis to assist the sharing of empirical knowledge. For analysing texts or documents as the empirical knowledge, we select the text mining and network analysis. We expect our research will improve the effectiveness of education and training based on simulation of DM&S.

통상 이해관계자 간 상호작용 관련 텍스트 네트워크 분석(TNA) - 한국 통상부처와 입법부 관계를 중심으로 (Text Network Analysis of Korean Trade Stakeholder's Interactions - A Focus on the Trade Ministry and the Legislature)

  • 고보민
    • 무역학회지
    • /
    • 제45권6호
    • /
    • pp.23-43
    • /
    • 2020
  • This study aims at analyzing the interactions between two of the most significant trade stakeholders in Korea, the Trade Ministry and the Legislature, using text network analysis. Tackling seven Action and Plan Reports for Requests from Parliamentary Inspection released by the National Assembly, this paper conducts a topic modelling analysis, particularly focusing on the reports for the three trade-related institutes: the MOTIE headquarter, Korea Trade Insurance Corporation, Korea Trade and Investment Promotion Agency. According to the analysis, such traditional topics of the MOTIE as enterprise, industry, business, management, development were frequently appeared in the reports. Trade-related topics including export, trade, commerce, investment, overseas, domestic, dispute, cooperation, efficiency, negotiation, service, promotion were repeatedly shown. Lastly, a case study on 2019 Parliamentary Inspection Report showed specific trade-related topics and relevant contents that raised issues in that year. This analysis implies that the text data driven from the Parliamentary Inspection Reports between the MOTIE and the National Assembly, can be established as so called 'trade policy information system' which are valuable not only for the two but also the rest of the trade stakeholders in Korea.

Text Mining of Wood Science Research Published in Korean and Japanese Journals

  • Eun-Suk JANG
    • Journal of the Korean Wood Science and Technology
    • /
    • 제51권6호
    • /
    • pp.458-469
    • /
    • 2023
  • Text mining techniques provide valuable insights into research information across various fields. In this study, text mining was used to identify research trends in wood science from 2012 to 2022, with a focus on representative journals published in Korea and Japan. Abstracts from Journal of the Korean Wood Science and Technology (JKWST, 785 articles) and Journal of Wood Science (JWS, 812 articles) obtained from the SCOPUS database were analyzed in terms of the word frequency (specifically, term frequency-inverse document frequency) and co-occurrence network analysis. Both journals showed a significant occurrence of words related to the physical and mechanical properties of wood. Furthermore, words related to wood species native to each country and their respective timber industries frequently appeared in both journals. CLT was a common keyword in engineering wood materials in Korea and Japan. In addition, the keywords "MDF," "MUF," and "GFRP" were ranked in the top 50 in Korea. Research on wood anatomy was inferred to be more active in Japan than in Korea. Co-occurrence network analysis showed that words related to the physical and structural characteristics of wood were organically related to wood materials.

Chinese-clinical-record Named Entity Recognition using IDCNN-BiLSTM-Highway Network

  • Tinglong Tang;Yunqiao Guo;Qixin Li;Mate Zhou;Wei Huang;Yirong Wu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권7호
    • /
    • pp.1759-1772
    • /
    • 2023
  • Chinese named entity recognition (NER) is a challenging work that seeks to find, recognize and classify various types of information elements in unstructured text. Due to the Chinese text has no natural boundary like the spaces in the English text, Chinese named entity identification is much more difficult. At present, most deep learning based NER models are developed using a bidirectional long short-term memory network (BiLSTM), yet the performance still has some space to improve. To further improve their performance in Chinese NER tasks, we propose a new NER model, IDCNN-BiLSTM-Highway, which is a combination of the BiLSTM, the iterated dilated convolutional neural network (IDCNN) and the highway network. In our model, IDCNN is used to achieve multiscale context aggregation from a long sequence of words. Highway network is used to effectively connect different layers of networks, allowing information to pass through network layers smoothly without attenuation. Finally, the global optimum tag result is obtained by introducing conditional random field (CRF). The experimental results show that compared with other popular deep learning-based NER models, our model shows superior performance on two Chinese NER data sets: Resume and Yidu-S4k, The F1-scores are 94.98 and 77.59, respectively.

소셜네트워크서비스에 활용할 비표준어 한글 처리 방법 연구 (Research on Methods for Processing Nonstandard Korean Words on Social Network Services)

  • 이종화;레환수;이현규
    • 한국산업정보학회논문지
    • /
    • 제21권3호
    • /
    • pp.35-46
    • /
    • 2016
  • 특정한 관심이나 활동을 공유하는 관계망을 구축해주는 온라인 서비스인 소셜네트워크서비스(SNS), 자신의 관심사에 따라 자유롭게 글, 사진, 동영상 등을 올릴 수 있는 공간인 블로그(Blog) 등은 자신을 알리고 표현하는 사회현상으로 자리 매김하고 있다. 이러한 SNS나 블로그를 통해 사용자들이 자유롭게 표현한 글들을 분석하여 의미있는 정보와 가치, 그리고 패턴을 찾기 위한 텍스트 마이닝(Text Mining), 오피니언 마이닝(Opinion Mining), 의미 분석(Semantic Analysis) 등의 연구가 활발히 이루어지고 있다. 또한, 연구자들의 연구 효율을 보다 높이기 위하여 키워드 기반 연구들도 이루어져있다. 하지만 대부분의 연구들은 한글의 맞춤법에 많은 한계점을 나타내고 있다. 본 연구는 어근을 찾기 힘든 이상한 외계 언어, 무분별하게 표현되는 속어, 알기 힘든 한글 이모티콘 인터넷 언어, 마이닝 처리 과정에서 파악하기 어려운 단어들을 데이터베이스에 구축하여 데이터 사전 기반 마이닝 처리 기법의 한계를 극복하고자 한다. 특정 주제에 대한 주관적 견해로 구성된 블로그를 사례 분석 대상으로 연구를 진행하였으며 유니코드를 활용한 비표준어 추출은 텍스트 마이닝 처리에 유용함을 발견할 수 있었다.

노인의 문자메시지 및 SNS 활용역량과 비공식적 사회관계망과의 접촉에 관한 연구 (Senior' Use of Text Messages and SNS and Contact with Informal Social Network Members)

  • 정찬우;최희정
    • 디지털융복합연구
    • /
    • 제19권3호
    • /
    • pp.401-414
    • /
    • 2021
  • 본 연구는 노인의 문자메시지와 SNS 활용역량이 비동거 자녀, 형제·자매 및 친인척, 친구, 이웃, 지인과의 접촉과 어떠한 관련성이 있는지 살펴보았다. 연구대상은 2017년 노인실태조사에 참여한 65세 이상 노인 8,392명으로, 노인의 문자메시지 받기와 보내기, SNS 활용역량을 기준으로 4집단으로 구분하였다. 회귀분석 결과, 노인이 문자메시지와 SNS를 모두 사용하는 능력을 보유하였을 때, 비공식적 사회관계망을 구성하는 모든 성원과 비대면 접촉(연락)이 가장 빈번하였다. 그러나 문자메시지와 SNS 활용역량은 친구, 이웃, 지인과의 대면접촉 빈도와 주로 유의한 관련성이 나타났다. 본 연구 결과는 문자메시지와 SNS 활용역량이 가족 및 친구와의 정서적, 도구적 지원 교환에 핵심적인 역할을 하여 지역사회에서 거주하는 노인의 삶의 질에 기여할 가능성을 제시한다. 또한 비공식적 사회관계망 성원들과 관계를 지속하는데 정보화 교육이 중요한 역할을 할 수 있음을 시사한다.

모바일 디바이스 화면의 클릭 가능한 객체 탐지를 위한 싱글 샷 디텍터 (Single Shot Detector for Detecting Clickable Object in Mobile Device Screen)

  • 조민석;전혜원;한성수;정창성
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제11권1호
    • /
    • pp.29-34
    • /
    • 2022
  • 모바일 디바이스 화면상의 클릭 가능한 객체를 인지하기 위한 데이터셋을 구축하고 새로운 네트워크 구조를 제안한다. 모바일 디바이스 화면에서 클릭 가능한 객체를 기준으로 다양한 해상도를 가진 디바이스에서 여러 애플리케이션을 대상으로 데이터를 수집하였다. 총 24,937개의 annotation data를 text, edit text, image, button, region, status bar, navigation bar의 7개 카테고리로 세분화하였다. 해당 데이터셋을 학습하기 위한 모델 구조는 Deconvolution Single Shot Detector를 베이스라인으로, backbone network는 기존 ResNet에 Squeeze-and-Excitation block을 추가한 Squeeze-and-Excitation networks를 사용하고, Single shot detector layers와 Deconvolution module을 Feature pyramid networks 형태로 쌓아 올려 header와 연결한다. 또한, 기존 input resolution의 1:1 비율에서 오는 특징의 손실을 최소화하기 위해 모바일 디바이스 화면과 유사한 1:2 비율로 변경하였다. 해당 모델을 구축한 데이터셋에 대하여 실험한 결과 베이스라인에 대비하여 mean average precision이 최대 101% 개선되었다.