• Title/Summary/Keyword: natural language process

Search Result 249, Processing Time 0.025 seconds

Application of AI Technology in Requirements Analysis and Architecture Definition - status and prospects (요구사항 분석 및 아키텍처 정의 분야의 인공지능 적용 현황 및 방향)

  • Jin Il, Kim;Choong Sub, Yeum;Joong Uk, Shin
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.18 no.2
    • /
    • pp.50-57
    • /
    • 2022
  • Along with the development of the 4th Industrial Revolution technology, artificial intelligence technology is also being used in the field of systems engineering. This study analyzed the development status of artificial intelligence technology in the areas of systems engineering core processes such as stakeholder needs and requirements definition, system requirement analysis, and system architecture definition, and presented future technology development directions. In the definition of stakeholder needs and requirements, technology development is underway to compensate for the shortcomings of the existing requirement extraction methods. In the field of system requirement analysis, technology for automatically checking errors in individual requirements and technology for analyzing categories of requirements are being developed. In the field of system architecture definition, a technology for automatically generating architectures for each system sector based on requirements is being developed. In this study, these contents were summarized and future development directions were presented.

Suggested social media big data consulting chatbot service for restaurant start-ups

  • Jong-Hyun Park;Jun-Ho Park;Ki-Hwan Ryu
    • International journal of advanced smart convergence
    • /
    • v.12 no.3
    • /
    • pp.68-74
    • /
    • 2023
  • The food industry has been hit hard since the first outbreak of COVID-19 in 2019. However, as of April 2022, social distancing has been resolved and the restaurant industry has gradually recovered, interest in restaurant start-ups is increasing. Therefore, in this paper, 'restaurant start-up' was cited as a key keyword through social media big data analysis using TexTom, and word frequency and cone analysis were conducted for big data analysis. The keyword collection period was selected from May 1, 2022, when social distancing due to COVID-19 was lifted, to May 23, 2023, and based on this, a plan to develop chatbot services for restaurant start-ups was proposed. This paper was prepared in consideration of what to consider when starting a restaurant and a chatbot service that allows prospective restaurant founders to receive information more conveniently. Based on these analysis results, we expected to contribute to the process of developing chatbots for prospective restaurant founders in the future

A study on NLP Text Preprocessing for digital forensic investigation (디지털 포렌식 조사를 위한 NLP의 텍스트 전처리 연구)

  • Lee, Sung-won;Kim, Dohyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.189-191
    • /
    • 2022
  • In modern society, messenger services are necessary to communication with others, and criminals are no exception. In representative cases of Burning Sun Gate(2018) and NthRoom(2019), messenger data analysis was used as a smoking gun to solve these criminal cases. Therefore messenger text analytics is critical for the resolution of crimes in a modern environment. also, it takes a lot of time to analyze messenger data in the digital forensic investigation process, so researchers in text mining need to be more effective to respond with the current situation In this paper, we study various natural language preprocessing(NLP) methods according to the characteristics of instant messages to effectively proceed with NLP analysis on instant messengers.

  • PDF

A Study on Detection of Abnormal Patterns Based on AI·IoT to Support Environmental Management of Architectural Spaces (건축공간 환경관리 지원을 위한 AI·IoT 기반 이상패턴 검출에 관한 연구)

  • Kang, Tae-Wook
    • Journal of KIBIM
    • /
    • v.13 no.3
    • /
    • pp.12-20
    • /
    • 2023
  • Deep learning-based anomaly detection technology is used in various fields such as computer vision, speech recognition, and natural language processing. In particular, this technology is applied in various fields such as monitoring manufacturing equipment abnormalities, detecting financial fraud, detecting network hacking, and detecting anomalies in medical images. However, in the field of construction and architecture, research on deep learning-based data anomaly detection technology is difficult due to the lack of digitization of domain knowledge due to late digital conversion, lack of learning data, and difficulties in collecting and processing field data in real time. This study acquires necessary data through IoT (Internet of Things) from the viewpoint of monitoring for environmental management of architectural spaces, converts them into a database, learns deep learning, and then supports anomaly patterns using AI (Artificial Infelligence) deep learning-based anomaly detection. We propose an implementation process. The results of this study suggest an effective environmental anomaly pattern detection solution architecture for environmental management of architectural spaces, proving its feasibility. The proposed method enables quick response through real-time data processing and analysis collected from IoT. In order to confirm the effectiveness of the proposed method, performance analysis is performed through prototype implementation to derive the results.

Effects of Preprocessing on Text Classification in Balanced and Imbalanced Datasets

  • Mehmet F. Karaca
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.591-609
    • /
    • 2024
  • In this study, preprocessings with all combinations were examined in terms of the effects on decreasing word number, shortening the duration of the process and the classification success in balanced and imbalanced datasets which were unbalanced in different ratios. The decreases in the word number and the processing time provided by preprocessings were interrelated. It was seen that more successful classifications were made with Turkish datasets and English datasets were affected more from the situation of whether the dataset is balanced or not. It was found out that the incorrect classifications, which are in the classes having few documents in highly imbalanced datasets, were made by assigning to the class close to the related class in terms of topic in Turkish datasets and to the class which have many documents in English datasets. In terms of average scores, the highest classification was obtained in Turkish datasets as follows: with not applying lowercase, applying stemming and removing stop words, and in English datasets as follows: with applying lowercase and stemming, removing stop words. Applying stemming was the most important preprocessing method which increases the success in Turkish datasets, whereas removing stop words in English datasets. The maximum scores revealed that feature selection, feature size and classifier are more effective than preprocessing in classification success. It was concluded that preprocessing is necessary for text classification because it shortens the processing time and can achieve high classification success, a preprocessing method does not have the same effect in all languages, and different preprocessing methods are more successful for different languages.

Incorporating Deep Median Networks for Arabic Document Retrieval Using Word Embeddings-Based Query Expansion

  • Yasir Hadi Farhan;Mohanaad Shakir;Mustafa Abd Tareq;Boumedyen Shannaq
    • Journal of Information Science Theory and Practice
    • /
    • v.12 no.3
    • /
    • pp.36-48
    • /
    • 2024
  • The information retrieval (IR) process often encounters a challenge known as query-document vocabulary mismatch, where user queries do not align with document content, impacting search effectiveness. Automatic query expansion (AQE) techniques aim to mitigate this issue by augmenting user queries with related terms or synonyms. Word embedding, particularly Word2Vec, has gained prominence for AQE due to its ability to represent words as real-number vectors. However, AQE methods typically expand individual query terms, potentially leading to query drift if not carefully selected. To address this, researchers propose utilizing median vectors derived from deep median networks to capture query similarity comprehensively. Integrating median vectors into candidate term generation and combining them with the BM25 probabilistic model and two IR strategies (EQE1 and V2Q) yields promising results, outperforming baseline methods in experimental settings.

Korean Sentence Generation Using Phoneme-Level LSTM Language Model (한국어 음소 단위 LSTM 언어모델을 이용한 문장 생성)

  • Ahn, SungMahn;Chung, Yeojin;Lee, Jaejoon;Yang, Jiheon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.71-88
    • /
    • 2017
  • Language models were originally developed for speech recognition and language processing. Using a set of example sentences, a language model predicts the next word or character based on sequential input data. N-gram models have been widely used but this model cannot model the correlation between the input units efficiently since it is a probabilistic model which are based on the frequency of each unit in the training set. Recently, as the deep learning algorithm has been developed, a recurrent neural network (RNN) model and a long short-term memory (LSTM) model have been widely used for the neural language model (Ahn, 2016; Kim et al., 2016; Lee et al., 2016). These models can reflect dependency between the objects that are entered sequentially into the model (Gers and Schmidhuber, 2001; Mikolov et al., 2010; Sundermeyer et al., 2012). In order to learning the neural language model, texts need to be decomposed into words or morphemes. Since, however, a training set of sentences includes a huge number of words or morphemes in general, the size of dictionary is very large and so it increases model complexity. In addition, word-level or morpheme-level models are able to generate vocabularies only which are contained in the training set. Furthermore, with highly morphological languages such as Turkish, Hungarian, Russian, Finnish or Korean, morpheme analyzers have more chance to cause errors in decomposition process (Lankinen et al., 2016). Therefore, this paper proposes a phoneme-level language model for Korean language based on LSTM models. A phoneme such as a vowel or a consonant is the smallest unit that comprises Korean texts. We construct the language model using three or four LSTM layers. Each model was trained using Stochastic Gradient Algorithm and more advanced optimization algorithms such as Adagrad, RMSprop, Adadelta, Adam, Adamax, and Nadam. Simulation study was done with Old Testament texts using a deep learning package Keras based the Theano. After pre-processing the texts, the dataset included 74 of unique characters including vowels, consonants, and punctuation marks. Then we constructed an input vector with 20 consecutive characters and an output with a following 21st character. Finally, total 1,023,411 sets of input-output vectors were included in the dataset and we divided them into training, validation, testsets with proportion 70:15:15. All the simulation were conducted on a system equipped with an Intel Xeon CPU (16 cores) and a NVIDIA GeForce GTX 1080 GPU. We compared the loss function evaluated for the validation set, the perplexity evaluated for the test set, and the time to be taken for training each model. As a result, all the optimization algorithms but the stochastic gradient algorithm showed similar validation loss and perplexity, which are clearly superior to those of the stochastic gradient algorithm. The stochastic gradient algorithm took the longest time to be trained for both 3- and 4-LSTM models. On average, the 4-LSTM layer model took 69% longer training time than the 3-LSTM layer model. However, the validation loss and perplexity were not improved significantly or became even worse for specific conditions. On the other hand, when comparing the automatically generated sentences, the 4-LSTM layer model tended to generate the sentences which are closer to the natural language than the 3-LSTM model. Although there were slight differences in the completeness of the generated sentences between the models, the sentence generation performance was quite satisfactory in any simulation conditions: they generated only legitimate Korean letters and the use of postposition and the conjugation of verbs were almost perfect in the sense of grammar. The results of this study are expected to be widely used for the processing of Korean language in the field of language processing and speech recognition, which are the basis of artificial intelligence systems.

A Study on Architectural Design Factors for Tall Office Buildings with Regional Climates based on Sustainability

  • Cho, Jong Soo
    • Architectural research
    • /
    • v.7 no.2
    • /
    • pp.13-21
    • /
    • 2005
  • Throughout history, buildings have been interrelated with certain indigenous characteristics such as regional climate, culture and religions. In particular, the control of regional climate has been primarily a concern for compatibility with nature. In our modern age, technologies to control climate have been successfully developed in architecture but the consumption of large quantities of natural resources can also produce environmental problems. This study is based on the proposition that this negative trend can be minimized with architectural design that is motivated to coexist with a regional climate. This study develops these design strategies for tall office buildings by analyzing various combinations of building design configurations based on regional climates. The objective is to determine the optimum architecture of tall office buildings during the initial design process that will reduce energy consumption for regional climatic conditions. The eQUEST energy simulating program based on DOE-2.2 was used for this comparative analysis study of the energy use in tall office buildings based on architectural design variables and different regional climates. The results are statistically analyzed and presented in functional architectural design decision-making tables and charts. As a result of the comparison of architectural design consideration for tall office buildings in relation to regional climates, buildings physically need less energy consumption when the architecture is concerned with the regional climate and it produces a more reasonable design methodology. In reality, imbalanced planning which is architectural design's lack of regional characteristics requires additional natural resources to maintain desired comfortable indoor conditions. Therefore, the application of integrated architectural design with regional nature should be the first architectural design stage and this research produces the rational. This architectural design language approach must be a starting point to sustaining long-term planning.

An Emotional Gesture-based Dialogue Management System using Behavior Network (행동 네트워크를 이용한 감정형 제스처 기반 대화 관리 시스템)

  • Yoon, Jong-Won;Lim, Sung-Soo;Cho, Sung-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.10
    • /
    • pp.779-787
    • /
    • 2010
  • Since robots have been used widely recently, research about human-robot communication is in process actively. Typically, natural language processing or gesture generation have been applied to human-robot interaction. However, existing methods for communication among robot and human have their limits in performing only static communication, thus the method for more natural and realistic interaction is required. In this paper, an emotional gesture based dialogue management system is proposed for sophisticated human-robot communication. The proposed system performs communication by using the Bayesian networks and pattern matching, and generates emotional gestures of robots in real-time while the user communicates with the robot. Through emotional gestures robot can communicate the user more efficiently also realistically. We used behavior networks as the gesture generation method to deal with dialogue situations which change dynamically. Finally, we designed a usability test to confirm the usefulness of the proposed system by comparing with the existing dialogue system.

Development of an Energy Model of Rice Processing Complex(II) -Simulation Model Development and Analysis of Energy Requirement- (미곡종합처리장의 에너지 모델 개발(II) -시뮬레이션 모델 개발 및 소요 에너지 분석-)

  • 장홍희;장동일;김만수
    • Journal of Biosystems Engineering
    • /
    • v.20 no.3
    • /
    • pp.275-287
    • /
    • 1995
  • The rice processing complex(RPC) consisted of the rice handling, drying, storage, and milling processes. It has been established at 83 locations domestically by April 1994, and 200 of RPC will be built more throughout the country. Therefore, this study has been performed to achieve two objectives as the followings : 1) Development of mathematical models which can assess the requirement of electricity, fuel, and labor for four model systems of rice processing complex. 2) Development of a computer simulation model which produce the improved designs of RPC by the evaluation results of energy requirements of four RPC models. The results from this study are summarized as follows : 1) Mathematical models were developed on the basis of result of mass balance analysis and required power of machines for each process. 2) A computer simulation model was developed, which can produce the improved designs of RPC by the evaluation results of energy requirements. The computer simulation model language was BORLAND $C^{++}$. 3) The results of simulation showed that total energy requirements were ranged from 75.94㎾h/t to 124.30㎾h/t. 4) From the results of computer analysis of energy requirement classified by drying type, it was found that energy requirement of the drying type A{paddy rice (PR) for storage-natural air drying(15%), PR for milling-heated air drying(16%)} were less than that of the drying type B{1 step-natural air drying(PR for storage : 18%, PR for milling : 20%), 2 step-heated air drying(PR for storage : 15%, PR for milling : 16%)}. 5) The energy efficient drying method is that all the incoming rough rice to RPC should be dried by national air drying systems. If it is more than the capacity of national air drying system, the amount of surplus rough rice is recommended to be dried by the heated air drying method.

  • PDF