• Title/Summary/Keyword: Dataset for AI

Search Result 201, Processing Time 0.024 seconds

Prediction Model of Real Estate ROI with the LSTM Model based on AI and Bigdata

  • Lee, Jeong-hyun;Kim, Hoo-bin;Shim, Gyo-eon
    • International journal of advanced smart convergence
    • /
    • v.11 no.1
    • /
    • pp.19-27
    • /
    • 2022
  • Across the world, 'housing' comprises a significant portion of wealth and assets. For this reason, fluctuations in real estate prices are highly sensitive issues to individual households. In Korea, housing prices have steadily increased over the years, and thus many Koreans view the real estate market as an effective channel for their investments. However, if one purchases a real estate property for the purpose of investing, then there are several risks involved when prices begin to fluctuate. The purpose of this study is to design a real estate price 'return rate' prediction model to help mitigate the risks involved with real estate investments and promote reasonable real estate purchases. Various approaches are explored to develop a model capable of predicting real estate prices based on an understanding of the immovability of the real estate market. This study employs the LSTM method, which is based on artificial intelligence and deep learning, to predict real estate prices and validate the model. LSTM networks are based on recurrent neural networks (RNN) but add cell states (which act as a type of conveyer belt) to the hidden states. LSTM networks are able to obtain cell states and hidden states in a recursive manner. Data on the actual trading prices of apartments in autonomous districts between January 2006 and December 2019 are collected from the Actual Trading Price Disclosure System of the Ministry of Land, Infrastructure and Transport (MOLIT). Additionally, basic data on apartments and commercial buildings are collected from the Public Data Portal and Seoul Metropolitan Government's data portal. The collected actual trading price data are scaled to monthly average trading amounts, and each data entry is pre-processed according to address to produce 168 data entries. An LSTM model for return rate prediction is prepared based on a time series dataset where the training period is set as April 2015~August 2017 (29 months), the validation period is set as September 2017~September 2018 (13 months), and the test period is set as December 2018~December 2019 (13 months). The results of the return rate prediction study are as follows. First, the model achieved a prediction similarity level of almost 76%. After collecting time series data and preparing the final prediction model, it was confirmed that 76% of models could be achieved. All in all, the results demonstrate the reliability of the LSTM-based model for return rate prediction.

Performance Evaluation of Object Detection Deep Learning Model for Paralichthys olivaceus Disease Symptoms Classification (넙치 질병 증상 분류를 위한 객체 탐지 딥러닝 모델 성능 평가)

  • Kyung won Cho;Ran Baik;Jong Ho Jeong;Chan Jin Kim;Han Suk Choi;Seok Won Jung;Hvun Seung Son
    • Smart Media Journal
    • /
    • v.12 no.10
    • /
    • pp.71-84
    • /
    • 2023
  • Paralichthys olivaceus accounts for a large proportion, accounting for more than half of Korea's aquaculture industry. However, about 25-30% of the total breeding volume throughout the year occurs due to diseases, which has a very bad impact on the economic feasibility of fish farms. For the economic growth of Paralichthys olivaceus farms, it is necessary to quickly and accurately diagnose disease symptoms by automating the diagnosis of Paralichthys olivaceus diseases. In this study, we create training data using innovative data collection methods, refining data algorithms, and techniques for partitioning dataset, and compare the Paralichthys olivaceus disease symptom detection performance of four object detection deep learning models(such as YOLOv8, Swin, Vitdet, MvitV2). The experimental findings indicate that the YOLOv8 model demonstrates superiority in terms of average detection rate (mAP) and Estimated Time of Arrival (ETA). If the performance of the AI model proposed in this study is verified, Paralichthys olivaceus farms can diagnose disease symptoms in real time, and it is expected that the productivity of the farm will be greatly improved by rapid preventive measures according to the diagnosis results.

A Basic Study on User Experience Evaluation Based on User Experience Hierarchy Using ChatGPT 4.0 (챗지피티 4.0을 활용한 사용자 경험 계층 기반 사용자 경험 평가에 관한 기초적 연구)

  • Soomin Han;Jae Wan Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.493-498
    • /
    • 2024
  • With the rapid advancement of generative artificial intelligence technology, there is growing interest in how to utilize it in practical applications. Additionally, the importance of prompt engineering to generate results that meet user demands is being newly highlighted. Exploring the new possibilities of generative AI can hold significant value. This study aims to utilize ChatGPT 4.0, a leading generative AI, to propose an effective method for evaluating user experience through the analysis of online customer review data. The user experience evaluation method was based on the six-layer elements of user experience: 'functionality', 'reliability', 'usability', 'convenience', 'emotion', and 'significance'. For this study, a literature review was conducted to enhance the understanding of prompt engineering and to grasp the clear concept of the user experience hierarchy. Based on this, prompts were crafted, and experiments for the user experience evaluation method were carried out using the analysis of collected online customer review data. In this study, we reveal that when provided with accurate definitions and descriptions of the classification processes for user experience factors, ChatGPT demonstrated excellent performance in evaluating user experience. However, it was also found that due to time constraints, there were limitations in analyzing large volumes of data. By introducing and proposing a method to utilize ChatGPT 4.0 for user experience evaluation, we expect to contribute to the advancement of the UX field.

Korean Machine Reading Comprehension for Patent Consultation Using BERT (BERT를 이용한 한국어 특허상담 기계독해)

  • Min, Jae-Ok;Park, Jin-Woo;Jo, Yu-Jeong;Lee, Bong-Gun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.4
    • /
    • pp.145-152
    • /
    • 2020
  • MRC (Machine reading comprehension) is the AI NLP task that predict the answer for user's query by understanding of the relevant document and which can be used in automated consult services such as chatbots. Recently, the BERT (Pre-training of Deep Bidirectional Transformers for Language Understanding) model, which shows high performance in various fields of natural language processing, have two phases. First phase is Pre-training the big data of each domain. And second phase is fine-tuning the model for solving each NLP tasks as a prediction. In this paper, we have made the Patent MRC dataset and shown that how to build the patent consultation training data for MRC task. And we propose the method to improve the performance of the MRC task using the Pre-trained Patent-BERT model by the patent consultation corpus and the language processing algorithm suitable for the machine learning of the patent counseling data. As a result of experiment, we show that the performance of the method proposed in this paper is improved to answer the patent counseling query.

Edge Computing Model based on Federated Learning for COVID-19 Clinical Outcome Prediction in the 5G Era

  • Ruochen Huang;Zhiyuan Wei;Wei Feng;Yong Li;Changwei Zhang;Chen Qiu;Mingkai Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.4
    • /
    • pp.826-842
    • /
    • 2024
  • As 5G and AI continue to develop, there has been a significant surge in the healthcare industry. The COVID-19 pandemic has posed immense challenges to the global health system. This study proposes an FL-supported edge computing model based on federated learning (FL) for predicting clinical outcomes of COVID-19 patients during hospitalization. The model aims to address the challenges posed by the pandemic, such as the need for sophisticated predictive models, privacy concerns, and the non-IID nature of COVID-19 data. The model utilizes the FATE framework, known for its privacy-preserving technologies, to enhance predictive precision while ensuring data privacy and effectively managing data heterogeneity. The model's ability to generalize across diverse datasets and its adaptability in real-world clinical settings are highlighted by the use of SHAP values, which streamline the training process by identifying influential features, thus reducing computational overhead without compromising predictive precision. The study demonstrates that the proposed model achieves comparable precision to specific machine learning models when dataset sizes are identical and surpasses traditional models when larger training data volumes are employed. The model's performance is further improved when trained on datasets from diverse nodes, leading to superior generalization and overall performance, especially in scenarios with insufficient node features. The integration of FL with edge computing contributes significantly to the reliable prediction of COVID-19 patient outcomes with greater privacy. The research contributes to healthcare technology by providing a practical solution for early intervention and personalized treatment plans, leading to improved patient outcomes and efficient resource allocation during public health crises.

Science and Technology Networks for Disaster and Safety Management: Based on Expert Survey Data (재난안전관리 과학기술 네트워크: 전문가 수요조사를 중심으로)

  • Heo, Jungeun;Yang, Chang Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.11
    • /
    • pp.123-134
    • /
    • 2018
  • Recently, due to the rising incidence of disasters in the nation, there has been a growing interest in the relevance and role of science and technology in solving disaster and safety related issues. In addition, the necessities of securing the human rights of all citizens in disaster risk reduction, identifying fields of technology development for effective disaster response, and improving the efficiency of R&D investment for disaster and safety are becoming more important as the different types of disasters and stages of disaster and safety management process have been considered. In this study, we analyzed bipartite or two-mode networks constructed from an expert survey dataset of technology development for disaster and safety management. The results reveal that earthquake and fire are the two disasters affecting an individual and society at large and demonstrate that AI and big data analytics are effective supports in managing disaster and safety. We believe that such a network analytic approach can be used to explore some important implications exist for the national science and technology effort and successful disaster and safety management practices in Korea.

A Study on Prediction of EPB shield TBM Advance Rate using Machine Learning Technique and TBM Construction Information (머신러닝 기법과 TBM 시공정보를 활용한 토압식 쉴드TBM 굴진율 예측 연구)

  • Kang, Tae-Ho;Choi, Soon-Wook;Lee, Chulho;Chang, Soo-Ho
    • Tunnel and Underground Space
    • /
    • v.30 no.6
    • /
    • pp.540-550
    • /
    • 2020
  • Machine learning has been actively used in the field of automation due to the development and establishment of AI technology. The important thing in utilizing machine learning is that appropriate algorithms exist depending on data characteristics, and it is needed to analysis the datasets for applying machine learning techniques. In this study, advance rate is predicted using geotechnical and machine data of TBM tunnel section passing through the soil ground below the stream. Although there were no problems of application of statistical technology in the linear regression model, the coefficient of determination was 0.76. While, the ensemble model and support vector machine showed the predicted performance of 0.88 or higher. it is indicating that the model suitable for predicting advance rate of the EPB Shield TBM was the support vector machine in the analyzed dataset. As a result, it is judged that the suitability of the prediction model using data including mechanical data and ground information is high. In addition, research is needed to increase the diversity of ground conditions and the amount of data.

Research on Deep Learning Performance Improvement for Similar Image Classification (유사 이미지 분류를 위한 딥 러닝 성능 향상 기법 연구)

  • Lim, Dong-Jin;Kim, Taehong
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.8
    • /
    • pp.1-9
    • /
    • 2021
  • Deep learning in computer vision has made accelerated improvement over a short period but large-scale learning data and computing power are still essential that required time-consuming trial and error tasks are involved to derive an optimal network model. In this study, we propose a similar image classification performance improvement method based on CR (Confusion Rate) that considers only the characteristics of the data itself regardless of network optimization or data reinforcement. The proposed method is a technique that improves the performance of the deep learning model by calculating the CRs for images in a dataset with similar characteristics and reflecting it in the weight of the Loss Function. Also, the CR-based recognition method is advantageous for image identification with high similarity because it enables image recognition in consideration of similarity between classes. As a result of applying the proposed method to the Resnet18 model, it showed a performance improvement of 0.22% in HanDB and 3.38% in Animal-10N. The proposed method is expected to be the basis for artificial intelligence research using noisy labeled data accompanying large-scale learning data.

Valid Data Conditions and Discrimination for Machine Learning: Case study on Dataset in the Public Data Portal (기계학습에 유효한 데이터 요건 및 선별: 공공데이터포털 제공 데이터 사례를 통해)

  • Oh, Hyo-Jung;Yun, Bo-Hyun
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.1
    • /
    • pp.37-43
    • /
    • 2022
  • The fundamental basis of AI technology is learningable data. Recently, the types and amounts of data collected and produced by the government or private companies are increasing exponentially, however, verified data that can be used for actual machine learning has not yet led to it. This study discusses the conditions that data actually can be used for machine learning should meet, and identifies factors that degrade data quality through case studies. To this end, two representative cases of developing a prediction model using public big data was selected, and data for actual problem solving was collected from the public data portal. Through this, there is a difference from the results of applying valid data screening criteria and post-processing. The ultimate purpose of this study is to argue the importance of data quality management that must be most fundamentally preceded before the development of machine learning technology, which is the core of artificial intelligence, and accumulating valid data.

Control-Path Driven Process-Group Discovery Framework and its Experimental Validation for Process Mining and Reengineering (프로세스 마이닝과 리엔지니어링을 위한 제어경로 기반 프로세스 그룹 발견 프레임워크와 실험적 검증)

  • Thanh Hai Nguyen;Kwanghoon Pio Kim
    • Journal of Internet Computing and Services
    • /
    • v.24 no.5
    • /
    • pp.51-66
    • /
    • 2023
  • In this paper, we propose a new type of process discovery framework, which is named as control-path-driven process group discovery framework, to be used for process mining and process reengineering in supporting life-cycle management of business process models. In addition, we develop a process mining system based on the proposed framework and perform experimental verification through it. The process execution event logs applied to the experimental effectiveness and verification are specially defined as Process BIG-Logs, and we use it as the input datasets for the proposed discovery framework. As an eventual goal of this paper, we design and implement a control path-driven process group discovery algorithm and framework that is improved from the ρ-algorithm, and we try to verify the functional correctness of the proposed algorithm and framework by using the implemented system with a BIG-Log dataset. Note that all the process mining algorithm, framework, and system developed in this paper are based on the structural information control net process modeling methodology.