• Title/Summary/Keyword: Precision-recall

Search Result 731, Processing Time 0.023 seconds

IoT-Based Automatic Water Quality Monitoring System with Optimized Neural Network

  • Anusha Bamini A M;Chitra R;Saurabh Agarwal;Hyunsung Kim;Punitha Stephan;Thompson Stephan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.1
    • /
    • pp.46-63
    • /
    • 2024
  • One of the biggest dangers in the globe is water contamination. Water is a necessity for human survival. In most cities, the digging of borewells is restricted. In some cities, the borewell is allowed for only drinking water. Hence, the scarcity of drinking water is a vital issue for industries and villas. Most of the water sources in and around the cities are also polluted, and it will cause significant health issues. Real-time quality observation is necessary to guarantee a secure supply of drinking water. We offer a model of a low-cost system of monitoring real-time water quality using IoT to address this issue. The potential for supporting the real world has expanded with the introduction of IoT and other sensors. Multiple sensors make up the suggested system, which is utilized to identify the physical and chemical features of the water. Various sensors can measure the parameters such as temperature, pH, and turbidity. The core controller can process the values measured by sensors. An Arduino model is implemented in the core controller. The sensor data is forwarded to the cloud database using a WI-FI setup. The observed data will be transferred and stored in a cloud-based database for further processing. It wasn't easy to analyze the water quality every time. Hence, an Optimized Neural Network-based automation system identifies water quality from remote locations. The performance of the feed-forward neural network classifier is further enhanced with a hybrid GA- PSO algorithm. The optimized neural network outperforms water quality prediction applications and yields 91% accuracy. The accuracy of the developed model is increased by 20% because of optimizing network parameters compared to the traditional feed-forward neural network. Significant improvement in precision and recall is also evidenced in the proposed work.

Analysis of Ammunition Inspection Record Data and Development of Ammunition Condition Code Classification Model (탄약검사기록 데이터 분석 및 탄약상태기호 분류 모델 개발)

  • Young-Jin Jung;Ji-Soo Hong;Sol-Ip Kim;Sung-Woo Kang
    • Journal of the Korea Safety Management & Science
    • /
    • v.26 no.2
    • /
    • pp.23-31
    • /
    • 2024
  • In the military, ammunition and explosives stored and managed can cause serious damage if mishandled, thus securing safety through the utilization of ammunition reliability data is necessary. In this study, exploratory data analysis of ammunition inspection records data is conducted to extract reliability information of stored ammunition and to predict the ammunition condition code, which represents the lifespan information of the ammunition. This study consists of three stages: ammunition inspection record data collection and preprocessing, exploratory data analysis, and classification of ammunition condition codes. For the classification of ammunition condition codes, five models based on boosting algorithms are employed (AdaBoost, GBM, XGBoost, LightGBM, CatBoost). The most superior model is selected based on the performance metrics of the model, including Accuracy, Precision, Recall, and F1-score. The ammunition in this study was primarily produced from the 1980s to the 1990s, with a trend of increased inspection volume in the early stages of production and around 30 years after production. Pre-issue inspections (PII) were predominantly conducted, and there was a tendency for the grade of ammunition condition codes to decrease as the storage period increased. The classification of ammunition condition codes showed that the CatBoost model exhibited the most superior performance, with an Accuracy of 93% and an F1-score of 93%. This study emphasizes the safety and reliability of ammunition and proposes a model for classifying ammunition condition codes by analyzing ammunition inspection record data. This model can serve as a tool to assist ammunition inspectors and is expected to enhance not only the safety of ammunition but also the efficiency of ammunition storage management.

Development of AI and IoT-based smart farm pest prediction system: Research on application of YOLOv5 and Isolation Forest models (AI 및 IoT 기반 스마트팜 병충해 예측시스템 개발: YOLOv5 및 Isolation Forest 모델 적용 연구)

  • Mi-Kyoung Park;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.4
    • /
    • pp.771-780
    • /
    • 2024
  • In this study, we implemented a real-time pest detection and prediction system for a strawberry farm using a computer vision model based on the YOLOv5 architecture and an Isolation Forest Classifier. The model performance evaluation showed that the YOLOv5 model achieved a mean average precision (mAP 0.5) of 78.7%, an accuracy of 92.8%, a recall of 90.0%, and an F1-score of 76%, indicating high predictive performance. This system was designed to be applicable not only to strawberry farms but also to other crops and various environments. Based on data collected from a tomato farm, a new AI model was trained, resulting in a prediction accuracy of over 85% for major diseases such as late blight and yellow leaf curl virus. Compared to the previous model, this represented an improvement of more than 10% in prediction accuracy.

Integrated Deep Learning Models for Precise Disease Diagnosis in Pepper Crops: Performance Analysis of YOLOv8, ResNet50, and Faster R-CNN (고추 작물의 정밀 질병 진단을 위한 딥러닝 모델 통합 연구: YOLOv8, ResNet50, Faster R-CNN의 성능 분석)

  • Ji-In Seo;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.4
    • /
    • pp.791-798
    • /
    • 2024
  • The purpose of this study is to diagnose diseases in pepper crops using YOLOv8, ResNet50, and Faster R-CNN models and compare their performance. The first model utilizes YOLOv8 for disease diagnosis, the second model uses ResNet50 alone, the third model combines YOLOv8 and ResNet50, and the fourth model uses Faster R-CNN. The performance of each model was evaluated using metrics such as accuracy, precision, recall, and F1-Score. The results show that the combined YOLOv8 and ResNet50 model achieved the highest performance, while the YOLOv8 standalone model also demonstrated high performance.

Semantic Process Retrieval with Similarity Algorithms (유사도 알고리즘을 활용한 시맨틱 프로세스 검색방안)

  • Lee, Hong-Joo;Klein, Mark
    • Asia pacific journal of information systems
    • /
    • v.18 no.1
    • /
    • pp.79-96
    • /
    • 2008
  • One of the roles of the Semantic Web services is to execute dynamic intra-organizational services including the integration and interoperation of business processes. Since different organizations design their processes differently, the retrieval of similar semantic business processes is necessary in order to support inter-organizational collaborations. Most approaches for finding services that have certain features and support certain business processes have relied on some type of logical reasoning and exact matching. This paper presents our approach of using imprecise matching for expanding results from an exact matching engine to query the OWL(Web Ontology Language) MIT Process Handbook. MIT Process Handbook is an electronic repository of best-practice business processes. The Handbook is intended to help people: (1) redesigning organizational processes, (2) inventing new processes, and (3) sharing ideas about organizational practices. In order to use the MIT Process Handbook for process retrieval experiments, we had to export it into an OWL-based format. We model the Process Handbook meta-model in OWL and export the processes in the Handbook as instances of the meta-model. Next, we need to find a sizable number of queries and their corresponding correct answers in the Process Handbook. Many previous studies devised artificial dataset composed of randomly generated numbers without real meaning and used subjective ratings for correct answers and similarity values between processes. To generate a semantic-preserving test data set, we create 20 variants for each target process that are syntactically different but semantically equivalent using mutation operators. These variants represent the correct answers of the target process. We devise diverse similarity algorithms based on values of process attributes and structures of business processes. We use simple similarity algorithms for text retrieval such as TF-IDF and Levenshtein edit distance to devise our approaches, and utilize tree edit distance measure because semantic processes are appeared to have a graph structure. Also, we design similarity algorithms considering similarity of process structure such as part process, goal, and exception. Since we can identify relationships between semantic process and its subcomponents, this information can be utilized for calculating similarities between processes. Dice's coefficient and Jaccard similarity measures are utilized to calculate portion of overlaps between processes in diverse ways. We perform retrieval experiments to compare the performance of the devised similarity algorithms. We measure the retrieval performance in terms of precision, recall and F measure? the harmonic mean of precision and recall. The tree edit distance shows the poorest performance in terms of all measures. TF-IDF and the method incorporating TF-IDF measure and Levenshtein edit distance show better performances than other devised methods. These two measures are focused on similarity between name and descriptions of process. In addition, we calculate rank correlation coefficient, Kendall's tau b, between the number of process mutations and ranking of similarity values among the mutation sets. In this experiment, similarity measures based on process structure, such as Dice's, Jaccard, and derivatives of these measures, show greater coefficient than measures based on values of process attributes. However, the Lev-TFIDF-JaccardAll measure considering process structure and attributes' values together shows reasonably better performances in these two experiments. For retrieving semantic process, we can think that it's better to consider diverse aspects of process similarity such as process structure and values of process attributes. We generate semantic process data and its dataset for retrieval experiment from MIT Process Handbook repository. We suggest imprecise query algorithms that expand retrieval results from exact matching engine such as SPARQL, and compare the retrieval performances of the similarity algorithms. For the limitations and future work, we need to perform experiments with other dataset from other domain. And, since there are many similarity values from diverse measures, we may find better ways to identify relevant processes by applying these values simultaneously.

Ontology-based Course Mentoring System (온톨로지 기반의 수강지도 시스템)

  • Oh, Kyeong-Jin;Yoon, Ui-Nyoung;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.149-162
    • /
    • 2014
  • Course guidance is a mentoring process which is performed before students register for coming classes. The course guidance plays a very important role to students in checking degree audits of students and mentoring classes which will be taken in coming semester. Also, it is intimately involved with a graduation assessment or a completion of ABEEK certification. Currently, course guidance is manually performed by some advisers at most of universities in Korea because they have no electronic systems for the course guidance. By the lack of the systems, the advisers should analyze each degree audit of students and curriculum information of their own departments. This process often causes the human error during the course guidance process due to the complexity of the process. The electronic system thus is essential to avoid the human error for the course guidance. If the relation data model-based system is applied to the mentoring process, then the problems in manual way can be solved. However, the relational data model-based systems have some limitations. Curriculums of a department and certification systems can be changed depending on a new policy of a university or surrounding environments. If the curriculums and the systems are changed, a scheme of the existing system should be changed in accordance with the variations. It is also not sufficient to provide semantic search due to the difficulty of extracting semantic relationships between subjects. In this paper, we model a course mentoring ontology based on the analysis of a curriculum of computer science department, a structure of degree audit, and ABEEK certification. Ontology-based course guidance system is also proposed to overcome the limitation of the existing methods and to provide the effectiveness of course mentoring process for both of advisors and students. In the proposed system, all data of the system consists of ontology instances. To create ontology instances, ontology population module is developed by using JENA framework which is for building semantic web and linked data applications. In the ontology population module, the mapping rules to connect parts of degree audit to certain parts of course mentoring ontology are designed. All ontology instances are generated based on degree audits of students who participate in course mentoring test. The generated instances are saved to JENA TDB as a triple repository after an inference process using JENA inference engine. A user interface for course guidance is implemented by using Java and JENA framework. Once a advisor or a student input student's information such as student name and student number at an information request form in user interface, the proposed system provides mentoring results based on a degree audit of current student and rules to check scores for each part of a curriculum such as special cultural subject, major subject, and MSC subject containing math and basic science. Recall and precision are used to evaluate the performance of the proposed system. The recall is used to check that the proposed system retrieves all relevant subjects. The precision is used to check whether the retrieved subjects are relevant to the mentoring results. An officer of computer science department attends the verification on the results derived from the proposed system. Experimental results using real data of the participating students show that the proposed course guidance system based on course mentoring ontology provides correct course mentoring results to students at all times. Advisors can also reduce their time cost to analyze a degree audit of corresponding student and to calculate each score for the each part. As a result, the proposed system based on ontology techniques solves the difficulty of mentoring methods in manual way and the proposed system derive correct mentoring results as human conduct.

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

  • Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.77-92
    • /
    • 2014
  • Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.

Water resources monitoring technique using multi-source satellite image data fusion (다종 위성영상 자료 융합 기반 수자원 모니터링 기술 개발)

  • Lee, Seulchan;Kim, Wanyub;Cho, Seongkeun;Jeon, Hyunho;Choi, Minhae
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.8
    • /
    • pp.497-508
    • /
    • 2023
  • Agricultural reservoirs are crucial structures for water resources monitoring especially in Korea where the resources are seasonally unevenly distributed. Optical and Synthetic Aperture Radar (SAR) satellites, being utilized as tools for monitoring the reservoirs, have unique limitations in that optical sensors are sensitive to weather conditions and SAR sensors are sensitive to noises and multiple scattering over dense vegetations. In this study, we tried to improve water body detection accuracy through optical-SAR data fusion, and quantitatively analyze the complementary effects. We first detected water bodies at Edong, Cheontae reservoir using the Compact Advanced Satellite 500(CAS500), Kompsat-3/3A, and Sentinel-2 derived Normalized Difference Water Index (NDWI), and SAR backscattering coefficient from Sentinel-1 by K-means clustering technique. After that, the improvements in accuracies were analyzed by applying K-means clustering to the 2-D grid space consists of NDWI and SAR. Kompsat-3/3A was found to have the best accuracy (0.98 at both reservoirs), followed by Sentinel-2(0.83 at Edong, 0.97 at Cheontae), Sentinel-1(both 0.93), and CAS500(0.69, 0.78). By applying K-means clustering to the 2-D space at Cheontae reservoir, accuracy of CAS500 was improved around 22%(resulting accuracy: 0.95) with improve in precision (85%) and degradation in recall (14%). Precision of Kompsat-3A (Sentinel-2) was improved 3%(5%), and recall was degraded 4%(7%). More precise water resources monitoring is expected to be possible with developments of high-resolution SAR satellites including CAS500-5, developments of image fusion and water body detection techniques.

A Study of the Behavioral Characteristics of the Primary and Secondary Searches on Online Databases (온라인 데이터베이스의 1차탐색과 2차탐색의 특성 연구)

  • Noh Dong-Jo
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.32 no.2
    • /
    • pp.189-209
    • /
    • 1998
  • The purpose of this study is to verify the difference of behavioral characteristics between the primary and secondary search, the influence of the primary search results on Online databases. Data is collected by surveying the professional and end-user searchers in 33 online information service centers. 262 valid responses out of 308 questionaire are analyzed by the t-test, ANOVA, $\chi^2-test$ using SAS program. The major findings are as follows : (1) Preparatory searches(the levels of expectation, the degree of question comprehension), search results(the number of output documents, precision ratio, recall ratio, degree of satisfaction) are significantly different between the primary and secondary searches on Online Databases. But the number of search terms, Boolean logics, files, systems, relevant documents are not significantly different between the primary and secondary searches. (2) The results of the primary search affect the secondary searches. The number of output documents in the primary search affect the modification of search strategies and objects in the secondary searches. The number of relevant documents affect the variation of search scopes in the secondary searches. The precision ratio affect the change of search strategies and scopes in the secondary searches.

  • PDF

Query Expansion and Term Weighting Method for Document Filtering (문서필터링을 위한 질의어 확장과 가중치 부여 기법)

  • Shin, Seung-Eun;Kang, Yu-Hwan;Oh, Hyo-Jung;Jang, Myung-Gil;Park, Sang-Kyu;Lee, Jae-Sung;Seo, Young-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.10B no.7
    • /
    • pp.743-750
    • /
    • 2003
  • In this paper, we propose a query expansion and weighting method for document filtering to increase precision of the result of Web search engines. Query expansion for document filtering uses ConceptNet, encyclopedia and documents of 10% high similarity. Term weighting method is used for calculation of query-documents similarity. In the first step, we expand an initial query into the first expanded query using ConceptNet and encyclopedia. And then we weight the first expanded query and calculate the first expanded query-documents similarity. Next, we create the second expanded query using documents of top 10% high similarity and calculate the second expanded query- documents similarity. We combine two similarities from the first and the second step. And then we re-rank the documents according to the combined similarities and filter off non-relevant documents with the lower similarity than the threshold. Our experiments showed that our document filtering method results in a notable improvement in the retrieval effectiveness when measured using both precision-recall and F-Measure.