• Title/Summary/Keyword: the unstructured dataset

Search Result 29, Processing Time 0.03 seconds

An Effective Control Scheme for Unstructued Dataset in the Communication Environments (통신 환경에서 비정형적 구조를 갖는 데이터세트의 효과적인 제어 방법)

  • Bae, Myung-Nam;Choi, Wan;Lee, Dong-Chun
    • The KIPS Transactions:PartC
    • /
    • v.9C no.1
    • /
    • pp.31-38
    • /
    • 2002
  • Communication systems, such as Switching System, are operated in the restricted conditions that the suggested events must finish in the time-constraints. Therefore, the data in the systems requires not only rapid access time, but also completion in the restricted time. Many existing data systems have been developed and used in the communication environments. But, the system construct a structural scheme and provide users with basic data services only. In recent, as the complexity of data in the communication area is rapidly increasing, it requires the data system which can represent the unstructured dataset and complete the data access in this dataset on the restricted condition. In this paper, we propose the data model which is suitable to the unstructured multi-dataset environment. The data model supports the rapid data access for unstructured dataset and enables users to easily retrieve data needed at the execution. In addition to, we define the several algorithms to clarify the structure of our model.

Proposal of Standardization Plan for Defense Unstructured Datasets based on Unstructured Dataset Standard Format (비정형 데이터셋 표준포맷 기반 국방 비정형 데이터셋 표준화 방안 제안)

  • Yun-Young Hwang;Jiseong Son
    • Journal of Internet Computing and Services
    • /
    • v.25 no.1
    • /
    • pp.189-198
    • /
    • 2024
  • AI is accepted not only in the private sector but also in the defense sector as a cutting-edge technology that must be introduced for the development of national defense. In particular, artificial intelligence has been selected as a key task in defense science and technology innovation, and the importance of data is increasing. As the national defense department shifts from a closed data policy to data sharing and activation, efforts are being made to secure high-quality data necessary for the development of national defense. In particular, we are promoting a review of the business budget system to secure data so that related procedures can be improved to reflect the unique characteristics of AI and big data, and research and development can begin with sufficient large quantities and high-quality data. However, there is a need to establish standardization and quality standards for structured data and unstructured data at the national defense level, but the defense department is still proposing standardization and quality standards for structured data, so this needs to be supplemented. In this paper, we propose an unstructured data set standard format for defense unstructured data sets, which are most needed in defense artificial intelligence, and based on this, we propose a standardization method for defense unstructured data sets.

Implementation of YOLOv5-based Forest Fire Smoke Monitoring Model with Increased Recognition of Unstructured Objects by Increasing Self-learning data

  • Gun-wo, Do;Minyoung, Kim;Si-woong, Jang
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.536-546
    • /
    • 2022
  • A society will lose a lot of something in this field when the forest fire broke out. If a forest fire can be detected in advance, damage caused by the spread of forest fires can be prevented early. So, we studied how to detect forest fires using CCTV currently installed. In this paper, we present a deep learning-based model through efficient image data construction for monitoring forest fire smoke, which is unstructured data, based on the deep learning model YOLOv5. Through this study, we conducted a study to accurately detect forest fire smoke, one of the amorphous objects of various forms, in YOLOv5. In this paper, we introduce a method of self-learning by producing insufficient data on its own to increase accuracy for unstructured object recognition. The method presented in this paper constructs a dataset with a fixed labelling position for images containing objects that can be extracted from the original image, through the original image and a model that learned from it. In addition, by training the deep learning model, the performance(mAP) was improved, and the errors occurred by detecting objects other than the learning object were reduced, compared to the model in which only the original image was learned.

Memory Efficient Parallel Ray Casting Algorithm for Unstructured Grid Volume Rendering on Multi-core CPUs (비정렬 격자 볼륨 렌더링을 위한 다중코어 CPU기반 메모리 효율적 광선 투사 병렬 알고리즘)

  • Kim, Duksu
    • Journal of KIISE
    • /
    • v.43 no.3
    • /
    • pp.304-313
    • /
    • 2016
  • We present a novel memory-efficient parallel ray casting algorithm for unstructured grid volume rendering on multi-core CPUs. Our method is based on the Bunyk ray casting algorithm. To solve the high memory overhead problem of the Bunyk algorithm, we allocate a fixed size local buffer for each thread and the local buffers contain information of recently visited faces. The stored information is used by other rays or replaced by other face's information. To improve the utilization of local buffers, we propose an image-plane based ray grouping algorithm that makes ray groups have high coherency. The ray groups are then distributed to computing threads and each thread processes the given groups independently. We also propose a novel hash function that uses the index of faces as keys for calculating the buffer index each face will use to store the information. To see the benefits of our method, we applied it to three unstructured grid datasets with different sizes and measured the performance. We found that our method requires just 6% of the memory space compared with the Bunyk algorithm for storing face information. Also it shows compatible performance with the Bunyk algorithm even though it uses less memory. In addition, our method achieves up to 22% higher performance for a large-scale unstructured grid dataset with less memory than Bunyk algorithm. These results show the robustness and efficiency of our method and it demonstrates that our method is suitable to volume rendering for a large-scale unstructured grid dataset.

Fat Client-Based Abstraction Model of Unstructured Data for Context-Aware Service in Edge Computing Environment (에지 컴퓨팅 환경에서의 상황인지 서비스를 위한 팻 클라이언트 기반 비정형 데이터 추상화 방법)

  • Kim, Do Hyung;Mun, Jong Hyeok;Park, Yoo Sang;Choi, Jong Sun;Choi, Jae Young
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.3
    • /
    • pp.59-70
    • /
    • 2021
  • With the recent advancements in the Internet of Things, context-aware system that provides customized services become important to consider. The existing context-aware systems analyze data generated around the user and abstract the context information that expresses the state of situations. However, these datasets is mostly unstructured and have difficulty in processing with simple approaches. Therefore, providing context-aware services using the datasets should be managed in simplified method. One of examples that should be considered as the unstructured datasets is a deep learning application. Processes in deep learning applications have a strong coupling in a way of abstracting dataset from the acquisition to analysis phases, it has less flexible when the target analysis model or applications are modified in functional scalability. Therefore, an abstraction model that separates the phases and process the unstructured dataset for analysis is proposed. The proposed abstraction utilizes a description name Analysis Model Description Language(AMDL) to deploy the analysis phases by each fat client is a specifically designed instance for resource-oriented tasks in edge computing environments how to handle different analysis applications and its factors using the AMDL and Fat client profiles. The experiment shows functional scalability through examples of AMDL and Fat client profiles targeting a vehicle image recognition model for vehicle access control notification service, and conducts process-by-process monitoring for collection-preprocessing-analysis of unstructured data.

Segmentation-Based Depth Map Adjustment for Improved Grasping Pose Detection (물체 파지점 검출 향상을 위한 분할 기반 깊이 지도 조정)

  • Hyunsoo Shin;Muhammad Raheel Afzal;Sungon Lee
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.1
    • /
    • pp.16-22
    • /
    • 2024
  • Robotic grasping in unstructured environments poses a significant challenge, demanding precise estimation of gripping positions for diverse and unknown objects. Generative Grasping Convolution Neural Network (GG-CNN) can estimate the position and direction that can be gripped by a robot gripper for an unknown object based on a three-dimensional depth map. Since GG-CNN uses only a depth map as an input, the precision of the depth map is the most critical factor affecting the result. To address the challenge of depth map precision, we integrate the Segment Anything Model renowned for its robust zero-shot performance across various segmentation tasks. We adjust the components corresponding to the segmented areas in the depth map aligned through external calibration. The proposed method was validated on the Cornell dataset and SurgicalKit dataset. Quantitative analysis compared to existing methods showed a 49.8% improvement with the dataset including surgical instruments. The results highlight the practical importance of our approach, especially in scenarios involving thin and metallic objects.

Case Study on Managing Dataset Records in Government Information System: Focusing on Establishing Records Management Reference Table for Electronic Human Resource Management System (행정정보 데이터세트 기록관리 적용 사례 분석: 전자인사관리시스템 데이터세트 관리기준표 작성을 중심으로)

  • Shin, Jeongyeop
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.21 no.3
    • /
    • pp.227-246
    • /
    • 2021
  • The study seeks to analyze the procedures and methods of preparing the records management reference table of the electronic human resource management system dataset, the roles of participating organizations, and the contents of each management reference table area from the records manager's perspective to help the person in charge of establishing the management reference table. Improvement plans were suggested based on the problems that appeared during the process of preparing the reference table. As a major improvement plan, a separate selecting policy at the level of the national archives should be designed for the national important dataset records in the government information system, which should be operated such that it preserves the entire dataset rather than a part. It is necessary to set the unit function-data table-unstructured data mapping data as mandatory items, and the selection and management criteria for unstructured data that significantly influence system operation should be additionally prepared. Regarding the setting of the disposition delay period, because there is an aspect of increasing complexity, it is deemed desirable to operate it by integrating related unit functions or setting the retention period longer.

Grammatical Structure Oriented Automated Approach for Surface Knowledge Extraction from Open Domain Unstructured Text

  • Tissera, Muditha;Weerasinghe, Ruvan
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.2
    • /
    • pp.113-124
    • /
    • 2022
  • News in the form of web data generates increasingly large amounts of information as unstructured text. The capability of understanding the meaning of news is limited to humans; thus, it causes information overload. This hinders the effective use of embedded knowledge in such texts. Therefore, Automatic Knowledge Extraction (AKE) has now become an integral part of Semantic web and Natural Language Processing (NLP). Although recent literature shows that AKE has progressed, the results are still behind the expectations. This study proposes a method to auto-extract surface knowledge from English news into a machine-interpretable semantic format (triple). The proposed technique was designed using the grammatical structure of the sentence, and 11 original rules were discovered. The initial experiment extracted triples from the Sri Lankan news corpus, of which 83.5% were meaningful. The experiment was extended to the British Broadcasting Corporation (BBC) news dataset to prove its generic nature. This demonstrated a higher meaningful triple extraction rate of 92.6%. These results were validated using the inter-rater agreement method, which guaranteed the high reliability.

Tobacco Retail License Recognition Based on Dual Attention Mechanism

  • Shan, Yuxiang;Ren, Qin;Wang, Cheng;Wang, Xiuhui
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.480-488
    • /
    • 2022
  • Images of tobacco retail licenses have complex unstructured characteristics, which is an urgent technical problem in the robot process automation of tobacco marketing. In this paper, a novel recognition approach using a double attention mechanism is presented to realize the automatic recognition and information extraction from such images. First, we utilized a DenseNet network to extract the license information from the input tobacco retail license data. Second, bi-directional long short-term memory was used for coding and decoding using a continuous decoder integrating dual attention to realize the recognition and information extraction of tobacco retail license images without segmentation. Finally, several performance experiments were conducted using a largescale dataset of tobacco retail licenses. The experimental results show that the proposed approach achieves a correction accuracy of 98.36% on the ZY-LQ dataset, outperforming most existing methods.

AraProdMatch: A Machine Learning Approach for Product Matching in E-Commerce

  • Alabdullatif, Aisha;Aloud, Monira
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.4
    • /
    • pp.214-222
    • /
    • 2021
  • Recently, the growth of e-commerce in Saudi Arabia has been exponential, bringing new remarkable challenges. A naive approach for product matching and categorization is needed to help consumers choose the right store to purchase a product. This paper presents a machine learning approach for product matching that combines deep learning techniques with standard artificial neural networks (ANNs). Existing methods focused on product matching, whereas our model compares products based on unstructured descriptions. We evaluated our electronics dataset model from three business-to-consumer (B2C) online stores by putting the match products collectively in one dataset. The performance evaluation based on k-mean classifier prediction from three real-world online stores demonstrates that the proposed algorithm outperforms the benchmarked approach by 80% on average F1-measure.