• Title/Summary/Keyword: Unstructured data

Search Result 720, Processing Time 0.03 seconds

Implementation of CNN-based Classification Training Model for Unstructured Fashion Image Retrieval using Preprocessing with MASK R-CNN (비정형 패션 이미지 검색을 위한 MASK R-CNN 선형처리 기반 CNN 분류 학습모델 구현)

  • Seunga, Cho;Hayoung, Lee;Hyelim, Jang;Kyuri, Kim;Hyeon-Ji, Lee;Bong-Ki, Son;Jaeho, Lee
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.6
    • /
    • pp.13-23
    • /
    • 2022
  • In this paper, we propose a detailed component image classification algorithm by fashion item for unstructured data retrieval in the fashion field. Due to the COVID-19 environment, AI-based online shopping malls are increasing recently. However, there is a limit to accurate unstructured data search with existing keyword search and personalized style recommendations based on user surfing behavior. In this study, pre-processing using Mask R-CNN was conducted using images crawled from online shopping sites and then classified components for each fashion item through CNN. We obtain the accuaracy for collar of the shirt's as 93.28%, the pattern of the shirt as 98.10%, the 3 classese fit of the jeans as 91.73%, And, we further obtained one for the 4 classes fit of jeans as 81.59% and the color of the jeans as 93.91%. At the results for the decorated items, we also obtained the accuract of the washing of the jeans as 91.20% and the demage of jeans accuaracy as 92.96%.

Property-based Hierarchical Clustering of Peers using Mobile Agent for Unstructured P2P Systems (비구조화 P2P 시스템에서 이동에이전트를 이용한 Peer의 속성기반 계층적 클러스터링)

  • Salvo, MichaelAngelG.;Mateo, RomeoMarkA.;Lee, Jae-Wan
    • Journal of Internet Computing and Services
    • /
    • v.10 no.4
    • /
    • pp.189-198
    • /
    • 2009
  • Unstructured peer-to-peer systems are most commonly used in today's internet. But file placement is random in these systems and no correlation exists between peers and their contents. There is no guarantee that flooding queries will find the desired data. In this paper, we propose to cluster nodes in unstructured P2P systems using the agglomerative hierarchical clustering algorithm to improve the search method. We compared the delay time of clustering the nodes between our proposed algorithm and the k-means clustering algorithm. We also simulated the delay time of locating data in a network topology and recorded the overhead of the system using our proposed algorithm, k-means clustering, and without clustering. Simulation results show that the delay time of our proposed algorithm is shorter compared to other methods and resource overhead is also reduced.

  • PDF

An Automatically Extracting Formal Information from Unstructured Security Intelligence Report (비정형 Security Intelligence Report의 정형 정보 자동 추출)

  • Hur, Yuna;Lee, Chanhee;Kim, Gyeongmin;Jo, Jaechoon;Lim, Heuiseok
    • Journal of Digital Convergence
    • /
    • v.17 no.11
    • /
    • pp.233-240
    • /
    • 2019
  • In order to predict and respond to cyber attacks, a number of security companies quickly identify the methods, types and characteristics of attack techniques and are publishing Security Intelligence Reports(SIRs) on them. However, the SIRs distributed by each company are huge and unstructured. In this paper, we propose a framework that uses five analytic techniques to formulate a report and extract key information in order to reduce the time required to extract information on large unstructured SIRs efficiently. Since the SIRs data do not have the correct answer label, we propose four analysis techniques, Keyword Extraction, Topic Modeling, Summarization, and Document Similarity, through Unsupervised Learning. Finally, has built the data to extract threat information from SIRs, analysis applies to the Named Entity Recognition (NER) technology to recognize the words belonging to the IP, Domain/URL, Hash, Malware and determine if the word belongs to which type We propose a framework that applies a total of five analysis techniques, including technology.

A Study on the Procedure of Using Big Data to Solve Smart City Problems Based on Citizens' Needs and Participation (시민 니즈와 참여 기반의 스마트시티 문제해결을 위한 빅 데이터 활용 절차에 관한 연구)

  • Chang, Hye-Jung
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.2
    • /
    • pp.102-112
    • /
    • 2020
  • Smart City's goal is to solve urban problems through smart city's component technology, thereby developing eco-friendly and sustainable economies and improving citizens' quality of life. Until now, smart cities have evolved into component technologies, but it is time to focus attention on the needs and participation of citizens in smart cities. In this paper, we present a big data procedure for solving smart city problems based on citizens' needs and participation. To this end, we examine the smart city project market by region and major industry. We also examine the development stages of the smart city market area by sector. Additionally it understands the definition and necessity of each sector for citizen participation, and proposes a method to solve the problem through big data in the seven-step big data problem solving process. The seven-step big data process for solving problems is a method of deriving tasks after analyzing structured and unstructured data in each sector of smart cities and deriving policy programs accordingly. To attract citizen participation in these procedures, the empathy stage of the design thinking methodology is used in the unstructured data collection process. Also, as a method of identifying citizens' needs to solve urban problems in smart cities, the problem definition stage of the design sinking methodology was incorporated into the unstructured data analysis process.

Strategies for the Development of Watermelon Industry Using Unstructured Big Data Analysis

  • LEE, Seung-In;SON, Chansoo;SHIM, Joonyong;LEE, Hyerim;LEE, Hye-Jin;CHO, Yongbeen
    • The Journal of Industrial Distribution & Business
    • /
    • v.12 no.1
    • /
    • pp.47-62
    • /
    • 2021
  • Purpose: Our purpose in this study was to examine the strategies for the development of watermelon industry using unstructured big data analysis. That is, this study was to look the change of issues and consumer's perception about watermelon using big data and social network analysis and to investigate ways to strengthen the competitiveness of watermelon industry based on that. Methodology: For this purpose, the data was collected from Naver (blog, news) and Daum (blog, news) by TEXTOM 4.5 and the analysis period was set from 2015 to 2016 and from 2017-2018 and from 2019-2020 in order to understand change of issues and consumer's perception about watermelon or watermelon industry. For the data analysis, TEXTOM 4.5 was used to conduct key word frequency analysis, word cloud analysis and extraction of metrics data. UCINET 6.0 and NetDraw function of UCINET 6.0 were utilized to find the connection structure of words and to visualize the network relations, and to make a cluster of words. Results: The keywords related to the watermelon extracted such as 'the stalk end of a watermelon', 'E-mart', 'Haman', 'Gochang', and 'Lotte Mart' (news: 015-2016), 'apple watermelon', 'Haman', 'E-mart', 'Gochang', and' Mudeungsan watermelon' (news: 2017-2018), 'E-mart', 'apple watermelon', 'household', 'chobok', and 'donation' (news: 2019-2020), 'watermelon salad', 'taste', 'the heat', 'baby', and 'effect' (blog: 2015-2016), 'taste', 'watermelon juice', 'method', 'watermelon salad', and 'baby' (blog: 2017-2018), 'taste', 'effect', 'watermelon juice', 'method', and 'apple watermelon' (blog: 2019-2020) and the results from frequency and TF-IDF analysis presented. And in CONCOR analysis, appeared as four types, respectively. Conclusions: Based on the results, the authors discussed the strategies and policies for boosting the watermelon industry and limitations of this study and future research directions. The results of this study will help prioritize strategies and policies for boosting the consumption of the watermelon and contribute to improving the competitiveness of watermelon industry in Korea. Also, it is expected that this study will be used as a very important basis for agricultural big data studies to be conducted in the future and this study will offer watermelon producers and policy-makers practical points helpful in crafting tailor-made marketing strategies.

Qualitative Data Analysis using Computers (컴퓨터를 이용한 질적 자료 분석)

  • Yi Myung-Sun
    • Journal of Korean Academy of Fundamentals of Nursing
    • /
    • v.6 no.3
    • /
    • pp.570-582
    • /
    • 1999
  • Although computers cannot analyze textual data in the same way as they analyze numerical data. they can nevertheless be of great assistance to qualitative researchers. Thus, the use of computers in analyzing qualitative data has increased since the 1980s. The purpose of this article was to explore advantages and disadvanteges of using computers to analyze textual data and to suggest strategies to prevent problems of using computers. In additon, it illustrated characteristics and functions of softwares designed to analyze qualitative data to help researchers choose the program wisely. It also demonstrated precise functions and procedures of the NUDIST program which was designed to develop a conceptual framework or grounded theory from unstructured data. Major advantage of using computers in qualitative research is the management of huge amount of unstructured data. By managing overloaded data, researcher can keep track of the emerging ideas, arguments and theoretical concepts and can organize these tasks mope efficiently than the traditional method of 'cut-and-paste' technique. Additional advantages are the abilities to increase trustworthiness of research, transparency of research process, and intuitional creativity of the researcher, and to facilitate team and secondary research. On the other hand, disvantages of using computers were identified as worries that the machine could conquer the human understanding and as probability of these problems. it suggested strategies such as 1) deep understanding of orthodoxy in analytical process. To overcome philosophical and theoretical background of qualitative research method, 2) deep understanding of the data as a whole before using software, 3) use of software after familiarity with it, 4) continuous evaluation of software and feedback from them, and 5) continuous awareness of the limitation of the machine, that is computer, in the interpretive analysis.

  • PDF

Big Data using Artificial Intelligence CNN on Unstructured Financial Data (비정형 금융 데이터에 관한 인공지능 CNN 활용 빅데이터 연구)

  • Ko, Young-Bong;Park, Dea-Woo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.232-234
    • /
    • 2022
  • Big data is widely used in customer relationship management, relationship marketing, financial business improvement, credit information and risk management. Moreover, as non-face-to-face financial transactions have become more active recently due to the COVID-19 virus, the use of financial big data is more demanded in terms of relationships with customers. In terms of customer relationship, financial big data has arrived at a time that requires an emotional rather than a technical approach. In relational marketing, it was necessary to emphasize the emotional aspect rather than the cognitive, rational, and rational aspects. Existing traditional financial data was collected and utilized through text-type customer transaction data, corporate financial information, and questionnaires. In this study, the customer's emotional image data, that is, atypical data based on the customer's cultural and leisure activities, is acquired through SNS and the customer's activity image is analyzed with an artificial intelligence CNN algorithm. Activity analysis is again applied to the annotated AI, and the AI big data model is designed to analyze the behavior model shown in the annotation.

  • PDF

Expression and Purification of Unstructured Protein, IMUP-1, using Chaperone Co-expression System for NMR Study

  • Yi, Jong-Jae;Yoo, Jung Ki;Kim, Jin Kyeoung;Son, Woo Sung
    • Journal of the Korean Magnetic Resonance Society
    • /
    • v.17 no.1
    • /
    • pp.30-39
    • /
    • 2013
  • Immortalization-upregulated protein-1 (IMUP-1) genes have been cloned and are known to be involved in SV40-mediated immortalization. IMUP-1 gene is highly expressed in various cancer cell lines and tumors, suggesting the possibility that they might be involved in tumorigenicity. Previously, there were several problems for overexpression of IMUP-1 in bacterial expression systems including low solubility and aggregation due to unstructured property. To investigate the structural properties, it is necessary to obtain lots of pure and soluble proteins. Accordingly, the co-expression systems of bacterial chaperone proteins, GroEL-GroES, were used to increase solubility of IMUP-1. From the analysis of NMR and CD experiment data, it is suggested that the protein adopt typical the random coil properties in solution.

NUMERICAL SIMULATION OF THE INTERFERENCE EFFECT OF EXTERNAL STORES AND TAIL WING SURFACES OF A GENERIC FIGHTER AIRCRAFT (전투기 형상의 외부 장착물과 꼬리 날개 공력 간섭에 대한 수치적 연구)

  • Kim, M.J.;Kwon, O.J.;Kim, J.H.
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2007.10a
    • /
    • pp.149-156
    • /
    • 2007
  • A three-dimensional inviscid flow solver has been developed based on unstructured meshes for the simulation of steady and unsteady flowfields around a generic fighter aircraft and for the investigation of the aerodynamic interference between the external stores and the tail surfaces. The flow solver is based on a vertex-centered finite-volume method and an implicit point Gauss-Seidel relaxation scheme. To validate the flow solver, calculations were made for a steady flow and the computed results were compared with experimental data. An unsteady time-accurate computation of the generic fighter aircraft with external stores at transonic flight conditions showed that the external stores cause undesirable vibration on the horizontal tail surface due to the mutual interference between their wake and the horizontal tail surface. It was shown that downward deflection of the trailing edge flap significantly reduces the undesirable interference effect.

  • PDF

An Implicit Unstructured Finite Element Method for Diffraction of Water Waves by Two-Dimensional Floating Breakwaters (부유체 주위의 2차원 회절 문제를 위한 내율적 비정렬 격자 유한요소해법)

  • 정구창
    • Journal of Ocean Engineering and Technology
    • /
    • v.11 no.4
    • /
    • pp.90-101
    • /
    • 1997
  • A hybrid element method is presented for two-dimensional diffraction problem of water waves. In this method, only a limited fluid domain close to irregular bodies is discretized into conventional finite elements, while the remaining infinite domain is treated as one element with analytical representations of high accuracy. A finite element grid is automatically generated by using Dealunay triangulation based on the Bowyer's algorithm and a linear system of equations is approximately solved with the ILU-CGS algorithm. To validate the present scheme, Computational results are compared with the existing experimental data and other numerical solutions.

  • PDF