• Title/Summary/Keyword: Text-based classification

Search Result 455, Processing Time 0.028 seconds

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

  • Seo, Yian;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.1-19
    • /
    • 2018
  • Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

Quantification of Schedule Delay Risk of Rain via Text Mining of a Construction Log (공사일지의 텍스트 마이닝을 통한 우천 공기지연 리스크 정량화)

  • Park, Jongho;Cho, Mingeon;Eom, Sae Ho;Park, Sun-Kyu
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.1
    • /
    • pp.109-117
    • /
    • 2023
  • Schedule delays present a major risk factor, as they can adversely affect construction projects, such as through increasing construction costs, claims from a client, and/or a decrease in construction quality due to trims to stages to catch up on lost time. Risk management has been conducted according to the importance and priority of schedule delay risk, but quantification of risk on the depth of schedule delay tends to be inadequate due to limitations in data collection. Therefore, this research used the BERT (Bidirectional Encoder Representations from Transformers) language model to convert the contents of aconstruction log, which comprised unstructured data, into WBS (Work Breakdown Structure)-based structured data, and to form a model of classification and quantification of risk. A process was applied to eight highway construction sites, and 75 cases of rain schedule delay risk were obtained from 8 out of 39 detailed work kinds. Through a K-S test, a significant probability distribution was derived for fourkinds of work, and the risk impact was compared. The process presented in this study can be used to derive various schedule delay risks in construction projects and to quantify their depth.

Feasibility Study of Product Information Design at Internet shopping sites (인터넷 쇼핑 사이트에서 제품 정보 설계의 타당성 검토)

  • Lee, Joo-Hee
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.15 no.1
    • /
    • pp.283-289
    • /
    • 2015
  • This study examines what information is that affects factors of purchase from product detail page of internet shopping malls. For this purpose, the first, the classification of Internet shopping malls and product information and purchasing factors were determined through previous studies, the second, by constructing a questionnaire based on this, verify the validity of each factor and, the finally, the biggest influence what information was performed to examine. What consumers really wants the information, what information to make purchases, the Internet shopping site will be to assist in the design. The results using the Internet shopping site that users reviews, site reliability, Information Architecture, reserve, 3D images and product images available, has been identified as factors affecting the use of reviews and product images available on the factors affecting the revealed. In the site design layout, color systems, text and many design factors are important, but will have to be designed to be purchased by providing sufficient information for the product.

A Comparative Study on Deep Learning Topology for Event Extraction from Biomedical Literature (생의학 분야 학술 문헌에서의 이벤트 추출을 위한 심층 학습 모델 구조 비교 분석 연구)

  • Kim, Seon-Wu;Yu, Seok Jong;Lee, Min-Ho;Choi, Sung-Pil
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.51 no.4
    • /
    • pp.77-97
    • /
    • 2017
  • A recent sharp increase of the biomedical literature causes researchers to struggle to grasp the current research trends and conduct creative studies based on the previous results. In order to alleviate their difficulties in keeping up with the latest scholarly trends, numerous attempts have been made to develop specialized analytic services that can provide direct, intuitive and formalized scholarly information by using various text mining technologies such as information extraction and event detection. This paper introduces and evaluates total 8 Convolutional Neural Network (CNN) models for extracting biomedical events from academic abstracts by applying various feature utilization approaches. Also, this paper conducts performance comparison evaluation for the proposed models. As a result of the comparison, we confirmed that the Entity-Type-Fully-Connected model, one of the introduced models in the paper, showed the most promising performance (72.09% in F-score) in the event classification task while it achieved a relatively low but comparable result (21.81%) in the entire event extraction process due to the imbalance problem of the training collections and event identify model's low performance.

A study on the indications of Five Viscera Source Point Acupuncture extended from Taegeuk Acupuncture : Focused on Yeoungchu(靈樞) (태극침법(太極鍼法)의 확장형인 오장원혈침법(五臟原穴鍼法)의 적응증 연구 - "황제내경(黃帝內經).영추(靈樞)"를 중심으로 -)

  • Moh, Han Young;Lim, Gyo-Min;Baek, Jin-Ung
    • Journal of Korean Medical classics
    • /
    • v.25 no.4
    • /
    • pp.123-147
    • /
    • 2012
  • Objective : By establishing the Five Viscera Source Point Acupuncture as the targeted acupuncture treatment for stadardization, as the first step, this study was conducted to sort the indications of each acupuncture remedies, which can be referred as one of the most important factors in acupuncture treatment, based on Yeoungchu. Method : This study selected only the contents related to indications of five viscera, by extracting the relevant sentences from Yeoungchu using the search words Liver(Liver Meridian, First Yin), Heart(Pericardium, Heart Meridian, Second Yin), Spleen(Spleen meridian, Third Yin), Lung(Lung Meridian, Third Yin), and Kidney(Kidney Meridian, Second Yin). Result & Conclusion : 1. We selected and extracted text related to liver disease from Chapter 16, heart (pericardium) disease from Chapter 16, spleen disease from Chapter 19, lung disease from Chapter 17, and finally kidney disease from Chapter 17 of Yeoungchu. 2. The basic theory of applying Five Viscera Source Point Acupuncture to five viscera diseases is first assorting the diseases according to its state (i.e. deficiency or excess), then draining the source point of the appropriate viscus in case of excess, or supplementing the source point of the appropriate viscus in case of deficiency. 3. For the correct application of Five Viscera Source Point Acupuncture, the classification of the disease, not only the judgement on its state, must be presented systematically and synthetically in combination with Four Examinations. Therefore the follow-up studies needs to be conducted.

The Trend in Clinical Study on Atopic Dermatitis Over the Last 3 Years (아토피 피부염 임상 연구의 최근 3년간 동향)

  • Choi, In-Hwa
    • The Journal of Korean Medicine Ophthalmology and Otolaryngology and Dermatology
    • /
    • v.20 no.3
    • /
    • pp.138-146
    • /
    • 2007
  • Objective : To observe the trend in clinical study on atopic dermatitis(AD) over the last 3 years in order to develop a study methodology of AD in Oriental Medicine. Methods : Using the Pub med on line site; search terms: atopic dermatitis, field as title/abstract, limitation as published in the last 3 years, only items with links to full text, Humans, Clinical Trial, English. I observed the study contents of all this research and focused on the classification of treatments. I also observed of AD clinical trials registered on a clinical trial site(www.clinicaltrial.gov) on 23rd of June, 2007: study contents, localization and study designs. Results : Through the Pub Med site, I found 169 articles. Classified according to study subject, the studies related to treatment were 114(67.5%); physiology, pathology and prevention 12(7.1%); Diagnosis and Evaluation(7.1%); psychological aspects including quality of life 10(5.9%); diet and management 10(5.9%); epidemiology 7(4.1%); and others 7(4.1%). However, the treatment study of herb-medicine as intervention showed only 1 article. Futhermore, it wasn't based on Oriental Medicine philosophy. In the clinical trial site, there were 31 studies in which patients were recruited or planned for the future. The study on efficacy and safety of the medicine produced 14 trials, 5 comparative trials, 2 phototherapy, 2 diagnosis, 6 physiology and pathology, and 2 epidemiology. The trial institutions were concentrated in U.S.A. Conclusion : I suggest we'd better try to make a good clinical guideline and standardization of diagnosis and herb-medicine in order to develop a clinical study methodology of AD as soon as possible in the future. Even though it 's very hard to find the study methodology, we should aim to achieve positive results and show the evidence of the efficacy and safety of herb-medicine treatment for AD using Oriental Medicine.

  • PDF

Study on the Meaning of Yin-Yang and Sasang in the "Huangdineijing" ("황제내경(黃帝內經)"의 '음양(陰陽)'과 '태양(太陽).소양(少陽).소음(少陰).태음(太陰)'의 의미 고찰)

  • Lee, Ok Youn;Jung, Yun Im;Bae, Go Eun;Kwon, Young Kyu
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.28 no.6
    • /
    • pp.577-584
    • /
    • 2014
  • The purpose of this study was to examine how the term Yin-Yang and Sasang categorized in the book "Huangdineijing". In order to investigate how the terms are used, we reviewed the text including the terms expressed in the manner of [(Yin/Yang) within (Yin/Yang)] and [Sasang]. We found three forms of expressions; [(Yin/Yang) within (Yin/Yang)], [Sasang], [(Sasang) within (Yin/Yang)]. Two paragraphs of [(Yin/Yang) within (Yin/Yang)] was found in one chapter, two paragraphs of [(Sasang)] was found in two chapters, and three paragraphs of [(Sasang) within (Yin/Yang)] was found in three chapters. We found five types of relation between [(Yin/Yang) within (Yin/Yang)], [Sasang], and five phases in "Huangdi neijing" as follows; (1) Yang within Yin, lesser Yang, and wood (2) Yang within Yang, greater Yang, and fire (3) ( ), ( ), and extreme Yin (4) Yin within Yang, lesser Yin, and metal, and (5) Yin within Yin, greater Yin, and water. And, as for the [(Yin/Yang) within (Yin/Yang)] and [(Sasang) within (Yin/Yang)], the classification criteria for Yin-Yang were brightness, abdomen/back or lumbar. The order of Sasang with the description form of [Sasang] or [(Sasang) within (Yin/Yang)] in "Siqi Tiaoshen Dalun" and "Liu Jie ZangXiang Theory" was lesser Yang, greater Yang, greater Yin, and lesser Yin, which is based on the meridian system or a plant-shaped change order. We discussed the results and its implication for the analysis of medical classics with the consideration of previous studies on Yin-Yang theories in "Huangdi neijing".

A Study on the Factors Obstructing Prostitutes' Escape from Prostitution (성매매 여성들의 탈성매매 저해요인에 관한 연구)

  • Lee, Keun-Moo;Yu, Eun-Ju
    • Korean Journal of Social Welfare
    • /
    • v.58 no.2
    • /
    • pp.5-31
    • /
    • 2006
  • Since enforcement of the anti-prostitution law, in spite of systematic setting helping escape prostitution of the women who engage in prostitution that they have had the will lasting prostitution. Therefore, this study aimed to devise intervention plan helping their escape prostitution and return to social by examining individual and structural factor obstructing their escape prostitution The data were collected through the in-depth interview and text. And these were analysed according to coding, constitution of concept, matching, construction of explanation on the phenomenon. The nine women who engaging in prostitution were participated in this study. As a result of the data analysis, 46 concepts and 10 categories were generated. By classification of individual and structural factor, the outcomes of an interpretation were as follows: The cause obstructing Prostitutes' escape prostitution were (1) distrust on the policy of the government, (2) life-script was made by reaction-formation, (3) predestined resignation caused by anxiety, (4) body as capital goods, and (5) the commensal model with pimp. Based on this result, we proposed practical and political alternative plans for prostitutes.

  • PDF

Clinical Practice Guideline for Soyangin Disease of Sasang Constitutional Medicine: Chest-Heat congested (Hyunggyeok-yeol) Symptomatology (소양인체질병증 임상진료지침: 흉격열병)

  • Park, Hye-Sun;Hwang, Min-Woo;Lee, Eui-Ju
    • Journal of Sasang Constitutional Medicine
    • /
    • v.26 no.3
    • /
    • pp.262-271
    • /
    • 2014
  • Objectives This research was proposed to present Clinical Practice Guideline(CPG) for Soyangin Disease of Sasang Constitutional Medicine (SCM) ; Chest-Heat congested(Hyunggyeok-yeol) Symptomatology. Methods This CPG was developed by the national-wide experts committee consisting of SCM professors. First, collection and organization of literature related to SCM such as Donguisusebowon, Text book of SCM, Clinical Guidebook of SCM and Fundamental research to standardize diagnosis of Sasang Constitutional Medicine was performed. Secondly, journals related to clinical trial or Human complementary medicine of SCM were searched. Finally, 4 articles were selected and included in CPG for Chest-Heat congested(Hyunggyeok-yeol) Symptomatology of Stomach Heat-based Interior Heat disease in Soyangin disease. Results & Conclusions CPG of Chest-Heat congested(Hyunggyeok-yeol) symptomatology in Soyangin disease includes classification, definition and standard symptoms of each pattern. Chest-Heat congested(Hyunggyeok-yeol) symptomatology is classified into mild and moderate pattern by severity. Chest-Heat(Hyunggyeok-yeol) symptomatology Mild pattern is classified into Chest-Heat congested(Hyunggyeok-yeol) initial pattern and Chest-Heat congested(Hyunggyeok-yeol) advanced pattern. And Chest-Heat congested (Hyunggyeok-yeol) moderate pattern is classified into Clear Yang Failure of Stomach(Weguck-cheongyang Bulsagnseung) pattern (Upper wasting-thirst(Sangso) pattern), Clear Yang Failure of Large Intestine (Daejang-cheongyang Bulsangseung) pattern (Middle wasting-thirst (Jungso) pattern).