• Title/Summary/Keyword: Feature generation

Search Result 614, Processing Time 0.026 seconds

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

  • Seo, Yian;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.1-19
    • /
    • 2018
  • Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

Development of a Prototype System for Aquaculture Facility Auto Detection Using KOMPSAT-3 Satellite Imagery (KOMPSAT-3 위성영상 기반 양식시설물 자동 검출 프로토타입 시스템 개발)

  • KIM, Do-Ryeong;KIM, Hyeong-Hun;KIM, Woo-Hyeon;RYU, Dong-Ha;GANG, Su-Myung;CHOUNG, Yun-Jae
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.19 no.4
    • /
    • pp.63-75
    • /
    • 2016
  • Aquaculture has historically delivered marine products because the country is surrounded by ocean on three sides. Surveys on production have been conducted recently to systematically manage aquaculture facilities. Based on survey results, pricing controls on marine products has been implemented to stabilize local fishery resources and to ensure minimum income for fishermen. Such surveys on aquaculture facilities depend on manual digitization of aerial photographs each year. These surveys that incorporate manual digitization using high-resolution aerial photographs can accurately evaluate aquaculture with the knowledge of experts, who are aware of each aquaculture facility's characteristics and deployment of those facilities. However, using aerial photographs has monetary and time limitations for monitoring aquaculture resources with different life cycles, and also requires a number of experts. Therefore, in this study, we investigated an automatic prototype system for detecting boundary information and monitoring aquaculture facilities based on satellite images. KOMPSAT-3 (13 Scene), a local high-resolution satellite provided the satellite imagery collected between October and April, a time period in which many aquaculture facilities were operating. The ANN classification method was used for automatic detecting such as cage, longline and buoy type. Furthermore, shape files were generated using a digitizing image processing method that incorporates polygon generation techniques. In this study, our newly developed prototype method detected aquaculture facilities at a rate of 93%. The suggested method overcomes the limits of existing monitoring method using aerial photographs, but also assists experts in detecting aquaculture facilities. Aquaculture facility detection systems must be developed in the future through application of image processing techniques and classification of aquaculture facilities. Such systems will assist in related decision-making through aquaculture facility monitoring.

Distribution and Bacteriological Characteristics of Vibrio vulnificus (Vibrio vulnificus 균의 분포 및 세균학적 특성)

  • CHANG Dong-Suck;SHIN Il-Shik;CHOI Seung-Tae;KIM Young-Man
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.19 no.2
    • /
    • pp.118-126
    • /
    • 1986
  • Vibrio vulnificus is a recently recognized halophilic organism that nay cause serious human infections. Patients infected with V. vulnificus often have a history of exposure to the sea, suggesting that the organism may be common inhabitant of marine environment. The purpose of this experiment is to investigate the distribution and bacteriological characteristics of V. vulnificus. The strain used in this experiment was isolated from sea water and sea products such as common octopus (Octopus variabilis), ark shell (Anadara broughtonii), blue crab (Ericheir japonica), and sea squirt (Synthia roretzi) collected in Pusan area from July to October in 1985. V. vulnificus was frequently isolated in August when temperature of sea water was around $26^{\circ}C$ and rarely isolated in October when temperature of sea water was around $18.5^{\circ}C$. The distinctive biochemical characteristics of V. vulnificus were ONPG hydrolysis positive and fermented lactose and not grown in peptone water contained $8\%$ NaCl. The optical density at 660 nm of the growth of V. vulnificus was reached maximum level after 8 hours of culture at $35^{\circ}C$ in brain heart infusion broth but that of V. vulnificus was little increased at $15^{\circ}C$ for 14 hours. Optimum temperature and pH for the growth of V. vulnificus were around $35^{\circ}C$ and 8.0. The specific growth rate and the generation time of V. vulnificus isolated from the samples were $1.21\;hr^{-1}$, 34 min at $35^{\circ}C$ and $0.61\;hr^{-1}$, 69 min at $25^{\circ}C$, respectively. V. vulnificus did not grow on eosin-methylene-blue agar, salmonella-shigella agar, deoxycholate agar but grew well on Endo agar, xylose-lysine-deoxycholate agar and hektoen enteric agar. On Endo agar, the colonies of V. vulnificus were red and achieved a diameter of 2 to 4 mm as a feature enabling differentiation of V. vulnificus from other Vibrio spp. V. vulnificus grow well on TCBS agar forming green colonies. V. vulnificus refrigerated at $4^{\circ}C$ exhibited a linear decline of its viablity as 1 log cycle in every 16 hours storage, while V. vulnificus freezed at $-18^{\circ}C$ almost became extinct.

  • PDF

Combustion Characteristic Study of LNG Flame in an Oxygen Enriched Environment (산소부화 조건에 따른 LNG 연소특성 연구)

  • Kim, Hey-Suk;Shin, Mi-Soo;Jang, Dong-Soon;Lee, Dae-Geun
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.29 no.1
    • /
    • pp.23-30
    • /
    • 2007
  • The ultimate objective of this study is to develop oxygen-enriched combustion techniques applicable to the system of practical industrial boiler. To this end the combustion characteristics of lab-scale LNG combustor were investigated as a first step using the method of numerical simulation by analyzing the flame characteristics and pollutant emission behaviour as a function of oxygen enrichment level. Several useful conclusions could be drawn based on this study. First of all, the increase of oxygen enrichment level instead of air caused long and thin flame called laminar flame feature. This was in good agreement with experimental results appeared in open literature and explained by the effect of the decrease of turbulent mixing due to the decrease of absolute amount of oxidizer flow rate by the absence of the nitrogen species. Further, as expected, oxygen enrichment increased the flame temperatures to a significant level together with concentrations of $CO_2$ and $H_2O$ species because of the elimination of the heat sink and dilution effects by the presence of $N_2$ inert gas. However, the increased flame temperature with $O_2$ enriched air showed the high possibility of the generation of thermal $NO_x$ if nitrogen species were present. In order to remedy the problem caused by the oxygen-enriched combustion, the appropriate amount of recirculation $CO_2$ gas was desirable to enhance the turbulent mixing and thereby flame stability and further optimum determination of operational conditions were necessary. For example, the adjustment of burner with swirl angle of $30\sim45^{\circ}$ increased the combustion efficiency of LNG fuel and simultaneously dropped the $NO_x$ formation.

Design of Translator for generating Secure Java Bytecode from Thread code of Multithreaded Models (다중스레드 모델의 스레드 코드를 안전한 자바 바이트코드로 변환하기 위한 번역기 설계)

  • 김기태;유원희
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2002.06a
    • /
    • pp.148-155
    • /
    • 2002
  • Multithreaded models improve the efficiency of parallel systems by combining inner parallelism, asynchronous data availability and the locality of von Neumann model. This model executes thread code which is generated by compiler and of which quality is given by the method of generation. But multithreaded models have the demerit that execution model is restricted to a specific platform. On the contrary, Java has the platform independency, so if we can translate from threads code to Java bytecode, we can use the advantages of multithreaded models in many platforms. Java executes Java bytecode which is intermediate language format for Java virtual machine. Java bytecode plays a role of an intermediate language in translator and Java virtual machine work as back-end in translator. But, Java bytecode which is translated from multithreaded models have the demerit that it is not secure. This paper, multhithread code whose feature of platform independent can execute in java virtual machine. We design and implement translator which translate from thread code of multithreaded code to Java bytecode and which check secure problems from Java bytecode.

  • PDF

Incorporation of RAPD linkage Map Into RFLP Map in Glycine max (L, ) Merr (콩의 RAPD 연관지도를 RFLP 연관지도와 합병)

  • Choi, In-Soo;Kim, Yong-Chul
    • Journal of Life Science
    • /
    • v.13 no.3
    • /
    • pp.280-290
    • /
    • 2003
  • The incorporation of RAPD markers into the previous classical and RFLP genetic linkage maps will facilitate the generation of a detailed genetic map by compensating for the lack of one type of marker in the region of interest. The objective of this paper was to present features we observed when we associated RAPD map from an intraspecific cross of a Glycine max$\times$G. max, 'Essex'$\times$PI 437654 with the public RFLP map developed from an interspecific cross of G. max$\times$G. soja. Among 27 linkage groups of RAPD map, eight linkage groups contained probe/enzyme combination RFLP markers, which allowed us the incorporation of RAPD markers into the public RFLP map. Map position rearrangement was observed. In incorporating L.G.C-3 into the public RFLP linkage group a1 and a2, both pSAC3 and pA136 region, and pA170/EcoRV and pB170/HindIII region were in opposite order, respectively. And, pk400 was localized 1.8 cM from pA96-1 and 8.4 cM from pB172 in the public RFLP map, but was localized 9.9 cM from i locus and 18.9 cM from pA85 in our study. A noticeable expansion of the map distances in the intraspecific cross of Essex and PI 437654 was also observed. Map distance between probes pA890 and pK493 in L.G.C-1 was 48.6 cM, but it was only 13.3 cM in the public RFLP map. The distances from the probe pB32-2 to pA670 and from pA670 to pA668 in L.G. C-2 were 50.9 cM and 31.7 cM, but they were 35.9 cM and 13.5 cM in the public RFLP map. The detection of duplicate loci from the same probe that were mapped on the same or/and different linkage group was another feature we observed.

On the Persistence of Warm Eddies in the East Sea (동해 난수성 에디의 장기간 지속에 관하여)

  • JIN, HYUNKEUN;PARK, YOUNG-GYU;PAK, GYUNDO;KIM, YOUNG HO
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.24 no.2
    • /
    • pp.318-331
    • /
    • 2019
  • In this study, comparative analysis is performed on the long-term persisted warm eddies that were generated in 2003 (WE03) and in 2014 (WE14) over the East Sea using the HYCOM reanalysis data. The overshooting of the East Korea Warm Current (EKWC) was appeared during the formation period of those warm eddies. The warm eddies were produced in the shallow Korea Plateau region through the interaction of the EKWC and the sub-polar front. In the interior of the both warm eddies, a homogeneous water mass of about $13^{\circ}C$ and 34.1 psu were generated over the upper 150 m depth by the winter mixing. In 2004, the next year of the generation of the WE03, the amount of the inflow through the western channel of the Korea Strait was larger, while the inflow was lesser than its climatology during 2015 corresponding to the development period of the WE14. The above results suggest that the heat and salt are supplied in the warm eddies through the Tsushima Warm Current (TWC), however the amount of the inflow through the Korea Strait has negligible impact on the long-term persistency of the warm eddies. Both of the warm eddies were maintained more than 18 months near Ulleung island, while they have no common feature on the pathways. In the vicinity of the Ulleung basin, large and small eddies are continuously created due to the meandering of the EKWC. The long-term persisted warm eddies in the Ulleung Island seem to be the results of the interaction between the pre-existed eddies located south of the sub-polar front and fresh eddies induced by the meanderings of the EKWC. The conclusion is also in line with the fact that the long-term persisted warm eddies were not always created when the overshooting of the EKWC was appeared.

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.