• Title/Summary/Keyword: SELECT 모델

Search Result 690, Processing Time 0.036 seconds

Development of Cloud Detection Method Considering Radiometric Characteristics of Satellite Imagery (위성영상의 방사적 특성을 고려한 구름 탐지 방법 개발)

  • Won-Woo Seo;Hongki Kang;Wansang Yoon;Pyung-Chae Lim;Sooahm Rhee;Taejung Kim
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1211-1224
    • /
    • 2023
  • Clouds cause many difficult problems in observing land surface phenomena using optical satellites, such as national land observation, disaster response, and change detection. In addition, the presence of clouds affects not only the image processing stage but also the final data quality, so it is necessary to identify and remove them. Therefore, in this study, we developed a new cloud detection technique that automatically performs a series of processes to search and extract the pixels closest to the spectral pattern of clouds in satellite images, select the optimal threshold, and produce a cloud mask based on the threshold. The cloud detection technique largely consists of three steps. In the first step, the process of converting the Digital Number (DN) unit image into top-of-atmosphere reflectance units was performed. In the second step, preprocessing such as Hue-Value-Saturation (HSV) transformation, triangle thresholding, and maximum likelihood classification was applied using the top of the atmosphere reflectance image, and the threshold for generating the initial cloud mask was determined for each image. In the third post-processing step, the noise included in the initial cloud mask created was removed and the cloud boundaries and interior were improved. As experimental data for cloud detection, CAS500-1 L2G images acquired in the Korean Peninsula from April to November, which show the diversity of spatial and seasonal distribution of clouds, were used. To verify the performance of the proposed method, the results generated by a simple thresholding method were compared. As a result of the experiment, compared to the existing method, the proposed method was able to detect clouds more accurately by considering the radiometric characteristics of each image through the preprocessing process. In addition, the results showed that the influence of bright objects (panel roofs, concrete roads, sand, etc.) other than cloud objects was minimized. The proposed method showed more than 30% improved results(F1-score) compared to the existing method but showed limitations in certain images containing snow.

A Study on the Revitalization of the Competency Assessment System in the Public Sector : Compare with Private Sector Operations (공공부문 역량평가제도의 활성화 방안에 대한 연구 : 민간부분의 운영방식과의 비교 연구)

  • Kwon, Yong-man;Jeong, Jang-ho
    • Journal of Venture Innovation
    • /
    • v.4 no.1
    • /
    • pp.51-65
    • /
    • 2021
  • The HR policy in the public sector was closed and operated mainly on written tests, but in 2006, a new evaluation, promotion and education system based on competence was introduced in the promotion and selection system of civil servants. In particular, the seniority-oriented promotion system was evaluated based on competence by operating an Assessment Center related to promotion. Competency evaluation is known to be the most reliable and valid evaluation method among the evaluation methods used to date and is also known to have high predictive feasibility for performance. In 2001, 19 government standard competency models were designed. In 2006, the competency assessment was implemented with the implementation of the high-ranking civil service team system. In the public sector, the purpose of the competency evaluation is mainly to select third-grade civil servants, assign fourth-grade civil servants, and promotion fifth-grade civil servants. However, competency assessments in the public sector differ in terms of competency assessment objectives, assessment processes and competency assessment programmes compared to those in the private sector. For the purposes of competency assessment, the public sector is for the promotion of candidates, and the private sector focuses on career development and fostering. Therefore, it is not continuously developing capabilities than the private sector and is not used to enhance performance in performing its duties. In relation to evaluation items, the public sector generally operates a system that passes capacity assessment at 2.5 out of 5 for 6 competencies, lacks feedback on what competencies are lacking, and the private sector uses each individual's competency score. Regarding the selection and operation of evaluators, the public sector focuses on fairness in evaluation, and the private sector focuses on usability, which is inconsistent with the aspect of developing capabilities and utilizing human resources in the right place. Therefore, the public sector should also improve measures to identify outstanding people and motivate them through capacity evaluation and change the operation of the capacity evaluation system so that they can grow into better managers through accurate reports and individual feedback

A Study on Consumer Eco-friendly Behavior Utilizing the Photovoice Methodology : Focus Group Study (포토보이스(Photovoice) 기법을 활용한 소비자의 친환경 행동에 대한 연구 : Focus Group Study)

  • Lee, Il-han
    • Journal of Venture Innovation
    • /
    • v.6 no.4
    • /
    • pp.63-81
    • /
    • 2023
  • The purpose of this study was to utilize the Photovoice qualitative research method targeting university students. Through this method, we aimed to understand the perceptions of environmental issues, environmental barriers, and eco-friendly behaviors among university students. By employing the Photovoice methodology, we sought to share the perspectives of university students on eco-friendly behaviors, explore the motivations and manifestations of these behaviors, and reflect on their significance. The ultimate goal was to provide practical suggestions for fostering eco-friendly behaviors through an in-depth examination of the visual narratives and reflections of university students. Under the overarching theme of the environment, participants were given the opportunity to individually select and explore three specific sub-themes: 'My Concept of the Environment,' 'Environmental Barriers in My Life,' and 'My Eco-friendly Behaviors.' Participants engaged in the process of capturing photographs from their daily lives related to each theme, expressing their thoughts and perspectives through the selected images. Subsequently, they shared and discussed their insights, actively listening to the opinions of others in the group. The results of this study revealed several key findings. Firstly, participants assigned meaning to the photographs they selected by directly capturing aspects related to the environment, such as 'waste,' 'discomfort,' 'fine dust=environmental pollution,' and 'indifference.' Secondly, participants attributed meaning to the selected photographs related to environmental barriers, associating them with concepts like 'invisibility,' 'apathy,' 'social stigma,' 'inefficiency,' and 'compulsion.' Lastly, participants ascribed significance to photographs selected in the context of eco-friendly behaviors, with themes like 'recycling,' 'energy conservation,' 'reuse,' and 'reducing the use of disposable items.' Based on these research findings, the confirmation of the V-A-B (Values-Attitudes-Behavior) model was established. It was observed that consumers structure a hierarchical relationship between their personal values, attitudes, and behaviors. The study also identified clear impediments in consumers' daily lives hindering the practice of eco-friendly behaviors. In light of this, the research highlighted the need for strategies to address the discomfort or inconvenience associated with implementing environmentally friendly consumer behaviors. The implications of the study suggest that interventions or solutions are necessary to alleviate barriers and promote a more seamless integration of eco-friendly practices into consumers' daily routines.

A Study on the Forest Yield Regulation by Systems Analysis (시스템분석(分析)에 의(依)한 삼림수확조절(森林收穫調節)에 관(關)한 연구(硏究))

  • Cho, Eung-hyouk
    • Korean Journal of Agricultural Science
    • /
    • v.4 no.2
    • /
    • pp.344-390
    • /
    • 1977
  • The purpose of this paper was to schedule optimum cutting strategy which could maximize the total yield under certain restrictions on periodic timber removals and harvest areas from an industrial forest, based on a linear programming technique. Sensitivity of the regulation model to variations in restrictions has also been analyzed to get information on the changes of total yield in the planning period. The regulation procedure has been made on the experimental forest of the Agricultural College of Seoul National University. The forest is composed of 219 cutting units, and characterized by younger age group which is very common in Korea. The planning period is devided into 10 cutting periods of five years each, and cutting is permissible only on the stands of age groups 5-9. It is also assumed in the study that the subsequent forests are established immediately after cutting existing forests, non-stocked forest lands are planted in first cutting period, and established forests are fully stocked until next harvest. All feasible cutting regimes have been defined to each unit depending on their age groups. Total yield (Vi, k) of each regime expected in the planning period has been projected using stand yield tables and forest inventory data, and the regime which gives highest Vi, k has been selected as a optimum cutting regime. After calculating periodic yields and cutting areas, and total yield from the optimum regimes selected without any restrictions, the upper and lower limits of periodic yields(Vj-max, Vj-min) and those of periodic cutting areas (Aj-max, Aj-min) have been decided. The optimum regimes under such restrictions have been selected by linear programming. The results of the study may be summarized as follows:- 1. The fluctuations of periodic harvest yields and areas under cutting regimes selected without restrictions were very great, because of irregular composition of age classes and growing stocks of existing stands. About 68.8 percent of total yield is expected in period 10, while none of yield in periods 6 and 7. 2. After inspection of the above solution, restricted optimum cutting regimes were obtained under the restrictions of Amin=150 ha, Amax=400ha, $Vmin=5,000m^3$ and $Vmax=50,000m^3$, using LP regulation model. As a result, about $50,000m^3$ of stable harvest yield per period and a relatively balanced age group distribution is expected from period 5. In this case, the loss in total yield was about 29 percent of that of unrestricted regimes. 3. Thinning schedule could be easily treated by the model presented in the study, and the thinnings made it possible to select optimum regimes which might be effective for smoothing the wood flows, not to speak of increasing total yield in the planning period. 4. It was known that the stronger the restrictions becomes in the optimum solution the earlier the period comes in which balanced harvest yields and age group distribution can be formed. There was also a tendency in this particular case that the periodic yields were strongly affected by constraints, and the fluctuations of harvest areas depended upon the amount of periodic yields. 5. Because the total yield was decreased at the increasing rate with imposing stronger restrictions, the Joss would be very great where strict sustained yield and normal age group distribution are required in the earlier periods. 6. Total yield under the same restrictions in a period was increased by lowering the felling age and extending the range of cutting age groups. Therefore, it seemed to be advantageous for producing maximum timber yield to adopt wider range of cutting age groups with the lower limit at which the smallest utilization size of timber could be produced. 7. The LP regulation model presented in the study seemed to be useful in the Korean situation from the following point of view: (1) The model can provide forest managers with the solution of where, when, and how much to cut in order to best fulfill the owners objective. (2) Planning is visualized as a continuous process where new strateges are automatically evolved as changes in the forest environment are recognized. (3) The cost (measured as decrease in total yield) of imposing restrictions can be easily evaluated. (4) Thinning schedule can be treated without difficulty. (5) The model can be applied to irregular forests. (6) Traditional regulation methods can be rainforced by the model.

  • PDF

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.

The Ability of Anti-tumor Necrosis Factor Alpha(TNF-${\alpha}$) Antibodies Produced in Sheep Colostrums

  • Yun, Sung-Seob
    • 한국유가공학회:학술대회논문집
    • /
    • 2007.09a
    • /
    • pp.49-58
    • /
    • 2007
  • Inflammatory process leads to the well-known mucosal damage and therefore a further disturbance of the epithelial barrier function, resulting abnormal intestinal wall function, even further accelerating the inflammatory process[1]. Despite of the records, etiology and pathogenesis of IBD remain rather unclear. There are many studies over the past couple of years have led to great advanced in understanding the inflammatory bowel disease(IBD) and their underlying pathophysiologic mechanisms. From the current understanding, it is likely that chronic inflammation in IBD is due to aggressive cellular immune responses including increased serum concentrations of different cytokines. Therefore, targeted molecules can be specifically eliminated in their expression directly on the transcriptional level. Interesting therapeutic trials are expected against adhesion molecules and pro-inflammatory cytokines such as TNF-${\alpha}$. The future development of immune therapies in IBD therefore holds great promises for better treatment modalities of IBD but will also open important new insights into a further understanding of inflammation pathophysiology. Treatment of cytokine inhibitors such as Immunex(Enbrel) and J&J/Centocor(Remicade) which are mouse-derived monoclonal antibodies have been shown in several studies to modulate the symptoms of patients, however, theses TNF inhibitors also have an adverse effect immune-related problems and also are costly and must be administered by injection. Because of the eventual development of unwanted side effects, these two products are used in only a select patient population. The present study was performed to elucidate the ability of TNF-${\alpha}$ antibodies produced in sheep colostrums to neutralize TNF-${\alpha}$ action in a cell-based bioassay and in a small animal model of intestinal inflammation. In vitro study, inhibitory effect of anti-TNF-${\alpha}$ antibody from the sheep was determined by cell bioassay. The antibody from the sheep at 1 in 10,000 dilution was able to completely inhibit TNF-${\alpha}$ activity in the cell bioassay. The antibodies from the same sheep, but different milkings, exhibited some variability in inhibition of TNF-${\alpha}$ activity, but were all greater than the control sample. In vivo study, the degree of inflammation was severe to experiment, despite of the initial pilot trial, main trial 1 was unable to figure out of any effect of antibody to reduce the impact of PAF and LPS. Main rat trial 2 resulted no significant symptoms like characteristic acute diarrhea and weight loss of colitis. This study suggested that colostrums from sheep immunized against TNF-${\alpha}$ significantly inhibited TNF-${\alpha}$ bioactivity in the cell based assay. And the higher than anticipated variability in the two animal models precluded assessment of the ability of antibody to prevent TNF-${\alpha}$ induced intestinal damage in the intact animal. Further study will require to find out an alternative animal model, which is more acceptable to test anti-TNF-${\alpha}$ IgA therapy for reducing the impact of inflammation on gut dysfunction. And subsequent pre-clinical and clinical testing also need generation of more antibody as current supplies are low.

  • PDF

Analysis of promising countries for export using parametric and non-parametric methods based on ERGM: Focusing on the case of information communication and home appliance industries (ERGM 기반의 모수적 및 비모수적 방법을 활용한 수출 유망국가 분석: 정보통신 및 가전 산업 사례를 중심으로)

  • Jun, Seung-pyo;Seo, Jinny;Yoo, Jae-Young
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.175-196
    • /
    • 2022
  • Information and communication and home appliance industries, which were one of South Korea's main industries, are gradually losing their export share as their export competitiveness is weakening. This study objectively analyzed export competitiveness and suggested export-promising countries in order to help South Korea's information communication and home appliance industries improve exports. In this study, network properties, centrality, and structural hole analysis were performed during network analysis to evaluate export competitiveness. In order to select promising export countries, we proposed a new variable that can take into account the characteristics of an already established International Trade Network (ITN), that is, the Global Value Chain (GVC), in addition to the existing economic factors. The conditional log-odds for individual links derived from the Exponential Random Graph Model (ERGM) in the analysis of the cross-border trade network were assumed as a proxy variable that can indicate the export potential. In consideration of the possibility of ERGM linkage, a parametric approach and a non-parametric approach were used to recommend export-promising countries, respectively. In the parametric method, a regression analysis model was developed to predict the export value of the information and communication and home appliance industries in South Korea by additionally considering the link-specific characteristics of the network derived from the ERGM to the existing economic factors. Also, in the non-parametric approach, an abnormality detection algorithm based on the clustering method was used, and a promising export country was proposed as a method of finding outliers that deviate from two peers. According to the research results, the structural characteristic of the export network of the industry was a network with high transferability. Also, according to the centrality analysis result, South Korea's influence on exports was weak compared to its size, and the structural hole analysis result showed that export efficiency was weak. According to the model for recommending promising exporting countries proposed by this study, in parametric analysis, Iran, Ireland, North Macedonia, Angola, and Pakistan were promising exporting countries, and in nonparametric analysis, Qatar, Luxembourg, Ireland, North Macedonia and Pakistan were analyzed as promising exporting countries. There were differences in some countries in the two models. The results of this study revealed that the export competitiveness of South Korea's information and communication and home appliance industries in GVC was not high compared to the size of exports, and thus showed that exports could be further reduced. In addition, this study is meaningful in that it proposed a method to find promising export countries by considering GVC networks with other countries as a way to increase export competitiveness. This study showed that, from a policy point of view, the international trade network of the information communication and home appliance industries has an important mutual relationship, and although transferability is high, it may not be easily expanded to a three-party relationship. In addition, it was confirmed that South Korea's export competitiveness or status was lower than the export size ranking. This paper suggested that in order to improve the low out-degree centrality, it is necessary to increase exports to Italy or Poland, which had significantly higher in-degrees. In addition, we argued that in order to improve the centrality of out-closeness, it is necessary to increase exports to countries with particularly high in-closeness. In particular, it was analyzed that Morocco, UAE, Argentina, Russia, and Canada should pay attention as export countries. This study also provided practical implications for companies expecting to expand exports. The results of this study argue that companies expecting export expansion need to pay attention to countries with a relatively high potential for export expansion compared to the existing export volume by country. In particular, for companies that export daily necessities, countries that should pay attention to the population are presented, and for companies that export high-end or durable products, countries with high GDP, or purchasing power, relatively low exports are presented. Since the process and results of this study can be easily extended and applied to other industries, it is also expected to develop services that utilize the results of this study in the public sector.

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

  • Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.205-225
    • /
    • 2018
  • Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.

Development of Yóukè Mining System with Yóukè's Travel Demand and Insight Based on Web Search Traffic Information (웹검색 트래픽 정보를 활용한 유커 인바운드 여행 수요 예측 모형 및 유커마이닝 시스템 개발)

  • Choi, Youji;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.155-175
    • /
    • 2017
  • As social data become into the spotlight, mainstream web search engines provide data indicate how many people searched specific keyword: Web Search Traffic data. Web search traffic information is collection of each crowd that search for specific keyword. In a various area, web search traffic can be used as one of useful variables that represent the attention of common users on specific interests. A lot of studies uses web search traffic data to nowcast or forecast social phenomenon such as epidemic prediction, consumer pattern analysis, product life cycle, financial invest modeling and so on. Also web search traffic data have begun to be applied to predict tourist inbound. Proper demand prediction is needed because tourism is high value-added industry as increasing employment and foreign exchange. Among those tourists, especially Chinese tourists: Youke is continuously growing nowadays, Youke has been largest tourist inbound of Korea tourism for many years and tourism profits per one Youke as well. It is important that research into proper demand prediction approaches of Youke in both public and private sector. Accurate tourism demands prediction is important to efficient decision making in a limited resource. This study suggests improved model that reflects latest issue of society by presented the attention from group of individual. Trip abroad is generally high-involvement activity so that potential tourists likely deep into searching for information about their own trip. Web search traffic data presents tourists' attention in the process of preparation their journey instantaneous and dynamic way. So that this study attempted select key words that potential Chinese tourists likely searched out internet. Baidu-Chinese biggest web search engine that share over 80%- provides users with accessing to web search traffic data. Qualitative interview with potential tourists helps us to understand the information search behavior before a trip and identify the keywords for this study. Selected key words of web search traffic are categorized by how much directly related to "Korean Tourism" in a three levels. Classifying categories helps to find out which keyword can explain Youke inbound demands from close one to far one as distance of category. Web search traffic data of each key words gathered by web crawler developed to crawling web search data onto Baidu Index. Using automatically gathered variable data, linear model is designed by multiple regression analysis for suitable for operational application of decision and policy making because of easiness to explanation about variables' effective relationship. After regression linear models have composed, comparing with model composed traditional variables and model additional input web search traffic data variables to traditional model has conducted by significance and R squared. after comparing performance of models, final model is composed. Final regression model has improved explanation and advantage of real-time immediacy and convenience than traditional model. Furthermore, this study demonstrates system intuitively visualized to general use -Youke Mining solution has several functions of tourist decision making including embed final regression model. Youke Mining solution has algorithm based on data science and well-designed simple interface. In the end this research suggests three significant meanings on theoretical, practical and political aspects. Theoretically, Youke Mining system and the model in this research are the first step on the Youke inbound prediction using interactive and instant variable: web search traffic information represents tourists' attention while prepare their trip. Baidu web search traffic data has more than 80% of web search engine market. Practically, Baidu data could represent attention of the potential tourists who prepare their own tour as real-time. Finally, in political way, designed Chinese tourist demands prediction model based on web search traffic can be used to tourism decision making for efficient managing of resource and optimizing opportunity for successful policy.