• Title/Summary/Keyword: Generating

Search Result 7,278, Processing Time 0.042 seconds

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

  • Seo, Yian;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.1-19
    • /
    • 2018
  • Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.

Performance Analysis of Frequent Pattern Mining with Multiple Minimum Supports (다중 최소 임계치 기반 빈발 패턴 마이닝의 성능분석)

  • Ryang, Heungmo;Yun, Unil
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.1-8
    • /
    • 2013
  • Data mining techniques are used to find important and meaningful information from huge databases, and pattern mining is one of the significant data mining techniques. Pattern mining is a method of discovering useful patterns from the huge databases. Frequent pattern mining which is one of the pattern mining extracts patterns having higher frequencies than a minimum support threshold from databases, and the patterns are called frequent patterns. Traditional frequent pattern mining is based on a single minimum support threshold for the whole database to perform mining frequent patterns. This single support model implicitly supposes that all of the items in the database have the same nature. In real world applications, however, each item in databases can have relative characteristics, and thus an appropriate pattern mining technique which reflects the characteristics is required. In the framework of frequent pattern mining, where the natures of items are not considered, it needs to set the single minimum support threshold to a too low value for mining patterns containing rare items. It leads to too many patterns including meaningless items though. In contrast, we cannot mine any pattern if a too high threshold is used. This dilemma is called the rare item problem. To solve this problem, the initial researches proposed approximate approaches which split data into several groups according to item frequencies or group related rare items. However, these methods cannot find all of the frequent patterns including rare frequent patterns due to being based on approximate techniques. Hence, pattern mining model with multiple minimum supports is proposed in order to solve the rare item problem. In the model, each item has a corresponding minimum support threshold, called MIS (Minimum Item Support), and it is calculated based on item frequencies in databases. The multiple minimum supports model finds all of the rare frequent patterns without generating meaningless patterns and losing significant patterns by applying the MIS. Meanwhile, candidate patterns are extracted during a process of mining frequent patterns, and the only single minimum support is compared with frequencies of the candidate patterns in the single minimum support model. Therefore, the characteristics of items consist of the candidate patterns are not reflected. In addition, the rare item problem occurs in the model. In order to address this issue in the multiple minimum supports model, the minimum MIS value among all of the values of items in a candidate pattern is used as a minimum support threshold with respect to the candidate pattern for considering its characteristics. For efficiently mining frequent patterns including rare frequent patterns by adopting the above concept, tree based algorithms of the multiple minimum supports model sort items in a tree according to MIS descending order in contrast to those of the single minimum support model, where the items are ordered in frequency descending order. In this paper, we study the characteristics of the frequent pattern mining based on multiple minimum supports and conduct performance evaluation with a general frequent pattern mining algorithm in terms of runtime, memory usage, and scalability. Experimental results show that the multiple minimum supports based algorithm outperforms the single minimum support based one and demands more memory usage for MIS information. Moreover, the compared algorithms have a good scalability in the results.

The Relationship Between DEA Model-based Eco-Efficiency and Economic Performance (DEA 모형 기반의 에코효율성과 경제적 성과의 연관성)

  • Kim, Myoung-Jong
    • Journal of Environmental Policy
    • /
    • v.13 no.4
    • /
    • pp.3-49
    • /
    • 2014
  • Growing interest of stakeholders on corporate responsibilities for environment and tightening environmental regulations are highlighting the importance of environmental management more than ever. However, companies' awareness of the importance of environment is still falling behind, and related academic works have not shown consistent conclusions on the relationship between environmental performance and economic performance. One of the reasons is different ways of measuring these two performances. The evaluation scope of economic performance is relatively narrow and the performance can be measured by a unified unit such as price, while the scope of environmental performance is diverse and a wide range of units are used for measuring environmental performances instead of using a single unified unit. Therefore, the results of works can be different depending on the performance indicators selected. In order to resolve this problem, generalized and standardized performance indicators should be developed. In particular, the performance indicators should be able to cover the concepts of both environmental and economic performances because the recent idea of environmental management has expanded to encompass the concept of sustainability. Another reason is that most of the current researches tend to focus on the motive of environmental investments and environmental performance, and do not offer a guideline for an effective implementation strategy for environmental management. For example, a process improvement strategy or a market discrimination strategy can be deployed through comparing the environment competitiveness among the companies in the same or similar industries, so that a virtuous cyclical relationship between environmental and economic performances can be secured. A novel method for measuring eco-efficiency by utilizing Data Envelopment Analysis (DEA), which is able to combine multiple environmental and economic performances, is proposed in this report. Based on the eco-efficiencies, the environmental competitiveness is analyzed and the optimal combination of inputs and outputs are recommended for improving the eco-efficiencies of inefficient firms. Furthermore, the panel analysis is applied to the causal relationship between eco-efficiency and economic performance, and the pooled regression model is used to investigate the relationship between eco-efficiency and economic performance. The four-year eco-efficiencies between 2010 and 2013 of 23 companies are obtained from the DEA analysis; a comparison of efficiencies among 23 companies is carried out in terms of technical efficiency(TE), pure technical efficiency(PTE) and scale efficiency(SE), and then a set of recommendations for optimal combination of inputs and outputs are suggested for the inefficient companies. Furthermore, the experimental results with the panel analysis have demonstrated the causality from eco-efficiency to economic performance. The results of the pooled regression have shown that eco-efficiency positively affect financial perform ances(ROA and ROS) of the companies, as well as firm values(Tobin Q, stock price, and stock returns). This report proposes a novel approach for generating standardized performance indicators obtained from multiple environmental and economic performances, so that it is able to enhance the generality of relevant researches and provide a deep insight into the sustainability of environmental management. Furthermore, using efficiency indicators obtained from the DEA model, the cause of change in eco-efficiency can be investigated and an effective strategy for environmental management can be suggested. Finally, this report can be a motive for environmental management by providing empirical evidence that environmental investments can improve economic performance.

  • PDF

Performance analysis of Frequent Itemset Mining Technique based on Transaction Weight Constraints (트랜잭션 가중치 기반의 빈발 아이템셋 마이닝 기법의 성능분석)

  • Yun, Unil;Pyun, Gwangbum
    • Journal of Internet Computing and Services
    • /
    • v.16 no.1
    • /
    • pp.67-74
    • /
    • 2015
  • In recent years, frequent itemset mining for considering the importance of each item has been intensively studied as one of important issues in the data mining field. According to strategies utilizing the item importance, itemset mining approaches for discovering itemsets based on the item importance are classified as follows: weighted frequent itemset mining, frequent itemset mining using transactional weights, and utility itemset mining. In this paper, we perform empirical analysis with respect to frequent itemset mining algorithms based on transactional weights. The mining algorithms compute transactional weights by utilizing the weight for each item in large databases. In addition, these algorithms discover weighted frequent itemsets on the basis of the item frequency and weight of each transaction. Consequently, we can see the importance of a certain transaction through the database analysis because the weight for the transaction has higher value if it contains many items with high values. We not only analyze the advantages and disadvantages but also compare the performance of the most famous algorithms in the frequent itemset mining field based on the transactional weights. As a representative of the frequent itemset mining using transactional weights, WIS introduces the concept and strategies of transactional weights. In addition, there are various other state-of-the-art algorithms, WIT-FWIs, WIT-FWIs-MODIFY, and WIT-FWIs-DIFF, for extracting itemsets with the weight information. To efficiently conduct processes for mining weighted frequent itemsets, three algorithms use the special Lattice-like data structure, called WIT-tree. The algorithms do not need to an additional database scanning operation after the construction of WIT-tree is finished since each node of WIT-tree has item information such as item and transaction IDs. In particular, the traditional algorithms conduct a number of database scanning operations to mine weighted itemsets, whereas the algorithms based on WIT-tree solve the overhead problem that can occur in the mining processes by reading databases only one time. Additionally, the algorithms use the technique for generating each new itemset of length N+1 on the basis of two different itemsets of length N. To discover new weighted itemsets, WIT-FWIs performs the itemset combination processes by using the information of transactions that contain all the itemsets. WIT-FWIs-MODIFY has a unique feature decreasing operations for calculating the frequency of the new itemset. WIT-FWIs-DIFF utilizes a technique using the difference of two itemsets. To compare and analyze the performance of the algorithms in various environments, we use real datasets of two types (i.e., dense and sparse) in terms of the runtime and maximum memory usage. Moreover, a scalability test is conducted to evaluate the stability for each algorithm when the size of a database is changed. As a result, WIT-FWIs and WIT-FWIs-MODIFY show the best performance in the dense dataset, and in sparse dataset, WIT-FWI-DIFF has mining efficiency better than the other algorithms. Compared to the algorithms using WIT-tree, WIS based on the Apriori technique has the worst efficiency because it requires a large number of computations more than the others on average.

A Study of Current Perception Threshold of Trigeminal Nerve after Tooth Implantation (치아임플란트 시술 후 삼차신경에서의 전류인지역치에 대한 연구)

  • Lim, Hyun-Dae;Lee, Jung-Hyun;Lee, You-Mee
    • Journal of Oral Medicine and Pain
    • /
    • v.32 no.2
    • /
    • pp.187-200
    • /
    • 2007
  • This study attempted to contribute to the clinical application of implant operation by making a quantitative nerve examination using a neurometer for the evaluation of sensory disturbances that could be incurred after the implantation in the dental clinics, and it intended to establish an objective guideline in the evaluation of sensory nerve after the operation of implant. An inspection was performed with the frequencies of 2000Hz, 250 Hz and 5 Hz before and after the operations of tooth implant using $Neurometer^{(R)}$ CPT/C (Neurotron, Inc. Baltimore, Maryland, USA) for 44 patients who had performed an implant operation among the patients coming to Daejeon Sun Dental Hospital in 2006 and 30 people for control group. The measuring sites were maxillary nerve ending and mandibular nerve ending of trigeminal nerve according to the implant operating regions. The current perception threshold (CPT) by each nerve fiber was specifically responded under the electric stimulation of 2000 Hz in case of $A{\beta}$ fiber and of 250 Hz in case of $A{\delta}$ fiber and of 5Hz in case of C fiber. The CPT test could be performed to assess the damages of peripheral nerve in the trigeminal nerve area and it stimulated selective nerve fibers by generating the electricity of specific frequency in the peripheral nerve area. The nerve fibers with varied thickness were responsive selectively to the electric stimulation with different frequencies; accordingly, they applied the electric stimulation with different frequencies and the reaction threshold of $A{\beta},\;A{\delta}$ and C fibers selectively responsive to each electric current could be individually evaluated. In the assessment through the CPT, the increase and decrease of the CPT could be measured so that sensory disturbances such as hyperaesthesia or hypoaesthesia could be diagnosed. This study could obtain the following results after the assessment of the CPT before and after the implant operation. 1. In the assessment before and after the implant operation, the CPT in the frequencies of 2000 Hz, 250 Hz, 5 Hz for maxillary branch increased on the whole after the operation and the CPT for mandibular branch in the $A{\beta}$-fiber(2000 Hz) and C-fiber(5 Hz )after the operation increased statistically significantly. 2. For the groups of patients with medically compromised or its subsequent medicinal prescription, there were no significant differences before and after the implant operation and for the control groups, significantly high CPT was shown after the implant operation in the left $A{\beta}$-fiber(2000 Hz) and C-fiber(5 Hz). 3. In the comparison of the measured value of the CPT before the operation between the control group and the implant operation group, the latter group had a significantly high measured value of the CPT in the right $A{\beta}$-fiber(2000 Hz) and C-fiber(5 Hz) and there were significant differences in $A{\beta}$-fiber(2000 Hz) in the CPT assessment after the implant operation for the control group. 4. Male participants had higher CPT than female counterparts; however, there were no statistic significances. In the CPT evaluation before and after implant operation, there were no statistical differences in the male group while the right C-fiber(5 Hz) and left $A{\beta}$-fiber(2000Hz) were significantly high in the female group. 5. In the comparison between the group who complain sensory disturbance and the other group, the CPT increased on the whole in the former group, but there were no statistical significances. In the groups, whom there was an increase in VAS, the CPT after the implant operation in the right C-fiber(5 Hz) increased significantly; meanwhile, in case that the VAS mark was '0' before and after the operation, the CPT after the operation in the left $A{\beta}$-fiber(2000 Hz) increased significantly. This study suggested that the CPT measurements using $Neurometer^{(R)}$ CPT/C, provide useful information of objective and quantitative sensory disturbances for tooth implantation.

Ensemble of Nested Dichotomies for Activity Recognition Using Accelerometer Data on Smartphone (Ensemble of Nested Dichotomies 기법을 이용한 스마트폰 가속도 센서 데이터 기반의 동작 인지)

  • Ha, Eu Tteum;Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.123-132
    • /
    • 2013
  • As the smartphones are equipped with various sensors such as the accelerometer, GPS, gravity sensor, gyros, ambient light sensor, proximity sensor, and so on, there have been many research works on making use of these sensors to create valuable applications. Human activity recognition is one such application that is motivated by various welfare applications such as the support for the elderly, measurement of calorie consumption, analysis of lifestyles, analysis of exercise patterns, and so on. One of the challenges faced when using the smartphone sensors for activity recognition is that the number of sensors used should be minimized to save the battery power. When the number of sensors used are restricted, it is difficult to realize a highly accurate activity recognizer or a classifier because it is hard to distinguish between subtly different activities relying on only limited information. The difficulty gets especially severe when the number of different activity classes to be distinguished is very large. In this paper, we show that a fairly accurate classifier can be built that can distinguish ten different activities by using only a single sensor data, i.e., the smartphone accelerometer data. The approach that we take to dealing with this ten-class problem is to use the ensemble of nested dichotomy (END) method that transforms a multi-class problem into multiple two-class problems. END builds a committee of binary classifiers in a nested fashion using a binary tree. At the root of the binary tree, the set of all the classes are split into two subsets of classes by using a binary classifier. At a child node of the tree, a subset of classes is again split into two smaller subsets by using another binary classifier. Continuing in this way, we can obtain a binary tree where each leaf node contains a single class. This binary tree can be viewed as a nested dichotomy that can make multi-class predictions. Depending on how a set of classes are split into two subsets at each node, the final tree that we obtain can be different. Since there can be some classes that are correlated, a particular tree may perform better than the others. However, we can hardly identify the best tree without deep domain knowledge. The END method copes with this problem by building multiple dichotomy trees randomly during learning, and then combining the predictions made by each tree during classification. The END method is generally known to perform well even when the base learner is unable to model complex decision boundaries As the base classifier at each node of the dichotomy, we have used another ensemble classifier called the random forest. A random forest is built by repeatedly generating a decision tree each time with a different random subset of features using a bootstrap sample. By combining bagging with random feature subset selection, a random forest enjoys the advantage of having more diverse ensemble members than a simple bagging. As an overall result, our ensemble of nested dichotomy can actually be seen as a committee of committees of decision trees that can deal with a multi-class problem with high accuracy. The ten classes of activities that we distinguish in this paper are 'Sitting', 'Standing', 'Walking', 'Running', 'Walking Uphill', 'Walking Downhill', 'Running Uphill', 'Running Downhill', 'Falling', and 'Hobbling'. The features used for classifying these activities include not only the magnitude of acceleration vector at each time point but also the maximum, the minimum, and the standard deviation of vector magnitude within a time window of the last 2 seconds, etc. For experiments to compare the performance of END with those of other methods, the accelerometer data has been collected at every 0.1 second for 2 minutes for each activity from 5 volunteers. Among these 5,900 ($=5{\times}(60{\times}2-2)/0.1$) data collected for each activity (the data for the first 2 seconds are trashed because they do not have time window data), 4,700 have been used for training and the rest for testing. Although 'Walking Uphill' is often confused with some other similar activities, END has been found to classify all of the ten activities with a fairly high accuracy of 98.4%. On the other hand, the accuracies achieved by a decision tree, a k-nearest neighbor, and a one-versus-rest support vector machine have been observed as 97.6%, 96.5%, and 97.6%, respectively.

A study on the strategies to lower technologist occupational exposure according to the performance form in PET scan procedure (PET 검사실 종사자의 업무 행위 별 방사선피폭 조사에 따른 피폭선량 저감화를 위한 연구)

  • Ko, Hyun Soo;Kim, Ho Sung;Nam-Kung, Chang Kyeoung;Yoon, Soon Sang;Song, Jae Hyuk;Ryu, Jae Kwang;Jung, Woo Young;Chang, Jung Chan
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.19 no.1
    • /
    • pp.17-29
    • /
    • 2015
  • Purpose For nuclear medicine technologists, it is difficult to stay away from or to separate from radiation sources comparing with workers who are using radiation generating devices. Nuclear medicine technologists work is recognized as an optimized way when they are familiar with work practices. The aims of this study are to measure radiation exposure of technologists working in PET and to evaluate the occupational radiation dose after implementation of strategies to lower exposure. Materials and Methods We divided into four working types by QC for PET, injection, scan and etc. in PET scan procedure. In QC of PET, we compared the radiation exposure controlling next to $^{68}Ge$ cylinder phantom directly to controlling the table in console room remotely. In injection, we compared the radiation exposure guiding patient in waiting room before injection to after injection. In scan procedure of PET, we compared the radiation exposure moving the table using the control button located next to the patient to moving the table using the control button located in the far distance. PERSONAL ELECTRONIC DOSEMETER (PED), Tracerco$^{TM}$ was used for measuring exposed radiation doses. Results The average doses of exposed radiation were $0.27{\pm}0.04{\mu}Sv$ when controlling the table directly and $0.13{\pm}0.14{\mu}Sv$ when controlling the table remotely while performing QC. The average doses of exposed radiation were $0.97{\pm}0.36{\mu}Sv$ when guiding patient after injection and $0.62{\pm}0.17{\mu}Sv$ when guiding patient before injection. The average doses of exposed radiation were $1.33{\pm}0.54{\mu}Sv$ when using the control button located next to the patient and $0.94{\pm}0.50{\mu}Sv$ when using the control button located in far distance while acquiring image. As a result, there were statistically significant differences(P<0.05). Conclusion: From this study, we found that how much radiation doses technologists are exposed on average at each step of PET procedure while working in PET center and how we can reduce the occupational radiation dose after implementation of strategies to lower exposure. And if we make effort to seek any other methods to reduce technologist occupational radiation, we can minimize and optimize exposed radiation doses in department of nuclear medicine. Conclusion From this study, we found that how much radiation doses technologists are exposed on average at each step of PET procedure while working in PET center and how we can reduce the occupational radiation dose after implementation of strategies to lower exposure. And if we make effort to seek any other methods to reduce technologist occupational radiation, we can minimize and optimize exposed radiation doses in department of nuclear medicine.

  • PDF

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.

Converting Ieodo Ocean Research Station Wind Speed Observations to Reference Height Data for Real-Time Operational Use (이어도 해양과학기지 풍속 자료의 실시간 운용을 위한 기준 고도 변환 과정)

  • BYUN, DO-SEONG;KIM, HYOWON;LEE, JOOYOUNG;LEE, EUNIL;PARK, KYUNG-AE;WOO, HYE-JIN
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.23 no.4
    • /
    • pp.153-178
    • /
    • 2018
  • Most operational uses of wind speed data require measurements at, or estimates generated for, the reference height of 10 m above mean sea level (AMSL). On the Ieodo Ocean Research Station (IORS), wind speed is measured by instruments installed on the lighthouse tower of the roof deck at 42.3 m AMSL. This preliminary study indicates how these data can best be converted into synthetic 10 m wind speed data for operational uses via the Korea Hydrographic and Oceanographic Agency (KHOA) website. We tested three well-known conventional empirical neutral wind profile formulas (a power law (PL); a drag coefficient based logarithmic law (DCLL); and a roughness height based logarithmic law (RHLL)), and compared their results to those generated using a well-known, highly tested and validated logarithmic model (LMS) with a stability function (${\psi}_{\nu}$), to assess the potential use of each method for accurately synthesizing reference level wind speeds. From these experiments, we conclude that the reliable LMS technique and the RHLL technique are both useful for generating reference wind speed data from IORS observations, since these methods produced very similar results: comparisons between the RHLL and the LMS results showed relatively small bias values ($-0.001m\;s^{-1}$) and Root Mean Square Deviations (RMSD, $0.122m\;s^{-1}$). We also compared the synthetic wind speed data generated using each of the four neutral wind profile formulas under examination with Advanced SCATterometer (ASCAT) data. Comparisons revealed that the 'LMS without ${\psi}_{\nu}^{\prime}$ produced the best results, with only $0.191m\;s^{-1}$ of bias and $1.111m\;s^{-1}$ of RMSD. As well as comparing these four different approaches, we also explored potential refinements that could be applied within or through each approach. Firstly, we tested the effect of tidal variations in sea level height on wind speed calculations, through comparison of results generated with and without the adjustment of sea level heights for tidal effects. Tidal adjustment of the sea levels used in reference wind speed calculations resulted in remarkably small bias (<$0.0001m\;s^{-1}$) and RMSD (<$0.012m\;s^{-1}$) values when compared to calculations performed without adjustment, indicating that this tidal effect can be ignored for the purposes of IORS reference wind speed estimates. We also estimated surface roughness heights ($z_0$) based on RHLL and LMS calculations in order to explore the best parameterization of this factor, with results leading to our recommendation of a new $z_0$ parameterization derived from observed wind speed data. Lastly, we suggest the necessity of including a suitable, experimentally derived, surface drag coefficient and $z_0$ formulas within conventional wind profile formulas for situations characterized by strong wind (${\geq}33m\;s^{-1}$) conditions, since without this inclusion the wind adjustment approaches used in this study are only optimal for wind speeds ${\leq}25m\;s^{-1}$.

Virtuous Concordance of Yin and Yang and Tai-Ji in Joseon art: Focusing on Daesoon Thought (조선 미술에 내재한 음양합덕과 태극 - 대순사상을 중심으로 -)

  • Hwang, Eui-pil
    • Journal of the Daesoon Academy of Sciences
    • /
    • v.35
    • /
    • pp.217-253
    • /
    • 2020
  • This study analyzes the principles of the 'Earthly Paradise' (仙境, the realm of immortals), 'Virtuous Concordance of Yin and Yang' (陰陽合德), and the 'Reordering Works of Heaven and Earth' (天地公事) while combining them with Joseon art. Therefore, this study aims to discover the context wherein the concept of Taiji in 'Daesoon Truth,' deeply penetrates into Joseon art. Doing so reveals how 'Daesoon Thought' is embedded in the lives and customs of the Korean people. In addition, this study follows a review of the sentiments and intellectual traditions of the Korean people based on 'Daesoon Thought' and creative works. Moreover, 'Daesoon Thought' brings all of this to the forefront in academics and art at the cosmological level. The purpose of this research is to vividly reveal the core of 'Daesoon Thought' as a visual image. Through this, the combination of 'Daesoon Thought' and Joseon art will secure both data and reality at the same time. As part of this, this study deals with the world of 'Daesoon Thought' as a cosmological Taiji principle. This concept is revealed in Joseon art, which is analyzed and examined from the viewpoint of art philosophy. First, as a way to make use of 'Daesoon Thought,' 'Daesoon Truth' was developed and directly applied to Joseon art. In this way, reflections on Korean life within 'Daesoon Thought' can be revealed. In this regard, the selection of Joseon art used in this study highlights creative works that have been deeply ingrained into people's lives. For example, as 'Daesoon Thought' appears to focus on the genre painting, folk painting, and landscape painting of the Joseon Dynasty, attention is given to verifying these cases. This study analyzes 'Daesoon Thought,' which borrows from Joseon art, from the perspective of art philosophy. Accordingly, attempts are made to find examples of the 'Virtuous Concordance of Yin and Yang' and Tai-Ji in Joseon art which became a basis by which 'Daesoon Thought' was communicated to people. In addition, appreciating 'Daesoon Thought' in Joseon art is an opportunity to vividly examine not only the Joseon art style but also the life, consciousness, and mental world of the Korean people. As part of this, Chapter 2 made several findings related to the formation of 'Daesoon Thought.' In Chapter 3, the structures of the ideas of 'Earthly Paradise' and 'Virtuous Concordance of Yin and Yang' were likewise found to have support. And 'The Reordering Works of Heaven and Earth' and Tai-Ji were found in depictions of metaphysical laws. To this end, the laws of 'The Reordering Works of Heaven and Earth' and the structure of Tai-Ji were combined. In chapter 4, we analyzed the 'Daesoon Thought' in the life and work of the Korean people at the level of the convergence of 'Daeesoon Thought' and Joseon art. The analysis of works provides a glimpse into the precise identity of 'Daesoon Thought' as observable in Joseon art, as doing so is useful for generating empirical data. For example, works such as Tai-Jido, Ssanggeum Daemu, Jusachaebujeokdo, Hwajogi Myeonghwabundo, and Gyeongdodo are objects that inspired descriptions of 'Earthly Paradise', 'Virtuous Concordance of Yin and Yang,' and 'The Reordering Works of Heaven and Earth.' As a result, Tai-Ji which appears in 'Daesoon Thought', proved the status of people in Joseon art. Given all of these statements, the Tai-Ji idea pursued by Daesoon Thought is a providence that follows change as all things are mutually created. In other words, it was derived that Tai-Ji ideology sits profoundly in the lives of the Korean people and responds mutually to the providence that converges with 'Mutual Beneficence.'