• Title/Summary/Keyword: Non-extraction

Search Result 1,065, Processing Time 0.026 seconds

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

An Empirical Study on the Improvement of In Situ Soil Remediation Using Plasma Blasting, Pneumatic Fracturing and Vacuum Suction (플라즈마 블라스팅, 공압파쇄, 진공추출이 활용된 지중 토양정화공법의 정화 개선 효과에 대한 실증연구)

  • Jae-Yong Song;Geun-Chun Lee;Cha-Won Kang;Eun-Sup Kim;Hyun-Shic Jang;Bo-An Jang;Yu-Chul Park
    • The Journal of Engineering Geology
    • /
    • v.33 no.1
    • /
    • pp.85-103
    • /
    • 2023
  • The in-situ remediation of a solidified stratum containing a large amount of fine-texture material like clay or organic matter in contaminated soil faces limitations such as increased remediation cost resulting from decreased purification efficiency. Even if the soil conditions are good, remediation generally requires a long time to complete because of non-uniform soil properties and low permeability. This study assessed the remediation effect and evaluated the field applicability of a methodology that combines pneumatic fracturing, vacuum extraction, and plasma blasting (the PPV method) to improve the limitations facing existing underground remediation methods. For comparison, underground remediation was performed over 80 days using the experimental PPV method and chemical oxidation (the control method). The control group showed no decrease in the degree of contamination due to the poor delivery of the soil remediation agent, whereas the PPV method clearly reduced the degree of contamination during the remediation period. Remediation effect, as assessed by the reduction of the highest TPH (Total Petroleum Hydrocarbons) concentration by distance from the injection well, was uncleared in the control group, whereas the PPV method showed a remediation effect of 62.6% within a 1 m radius of the injection well radius, 90.1% within 1.1~2.0 m, and 92.1% within 2.1~3.0 m. When evaluating the remediation efficiency by considering the average rate of TPH concentration reduction by distance from the injection well, the control group was not clear; in contrast, the PPV method showed 53.6% remediation effect within 1 m of the injection well, 82.4% within 1.1~2.0 m, and 68.7% within 2.1~3.0 m. Both ways of considering purification efficiency (based on changes in TPH maximum and average contamination concentration) found the PPV method to increase the remediation effect by 149.0~184.8% compared with the control group; its average increase in remediation effect was ~167%. The time taken to reduce contamination by 80% of the initial concentration was evaluated by deriving a correlation equation through analysis of the TPH concentration: the PPV method could reduce the purification time by 184.4% compared with chemical oxidation. However, the present evaluation of a single site cannot be equally applied to all strata, so additional research is necessary to explore more clearly the proposed method's effect.

Application of Amplitude Demodulation to Acquire High-sampling Data of Total Flux Leakage for Tendon Nondestructive Estimation (덴던 비파괴평가를 위한 Total Flux Leakage에서 높은 측정빈도의 데이터를 획득하기 위한 진폭복조의 응용)

  • Joo-Hyung Lee;Imjong Kwahk;Changbin Joh;Ji-Young Choi;Kwang-Yeun Park
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.27 no.2
    • /
    • pp.17-24
    • /
    • 2023
  • A post-processing technique for the measurement signal of a solenoid-type sensor is introduced. The solenoid-type sensor nondestructively evaluates an external tendon of prestressed concrete using the total flux leakage (TFL) method. The TFL solenoid sensor consists of primary and secondary coils. AC electricity, with the shape of a sinusoidal function, is input in the primary coil. The signal proportional to the differential of the input is induced in the secondary coil. Because the amplitude of the induced signal is proportional to the cross-sectional area of the tendon, sectional loss of the tendon caused by ruptures or corrosion can be identified by the induced signal. Therefore, it is important to extract amplitude information from the measurement signal of the TFL sensor. Previously, the amplitude was extracted using local maxima, which is the simplest way to obtain amplitude information. However, because the sampling rate is dramatically decreased by amplitude extraction using the local maxima, the previous method places many restrictions on the direction of TFL sensor development, such as applying additional signal processing and/or artificial intelligence. Meanwhile, the proposed method uses amplitude demodulation to obtain the signal amplitude from the TFL sensor, and the sampling rate of the amplitude information is same to the raw TFL sensor data. The proposed method using amplitude demodulation provides ample freedom for development by eliminating restrictions on the first coil input frequency of the TFL sensor and the speed of applying the sensor to external tension. It also maintains a high measurement sampling rate, providing advantages for utilizing additional signal processing or artificial intelligence. The proposed method was validated through experiments, and the advantages were verified through comparison with the previous method. For example, in this study the amplitudes extracted by amplitude demodulation provided a sampling rate 100 times greater than those of the previous method. There may be differences depending on the given situation and specific equipment settings; however, in most cases, extracting amplitude information using amplitude demodulation yields more satisfactory results than previous methods.

Current Status and Perspectives in Varietal Improvement of Rice Cultivars for High-Quality and Value-Added Products (쌀 품질 고급화 및 고부가가치화를 위한 육종현황과 전망)

  • 최해춘
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.47
    • /
    • pp.15-32
    • /
    • 2002
  • The endeavors enhancing the grain quality of high-yielding japonica rice were steadily continued during 1980s-1990s along with the self-sufficiency of rice production and the increasing demands of high-quality rices. During this time, considerably great progress and success was obtained in development of high-quality japonica cultivars and quality evaluation techniques including the elucidation of interrelationship between the physicochemical properties of rice grain and the physical or palatability components of cooked rice. In 1990s, some high-quality japonica rice cultivars and special rices adaptable for food processing such as large kernel, chalky endosperm, aromatic and colored rices were developed and its objective preference and utility was also examined by a palatability meter, rapid-visco analyzer and texture analyzer, Recently, new special rices such as extremely low-amylose dull or opaque non-glutinous endosperm mutants were developed. Also, a high-lysine rice variety was developed for higher nutritional utility. The water uptake rate and the maximum water absorption ratio showed significantly negative correlations with the K/Mg ratio and alkali digestion value(ADV) of milled rice. The rice materials showing the higher amount of hot water absorption exhibited the larger volume expansion of cooked rice. The harder rices with lower moisture content revealed the higher rate of water uptake at twenty minutes after soaking and the higher ratio of maximum water uptake under the room temperature condition. These water uptake characteristics were not associated with the protein and amylose contents of milled rice and the palatability of cooked rice. The water/rice ratio (in w/w basis) for optimum cooking was averaged to 1.52 in dry milled rices (12% wet basis) with varietal range from 1.45 to 1.61 and the expansion ratio of milled rice after proper boiling was average to 2.63(in v/v basis). The major physicochemical components of rice grain associated with the palatability of cooked rice were examined using japonica rice materials showing narrow varietal variation in grain size and shape, alkali digestibility, gel consistency, amylose and protein contents, but considerable difference in appearance and texture of cooked rice. The glossiness or gross palatability score of cooked rice were closely associated with the peak, hot paste and consistency viscosities of viscosities with year difference. The high-quality rice variety "IIpumbyeo" showed less portion of amylose on the outer layer of milled rice grain and less and slower change in iodine blue value of extracted paste during twenty minutes of boiling. This highly palatable rice also exhibited very fine net structure in outer layer and fine-spongy and well-swollen shape of gelatinized starch granules in inner layer and core of cooked rice kernel compared with the poor palatable rice through image of scanning electronic microscope. Gross sensory score of cooked rice could be estimated by multiple linear regression formula, deduced from relationship between rice quality components mentioned above and eating quality of cooked rice, with high probability of determination. The $\alpha$-amylose-iodine method was adopted for checking the varietal difference in retrogradation of cooked rice. The rice cultivars revealing the relatively slow retrogradation in aged cooked rice were IIpumbyeo, Chucheongyeo, Sasanishiki, Jinbubyeo and Koshihikari. A Tonsil-type rice, Taebaegbyeo, and a japonica cultivar, Seomjinbyeo, showed the relatively fast deterioration of cooked rice. Generally, the better rice cultivars in eating quality of cooked rice showed less retrogradation and much sponginess in cooled cooked rice. Also, the rice varieties exhibiting less retrogradation in cooled cooked rice revealed higher hot viscosity and lower cool viscosity of rice flour in amylogram. The sponginess of cooled cooked rice was closely associated with magnesium content and volume expansion of cooked rice. The hardness-changed ratio of cooked rice by cooling was negatively correlated with solids amount extracted during boiling and volume expansion of cooked rice. The major physicochemical properties of rice grain closely related to the palatability of cooked rice may be directly or indirectly associated with the retrogradation characteristics of cooked rice. The softer gel consistency and lower amylose content in milled rice revealed the higher ratio of popped rice and larger bulk density of popping. The stronger hardness of rice grain showed relatively higher ratio of popping and the more chalky or less translucent rice exhibited the lower ratio of intact popped brown rice. The potassium and magnesium contents of milled rice were negatively associated with gross score of noodle making mixed with wheat flour in half and the better rice for noodle making revealed relatively less amount of solid extraction during boiling. The more volume expansion of batters for making brown rice bread resulted the better loaf formation and more springiness in rice breed. The higher protein rices produced relatively the more moist white rice bread. The springiness of rice bread was also significantly correlated with high amylose content and hard gel consistency. The completely chalky and large grain rices showed better suitability far fermentation and brewing. The glutinous rice were classified into nine different varietal groups based on various physicochemical and structural characteristics of endosperm. There was some close associations among these grain properties and large varietal difference in suitability to various traditional food processing. Our breeding efforts on improvement of rice quality for high palatability and processing utility or value-adding products in the future should focus on not only continuous enhancement of marketing and eating qualities but also the diversification in morphological, physicochemical and nutritional characteristics of rice grain suitable for processing various value-added rice foods.ice foods.

A Proposal of a Keyword Extraction System for Detecting Social Issues (사회문제 해결형 기술수요 발굴을 위한 키워드 추출 시스템 제안)

  • Jeong, Dami;Kim, Jaeseok;Kim, Gi-Nam;Heo, Jong-Uk;On, Byung-Won;Kang, Mijung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.1-23
    • /
    • 2013
  • To discover significant social issues such as unemployment, economy crisis, social welfare etc. that are urgent issues to be solved in a modern society, in the existing approach, researchers usually collect opinions from professional experts and scholars through either online or offline surveys. However, such a method does not seem to be effective from time to time. As usual, due to the problem of expense, a large number of survey replies are seldom gathered. In some cases, it is also hard to find out professional persons dealing with specific social issues. Thus, the sample set is often small and may have some bias. Furthermore, regarding a social issue, several experts may make totally different conclusions because each expert has his subjective point of view and different background. In this case, it is considerably hard to figure out what current social issues are and which social issues are really important. To surmount the shortcomings of the current approach, in this paper, we develop a prototype system that semi-automatically detects social issue keywords representing social issues and problems from about 1.3 million news articles issued by about 10 major domestic presses in Korea from June 2009 until July 2012. Our proposed system consists of (1) collecting and extracting texts from the collected news articles, (2) identifying only news articles related to social issues, (3) analyzing the lexical items of Korean sentences, (4) finding a set of topics regarding social keywords over time based on probabilistic topic modeling, (5) matching relevant paragraphs to a given topic, and (6) visualizing social keywords for easy understanding. In particular, we propose a novel matching algorithm relying on generative models. The goal of our proposed matching algorithm is to best match paragraphs to each topic. Technically, using a topic model such as Latent Dirichlet Allocation (LDA), we can obtain a set of topics, each of which has relevant terms and their probability values. In our problem, given a set of text documents (e.g., news articles), LDA shows a set of topic clusters, and then each topic cluster is labeled by human annotators, where each topic label stands for a social keyword. For example, suppose there is a topic (e.g., Topic1 = {(unemployment, 0.4), (layoff, 0.3), (business, 0.3)}) and then a human annotator labels "Unemployment Problem" on Topic1. In this example, it is non-trivial to understand what happened to the unemployment problem in our society. In other words, taking a look at only social keywords, we have no idea of the detailed events occurring in our society. To tackle this matter, we develop the matching algorithm that computes the probability value of a paragraph given a topic, relying on (i) topic terms and (ii) their probability values. For instance, given a set of text documents, we segment each text document to paragraphs. In the meantime, using LDA, we can extract a set of topics from the text documents. Based on our matching process, each paragraph is assigned to a topic, indicating that the paragraph best matches the topic. Finally, each topic has several best matched paragraphs. Furthermore, assuming there are a topic (e.g., Unemployment Problem) and the best matched paragraph (e.g., Up to 300 workers lost their jobs in XXX company at Seoul). In this case, we can grasp the detailed information of the social keyword such as "300 workers", "unemployment", "XXX company", and "Seoul". In addition, our system visualizes social keywords over time. Therefore, through our matching process and keyword visualization, most researchers will be able to detect social issues easily and quickly. Through this prototype system, we have detected various social issues appearing in our society and also showed effectiveness of our proposed methods according to our experimental results. Note that you can also use our proof-of-concept system in http://dslab.snu.ac.kr/demo.html.