• Title/Summary/Keyword: terms extraction

Search Result 637, Processing Time 0.024 seconds

Optimal Organic Solvent Extraction Method for Dewaxing of Beeswax-treated Hanji (밀랍도포한지의 탈랍을 위한 최적 유기용매 추출기법 탐색)

  • Choi, Do-Chim;Choi, Eun-Yeon;Jo, Byoung-Muk;Cho, Byoung-Uk
    • Journal of Korea Technical Association of The Pulp and Paper Industry
    • /
    • v.44 no.6
    • /
    • pp.50-57
    • /
    • 2012
  • In this study, the beeswax extraction methods using organic solvents were examined to develop a optimal dewaxing technology for beeswax-treated Hanji. Thermally-aged beeswax-treated Hanji was dewaxed using four types of extraction methods including dipping, Soxhlet extraction, ultrasonic washing and shaking methods. Then, the aging stability of the dewaxed Hanji was evaluated in terms of variations in paper strength and in the color of the printed area with muk. The experimental results suggested that the dewaxing methods allowing solvent to flow during extraction showed superior extraction efficiency. The dipping method in which the organic solvent does not flow showed the slowest extraction rate of beeswax compared to three other methods. In terms of variations in tensile strength and folding endurance, however, no obvious differences in the aging stability were observed amongst these four extraction methods. Regarding the aging stability in terms of the color of the printed area with muk, Soxhlet extraction method showed the best performance of dewaxing.

Evaluation of English Term Extraction based on Inner/Outer Term Statistics

  • Kang, In-Su
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.4
    • /
    • pp.141-148
    • /
    • 2020
  • Automatic term extraction is to recognize domain-specific terms given a collection of domain-specific text. Previous term extraction methods operate effectively in unsupervised manners which include extracting candidate terms, and assigning importance scores to candidate terms. Regarding the calculation of term importance scores, the study focuses on utilizing sets of inner and outer terms of a candidate term. For a candidate term, its inner terms are shorter terms which belong to the candidate term as components, and its outer terms are longer terms which include the candidate term as their component. This work presents various functions that compute, for a candidate term, term strength from either set of its inner or outer terms. In addition, a scoring method of a term importance is devised based on C-value score and the term strength values obtained from the sets of inner and outer terms. Experimental evaluations using GENIA and ACL RD-TEC 2.0 datasets compare and analyze the effectiveness of the proposed term extraction methods for English. The proposed method performed better than the baseline method by up to 1% and 3% respectively for GENIA and ACL datasets.

Arabic Text Clustering Methods and Suggested Solutions for Theme-Based Quran Clustering: Analysis of Literature

  • Bsoul, Qusay;Abdul Salam, Rosalina;Atwan, Jaffar;Jawarneh, Malik
    • Journal of Information Science Theory and Practice
    • /
    • v.9 no.4
    • /
    • pp.15-34
    • /
    • 2021
  • Text clustering is one of the most commonly used methods for detecting themes or types of documents. Text clustering is used in many fields, but its effectiveness is still not sufficient to be used for the understanding of Arabic text, especially with respect to terms extraction, unsupervised feature selection, and clustering algorithms. In most cases, terms extraction focuses on nouns. Clustering simplifies the understanding of an Arabic text like the text of the Quran; it is important not only for Muslims but for all people who want to know more about Islam. This paper discusses the complexity and limitations of Arabic text clustering in the Quran based on their themes. Unsupervised feature selection does not consider the relationships between the selected features. One weakness of clustering algorithms is that the selection of the optimal initial centroid still depends on chances and manual settings. Consequently, this paper reviews literature about the three major stages of Arabic clustering: terms extraction, unsupervised feature selection, and clustering. Six experiments were conducted to demonstrate previously un-discussed problems related to the metrics used for feature selection and clustering. Suggestions to improve clustering of the Quran based on themes are presented and discussed.

Study on the Generation Methods of Composition Noun for Efficient Index Term Extraction (효율적인 색인어 추출을 위한 합성명사 생성 방안에 대한 연구)

  • Kim, Mi-Jin;Park, Mi-Seong;Choe, Jae-Hyeok;Lee, Sang-Jo
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.4
    • /
    • pp.1122-1131
    • /
    • 2000
  • The efficiency of thesytem depends upon an accurate extraction capability of index terms in the system of information search or in that of automatic index. Therefore, extraction of accurate index terms is of utmost importance. This report presents the generation methods of composition noun for efficient index term extraction by using words of high frequency appearance, so that the right documents can be found during information search. For the sake of presentation of this method, index terms of composition noun shall be extracted by applying the rule of composition and disintegration to the nouns with high frequency of appearance in the documents, such as those with upper 30%∼40% of frequency ratio. In addition, for he purpose of effecting an inspection of validity in relation to a composition of high frequency nouns such as those with upper 30∼40% of frequency ratio as presented in this report, it proposes an adequate frquency ratio during noun composition. Based upon the proposed application, in this short documents with less than 300 syllables, low frequency omissions were noticed, when composed with nouns in the upper 30% of frequency ratio; whereas the documents with more than 30 syllables, when composed with nouns in he upper 40% of frequency ration, had a considerable reduction of low frequency omissions. Thus, total number of index terms has decreased to 57.7% of these existing and an accurate extraction of index terms with an 85.6% adequacy ratio became possible.

  • PDF

Comparison of Significant Term Extraction Based on the Number of Selected Principal Components (주성분 보유수에 따른 중요 용어 추출의 비교)

  • Lee Chang-Beom;Ock Cheol-Young;Park Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.329-336
    • /
    • 2006
  • In this paper, we propose a method of significant term extraction within a document. The technique used is Principal Component Analysis(PCA) which is one of the multivariate analysis methods. PCA can sufficiently use term-term relationships within a document by term-term correlations. We use a correlation matrix instead of a covariance matrix between terms for performing PCA. We also try to find out thresholds of both the number of components to be selected and correlation coefficients between selected components and terms. The experimental results on 283 Korean newspaper articles show that the condition of the first six components with correlation coefficients of |0.4| is the best for extracting sentence based on the significant selected terms.

Development of Dye Natural Batik Based on Fiber Coconut Waste and Leaf Avocado through Extraction Method in Supporting Green Business

    • Asian Journal of Business Environment
    • /
    • v.14 no.1
    • /
    • pp.15-22
    • /
    • 2024
  • Purpose: The development of natural batik dyes based on a combination of coconut fiber waste and avocado leaves using the extraction method is important to support the green economy and reduce chemical waste in Indonesia. Research design, data and methodology: The research explores the use of coconut fiber and avocado leaf waste extraction as a natural batik dye and conducts market testing to assess consumer satisfaction. Results: Indonesian batik exports are growing, but synthetic dye practices are causing a decline in demand. To address this, natural dyes are being explored, including coconut fiber waste and avocado leaf waste. Conclusion: Test results from washing at 40 degrees Celsius in terms of color changes and color staining, from sweat in terms of changes in acid color and changes in base color, to sunlight in terms of color fastness value, to heat to iron in terms of color change and color staining shows a value of 3-4 (quite good) and 4-5 (good), meaning that coconut fiber and avocado leaves waste can be used as natural batik dye.

The Solubility Characteristics of Organic Compounds in Urban Aerosol Samples

  • Kim, Young-Min;Peter Brimblecombe;Tim Jickells;Baek, Sung-Ok
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.14 no.E
    • /
    • pp.27-40
    • /
    • 1998
  • The solubility characteristics of organic compounds were studied in terms of the extraction efficiency as a function of the polarity of the organic solvent, and the acidity of water in urban aerosol samples collected in University of East Anglia (UEA), Norwich, England. The extraction efficiency of organic compounds were evaluated with respect to the organic carbon, -nitrogen and -hydrogen by means of a wide range of solvent which include polar and nonpolar organic solvents as well as acids and alkaline water. In addition, after being dissolved in aqueous solution, the aqueous chemistry of organic compounds were studied in terms of the organic metal complexes in aerosol, which were studied with oxalic acid, copper, and zinc. The results of this study indicate that solubility characteristics of organic compounds depend on the polarity of the solvents and the acidity of the solvents. In particular, some organic compounds are water soluble, even though they are much smaller than acetone soluble fractions. In the comparison between polar organic solvent extraction and non- polar organic solvent extraction, it can be thought that significant fraction of organic compounds analysed in the aerosol samples, are polar organic compounds because of the higher extraction efficiencies of organic compounds in polar organic solvent extraction than in nonpolar organic solvent extraction. Regarding the study of the oxalic -metal complexes, it can be thought that most oxalic acids are present in the form of oxalic -copper complexes in the aerosols collected at UEA.

  • PDF

TAKES: Two-step Approach for Knowledge Extraction in Biomedical Digital Libraries

  • Song, Min
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.1
    • /
    • pp.6-21
    • /
    • 2014
  • This paper proposes a novel knowledge extraction system, TAKES (Two-step Approach for Knowledge Extraction System), which integrates advanced techniques from Information Retrieval (IR), Information Extraction (IE), and Natural Language Processing (NLP). In particular, TAKES adopts a novel keyphrase extraction-based query expansion technique to collect promising documents. It also uses a Conditional Random Field-based machine learning technique to extract important biological entities and relations. TAKES is applied to biological knowledge extraction, particularly retrieving promising documents that contain Protein-Protein Interaction (PPI) and extracting PPI pairs. TAKES consists of two major components: DocSpotter, which is used to query and retrieve promising documents for extraction, and a Conditional Random Field (CRF)-based entity extraction component known as FCRF. The present paper investigated research problems addressing the issues with a knowledge extraction system and conducted a series of experiments to test our hypotheses. The findings from the experiments are as follows: First, the author verified, using three different test collections to measure the performance of our query expansion technique, that DocSpotter is robust and highly accurate when compared to Okapi BM25 and SLIPPER. Second, the author verified that our relation extraction algorithm, FCRF, is highly accurate in terms of F-Measure compared to four other competitive extraction algorithms: Support Vector Machine, Maximum Entropy, Single POS HMM, and Rapier.

The Analysis of Physicochemical and Sensory Characteristics in Brown Stock - Comparison of Traditional Method and High-Pressure Extracted Method - (갈색 육수의 이화학적 및 관능적 특성 분석 - 전통 방식과 고압 가열 방식 비교 -)

  • Choi, Soo-Keun;Jang, Hyuk-Rae;Rha, Young-Ah
    • Culinary science and hospitality research
    • /
    • v.14 no.3
    • /
    • pp.196-209
    • /
    • 2008
  • This study was conducted to mass-produce brown stock optimized by using a high-pressure heating extractor and to use brown stock as a material for developing various products. For these purposes, we attempted to produce standardized brown stock by extracting brown stock using a high-pressure heating extractor and compared it with brown stock extracted by the traditional method in terms of general elements and mechanical and sensory characteristics. With regard to how to prepare optimal brown stock, the best brown stock was that extracted seven times repeatedly by the traditional method, but the method had a large economic loss in terms of material consumption and took a long time in extraction. Thus, considering time and labor, it was concluded that extraction at 120$^{\circ}C$ for 15 hours using a high-pressure heating extractor is the optimal extraction condition in terms of economic efficiency and quality. The results of this study are expected to be useful as a practical material for making brown stock production process more convenient, applying cooks' traditional cooking techniques to mass production, maintaining standardized superior quality and taste, and improving shelf life.

  • PDF

Extraction of voice signal embedded in 1/f noise using wavelet

  • Toyama, Naoki;Sasaya, Takashi;Akizuki, Kageo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1997.10a
    • /
    • pp.564-567
    • /
    • 1997
  • This paper deals with the problem of extraction of voice signal embedded in 1/f noise. We propose the extraction method using wavelet. This method is based on Wornell's modelling which can construct 1/f process in terms of uncorrelated variables and is well suited on treating 1/f process. Finally, we show further describe our method through simulation.

  • PDF