• Title/Summary/Keyword: Seed Word

Search Result 30, Processing Time 0.027 seconds

Status of Plasma Technology Applied to Agriculture and Foods (플라즈마 기술의 농식품 분야 활용)

  • Yoo, Suk Jae
    • Vacuum Magazine
    • /
    • v.2 no.4
    • /
    • pp.4-8
    • /
    • 2015
  • Recently, plasma technology has been increasingly applied to agriculture and foods. Owing to the three important plasma characteristics which are activation, sterilization, and catalysis, the plasma technology can be properly applied to agriculture, in the other word, to the whole cycle of agriculture from farm to table: seed germination, plant growth, harvest and storage, washing, packaging, transport, in store, household, cooking, garbage, etc. Some representative case studies for plasma activation, sterilization, and catalysis show well that plasma technology can be successfully applied to the whole cycle of agriculture.

Rice Gruel in Chinese Food and Culture

  • Fan, Zhihong
    • Proceedings of the EASDL Conference
    • /
    • 2003.04a
    • /
    • pp.55-60
    • /
    • 2003
  • Rice is one of the most important cereal in China. The seed of rice unearthed from ancient tombs proved that rice cultural has a history of more than 7,000 years in south China. The word "rice" was found engraved on turtle-bone scriptures of 1,500 BC. tn many ancient Chinese scriptures, rice is among the most important "Five Cereals", which includes millet, wheat, soy bean, rice and sorghum.(중략)

  • PDF

A Novel Classification Model for Efficient Patent Information Research (효율적인 특허정보 조사를 위한 분류 모형)

  • Kim, Youngho;Park, Sangsung;Jang, Dongsik
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.15 no.4
    • /
    • pp.103-110
    • /
    • 2019
  • A patent contains detailed information of the developed technology and is published to the public. Thus, patents can be used to overcome the limitations of traditional technology trend research and prediction techniques. Recently, due to the advantages of patented analytical methodology, IP R&D is carried out worldwide. The patent is big data and has a huge amount, various domains, and structured and unstructured data characteristics. For this reason, there are many difficulties in collecting and researching patent information. Patent research generally writes the Search formula to collect patent documents from DB. The collected patent documents contain some noise patents that are irrelevant to the purpose of analysis, so they are removed. However, eliminating noise patents is a manual task of reading and classifying technology, which is time consuming and expensive. In this study, we propose a model that automatically classifies The Noise patent for efficient patent information research. The proposed method performs Patent Embedding using Word2Vec and generates Noise seed label. In addition, noise patent classification is performed using the Random forest. The experimental data is published and registered with the USPTO among the patents related to Ocean Surveillance & Tracking Network technology. As a result of experimenting with the proposed model, it showed 73% accuracy with the label actually given by experts.

Performance Improvement of Bilingual Lexicon Extraction via Pivot Language and Word Alignment Tool (중간언어와 단어정렬을 통한 이중언어 사전의 자동 추출에 대한 성능 개선)

  • Kwon, Hong-Seok;Seo, Hyeung-Won;Kim, Jae-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2013.10a
    • /
    • pp.27-32
    • /
    • 2013
  • 본 논문은 잘 알려지지 않은 언어 쌍에 대해서 병렬말뭉치(parallel corpus)로부터 자동으로 이중언어 사전을 추출하는 방법을 제안하였다. 이 방법은 중간언어(pivot language)를 매개로 하고 문맥 벡터를 생성하기 위해 공개된 단어 정렬 도구인 Anymalign을 사용하였다. 그 결과로 초기사전(seed dictionary)을 사용한 문맥벡터의 번역 과정이 필요 없으며 통계적 방법의 약점인 낮은 빈도수를 가지는 어휘에 대한 번역 정확도를 높였다. 또한 문맥벡터의 요소 값으로 특정 임계값 이상을 가지는 양방향 번역 확률 정보를 사용하여 상위 5위 이내의 번역 정확도를 크게 높였다. 본 논문은 두 개의 서로 다른 언어 쌍 한국어-스페인어 그리고 한국어-프랑스어 양방향에 대해서 각각 이중언어 사전을 추출하는 실험을 하였다. 높은 빈도수를 가지는 어휘에 대한 번역 정확도는 이전 연구에서 보인 실험 결과에 비해 최소 3.41% 최대 67.91%의 성능 향상을 보였고 낮은 빈도수를 가지는 어휘에 대한 번역 정확도는 최소 5.06%, 최대 990%의 성능 향상을 보였다.

  • PDF

Headword Finding System Using Document Expansion (문서 확장을 이용한 표제어 검색시스템)

  • Kim, Jae-Hoon;Kim, Hyung-Chul
    • Journal of Information Management
    • /
    • v.42 no.4
    • /
    • pp.137-154
    • /
    • 2011
  • A headword finding system is defined as an information retrieval system using a word gloss as a query. We use the gloss as a document in order to implement such a system. Generally the gloss is very short in length and then makes very difficult to find the most proper headword for a given query. To alleviate this problem, we expand the document using the concept of query expansion in information retrieval. In this paper, we use 2 document expansion methods : gloss expansion and similar word expansion. The former is the process of inserting glosses of words, which include in the document, into a seed document. The latter is also the process of inserting similar words into a seed document. We use a featureless clustering algorithm for getting the similar words. The performance (r-inclusion rate) amounts to almost 100% when the queries are word glosses and r is 16, and to 66.9% when the queries are written in person by users. Through several experiments, we have observed that the document expansions are very useful for the headword finding system. In the future, new measures including the r-inclusion rate of our proposed measure are required for performance evaluation of headword finding systems and new evaluation sets are also needed for objective assessment.

Bootstrapping-based Bilingual Lexicon Induction by Learning Projection of Word Embedding (부트스트래핑 기반의 단어-임베딩 투영 학습에 의한 대역어 사전 구축)

  • Lee, Jongseo;Wang, JiHyun;Lee, Seung Jin
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.462-467
    • /
    • 2020
  • 대역사전의 구축은 저자원 언어쌍 간의 기계번역의 품질을 높이는데 있어 중요하다. 대역사전 구축을 위해 기존에 제시된 방법론 중 단어 임베딩을 기반으로 하는 방법론 대부분이 영어-프랑스어와 같이 형태적 및 구문적으로 유사한 언어쌍 사이에서는 높은 성능을 보이지만, 영어-중국어와 같이 유사하지 않은 언어쌍에 대해서는 그렇지 못하다는 사실이 널리 알려져 있다. 본 논문에서는 단어 임베딩을 기반으로 부트스트래핑을 통해 대역사전을 구축하는 방법론을 제안한다. 제안하는 방법론은 소량의 seed 사전으로부터 시작해 반복적인 과정을 통해 대역사전을 자동으로 구축하게 된다. 이후, 본 논문의 방법론을 이용해 한국어-영어 언어쌍에 대한 실험을 진행하고, 기존에 대역사전 구축 용도로 많이 활용되고 있는 도구인 Moses에 사용된 방법론과 F1-Score 성능을 비교한다. 실험 결과, F1-Score가 약 42%p 증가함을 확인할 수 있었으며, 초기에 입력해준 seed 사전 대비 7배 크기의 대역사전을 구축하였다.

  • PDF

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.

Improvement of Security Cryptography Algorithm in Transport Layer (전달 계층의 보안 암호화 알고리즘 개선)

  • Choi Seung-Kwon;Kim Song-Young;Shin Dong-Hwa;Lee Byong-Rok;Cho Yong-Hwan
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2005.05a
    • /
    • pp.107-111
    • /
    • 2005
  • As Internet grows rapidly and next electronic commerce applications increase, the security is getting more important. Information security to provide secure and reliable information transfer is based on cryptography technique. The proposed ISEED(Improved SEED) algorithm based on block cryptography algorithm which belongs to secret-key algorithm. In terms of efficiency, the round key generation algorithm has been proposed to reduces the time required in encryption and decryption. The algorithm has been implemented as follow. 128-bit key is divided into two 64-bit group to rotate each of them 8-bit on the left side and right side, and then basic arithmetic operation and G function have been applied to 4-word outputs. In the process of converting encryption key which is required in decryption and encryption of key generation algorithm into sub key type, the conversion algorithm is analyzed. As a result, the time consumed to encryption and decryption is reduced by minimizing the number of plain text required differential analysis.

  • PDF

Product Evaluation Summarization Through Linguistic Analysis of Product Reviews (상품평의 언어적 분석을 통한 상품 평가 요약 시스템)

  • Lee, Woo-Chul;Lee, Hyun-Ah;Lee, Kong-Joo
    • The KIPS Transactions:PartB
    • /
    • v.17B no.1
    • /
    • pp.93-98
    • /
    • 2010
  • In this paper, we introduce a system that summarizes product evaluation through linguistic analysis to effectively utilize explosively increasing product reviews. Our system analyzes polarities of product reviews by product features, based on which customers evaluate each product like 'design' and 'material' for a skirt product category. The system shows to customers a graph as a review summary that represents percentages of positive and negative reviews. We build an opinion word dictionary for each product feature through context based automatic expansion with small seed words, and judge polarity of reviews by product features with the extracted dictionary. In experiment using product reviews from online shopping malls, our system shows average accuracy of 69.8% in extracting judgemental word dictionary and 81.8% in polarity resolution for each sentence.

The Comparison between the Tastes of Food in "Naekyeong(內經)" and them in "Euhakibmun(醫學入門)", "Dongeuibogam(東醫寶鑑)" ("내경(內經)"과 "의학입문(醫學入門)", "동의보감(東醫寶鑑)" 에 나타난 식이(食餌)의 오미(五味) 비교)

  • Jo, Hak-Jun
    • Journal of Korean Medical classics
    • /
    • v.23 no.6
    • /
    • pp.27-44
    • /
    • 2010
  • In order to setup the diet guideline of five grains, five meats, five fruits, and five vegetables for the diseases of five organs, I reviewed the their tastes by comparing "Naekyeong" with "Euhakibmun", "Dongeuibogam". 'Ma(麻)' in "Naekyeong" means not a hemp, a ramie or a jute, but a sesame(胡麻;참깨). 'Maik(麥)' in it means both a barley(大麥;보리) and a wheat(小麥;밀). 'Guak(藿)' in it means bean leaves(콩잎), leaves of a red-bean(팥잎) or brown seaweed(海藻;미역). 'Gyu(葵)' in "Euhakibmun Jangbujobun(臟腑條分)" is a miswritten word for 'Welsh onion' caused by similarity of shape of word. Food of a salty taste according to five elemental arrangement in "Naekyeong" is really salty according to "Euhakibmun" and "Dongeuibogam". But a barley(大麥) and a wheat(小麥) of sour taste are bitter, a chicken of sour taste or hot taste is sweet, nonglutinous millet of sour taste is sweet, an apricot of bitter taste is hot, a sesame seed of sweet taste is sour, a nonglutinous rice of hot taste is sweet, and a horsemeat of hot taste is bitter according to them. There are two ways to recommend the food for diseases of five organs. One way is to promote or control the Qi(氣) of five organs according to "Somun(素問)" and "Euhakibmun Jangbujobun", the other way is to build up the Yin(陰血) of five organs according to "Yungchu(靈樞) five tastes(五味)". The two different ways are not contradictory to each other, but complement on the view point of their substances(體) or actions(用).