Search | Korea Science

Multi-Vector Document Embedding Using Semantic Decomposition of Complex Documents (복합 문서의 의미적 분해를 통한 다중 벡터 문서 임베딩 방법론)

Park, Jongin;Kim, Namgyu
- Journal of Intelligence and Information Systems
- /
- v.25 no.3
- /
- pp.19-41
- /
- 2019
According to the rapidly increasing demand for text data analysis, research and investment in text mining are being actively conducted not only in academia but also in various industries. Text mining is generally conducted in two steps. In the first step, the text of the collected document is tokenized and structured to convert the original document into a computer-readable form. In the second step, tasks such as document classification, clustering, and topic modeling are conducted according to the purpose of analysis. Until recently, text mining-related studies have been focused on the application of the second steps, such as document classification, clustering, and topic modeling. However, with the discovery that the text structuring process substantially influences the quality of the analysis results, various embedding methods have actively been studied to improve the quality of analysis results by preserving the meaning of words and documents in the process of representing text data as vectors. Unlike structured data, which can be directly applied to a variety of operations and traditional analysis techniques, Unstructured text should be preceded by a structuring task that transforms the original document into a form that the computer can understand before analysis. It is called "Embedding" that arbitrary objects are mapped to a specific dimension space while maintaining algebraic properties for structuring the text data. Recently, attempts have been made to embed not only words but also sentences, paragraphs, and entire documents in various aspects. Particularly, with the demand for analysis of document embedding increases rapidly, many algorithms have been developed to support it. Among them, doc2Vec which extends word2Vec and embeds each document into one vector is most widely used. However, the traditional document embedding method represented by doc2Vec generates a vector for each document using the whole corpus included in the document. This causes a limit that the document vector is affected by not only core words but also miscellaneous words. Additionally, the traditional document embedding schemes usually map each document into a single corresponding vector. Therefore, it is difficult to represent a complex document with multiple subjects into a single vector accurately using the traditional approach. In this paper, we propose a new multi-vector document embedding method to overcome these limitations of the traditional document embedding methods. This study targets documents that explicitly separate body content and keywords. In the case of a document without keywords, this method can be applied after extract keywords through various analysis methods. However, since this is not the core subject of the proposed method, we introduce the process of applying the proposed method to documents that predefine keywords in the text. The proposed method consists of (1) Parsing, (2) Word Embedding, (3) Keyword Vector Extraction, (4) Keyword Clustering, and (5) Multiple-Vector Generation. The specific process is as follows. all text in a document is tokenized and each token is represented as a vector having N-dimensional real value through word embedding. After that, to overcome the limitations of the traditional document embedding method that is affected by not only the core word but also the miscellaneous words, vectors corresponding to the keywords of each document are extracted and make up sets of keyword vector for each document. Next, clustering is conducted on a set of keywords for each document to identify multiple subjects included in the document. Finally, a Multi-vector is generated from vectors of keywords constituting each cluster. The experiments for 3.147 academic papers revealed that the single vector-based traditional approach cannot properly map complex documents because of interference among subjects in each vector. With the proposed multi-vector based method, we ascertained that complex documents can be vectorized more accurately by eliminating the interference among subjects.
https://doi.org/10.13088/jiis.2019.25.3.019 인용 PDF KSCI

Physiological Activity of Supercritical Poria cocos back Extract and Its Skin Delivery Application using Epidermal Penetrating Peptide (초임계 복령피 추출물의 생리활성 및 경피투과 펩티드를 이용한 경피 약물전달의 응용)

Kim, Min Gi;Park, Su In;An, Gyu Min;Heo, Soo Hyeon;Shin, Moon Sam
- Journal of the Korean Applied Science and Technology
- /
- v.36 no.3
- /
- pp.766-778
- /
- 2019
In this study, Poria cocos bark were extracted by supercritical process, and anti-inflammatory, whitening, and antioxidant effects were measured in comparison with ethanol extract. Also, An effective percutaneous permeation method using a selected formulation of the extract and a drug delivery peptide was proposed. Pachymic acid, known as the anti-cancer and anti-inflammatory compound of the ventricle, is an indicator component and the HPLC analysis shows that the supercritical extract of the pericardium is more than twice that of the Poria cocos bark extract. In order to confirm antioxidative effect of Bombyx mori, DPPH scavenging ability and ABTS scavenging ability test showed that the ethanol extract of Poria cocos Back had lower concentration than the supercritical extract of Poria cocos back. However, RAW 264.7 Measurements of Nitric oxide (NO) production in cells showed lower NO production at the same concentration than the Poria cocos back ethanol extract. In addition, after 72 hours of processing of $20{\mu}g/mL$ of the Poria cocos back extract in B16 melanoma cells, both the intracellular and extracellular melanin extract were effective and the supercritical extract was lower melanin content. No toxicity was observed at the concentration of $800{\mu}g/mL$ in RAW 264.7 cells used in NO production experiments. However, in B16 melanoma cells, even at $50{\mu}g/mL$, both Poria cocos back ethanol extract and supercritical extract showed a survival rate of less than 60%. The liposome formulation and drug delivery peptides were shown to be useful for percutaneous permeation of Supercritical Extract of Poria cocos back using a liposome formulation and a drug delivery peptide. it is expected that there will be great potential for development as a variety of cosmetic materials for Poria cocos back.
https://doi.org/10.12925/jkocs.2019.36.3.766 인용 PDF KSCI

Knowledge Extraction Methodology and Framework from Wikipedia Articles for Construction of Knowledge-Base (지식베이스 구축을 위한 한국어 위키피디아의 학습 기반 지식추출 방법론 및 플랫폼 연구)

Kim, JaeHun;Lee, Myungjin
- Journal of Intelligence and Information Systems
- /
- v.25 no.1
- /
- pp.43-61
- /
- 2019
Development of technologies in artificial intelligence has been rapidly increasing with the Fourth Industrial Revolution, and researches related to AI have been actively conducted in a variety of fields such as autonomous vehicles, natural language processing, and robotics. These researches have been focused on solving cognitive problems such as learning and problem solving related to human intelligence from the 1950s. The field of artificial intelligence has achieved more technological advance than ever, due to recent interest in technology and research on various algorithms. The knowledge-based system is a sub-domain of artificial intelligence, and it aims to enable artificial intelligence agents to make decisions by using machine-readable and processible knowledge constructed from complex and informal human knowledge and rules in various fields. A knowledge base is used to optimize information collection, organization, and retrieval, and recently it is used with statistical artificial intelligence such as machine learning. Recently, the purpose of the knowledge base is to express, publish, and share knowledge on the web by describing and connecting web resources such as pages and data. These knowledge bases are used for intelligent processing in various fields of artificial intelligence such as question answering system of the smart speaker. However, building a useful knowledge base is a time-consuming task and still requires a lot of effort of the experts. In recent years, many kinds of research and technologies of knowledge based artificial intelligence use DBpedia that is one of the biggest knowledge base aiming to extract structured content from the various information of Wikipedia. DBpedia contains various information extracted from Wikipedia such as a title, categories, and links, but the most useful knowledge is from infobox of Wikipedia that presents a summary of some unifying aspect created by users. These knowledge are created by the mapping rule between infobox structures and DBpedia ontology schema defined in DBpedia Extraction Framework. In this way, DBpedia can expect high reliability in terms of accuracy of knowledge by using the method of generating knowledge from semi-structured infobox data created by users. However, since only about 50% of all wiki pages contain infobox in Korean Wikipedia, DBpedia has limitations in term of knowledge scalability. This paper proposes a method to extract knowledge from text documents according to the ontology schema using machine learning. In order to demonstrate the appropriateness of this method, we explain a knowledge extraction model according to the DBpedia ontology schema by learning Wikipedia infoboxes. Our knowledge extraction model consists of three steps, document classification as ontology classes, proper sentence classification to extract triples, and value selection and transformation into RDF triple structure. The structure of Wikipedia infobox are defined as infobox templates that provide standardized information across related articles, and DBpedia ontology schema can be mapped these infobox templates. Based on these mapping relations, we classify the input document according to infobox categories which means ontology classes. After determining the classification of the input document, we classify the appropriate sentence according to attributes belonging to the classification. Finally, we extract knowledge from sentences that are classified as appropriate, and we convert knowledge into a form of triples. In order to train models, we generated training data set from Wikipedia dump using a method to add BIO tags to sentences, so we trained about 200 classes and about 2,500 relations for extracting knowledge. Furthermore, we evaluated comparative experiments of CRF and Bi-LSTM-CRF for the knowledge extraction process. Through this proposed process, it is possible to utilize structured knowledge by extracting knowledge according to the ontology schema from text documents. In addition, this methodology can significantly reduce the effort of the experts to construct instances according to the ontology schema.
https://doi.org/10.13088/jiis.2019.25.1.043 인용 PDF KSCI HTML

A Study on a Method to Use Activation and Awareness on Archives of University Student (대학생의 기록관 인식현황 및 이용 활성화 방안 연구)

Lee, Jung-eun;Gang, Juyeon;Kim, Eun-Sil;Kim, Yong
- The Korean Journal of Archival Studies
- /
- no.51
- /
- pp.133-173
- /
- 2017
The records and archives center provide a variety of archival information services in an effort to get closer to the public. However, there are still some problems with regard to the lack of awareness of records and archives. In order to activate the use of archives, it is necessary to understand the users of archives. Given the problems, this study aims to investigate the awareness of records and archives in university students who are potential users of archives as well as to suggest methods to activate the use of records and archives reflecting the characteristics of university students' awareness. As such, this study surveyed 182 university students at J university. The questionnaire items referred to Market & Opinion Research International (MORI) (2003) as a part of the projects conducted by the Museums Libraries Archives Council (MLA) and Cho's study (2008). The questionnaire items consisted of four major areas: awareness of records and archives, experience with records and archives or reasons of not using them, requirements for the use of archives by potential users, and efficient method of promoting archives. As a result of the survey, most of the university students are indifferent to records. However, they recognized that it is highly important to manage records that are related to historical values and archives that are relevant to information values. In addition, they showed a positive intention to use the archives in the future; thus, it is highly likely for them to be converted into active users through appropriate services. Based on the results, this study proposed important considerations for activating the use of the archives to university students, and suggested methods to activate the archives in terms of user education, program development, and user segmentation.
https://doi.org/10.20923/kjas.2017.51.133 인용 PDF

Behavior of Nutrients and Heavy Metals (Cu, Zn) and Applicability Evaluation from Swine Wastewater Treatment Using Microalga Scenedesmus obliquus (미세조류 Scenedesmus obliquus 영양염류와 중금속(Cu, Zn) 거동특성 및 축산 폐수 처리 적용성 평가)

Park, Ji-Su;Hwang, In-Sung;Oh, Eun-Ji;Yoo, Jin;Chung, Keun-Yook
- Applied Chemistry for Engineering
- /
- v.30 no.2
- /
- pp.226-232
- /
- 2019
The biological wastewater treatment is more eco-friendly and can be used effectively in wastewater for a variety of purposes than that of the conventional treatment. In particular, the wastewater treatment using microalgae in biological treatment processes has attracted great attention due to its ability to remove economically nutrients from wastewater and have many advantages as a renewable energy source. This study was investigated to establish the optimal growth conditions for microalga Scenedesmus obliquus. Additionally, the removal efficiencies of nutrients (N, P) and heavy metals (Cu, Zn) from the synthetic wastewater were evaluated. As a results, the optimal growth conditions were established at $28^{\circ}C$, pH 7, and light and dark cycle of 14 : 10 h. In the evaluation of nutrient removal efficiencies at each concentrations of 500, 1,000, 5,000, and 10,000 mg/L, the removal rates were 17.6~70% N and 8.4~34% P in the single treatment and 12.0~58.0% N and 3.0~40.3% P in the binary mixture treatment. In addition, the evaluation of heavy metal removal efficiencies at each concentrations of 10, 30 and 50 mg/L, the removal rates were 13.7~40.3% Cu and 10.0~30.0% Zn in the single treatment and 16.0~40.0% Cu and 12.0~20.0% Zn in the binary mixture treatment. Based on the results of the study, it appears that Scenedesmus obliquus can be used for the removal of nutrients and heavy metals from the swine wastewater.
https://doi.org/10.14478/ace.2019.1003 인용 PDF KSCI HTML

Development of HRM Markers Based on SNPs Identified from Next Generation Resequencing of Susceptible and Resistant Parents to Gummy Stem Blight in Watermelon (수박에서 덩굴마름병 감수성 및 저항성 양친에 대한 차세대 염기서열 재분석으로 탐색된 SNP 기반 HRM 분자표지 개발)

Lee, Eun Su;Kim, Jinhee;Hong, Jong Pil;Kim, Do-Sun;Kim, Minkyong;Huh, Yun-Chan;Back, Chang-Gi;Lee, Jundae;Lee, Hye-Eun
- Korean Journal of Breeding Science
- /
- v.50 no.4
- /
- pp.424-433
- /
- 2018
Watermelon (Citrullus lanatus) is an economically important vegetable crop all over the world, which has functional compounds such as lycopene and citrulline. Gummy stem blight caused by Didymella bryoniae is one of the most devastative diseases in watermelon. Single nucleotide polymorphisms (SNPs), which are genetic variations occurring between individuals with respect to a single base, were often used to construct genetic linkage maps and develop molecular markers linked to a variety of horticultural traits and resistance to several diseases. In this study, we developed high-resolution melting (HRM) markers based on SNPs generated from NGS resequencing of two parents in watermelon. Plant materials were C. lanatus '920533' (female and susceptible parent), C. amarus 'PI 189225' (male and resistant parent), and their $F_1$ and $F_2$ progenies. A total of 13.6 Gbp ('920533') and 13.1 Gbp ('PI 189225') of genomic sequences were obtained using NGS analysis. A total of 6.09 million SNPs between '920533' and 'PI 189225' were detected, and 354,860 SNPs were identified as potential HRM primer sets. From these, a total of 330 primer sets for HRM analysis were designed. As a result, a total of 61 HRM markers that have polymorphic melting curves were developed. These HRM markers can be used for the construction of SNP-based linkage maps and for the analysis of quantitative trait loci (QTLs) related to gummy stem blight resistance.
https://doi.org/10.9787/KJBS.2018.50.4.424 인용 KSCI

Comparison of Isoflavone Content in 43 Soybean Varieties Adapted to Highland Cultivation Areas (고랭지 적응 콩 43개 품종의 해발고도별 이소플라본 함량 비교)

Hong, Su-Young;Kim, Su-Jeong;Sohn, Hwang-Bae;Kim, Yul-Ho;Cho, Kwang-Soo
- Korean Journal of Breeding Science
- /
- v.50 no.4
- /
- pp.442-452
- /
- 2018
In this study, we analyzed the growth characteristics and isoflavone content of 43 soybean varieties highly adaptable to highland areas. The flowering period of each cultivation zone was from July 15 to August 12 at Daewallyeong, from July 18 to August 11 at Jinbu, and from July 23 to August 13 at Gangneung. The accumulated temperature from flowering to maturity was $1,297^{\circ}C$ for Daegwallyeong, $1,391^{\circ}C$ for Jinbu, and $1,685^{\circ}C$ for Gangneung. Forty-three varieties were classified into four utilities; soy sauce and tofu, bean sprouts, cooking with rice, and vegetable and early maturity. The content of isoflavone was highest at $2,579{\mu}g/g$ in varieties for soy sauce and tofu usage. Five varieties ("Paldalkong," "Sinpaldal2," "Ilmikong," "Sinpaldalkong," and "Daepung") cultivated in Daegwallyeong had over $4,000{\mu}g/g$ of isoflavone. The isoflavone content of the region Daegwallyeong was different at the significance level of 0.1 (p=0.061) compared to Gangneung. There was no significant difference between Gangneung and Jinbu. It is thought that the low temperature of the maturation stage during the growing period affected isoflavone accumulation. The varieties with more than $3,000{\mu}g/g$ of isoflavone content in Daegwallyeong, Jindu, and Gangneung were "L29," "Williams82," "Ilmikong," and "Daepung." These were genetically and environmentally stable in isoflavone content. It is expected that this study will be used as basic data for the functional breeding and selection of soybean varieties highly adaptable to a specific region, and to help expand soybean cultivation areas in highlands.
https://doi.org/10.9787/KJBS.2018.50.4.442 인용 KSCI

The Effect of Influencer's Characteristics and Contnets Quality on Brand Attitude and Purchase Intention: Trust and Self-congruity as a Mediator (소셜미디어 인플루언서의 개인특성과 콘텐츠 특성이 브랜드 태도와 구매의도에 미치는 영향: 신뢰와 자아일치성을 매개로)

Lee, Myung Jin;Lee, Sang Won
- Asia-Pacific Journal of Business Venturing and Entrepreneurship
- /
- v.16 no.5
- /
- pp.159-175
- /
- 2021
This study attempted to analyze the relationship between influencer's characteristic factors such as professionalism, authenticity, and interactivity and content quality factors consisting of accuracy, completeness, and diversity on brand attitude and purchase attitude through trust and self-consistency. To reveal the structural relationship between main variables, a survey was conducted on 201 users. An EFA, CFA, and reliability analysis were performed to confirm reliability and validity. And structural equation was conducted to verify hypothesis. The main results are as follows. First, it was found that professionalism and interactivity had a significant positive effect on trust. And, accuracy, completeness, and variety were all found to have a significant positive effect on trust. Second, in the relationship between individual characteristic factors and self-consistency, it was found that professionalism and authenticity had a significant positive effect on self-consistency. In addition, in the relationship between content quality and self-consistency, accuracy, completeness, and diversity were found to have a positive effect on self-consistency along with trust. Third, in the relationship between trust and self-consistency on brand attitude and purchase intention, both trust and self-consistency were found to have a statistically significant positive effect on brand attitude. It was found that only self-consistency and brand attitude had a statistically significant positive effect on purchase intention. These findings showed that when users perceive professionalism and interaction with influencer, trust increases, and professionalism and progress increase self-consistency with influencer. In addition, in the case of content quality, it was found that trust and self-consistency responded positively when perceived content quality through content accuracy, completeness, and diversity. Also, trust and self-consistency increased attitudes toward brands and could influence consumption behavior such as purchase intention. Therefore, for effective marketing performance using influencer's influence in the field of influencer marketing, which has a strong information delivery on products and brands, not only personal characteristics such as professionalism, authenticity, and interactivity, but also quality of content should be considered. The above research results are expected to suggest implications for marketing strategies and practices as one available basic data to exert the expected effect of marketing using influencer.
PDF KSCI

Yield Characteristics and Related Agronomic Traits Affected by the Transplanting Date in Early Maturing Varieties of Rice in the Central Plain Area of Korea (중부 평야지에서 조생종 벼의 이앙시기에 따른 수량 특성 변화와 작물학적 요인 분석)

Yang, Woonho;Park, Jeong-Hwa;Choi, Jong-Seo;Kang, Shingu;Kim, Sukjin
- KOREAN JOURNAL OF CROP SCIENCE
- /
- v.64 no.3
- /
- pp.165-175
- /
- 2019
In response to elevated temperature, a shift in the rice planting period was proposed as a promising option in temperate regions. To understand the yield response of early maturing rice to different transplanting dates and to analyze the related agronomic traits in the central plain area, we performed a two-year study using different transplanting dates and six varieties in Suwon, Korea. The maximum head rice weight was achieved in the treatments transplanted between June 14 and 29, depending upon the varieties. The optimal mean temperature during the 40 days from heading stage for attaining the maximum head rice weight was $21.8^{\circ}C$ on the average of six varieties. The index of head rice weight was positively correlated with the indices of both the milled rice weight and head rice percentage, the latter showing a higher coefficient of determination. The highest milled rice weight was commonly achieved from the treatment transplanted on June 29, where the head rice weight was also the highest. The index of milled rice weight was significantly correlated with the indices of grain filling percentage and number of spikelets per area, but not correlated with the index of 1000-brown rice weight. The transplanting date with the highest milled rice yield produced the largest number of spikelets per area, greatest biomass at the heading and harvesting stages, and highest level of harvest index. We suggest that the optimal transplanting date for early maturing rice varieties in the central plain area is from June 14 to 29. High head rice yield in this study was attributed to increased spikelets owing to the increased biomass production at the heading stage, enhanced grain filling due to the high biomass production and harvest index at maturity, and improved head rice percentage.
https://doi.org/10.7740/kjcs.2019.64.3.165 인용 PDF KSCI

A Study on the Painting's Aesthetic of Gongjae Yoon Duseo (공재(恭齋) 윤두서(尹斗緖)의 회화심미(繪畵審美) 고찰)

Kim, Doyoung
- The Journal of the Convergence on Culture Technology
- /
- v.7 no.1
- /
- pp.175-183
- /
- 2021
Gongjae Yoon DuSeo(1668~1715), from Haenam in the late Joseon Dynasty, is a scholar-born painter who was active during King Sukjong. He is the person who created the foundation as a pioneer of realist paintings in the late Joseon period during the transition from the middle to the latter period. He was born in Namin's prestigious family, but he ended his career as part of a partisan fight and immersed himself in painting and learning. 18C, the beginning of the late Joseon Dynasty, was a period when Silhak emerged and the Jinkyung era opened with awareness of nationalism. At this time, by incorporating the Silhak thought into the art world, the real reformed aesthetic consciousness was demonstrated to pioneer common people's customs, the application of Western painting methods, the pursuit of realist techniques, and the introduction of Namjongmuninhwa. His view of painting, who thoroughly learned the old things and pursued change, must have both the form and spirit that he can achieve 'HwaDo' only when it has the science of 'learning and knowledge' and the technical elements of 'practice and quality' emphasized. He has worked in a variety of reconciliations. In particular, portrait paintings are characterized by ihyeongsasin's realistic expressions of aesthetics. His masterpiece, 「Self-portrait」, excels in extreme-realistic depiction and innovation in composition, and stands out with an unconventional experimentation spirit that expresses his mind and thoughts in a painting with a sense of resentment. His landscape paintings combine to express the form as it is and mental notions, and beautifully embodied Do as a form, thus achieving ihyeongmido, which reached the level of'joyfulness forgotten even the heart of joy'. On the other hand, the generalization of the common people using various common people's lives as the subject of an open-mindedness aimed at gaining the facts of ihyeongsajin, a passive protest against corrupt power and an expression of a spirit of love. Since then, his painting style has been passed down from generation to generation to his eldest son Yoon Deok-hee and his grandson Yoon Yong, leading the change and revival of calligraphy art in the late Joseon Dynasty.
https://doi.org/10.17703/JCCT.2021.7.1.175 인용 PDF KSCI

Search Result 19,246, Processing Time 0.055 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)