• Title/Summary/Keyword: representing

Search Result 5,487, Processing Time 0.039 seconds

Multi-Vector Document Embedding Using Semantic Decomposition of Complex Documents (복합 문서의 의미적 분해를 통한 다중 벡터 문서 임베딩 방법론)

  • Park, Jongin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.19-41
    • /
    • 2019
  • According to the rapidly increasing demand for text data analysis, research and investment in text mining are being actively conducted not only in academia but also in various industries. Text mining is generally conducted in two steps. In the first step, the text of the collected document is tokenized and structured to convert the original document into a computer-readable form. In the second step, tasks such as document classification, clustering, and topic modeling are conducted according to the purpose of analysis. Until recently, text mining-related studies have been focused on the application of the second steps, such as document classification, clustering, and topic modeling. However, with the discovery that the text structuring process substantially influences the quality of the analysis results, various embedding methods have actively been studied to improve the quality of analysis results by preserving the meaning of words and documents in the process of representing text data as vectors. Unlike structured data, which can be directly applied to a variety of operations and traditional analysis techniques, Unstructured text should be preceded by a structuring task that transforms the original document into a form that the computer can understand before analysis. It is called "Embedding" that arbitrary objects are mapped to a specific dimension space while maintaining algebraic properties for structuring the text data. Recently, attempts have been made to embed not only words but also sentences, paragraphs, and entire documents in various aspects. Particularly, with the demand for analysis of document embedding increases rapidly, many algorithms have been developed to support it. Among them, doc2Vec which extends word2Vec and embeds each document into one vector is most widely used. However, the traditional document embedding method represented by doc2Vec generates a vector for each document using the whole corpus included in the document. This causes a limit that the document vector is affected by not only core words but also miscellaneous words. Additionally, the traditional document embedding schemes usually map each document into a single corresponding vector. Therefore, it is difficult to represent a complex document with multiple subjects into a single vector accurately using the traditional approach. In this paper, we propose a new multi-vector document embedding method to overcome these limitations of the traditional document embedding methods. This study targets documents that explicitly separate body content and keywords. In the case of a document without keywords, this method can be applied after extract keywords through various analysis methods. However, since this is not the core subject of the proposed method, we introduce the process of applying the proposed method to documents that predefine keywords in the text. The proposed method consists of (1) Parsing, (2) Word Embedding, (3) Keyword Vector Extraction, (4) Keyword Clustering, and (5) Multiple-Vector Generation. The specific process is as follows. all text in a document is tokenized and each token is represented as a vector having N-dimensional real value through word embedding. After that, to overcome the limitations of the traditional document embedding method that is affected by not only the core word but also the miscellaneous words, vectors corresponding to the keywords of each document are extracted and make up sets of keyword vector for each document. Next, clustering is conducted on a set of keywords for each document to identify multiple subjects included in the document. Finally, a Multi-vector is generated from vectors of keywords constituting each cluster. The experiments for 3.147 academic papers revealed that the single vector-based traditional approach cannot properly map complex documents because of interference among subjects in each vector. With the proposed multi-vector based method, we ascertained that complex documents can be vectorized more accurately by eliminating the interference among subjects.

Hanseong Period of Baekje and Mahan (한성시대(漢城時代)의 백제(百濟)와 마한(馬韓))

  • Choi, Mong-Lyong
    • Korean Journal of Heritage: History & Science
    • /
    • v.36
    • /
    • pp.5-38
    • /
    • 2003
  • The history of Baekje Kingdom, one of the Three kingdoms, is divided into three periods to the change of sociopolitical center, including its capital as follows: Hanseong Period (18 BC ~ AD 475), Ungjin Period (AD 475~538), and Sabi Period (AD 538~660). Though the Hanseong Period of Baekje Kingdom covers more than two thirds of the whole history of Baekje Kingdom (493 years), history and archaeological culture of the Hanseong Period is still unclear and even ambiguous comparing to the Ungjin and Sabi periods. Most of all, it is because of quite limited historical records and archaeological data available. In addition, negative attitude of the Korean academic circles to the early records of Samguksaki(三國史記) has been a critical obstacle to the study of early history of the Three kingdoms, including the Hanseong Period of Baekje kingdom. Author, who has attempted to combine historical records and archaeological data in order to reconstruct the history and archaeological culture of the early Baekje, specifically the Hanseong Period, has held positive attitude to the early records of the Samguksaki as far as possible. He(Author) came to realize that comprehensive understanding of Mahan (馬韓) society, one of the Three Han (三韓) Society was more than essential in the study of Baekje. According to historical records and archaeological data, Mahan Society represented by Mojiguk(目支國) ruled by King Jin(辰王) has been located in the middle and/or southwestern parts of the Korean peninsula from the 3rd~2nd century BC through the end of the 5th century or early 6th century AD. Mahan already occupied central portion of the Korean Peninsula, including the Han River Valley when King Onjo(溫祖王) first set up the capital of Baekje Kingdom at Wiryeseong (慰 禮城) considered to be modern Jungrang~Songpa-gu area of Han River Valley. From the beginning of the Baekje history, there had been quite close interrelationships between Baekje and Mahan, and the interrelationships had lasted for around 500 years. In other words, it is impossible to attempt to understand and study Hanseong period of Baekje, without considering the historical and archaeological identity of Mahan. According to the Samguksaki, Baekje moved its capital three times during the Hanseong Period (18 BC ~ AD 475) within the Han River Valley as follows: Wiryeseong at Jungrang-gu area of the Han River (河北慰禮城, 18 ~ 5 BC), Wiryeseong at Songpa-gu area of the Han River(河南慰禮城, 5 BC ~ AD 371), Hansan at Iseongsan fortress site(Historical site No. 422, 漢山, AD 371~391), and Hanseong at Chungung-dong of Hanam city(漢城, AD 391~475). Before 1990s, archaeological data of the Hanseong Period was quite limited, and archaeological culture of Mahan was not well defined. Only a few burial and fortress sites were reported to be archaeological remains of the early Baekje, and a few settlement and jar burial sites were assumed to be those of Mahan without clear definition of the Mahan Culture. Since 1990s, fortunately, a number of new archaeological sites of Hanseong Baekje and Mahan have been reported and investigated. Thanks to the new discoveries, there has been significant progress in the study of early Baekje and Mahan. In particular, a number of excavations of Pungnap-dong Fortress site(Historical site NO. 11, 1996~2003), considered to be the Wiryeseong at south of the Han River, the second capital of the Hanseong Baekje, provided critical archaeological evidence in the study of Hanseong Period of Baekje. Since the end of the 1990s, a number of sites have been reported in Gyeonggi, Chungcheong, and Jeolla provinces, as well. From these sites, archaeological features and artifacts representing distinctive cultural tradition of Mahan have been identified such as unstamped fortresses, pit houses cut into the rock, houses with lifted floor(掘立柱 건물), and potteries decorated with toothed wheel and bird's footprint designs. These cultural traditions reflected in the archaeological remains played a critical role to define and understand archaeological identity of the Mahan society. Moreover, archaeological data from these new sites reported in the middle and southwestern parts of the Korean Peninsular made it possible to postulate a hypothesis that the history of Mahan could be divided into three periods to the change of its sociopolitical center in relation with the Baekje Kingdom's political Situation as follows: Cheonan (天安) Period, Iksan(益山) Period, and Naju(羅州) Period. The change of Mahan's sociopolitical center is closely related to the sociopolitical expansion of the Hanseong Baekje.

Isolation of Wild Yeasts from the Water and Riverside Soil of Geumgang Midstream in Sejong City, Korea, and Characterization of Unrecorded Wild Yeasts (세종특별자치시 주변의 금강 중류 물과 토양에서 야생 효모의 분리 및 국내 미기록 효모의 특성)

  • Han, Sang-Min;Kim, Ji-Yoon;Lee, Jong-Soo
    • The Korean Journal of Mycology
    • /
    • v.47 no.1
    • /
    • pp.51-61
    • /
    • 2019
  • The goal of this study was to elucidate wild yeast diversity of Geumgang midstream near Sejong metropolitan autonomous city, Korea. Thirty-seven strains of 32 species of wild yeasts were isolated from 43 water and soil samples under the Bulti bridge of Sejong city, Korea. Seven yeasts of each Candida spp. and Cryptococcus spp. were the predominant species isolated from samples near the Bulti bridge. Holtermanniella takashimae SW048 (NNIBRFG9314), Cystofilobasidium infirmominiatum SW013 (NNIBRFG9310), Mrakia cryoconite SW015 (NNIBRFG9316), Pichia sporocuriosa SW085 (NNIBRFG9326) and Cryptococcus aspenensis SW008 (NNIBRFG9309) represented novel yeast strains found in Korea for the first time. All of these previously unrecorded yeasts, except for Mrakia cryoconite SW015 had ascospores and grew well in yeast extract-peptone-dextrose (YPD), yeast extract-malt extract (YM) and potato-extrose (PD) media. Pichia sporocuriosa SW085 grew well in vitamin-free medium and Holtermanniella takashimae SW048, which was a halotolerant wild yeast, grew well YPD medium containing 5 % NaCl. Twenty-six strains representing eight species of wild yeast were isolated from 22 water and soil samples under the Haetmuri bridge of Sejong city, Korea. Candida pseudolambica (12 strains) and Aureobasidium pullulans (11 strains) were the predominant isolates from samples near the Haetmuri bridge. Occultifur kilbournensis HB060 (NNIBRFG9317), Sampaiozyma vanillica HB014 (NNIBRFG9332), Xenoramularia neerlandica HB039 (NNIBRFG9335), Candida norvegica HB315 (NNIBRFG9306), C. melibiosica HB316 (NNIBRFG9305), C. quercuum GB014 (NNIBRFG9307), and C. succiphila GB015 (NNIBRFG9308) represented novel yeast strains recorded in Korea for the first time. O. kilbournensis HB060 and X. neerlandica HB039 did not form ascospores or pseudo-mycelia. All of these previously unrecorded yeasts, except S. vanillica HB014 and X. neerlandica HB039, grew well in vitaminfree medium, and C. norvegica HB315 and C. succiphila GB015, which were halotolerant wild yeasts, which grew well in YPD medium containing 5 % NaCl.

A Plan to Activate the Archive of Maeul Communities (마을공동체 아카이브 활성화 방안)

  • Sohn, Dong-you;Lee, Kyoung-juhn
    • The Korean Journal of Archival Studies
    • /
    • no.35
    • /
    • pp.161-206
    • /
    • 2013
  • 'Maeul' is a concept connoting a community. As a place where ordinary people's lives are planned and realized, Maeul is the foundation of their daily lives as well as a place where they work, rest and enjoy pastime activities. In Korea, however, most Maeul communities are dismantled while going though the modern period representing colonization and developmental dictatorship. Growth-oriented industrialization and urbanization turned into such adverse effects as individualization, a sense of loss and a sense of alienation. Recently, through innovations from below, Maeuls are restored, and through Maeul communities restored this way, every Maeul and many researchers carry out activities to build a healthy civil society. This study was conducted on such a background. For a healthy restoration of Maeul communities and a sustainable operation of those communities, it is necessary to establish archives where record the trace of Maeul members' daily lives and relations between those members. The archive of Maeul communities is a place that contains each Maeul's local characteristics as well as human relations as well. It is because this place can be space where Maeul members can record their history, communicate with each other and make a better future. The archive of Maeul communities can be made into various different models, which can be operated by reflecting the identity of a community such as main agents and characteristics, objectives and orientation of objects recorded. Rather than when Maeul communities exist as individuals, they can display more important functions and better effect when they form a network. Therefore, it is needed to provide various and creative methodologies different from the existing government-led record management. Not only on the form of archives, but also all over their functions, such as collection, arrangement, classification, evaluation, management and utilization, Maeul and Maeul residents' norms, orientation and realistic conditions should be thoroughly reflected. Starting from a chance to look back at individuals' lives, the archive of Maeul communities will be a new chapter to restore and build a healthy community in our society and overcome social contradictions from below. Moreover, the archive of Maeul communities has a great significance that it will broaden its prospect creatively with a new paradigm, not only mechanically turning the existing public sector-centered record management into a non-governmental sector.

A Status Analysis for the Standards on Permission of Altering Cultural Heritage's Current State Focusing on the Results of Handling Application Cases on Permission of State-Designated Cultural Heritage (Historic Site) for the Last Five Years (2015~2019) (문화재 현상변경 인·허가 검토기준 마련을 위한 실태분석 연구 - 최근 5년(2015~2019)간 국가지정문화재(사적)의 허가신청 안건 처리결과를 중심으로 -)

  • CHO, Hongseok;SUH, Hyunjung;CHOI, Jisu
    • Korean Journal of Heritage: History & Science
    • /
    • v.54 no.3
    • /
    • pp.24-51
    • /
    • 2021
  • Since June 2006, there have been active efforts to systematize the permission system including the amendment of [Cultural Heritage Protection Act]. Cultural Heritage Administration prepared standards on reviewing each type of cultural heritages(CH) in 2015, promoted a project on the modification of permission standards and showed remarkable performances in quantitative aspects. But as there has been little change for the cases applied for permission, additional studies on policy are required to improve the management efficiency and reduce the citizens'inconvenience. In response, this study aims to identify the actual management status on the current state alteration permission system, and establish practically utilizable reference materials at permission review. While historic sites(HS) constitute a relatively small proportion in state-designated CHs, they are subject to the designation of permission standards. Also, with their location in the downtown area, the application rate is high (51.4%) and the results are commonly utilizable to other types of CH. We constructed a DB based on the minutes of Cultural Heritage Committee(CHC) on HS and categorized similar features in permission handling results. The result of the analysis is as follows. Out of a total of 5,243 cases for permission applied for HS, 1,734 cases of cultural heritage areas(CHA) and 3,509 cases of historic and cultural environment preservation areas(HCEPA) have been applied. CHA has a great proportion of the applications for events and festivals, which are highly related to CHs or representing the local area. There is a high permission rate on applications for the purpose of public service by local governments. Meanwhile, HCEPA has a high proportion of applying for the installation and extension of buildings and facilities at the private level. Thus, negative decisions were made for tall buildings, massed facilities, or suspected scattering of similar acts. Our actual condition analysis has identified a total of 78 types of harmful acts which may influence the preservation of CHs. 31 types in CHA and 37 types in HCEPA are categorized. Especially, 10 common types of permission have been confirmed in both sectors. As a result, it is expected to secure consistency in the permission administration, enhance the management efficiency and improve the public's satisfaction over the regulatory administration by providing practically utilizable reference materials for altering the current state of CH and for decision making on the part of CHC.

Influence of Microcracks in Geochang Granite on Brazilian Tensile Strength (거창화강암의 미세균열이 압열인장강도에 미치는 영향)

  • Park, Deok-Won
    • Korean Journal of Mineralogy and Petrology
    • /
    • v.34 no.3
    • /
    • pp.193-208
    • /
    • 2021
  • The characteristics of the microcrack lengths(①), microcrack spacings(②) and Brazilian tensile strengths(③) related to the six directions of rock cleavages(H2~R1) in Geochang granite were analyzed. First, the 18 cumulative graphs for the above three major factors representing unique characteristics of the rock cleavages were made. Through the general chart for these graphs classified into three planes and three rock cleavages, the 28 parameters on the length, spacing and Brazilian tensile strength have been determined. The results of correlation analysis among these parameters are summarized as follows. Second, the above parameters were classified into six groups(I~VI) according to the sorting order on the magnitude of parameter values among three rock cleavages and three planes. The values of parameters belonging to group I and II are in order of R(rift) < G(grain) < H(hardway) and H < G < R. The values of the 8 parameters on the length of line(os2, 𝚫s, 𝚫L and oSmean), the exponent(λLmean and λSmean), the slope(amean) and the anisotropy coefficient (Anmean) are in order of R < G < H and H'(hardway plane) < G'(grain plane) < R'(rift plane). Third, the noticeable differences in distribution patterns among the six types of charts for three planes and three rock cleavages are as follows. From the chart for three planes, the values of 𝚫L, 𝚫s and 𝚫σt, corresponding to the distance between two points where the two fitting lines meet on the X-axis, increase in the order of R' < H' < G'. In particular, the two graphs of R2 and G2 related to the length and Brazilian tensile strength are almost parallel to each other and show the distribution characteristics of hardway plane. Among the graphs related to the Brazilian tensile strength, the overall shape for hardway plane is similar to that for grain. From the chart for three rock cleavages, the slopes of the graphs related to the length increase in the order of R < G < H, while those of the graphs related to the spacing and Brazilian tensile strength decrease in the order of R < G < H. Lastly, the characteristics of variation among the six rock cleavages, the three planes and the three rock cleavages were visualized through the correlation chart among the above parameters from this study.

Evaluation of Stabilization Capacity for Typical Amendments based on the Scenario of Heavy Metal Contaminated Sites in Korea (국내 중금속 부지오염시나리오를 고려한 안정화제의 중금속 안정화 효율 규명)

  • Yang, Jihye;Kim, Danu;Oh, Yuna;Jeon, Soyoung;Lee, Minhee
    • Economic and Environmental Geology
    • /
    • v.54 no.1
    • /
    • pp.21-33
    • /
    • 2021
  • The purpose of this study is to determine the order of priority for the use of amendments, matching the optimal amendment to the specific site in Korea. This decision-making process must prioritize the stabilization and economic efficiency of amendment for heavy metals and metalloid based on domestic site contamination scenarios. For this study, total 5 domestic heavy metal contaminated sites were selected based on different pollution scenarios and 13 amendments, which were previously studied as the soil stabilizer. Batch extraction experiments were performed to quantify the stabilization efficiency for 8 heavy metals (including As and Hg) for 5 soil samples, representing 5 different pollution scenarios. For each amendment, the analyses using XRD and XRF to identify their properties, the toxicity characteristics leaching procedure (TCLP) test, and the synthetic precipitation leaching procedure (SPLP) test were also conducted to evaluate the leaching safety in applied site. From results of batch experiments, the amendments showing > 20% extraction lowering efficiency for each heavy metal (metalloid) was selected and the top 5 ranked amendments were determined at different amount of amendment and on different extraction time conditions. For each amendment, the total number of times ranked in the top 5 was counted, prioritizing the feasible amendment for specific domestic contaminated sites in Korea. Mine drainage treatment sludge, iron oxide, calcium oxide, calcium hydroxide, calcite, iron sulfide, biochar showed high extraction decreasing efficiency for heavy metals in descending order. When the economic efficiency for these amendments was analyzed, mine drainage treatment sludge, limestone, steel making slag, calcium oxide, calcium hydroxide were determined as the priority amendment for the Korean field application in descending order.

Toponymic Practices for Creating and Governing of Cultural Heritage (문화유산 관리를 위한 지명(地名)의 가치와 활용 방안)

  • KIM, Sunbae
    • Korean Journal of Heritage: History & Science
    • /
    • v.54 no.2
    • /
    • pp.56-77
    • /
    • 2021
  • Toponyms are located not only in the site between human cognition and the physical environment but also in the name of cultural heritage. Accordingly, certain identities and ideologies for which human groups and community have sought, their holistic way of life, and all cultural symbols and cosmos, such as sense of place and genius loci, are included in their toponymic heritage. Denoting, symbolizing, integrating and representing the culture and nature belong to the human community. Based on these perceptions of the toponymic heritage, the aims of this article are to examine the values of a toponym as an Intangible Cultural Heritage (ICH) and to suggest the application methods using the toponymic functions for governing of tangible cultural heritage. This article discusses the multivocality, diversity, and non-representational theory of landscape phenomenology intrinsic to the terms of culture and cultural landscape and then the domestic and international issues on the toponymic heritage in the first chapter on the values of toponym as a part of the ICH. In particular, it analyzes the preceding research in the field of toponymy, as well as the Resolutions of UNCSGN and UNGEGN on "Geographical names as culture, heritage and identity" including indigenous, minority and regional language names since 1992, which is related to the UNESCO's Convention for the Safeguarding of the Intangible Cultural Heritage in 2003. Based on this, I suggest that the traits of toponymic cultural heritage and its five standards of selection, i.e., cultural traits of toponyms, historical traits, spatial traits, socio-economic traits and linguistic traits with some examples. In the second chapter discussing on the methods using the toponymic denoting functions for creating and governing of the tangible cultural heritage, it is underlined to maintain the systematic and unified principle regarding the ways of naming in the official cultural heritage and its governing. Lastly, I introduce the possible ways of establishing a conservative area of the historical and cultural environment while using the toponymic scale and multi-toponymic territory. Considering both the spatial and participatory turns in the field of heritage studies in addition to the multiple viewpoints and sense of cultural heritage, I suggest that the conservative area for the cultural heritage and the historical and cultural environment should be set up through choosing the certain toponymic scale and multi-toponymic territory.

The Morphologic Characteristics of Step-pool Structures in a Steep Mountain Stream, Chuncheon, Gangwon-do (강원도 춘천시 근교의 산지계류에 형성된 계단상 하상구조의 특징)

  • Kim, Suk Woo;Chun, Kun Woo;Park, Chong Min;Nam, Soo Youn;Lim, Young Hyup;Kim, Young Seol
    • Journal of Korean Society of Forest Science
    • /
    • v.100 no.2
    • /
    • pp.202-211
    • /
    • 2011
  • The geometric characteristics of step-pool structures and how they are influenced by channel characteristics were investigated in a steep mountain stream in the Experimental Forests of Kangwon National University in Chuncheon, Gangwon-do. Average values of steps for the study reaches were as follows: step spacing, 4.69 m; step height, 0.47 m; step drop, 0.71 m; step-forming particle sizes, 0.68 m; number, 21steps/ 100 m; the ratio of step spacing to channel width, 0.5; and step steepness, 0.13. Relationships between spacing and height of steps and channel gradient showed a negative- and positive correlation, respectively, whereas all geometric variables of steps manifested poor correlation with channel width. Therefore, step steepness, expressed as the ratio of step height to step spacing, increased as channel gradient increased. The ratio of step steepness to channel gradient representing the criterion of maximum flow resistance was 1.2, indicating the channel bed's stable condition. In particular, the relationship between the ratio of step drop to step height and channel gradient showed a significant negative correlation, suggesting the influence of step-pool geometry in trapping sediment and providing an aquatic habitat. Positive correlations also exist between spacing and drop of steps and step particles. Our findings suggest that the dynamics of step-pool structures may strongly control physical and ecological environments in steep mountain streams, so understanding them is essential for stream management.

The study of Zhu-xi(朱熹) and Dai-zhen(戴震)'s filthy poetry interpretation - Centering around 15Guo-feng(國風) (주희(朱熹)와 대진(戴震)의 음시해석(淫詩解釋)에 관한 고찰(考察) - 15국풍(國風)을 중심으로 -)

  • Park, Sun-cheul
    • (The)Study of the Eastern Classic
    • /
    • no.37
    • /
    • pp.249-278
    • /
    • 2009
  • Zhu-zi(朱子) represented the study of The Book of Odes in Song dynasty and Dai-zhen(戴震) was The Book of Odes researcher representing Wan-pai(a kind of party) in Qing dynasty. Especially Dai-zhen took critical position of Zhu-xi. Comparing Zhu-zi with Dai-zhen in the aspect of The Book of Odes interpretation, this thesis intends to review the difference and the reason of both interpretations. Especially this thesis compares Zhu-zi's interpretation with Dai-zhen's about thirty poems Zhu-zi considered filthy poetry, investigating the differences of their interpretation. Regarding the poetry Zhu-zi considered filthy as refined satire, Dai-zhen had a negative position about Zhu-zi's theory of filthy poetry. As Zhu-zi interpreted the poetry in the first person on the literary view in the time when he interpreted the lyrics in the Feng-shi, he regarded the purpose and the usefulness of poetry as feeling of words. But as Dai-zhen interpreted the poetry in the third person under a Confucian classic view, he regarded the purpose and the usefulness of poetry as refined satire. In brief, that is to say that Zhu-zi made literary interpretations of feeling of words but Dai-zhen made Confucian classic interpretations of 'Si-wu-xie'(思無邪). These two men's differences about interpretation of The Book of Odes have much importance on the historical aspects of The Book of Odes. So to speak, Dai-zhen had bibliographical approach and described the meaning of poetry objectively, following Mao-shi(毛詩) theory about the interpretation of meaning of poetry, criticizing Zhu-zi's literarary view. Dai-zhen's interpretation of The Book of Odes mentioned Above was made from long vital power of Mao-shi theory and a Confucian classic method. Considering the historical stream of Zhu-zi and Dai-zhen's interpreting The Book of Odes, The Book of Odes will be interpreted and analyzed from the various views in the future.