• Title/Summary/Keyword: SIMILARITY ANALYSIS

Search Result 3,153, Processing Time 0.027 seconds

Fish Fauna and Community Structure in the Deogyusan National Park, Korea (덕유산국립공원의 어류상과 군집구조)

  • Yun, Seung Woon;Park, Jong Young
    • Korean Journal of Ichthyology
    • /
    • v.33 no.2
    • /
    • pp.126-141
    • /
    • 2021
  • Fauna of freshwater fish and community structure were investigated at 13 sites in the Deogyusan National Park, Korea from 2014 to 2018. During the period, a total of 8 families, 21 species, and 8,716 individuals of fishes were collected. The number of fish collected over the past five years from 2014 to 2018, were 17 species and 2,280 individuals, 17 species and 1,579 individuals, 17 species 1,905 individuals, 17 species and 1,384 individuals, and 15 species and 1,568 individuals, respectively. There were 13 Korean endemic species including Iksookimia koreensis and Coreoleuciscus splendidus, etc. Only in Wondangcheon Stream, two endangered species were identified, and Hemibarbus mylodon was collected continuously except in 2015, and Pseudopungtungia nigra was observed every year. And two exotic species such as Oncorhynchus masou masou and Oncorhynchus mykiss occurred in Gucheongdongcheon Stream sites. The dominant species was Rhynchocypris oxycephalus and the sub-dominant species was Zacco koreanus and there was no difference by year. The fish community structure of Deogyusan National Park was varied depending on the sites and the year. Most of the survey sites located upper stream where the river structure is Aa river type showed poor community analysis results. On the other hand, the upper-mid stream sites including the Bb type showed better results. As a result, the Wondangcheon Stream sites had the most diverse and stable community structure. Similarity dendrogram was divided into 4 groups, mainly reflecting the characteristics of the habitat. The flagship species of the Deogyusan National Park, Rhynchocypris kumgangensis, was constantly observed during the investigation period. Compared to the previous survey, the number of species increased from 2004 (12 species) and decreased from 2009 (22 species).

Nonlinear Vector Alignment Methodology for Mapping Domain-Specific Terminology into General Space (전문어의 범용 공간 매핑을 위한 비선형 벡터 정렬 방법론)

  • Kim, Junwoo;Yoon, Byungho;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.127-146
    • /
    • 2022
  • Recently, as word embedding has shown excellent performance in various tasks of deep learning-based natural language processing, researches on the advancement and application of word, sentence, and document embedding are being actively conducted. Among them, cross-language transfer, which enables semantic exchange between different languages, is growing simultaneously with the development of embedding models. Academia's interests in vector alignment are growing with the expectation that it can be applied to various embedding-based analysis. In particular, vector alignment is expected to be applied to mapping between specialized domains and generalized domains. In other words, it is expected that it will be possible to map the vocabulary of specialized fields such as R&D, medicine, and law into the space of the pre-trained language model learned with huge volume of general-purpose documents, or provide a clue for mapping vocabulary between mutually different specialized fields. However, since linear-based vector alignment which has been mainly studied in academia basically assumes statistical linearity, it tends to simplify the vector space. This essentially assumes that different types of vector spaces are geometrically similar, which yields a limitation that it causes inevitable distortion in the alignment process. To overcome this limitation, we propose a deep learning-based vector alignment methodology that effectively learns the nonlinearity of data. The proposed methodology consists of sequential learning of a skip-connected autoencoder and a regression model to align the specialized word embedding expressed in each space to the general embedding space. Finally, through the inference of the two trained models, the specialized vocabulary can be aligned in the general space. To verify the performance of the proposed methodology, an experiment was performed on a total of 77,578 documents in the field of 'health care' among national R&D tasks performed from 2011 to 2020. As a result, it was confirmed that the proposed methodology showed superior performance in terms of cosine similarity compared to the existing linear vector alignment.

The Heading Response and Regional Adaptability of Rice Varieties under the Temperature and Day-Length Conditions of Major Rice Production Areas in North Korea (북한 주요 벼 재배지역의 기온과 일장 환경에서 품종의 출수 반응과 지역 적응성 분석)

  • Woonho Yang;Shingu Kang;Jong-Seo Choi;Dae-Woo Lee
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.67 no.4
    • /
    • pp.222-233
    • /
    • 2022
  • The heading responses of rice varieties that originated from South Korea, North Korea, and northern China were examined under the temperature and day-length conditions of 13 major rice production areas in North Korea. Kenjiandao3 and Nongdae3 originated from China, Olbyeo1, Olbyeo2 and Sonbong9 from North Korea, and Joun from South Korea demonstrated the earliest heading stage depending on the regional environment. Out of 40 rice varieties, 34 reached the heading stage within the regional safe marginal heading date (SMHD) under Haeju and Sariwon environmental conditions, while 16 to 17 varieties reached the heading stage under Wonsan, Changjon, Supung, and Yongyon environmental conditions. Some middle and mid-late maturing varieties that originated from South Korea reached the heading stage within the SMHD under the temperature and day-length conditions of Kaesong, Haeju, Sariwon, Nampo, and Pyongyang that are located in the west-southern plain. The majority of early maturing varieties, but not the middle or mid-late ones, reached the heading stage within the SMHD under the environmental conditions of Singye, Anju, Kusong, and Sinuiju. Only a few early maturing varieties demonstrated the heading stage within the SMHD under Yongyon, Changjon, and Wonsan environments. The number of days to heading was highly positively correlated among all regions; however, it was not consistent among the rice varieties. The 40 rice varieties that had been tested were classified into seven groups according to their heading responses to the temperature and day-length variations of the 13 regional conditions at 65% similarity level in cluster analysis.

Benchmark Test Study of Localized Digital Streamer System (국산화 디지털 스트리머 시스템의 벤치마크 테스트 연구)

  • Jungkyun Shin;Jiho Ha;Gabseok Seo;Young-Jun Kim;Nyeonkeon Kang;Jounggyu Choi;Dongwoo Cho;Hanhui Lee;Seong-Pil Kim
    • Geophysics and Geophysical Exploration
    • /
    • v.26 no.2
    • /
    • pp.52-61
    • /
    • 2023
  • The use of ultra-high-resolution (UHR) seismic surveys to preceisly characterize coastal and shallow structures have increased recently. UHR surveys derive a spatial resolution of 3.125 m using a high-frequency source (80 Hz to 1 kHz). A digital streamer system is an essential module for acquiring high-quality UHR seismic data. Localization studies have focused on reducing purchase costs and decreasing maintenance periods. Basic performance verification and application tests of the developed streamer have been successfully carried out; however, a comparative analysis with the existing benchmark model was not conducted. In this study, we characterized data obtained by using a developed streamer and a benchmark model simultaneously. Tamhae 2 and auxiliary equipment of the Korea Institute of Geoscience and Mineral Resources were used to acquire 2D seismic data, which were analyzed from different perspectives. The data obtained using the developed streamer differed in sensitivity from that obtained using benchmark model by frequency band.However, both type of data had a very high level of similarity in the range corresponding to the central frequency band of the seismic source. However, in the low frequency band below 60 Hz, data obtained using the developed streamer showed a lower signal-to-noise ratio than that obtained using the benchmark model.This lower ratio can hinder the quality in data acquisition using low-frequency sound sources such as cluster air guns. Three causes for this difference were, and streamers developed in future will attempt to reflect on these improvements.

Identify the Type of Exercise to Prevent Falls for Healthy Elderly Life (고령자의 건강한 삶을 위한 낙상 예방 운동유형 확인)

  • Park, Yang-Sun;Kim, Mi-Ye;Park, Seong-Won;Lee, Ok-Jin
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.7
    • /
    • pp.361-373
    • /
    • 2019
  • Falls are a threat to the physical health of the elderly as well as to their overall quality of life. The purpose of this study was to identify which type of exercise is effective for improving the balance of the elderly, and to obtain the basic data for developing the falls prevention exercise intervention program for the elderly. We compared to the differential effects between rhythmic step exercise and core muscle strengthening exercise in terms of functional balance test and self-reported balance test. Women older than 65 years and under 80 years of age were assigned to one of the step exercise group(21), core muscle exercise group(20), and control group(21), and for 8 weeks, twice per week, 20-30 minutes of exercise were treated. All participants performed one foot static balance test with open and closed eyes. And they responded to self-reported balance test, such as Fall Efficacy Scale(FES) and Activities-specific Balance Confidence(ABC) Scale. The results of statistical analysis are summarized as follows. First, rhythmic stepping exercise was more effective in improving functional balance than core muscle strengthening exercise. In particular, the effect of step exercise was obvious in the one-foot static balance test with open eyes. Second, the self-reported balance test showed better step exercise than core muscle exercise. Specifically, rhythmic step exercise was more effective in enhancing fall efficacy than core muscle exercise. In conclusion, the rhythmic step exercise was more effective in improving the balance ability of the elderly than the core muscle exercise. The rhythmic step exercise is more related to the lower extremity muscles, and especially since the rhythmic step exercise is performed in various ground changes, it seems to have a high similarity to the fall occurrence situation. For future research, we recommended the development of task-oriented ankle proprioceptive exercise intervention program and exercise equipment based on the specific motion situation in which the fall accident occurs in the elderly.

Characterization of a cDNA Encoding Transmembrane Protein 258 from a Two-spotted Cricket Gryllus bimaculatus (쌍별귀뚜라미(Gryllus bimaculatus)의 GbTmem258 cDNA 클로닝과 발현분석)

  • Kisang Kwon;Honggeun Kim;Hyewon Park;O-Yu Kwon
    • Journal of Life Science
    • /
    • v.33 no.10
    • /
    • pp.828-834
    • /
    • 2023
  • The cDNA that encodes transmembrane protein 258 (Tmem258) was cloned from Gryllus bimaculatus and named GbTmem258. This protein comprises 80 amino acids, has no N-glycosylation site, and contains five potential phosphorylation sites at two serines, two threonines, and one tyrosine. The predicted molecular mass of GbTmem258 is 9.06 kDa, and its theoretical isoelectric point is 5.5. The tertiary structure of GbTmem258 was predicted using the available secondary structure information, which suggests the presence of alpha helices (52.5%), random coils (22.5%), extended strands (16.25%), and beta turns (8.75%). Homology analysis revealed that GbTmem258 exhibits high similarity at the amino-acid level to Tmem258 found in other species. The effect of starvation and refeeding on GbTmem258 mRNA expression was also examined in this study. It was found that GbTmem258 mRNA expression in the hindgut progressively increased throughout the starvation period, peaking at almost 1.5 times the control level after six days of starvation. However, refeeding for one to two days after the six-day starvation period restored GbTmem258 mRNA expression to the control level. In fat body, GbTmem258 mRNA expression was almost 3-fold higher during starvation compared to the control level. Refeeding for one to two days after the six-day fast resulted in a decline in the expression to about a 2.5-fold increase over the control level. Throughout the starving and refeeding periods, no other tissues showed any discernible alterations in GbTmem258 mRNA expression.

A Study on the Digital Drawing of Archaeological Relics Using Open-Source Software (오픈소스 소프트웨어를 활용한 고고 유물의 디지털 실측 연구)

  • LEE Hosun;AHN Hyoungki
    • Korean Journal of Heritage: History & Science
    • /
    • v.57 no.1
    • /
    • pp.82-108
    • /
    • 2024
  • With the transition of archaeological recording method's transition from analog to digital, the 3D scanning technology has been actively adopted within the field. Research on the digital archaeological digital data gathered from 3D scanning and photogrammetry is continuously being conducted. However, due to cost and manpower issues, most buried cultural heritage organizations are hesitating to adopt such digital technology. This paper aims to present a digital recording method of relics utilizing open-source software and photogrammetry technology, which is believed to be the most efficient method among 3D scanning methods. The digital recording process of relics consists of three stages: acquiring a 3D model, creating a joining map with the edited 3D model, and creating an digital drawing. In order to enhance the accessibility, this method only utilizes open-source software throughout the entire process. The results of this study confirms that in terms of quantitative evaluation, the deviation of numerical measurement between the actual artifact and the 3D model was minimal. In addition, the results of quantitative quality analysis from the open-source software and the commercial software showed high similarity. However, the data processing time was overwhelmingly fast for commercial software, which is believed to be a result of high computational speed from the improved algorithm. In qualitative evaluation, some differences in mesh and texture quality occurred. In the 3D model generated by opensource software, following problems occurred: noise on the mesh surface, harsh surface of the mesh, and difficulty in confirming the production marks of relics and the expression of patterns. However, some of the open source software did generate the quality comparable to that of commercial software in quantitative and qualitative evaluations. Open-source software for editing 3D models was able to not only post-process, match, and merge the 3D model, but also scale adjustment, join surface production, and render image necessary for the actual measurement of relics. The final completed drawing was tracked by the CAD program, which is also an open-source software. In archaeological research, photogrammetry is very applicable to various processes, including excavation, writing reports, and research on numerical data from 3D models. With the breakthrough development of computer vision, the types of open-source software have been diversified and the performance has significantly improved. With the high accessibility to such digital technology, the acquisition of 3D model data in archaeology will be used as basic data for preservation and active research of cultural heritage.

Environmental Changes after Timber Harvesting in (Mt.) Paekunsan (백운산(白雲山) 성숙활엽수림(成熟闊葉樹林) 개벌수확지(皆伐收穫地)에서 벌출직후(伐出直後)의 환경변화(環境變化))

  • Park, Jae-Hyeon
    • Journal of Korean Society of Forest Science
    • /
    • v.84 no.4
    • /
    • pp.465-478
    • /
    • 1995
  • The objective of this study was to investigate the impacts of large-scale timber harvesting on the environment of a mature hardwood forest. To achieve the objective, the effects of harvesting on forest environmental factors were analyzed quantitatively using the field data measured in the study sites of Seoul National University Research Forests [(Mt.) Paekunsan] for two years(1993-1994) following timber harvesting. The field data include information on vegetation, soil mesofauna, physicochemical characteristics of soil, surface water runoff, water quality in the stream, and hillslope erosion. For comparison, field data for each environmental factor were collected in forest areas disturbed by logging and undisturbed, separately. The results of this study were as follows : The diversity of vegetational species increased in the harvested sites. However, the similarity index value of species between harvested and non-harvested sites was close to each other. Soil bulk density and soil hardness were increased after timber harvesting, respectively. The level of organic matter, total-N, avail $P_2O_5$, CEC($K^+$, $Na^+$, $Ca^{{+}{+}}$, $Mg^{{+}{+}}$) in the harvested area were found decreased. While the population of Colembola spp., and Acari spp. among soil mesofauna in harvested sites increased by two to seven times compared to those of non-harvested sites during the first year, the rates of increment decreased in the second year. However, those members of soil mesofauna in harvested sites were still higher than those of non-harvested sites in the second year. The results of statistical analysis using the stepwise regression method indicated that the diversity of soil mesofauna were significantly affected by soil moisture, soil bulk density, $Mg^{{+}{+}}$, CEC, and soil temperature at soil depth of 5(0~10)cm in the order of importance. The amount of surface water runoff on harvested sites was larger than that of non-harvested sites by 28% in the first year and 24.5% in the second year after timber harvesting. The level of BOD, COD, and pH in the stream water on the harvested sites reached at the level of the domestic use for drinking in the first and second year after timber harvesting. Such heavy metals as Cd, Pb, Cu, and organic P were not found. Moreover, the level of eight factors of domestic use for drinking water designated by the Ministry of Health and Welfare of Korea were within the level of the first class in the quality of drinking water standard. The study also showed that the amount of hillslope erosion in harvested sites was 4.77 ton/ha/yr in the first year after timber harvesting. In the second year, the amount decreased rapidly to 1.0 ton/ha/yr. The impact of logging on hillslope erosion in the harvested sites was larger than that in non-harvested sites by seven times in the first year and two times in the second year. The above results indicate that the large-scale timber harvesting cause significant changes in the environmental factors. However, the results are based on only two-year field observation. We should take more field observation and analyses to increase understandings on the impacts of timber harvesting on environmental changes. With the understandings, we might be able to improve the technology of timber harvesting operations to reduce the environmental impacts of large-scale timber harvesting.

  • PDF

The Ontology Based, the Movie Contents Recommendation Scheme, Using Relations of Movie Metadata (온톨로지 기반 영화 메타데이터간 연관성을 활용한 영화 추천 기법)

  • Kim, Jaeyoung;Lee, Seok-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.25-44
    • /
    • 2013
  • Accessing movie contents has become easier and increased with the advent of smart TV, IPTV and web services that are able to be used to search and watch movies. In this situation, there are increasing search for preference movie contents of users. However, since the amount of provided movie contents is too large, the user needs more effort and time for searching the movie contents. Hence, there are a lot of researches for recommendations of personalized item through analysis and clustering of the user preferences and user profiles. In this study, we propose recommendation system which uses ontology based knowledge base. Our ontology can represent not only relations between metadata of movies but also relations between metadata and profile of user. The relation of each metadata can show similarity between movies. In order to build, the knowledge base our ontology model is considered two aspects which are the movie metadata model and the user model. On the part of build the movie metadata model based on ontology, we decide main metadata that are genre, actor/actress, keywords and synopsis. Those affect that users choose the interested movie. And there are demographic information of user and relation between user and movie metadata in user model. In our model, movie ontology model consists of seven concepts (Movie, Genre, Keywords, Synopsis Keywords, Character, and Person), eight attributes (title, rating, limit, description, character name, character description, person job, person name) and ten relations between concepts. For our knowledge base, we input individual data of 14,374 movies for each concept in contents ontology model. This movie metadata knowledge base is used to search the movie that is related to interesting metadata of user. And it can search the similar movie through relations between concepts. We also propose the architecture for movie recommendation. The proposed architecture consists of four components. The first component search candidate movies based the demographic information of the user. In this component, we decide the group of users according to demographic information to recommend the movie for each group and define the rule to decide the group of users. We generate the query that be used to search the candidate movie for recommendation in this component. The second component search candidate movies based user preference. When users choose the movie, users consider metadata such as genre, actor/actress, synopsis, keywords. Users input their preference and then in this component, system search the movie based on users preferences. The proposed system can search the similar movie through relation between concepts, unlike existing movie recommendation systems. Each metadata of recommended candidate movies have weight that will be used for deciding recommendation order. The third component the merges results of first component and second component. In this step, we calculate the weight of movies using the weight value of metadata for each movie. Then we sort movies order by the weight value. The fourth component analyzes result of third component, and then it decides level of the contribution of metadata. And we apply contribution weight to metadata. Finally, we use the result of this step as recommendation for users. We test the usability of the proposed scheme by using web application. We implement that web application for experimental process by using JSP, Java Script and prot$\acute{e}$g$\acute{e}$ API. In our experiment, we collect results of 20 men and woman, ranging in age from 20 to 29. And we use 7,418 movies with rating that is not fewer than 7.0. In order to experiment, we provide Top-5, Top-10 and Top-20 recommended movies to user, and then users choose interested movies. The result of experiment is that average number of to choose interested movie are 2.1 in Top-5, 3.35 in Top-10, 6.35 in Top-20. It is better than results that are yielded by for each metadata.

A study on the classification of research topics based on COVID-19 academic research using Topic modeling (토픽모델링을 활용한 COVID-19 학술 연구 기반 연구 주제 분류에 관한 연구)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.155-174
    • /
    • 2022
  • From January 2020 to October 2021, more than 500,000 academic studies related to COVID-19 (Coronavirus-2, a fatal respiratory syndrome) have been published. The rapid increase in the number of papers related to COVID-19 is putting time and technical constraints on healthcare professionals and policy makers to quickly find important research. Therefore, in this study, we propose a method of extracting useful information from text data of extensive literature using LDA and Word2vec algorithm. Papers related to keywords to be searched were extracted from papers related to COVID-19, and detailed topics were identified. The data used the CORD-19 data set on Kaggle, a free academic resource prepared by major research groups and the White House to respond to the COVID-19 pandemic, updated weekly. The research methods are divided into two main categories. First, 41,062 articles were collected through data filtering and pre-processing of the abstracts of 47,110 academic papers including full text. For this purpose, the number of publications related to COVID-19 by year was analyzed through exploratory data analysis using a Python program, and the top 10 journals under active research were identified. LDA and Word2vec algorithm were used to derive research topics related to COVID-19, and after analyzing related words, similarity was measured. Second, papers containing 'vaccine' and 'treatment' were extracted from among the topics derived from all papers, and a total of 4,555 papers related to 'vaccine' and 5,971 papers related to 'treatment' were extracted. did For each collected paper, detailed topics were analyzed using LDA and Word2vec algorithms, and a clustering method through PCA dimension reduction was applied to visualize groups of papers with similar themes using the t-SNE algorithm. A noteworthy point from the results of this study is that the topics that were not derived from the topics derived for all papers being researched in relation to COVID-19 (

    ) were the topic modeling results for each research topic (
    ) was found to be derived from For example, as a result of topic modeling for papers related to 'vaccine', a new topic titled Topic 05 'neutralizing antibodies' was extracted. A neutralizing antibody is an antibody that protects cells from infection when a virus enters the body, and is said to play an important role in the production of therapeutic agents and vaccine development. In addition, as a result of extracting topics from papers related to 'treatment', a new topic called Topic 05 'cytokine' was discovered. A cytokine storm is when the immune cells of our body do not defend against attacks, but attack normal cells. Hidden topics that could not be found for the entire thesis were classified according to keywords, and topic modeling was performed to find detailed topics. In this study, we proposed a method of extracting topics from a large amount of literature using the LDA algorithm and extracting similar words using the Skip-gram method that predicts the similar words as the central word among the Word2vec models. The combination of the LDA model and the Word2vec model tried to show better performance by identifying the relationship between the document and the LDA subject and the relationship between the Word2vec document. In addition, as a clustering method through PCA dimension reduction, a method for intuitively classifying documents by using the t-SNE technique to classify documents with similar themes and forming groups into a structured organization of documents was presented. In a situation where the efforts of many researchers to overcome COVID-19 cannot keep up with the rapid publication of academic papers related to COVID-19, it will reduce the precious time and effort of healthcare professionals and policy makers, and rapidly gain new insights. We hope to help you get It is also expected to be used as basic data for researchers to explore new research directions.


  • (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.