• Title/Summary/Keyword: large database

Search Result 1,434, Processing Time 0.029 seconds

Co-authorship patterns and networks of Korean radiation oncologists

  • Choi, Jin-Hyun;Kang, Jin-Oh;Park, Seo-Hyun;Kim, Sang-Ki
    • Radiation Oncology Journal
    • /
    • v.29 no.3
    • /
    • pp.164-173
    • /
    • 2011
  • Purpose: This research aimed to analyze the patterns of co-authorship network among the Korean radiation oncologists and to identify attributing factors for the formation of networks. Materials and Methods: A total of 1,447 articles including contents of ‘Radiation Oncology' and 'Therapeutic Radiology' were searched from the KoreaMed database. The co-authorship was assorted by the author's full name, affiliation and specialties. UCINET 6.0 was used to fi gure out the author's network centrality and the cluster analysis, and KeyPlayer 1.44 program was used to get a result of key player index. Sociogram was analyzed with the Netdraw 2.090. The statistical comparison was performed by a t-test and ANOVA using SPSS 16.0 with p-value < 0.05 as the significant value. Results: The number of articles written by a radiation oncologist as the first author was 1,025 out of 1,447. The pattern of coauthorship was classified into five groups. For articles of which the first author was a radiation oncologist, the number of singleauthor articles (type-A) was 81; single-institution articles (type-B) was 687; and multiple-author articles (type-C) was 257. For the articles which radiation oncologists participated in as a co-author, the number of single-institution articles (type-D) was 280 while multiple-institution articles (type-E) were 142. There were 8,895 authors from 1,366 co-authored articles, thus the average number of authors per article was 6.51. It was 5.73 for type-B, 6.44 for type-C, 7.90 for type-D, and 7.67 for type-E (p = 0.000) in the average number of authors per article. The number of authors for articles from the hospitals published more than 100 articles was 7.23 while form others was 5.94 (p = 0.005). Its number was 5.94 and 7.16 for the articles published before and after 2001 (p = 0.000). The articles written by a radiation oncologist as the first author had 5.92 authors while others for 7.82 (p = 0.025). Its number was 5.57 and 7.71 for the Journal of the Korean Society for Therapeutic Radiology and Oncology and others (p = 0.000), respectively. Among the analysis, a significant difference in the average number of author per article was indicated. The out-degree centrality of network among authors was 4.26% (2.03-7.09%) while in-degree centrality was 1.31% (0.53-2.84%). The three significant nodes were classified and listed as following: Choi, Eun Kyung for 1991-1995, Kim, Dae Young for 1998-2001, Park, Won and Lee, Sang Wook for 2003-2010. Choi, Eun Kyung and Kim, Dae Young appeared in two cases, and ranked as the highest degree in centrality. In the key player analysis, Choi, Eun Kyung and Lee, Sang Wook appeared in two cases, and ranked as the highest. From the cluster analysis, Sungkyunkwan University, Seoul National University and Yonsei University revealed as the three large clusters when Ulsan University, Chonnam National University, and Korea Institute of Radiological & Medical Science as the medium clusters. Conclusion: The Korean radiation oncologist's society shows a closed network with numerous relationships among the particular clusters, and the result indicates it is different from other institutions in the pattern of co-authorship formation of the major hospitals.

Advanced Improvement for Frequent Pattern Mining using Bit-Clustering (비트 클러스터링을 이용한 빈발 패턴 탐사의 성능 개선 방안)

  • Kim, Eui-Chan;Kim, Kye-Hyun;Lee, Chul-Yong;Park, Eun-Ji
    • Journal of Korea Spatial Information System Society
    • /
    • v.9 no.1
    • /
    • pp.105-115
    • /
    • 2007
  • Data mining extracts interesting knowledge from a large database. Among numerous data mining techniques, research work is primarily concentrated on clustering and association rules. The clustering technique of the active research topics mainly deals with analyzing spatial and attribute data. And, the technique of association rules deals with identifying frequent patterns. There was an advanced apriori algorithm using an existing bit-clustering algorithm. In an effort to identify an alternative algorithm to improve apriori, we investigated FP-Growth and discussed the possibility of adopting bit-clustering as the alternative method to solve the problems with FP-Growth. FP-Growth using bit-clustering demonstrated better performance than the existing method. We used chess data in our experiments. Chess data were used in the pattern mining evaluation. We made a creation of FP-Tree with different minimum support values. In the case of high minimum support values, similar results that the existing techniques demonstrated were obtained. In other cases, however, the performance of the technique proposed in this paper showed better results in comparison with the existing technique. As a result, the technique proposed in this paper was considered to lead to higher performance. In addition, the method to apply bit-clustering to GML data was proposed.

  • PDF

Impacts of Introduced Fishes (Carassius cuvieri, Micropterus salmoides, Lepomis macrochirus) on Stream Fish Communities in South Korea (외래어류가 우리나라 하천생태계 어류 군집에 미치는 영향: 떡붕어(Carassius cuvieri), 배스(Micropterus salmoides), 블루길(Lepomis macrochirus)을 대상으로)

  • Lee, Dae-Seong;Lee, Da-Yeong;Ji, Chang Woo;Kwak, Ihn-Sil;Hwang, Soon-Jin;Lee, Hae-Jin;Park, Young-Seuk
    • Korean Journal of Ecology and Environment
    • /
    • v.53 no.3
    • /
    • pp.241-254
    • /
    • 2020
  • Three introduced fish species, Japanese white crucian carp (Carassius cuvieri Temminck and Schlegel, 1846), bass (Micropterus salmoides Lacepède, 1802) and bluegill (Lepomis macrochirus Rafinesque, 1819), are dominant fishes in Korean freshwater ecosystem. In this study, we analyzed habitat environment conditions of these three species and their impacts to fish communities in streams across South Korea. Fish community data were obtained from the database of the Stream/River Ecosystem Survey and Health Assessment program maintained by the Ministry of Environment and the National Institute of Environmental Research, Korea. Our results showed that species richness and Shannon diversity of fish were higher at the presence sites of introduced fish than at the absence sites. However, when the abundance of these introduced fish species was increased, the species richness and abundance of fish were decreased. An association analysis showed that the introduced fish species had a low similarity in their appearance with some indigenous fishes such as Siniperca scherzeri and Channa argus and some endemic fishes of Korea such as Zacco koreanus, Sarcocheilichthys variegatus wakiyae, and Acheilognathus yamatsutae. In addition, the introduced fish species had a low appearance similarity with a large number of fishes in their association networks. Finally, our results presented that these introduced fish species influenced the negative impacts to the stream fish communities, and they were potential risk factors for fish community in Korean freshwater ecosystem. Therefore, it is necessary that continuous monitoring and establishment of management strategy for introduced fish species to preserve fish resource and biodiversity in the Korean streams.

Design of a Crowd-Sourced Fingerprint Mapping and Localization System (군중-제공 신호지도 작성 및 위치 추적 시스템의 설계)

  • Choi, Eun-Mi;Kim, In-Cheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.9
    • /
    • pp.595-602
    • /
    • 2013
  • WiFi fingerprinting is well known as an effective localization technique used for indoor environments. However, this technique requires a large amount of pre-built fingerprint maps over the entire space. Moreover, due to environmental changes, these maps have to be newly built or updated periodically by experts. As a way to avoid this problem, crowd-sourced fingerprint mapping attracts many interests from researchers. This approach supports many volunteer users to share their WiFi fingerprints collected at a specific environment. Therefore, crowd-sourced fingerprinting can automatically update fingerprint maps up-to-date. In most previous systems, however, individual users were asked to enter their positions manually to build their local fingerprint maps. Moreover, the systems do not have any principled mechanism to keep fingerprint maps clean by detecting and filtering out erroneous fingerprints collected from multiple users. In this paper, we present the design of a crowd-sourced fingerprint mapping and localization(CMAL) system. The proposed system can not only automatically build and/or update WiFi fingerprint maps from fingerprint collections provided by multiple smartphone users, but also simultaneously track their positions using the up-to-date maps. The CMAL system consists of multiple clients to work on individual smartphones to collect fingerprints and a central server to maintain a database of fingerprint maps. Each client contains a particle filter-based WiFi SLAM engine, tracking the smartphone user's position and building each local fingerprint map. The server of our system adopts a Gaussian interpolation-based error filtering algorithm to maintain the integrity of fingerprint maps. Through various experiments, we show the high performance of our system.

Development of Evaluation Model for ITS Project using the Probabilistic Risk Analysis (확률적 위험도분석을 이용한 ITS사업의 경제성평가모형)

  • Lee, Yong-Taeck;Nam, Doo-Hee;Lim, Kang-Won
    • Journal of Korean Society of Transportation
    • /
    • v.23 no.3 s.81
    • /
    • pp.95-108
    • /
    • 2005
  • The purpose of this study is to develop the ITS evaluation model using the Probabilistic Risk Analysis (PRA) methodology and to demonstrate the goodness-of-fit of the large ITS projects through the comparative analysis between DEA and PRA model. The results of this study are summarized below. First, the evaluation mode] using PRA with Monte-Carlo Simulation(MCS) and Latin-Hypercube Sampling(LHS) is developed and applied to one of ITS projects initiated by local government. The risk factors are categorized with cost, benefit and social-economic factors. Then, PDF(Probability Density Function) parameters of these factors are estimated. The log-normal distribution, beta distribution and triangular distribution are well fitted with the market and delivered price. The triangular and uniform distributions are valid in benefit data from the simulation analysis based on the several deployment scenarios. Second, the decision making rules for the risk analysis of projects for cost and economic feasibility study are suggested. The developed PRA model is applied for the Daejeon metropolitan ITS model deployment project to validate the model. The results of cost analysis shows that Deterministic Project Cost(DPC), Deterministic Total Project Cost(DTPC) is the biased percentile values of CDF produced by PRA model and this project need Contingency Budget(CB) because these values are turned out to be less than Target Value(TV;85% value), Also, this project has high risk of DTPC and DPC because the coefficient of variation(C.V) of DTPC and DPC are 4 and 15 which are less than that of DTPC(19-28) and DPC(22-107) in construction and transportation projects. The results of economic analysis shows that total system and subsystem of this project is in type II, which means the project is economically feasible with high risk. Third, the goodness-of-fit of PRA model is verified by comparing the differences of the results between PRA and DEA model. The difference of evaluation indices is up to 68% in maximum. Because of this, the deployment priority of ITS subsystems are changed in each mode1. In results. ITS evaluation model using PRA considering the project risk with the probability distribution is superior to DEA. It makes proper decision making and the risk factors estimated by PRA model can be controlled by risk management program suggested in this paper. Further research not only to build the database of deployment data but also to develop the methodologies estimating the ITS effects with PRA model is needed to broaden the usage of PRA model for the evaluation of ITS projects.

Is the BRCA Germline Mutation a Prognostic Factor in Korean Patients with Early-onset Breast Carcinomas? (한국의 젊은 여성유방암 환자에서 BRCA 배선유전자 돌연변이는 예후인자인가?)

  • Choi Doo Ho;Lee Min Hyuk;Haffty Bruce G.
    • Radiation Oncology Journal
    • /
    • v.21 no.2
    • /
    • pp.149-157
    • /
    • 2003
  • Purpose: The purpose of this study was to determine if there were prognostic differences between BRCA related and BRCA non-related Korean patients with early-onset breast carcinomas. Materials and Methods: Sixty women who had developed breast cancers before the age of 40, and who were treated at the Soonchunhyang University Hospital, were studied independently of their family histories. The age range was 18 to 40 with a median of 34.5 years. Lymphocyte specimens from peripheral blood were studied for the heterozygous mutations of BRCA1 and BRCA2 using direct sequencing methods. Immunohistochemistry was peformed on the paraffin-embedded tissue blocks that were available. Results: Eleven deleterious mutations (18.3%, 6 in BRCA1 and 5 in BRCA2) and 7 missense mutations of unknown significance (11.7%), were found among the 60 patients. More than half of the mutation were novel, and were not reported in the database. Most of the BRCA-associated patients had no history of breast cancer. No treatment related failures were observed in the BRCA carriers, with the exception of one patient that had experienced a new primary tumor of the contralateral breast. The seven year relapse free survival rate were 50 and 79% In the BRCA carrier and BRCA negative patients, respectively. Although the expression of estrogen and progesterone receptors were less common, and histological features more aggressive, in the BRCA associated tumors, the outcome of the patients with BRCA mutations was not poorer than that on the patients without deleterious mutations. Conclusion.: Despite the BRCA mutation carriers having adverse prognostic features, the recurrence rate was relatively lower than that in the BRCA non-carrying Korean patients wi4h early-onset breast carcinomas. In addition, although the prevalence of the BRCA mutation in Korean patients was higher than that in white patients, the penetrance of the cancer seemed to be relatively low in Korean women carrying BRCA mutations. A large population based study of the BRCA mutation, with a long-term follow-up of the study patients will be required to confirm these results.

Analysis and Performance Evaluation of Pattern Condensing Techniques used in Representative Pattern Mining (대표 패턴 마이닝에 활용되는 패턴 압축 기법들에 대한 분석 및 성능 평가)

  • Lee, Gang-In;Yun, Un-Il
    • Journal of Internet Computing and Services
    • /
    • v.16 no.2
    • /
    • pp.77-83
    • /
    • 2015
  • Frequent pattern mining, which is one of the major areas actively studied in data mining, is a method for extracting useful pattern information hidden from large data sets or databases. Moreover, frequent pattern mining approaches have been actively employed in a variety of application fields because the results obtained from them can allow us to analyze various, important characteristics within databases more easily and automatically. However, traditional frequent pattern mining methods, which simply extract all of the possible frequent patterns such that each of their support values is not smaller than a user-given minimum support threshold, have the following problems. First, traditional approaches have to generate a numerous number of patterns according to the features of a given database and the degree of threshold settings, and the number can also increase in geometrical progression. In addition, such works also cause waste of runtime and memory resources. Furthermore, the pattern results excessively generated from the methods also lead to troubles of pattern analysis for the mining results. In order to solve such issues of previous traditional frequent pattern mining approaches, the concept of representative pattern mining and its various related works have been proposed. In contrast to the traditional ones that find all the possible frequent patterns from databases, representative pattern mining approaches selectively extract a smaller number of patterns that represent general frequent patterns. In this paper, we describe details and characteristics of pattern condensing techniques that consider the maximality or closure property of generated frequent patterns, and conduct comparison and analysis for the techniques. Given a frequent pattern, satisfying the maximality for the pattern signifies that all of the possible super sets of the pattern must have smaller support values than a user-specific minimum support threshold; meanwhile, satisfying the closure property for the pattern means that there is no superset of which the support is equal to that of the pattern with respect to all the possible super sets. By mining maximal frequent patterns or closed frequent ones, we can achieve effective pattern compression and also perform mining operations with much smaller time and space resources. In addition, compressed patterns can be converted into the original frequent pattern forms again if necessary; especially, the closed frequent pattern notation has the ability to convert representative patterns into the original ones again without any information loss. That is, we can obtain a complete set of original frequent patterns from closed frequent ones. Although the maximal frequent pattern notation does not guarantee a complete recovery rate in the process of pattern conversion, it has an advantage that can extract a smaller number of representative patterns more quickly compared to the closed frequent pattern notation. In this paper, we show the performance results and characteristics of the aforementioned techniques in terms of pattern generation, runtime, and memory usage by conducting performance evaluation with respect to various real data sets collected from the real world. For more exact comparison, we also employ the algorithms implementing these techniques on the same platform and Implementation level.

A Study on the Forest Vegetation of Odaesan National Park, Korea (오대산국립공원 삼림식생에 관한 연구)

  • Kim, Chang-Hwan;Oh, Jang-Geun;Lee, Nam-Sook;Choi, Young-Eun;Song, Myoung-Jun
    • Korean Journal of Ecology and Environment
    • /
    • v.48 no.1
    • /
    • pp.61-67
    • /
    • 2015
  • This study, which was conducted from Apr. 2013 to Jan. 2014, was carried out as part of a project of making a more detailed ecological zoning map with 1/5,000 scale. The necessity of electronic vegetation map with large scale has arisen in order to make the best use of basic research findings on resource monitoring of National Parks and to enhance efficiency in National Park management. In order to improve accuracy and speed of vegetation research process, the data base for vegetation research was categorized into five groups, namely broad-leaved forest, coniferous forest, mixed forest, rock vegetation and miscellaneous one. And then a vegetation map for vegetation research was created for the research on the site. What is in the database for vegetation research and the vegetation map reflecting findings from vegetation research showed similar distribution rate for broad-leaved forest with 71.965% and 71.184%, respectively. The distribution rate of coniferous forest (16.010%, 15.747%), mixed forest (10.619%, 12.085%), and rock vegetation (0.015%, 0.002%) did not have much difference. In a detailed vegetation map reflecting vegetation research findings, the broad-leaved mountain forest was the most widely distributed with 60.096% based on the physiognomy classification. It was followed by mountain coniferous forest (16.332%), mountain valley forest (15.887%), and plantation forest (3.558%) As for vegetation conservation classification evaluated in the national park, grade I and grade II areas took up 200.44 km2, 61.80% and 108.80 km2, 33.55% respectively. The combined area of these two amounts to 95.35%, making this area the first grade area in ecological nature status. This means that this area is highly worth preserving its vegetation. The high rate of grade I area such as climax forests, unique vegetation, and subalpine vegetation seems to be attributable to diverse innate characteristics of Odaesan National Park, high altitude, low level of artificial disturbance, the subalpine zone formed on the ridge of the mountain top, and their vegetation formation, which reflects climatic and geological characteristics, despite continuous disturbance by mountain climbing.

Conflict of Interests and Analysts' Forecast (이해상충과 애널리스트 예측)

  • Park, Chang-Gyun;Youn, Taehoon
    • KDI Journal of Economic Policy
    • /
    • v.31 no.1
    • /
    • pp.239-276
    • /
    • 2009
  • The paper investigates the possible relationship between earnings prediction by security analysts and special ownership ties that link security companies those analysts belong to and firms under analysis. "Security analysts" are known best for their role as information producers in stock markets where imperfect information is prevalent and transaction costs are high. In such a market, changes in the fundamental value of a company are not spontaneously reflected in the stock price, and the security analysts actively produce and distribute the relevant information crucial for the price mechanism to operate efficiently. Therefore, securing the fairness and accuracy of information they provide is very important for efficiencyof resource allocation as well as protection of investors who are excluded from the special relationship. Evidence of systematic distortion of information by the special tie naturally calls for regulatory intervention, if found. However, one cannot presuppose the existence of distorted information based on the common ownership between the appraiser and the appraisee. Reputation effect is especially cherished by security firms and among analysts as indispensable intangible asset in the industry, and the incentive to maintain good reputation by providing accurate earnings prediction may overweigh the incentive to offer favorable rating or stock recommendation for the firms that are affiliated by common ownership. This study shares the theme of existing literature concerning the effect of conflict of interests on the accuracy of analyst's predictions. This study, however, focuses on the potential conflict of interest situation that may originate from the Korea-specific ownership structure of large conglomerates. Utilizing an extensive database of analysts' reports provided by WiseFn(R) in Korea, we perform empirical analysis of potential relationship between earnings prediction and common ownership. We first analyzed the prediction bias index which tells how optimistic or friendly the analyst's prediction is compared to the realized earnings. It is shown that there exists no statistically significant relationship between the prediction bias and common ownership. This is a rather surprising result since it is observed that the frequency of positive prediction bias is higher with such ownership tie. Next, we analyzed the prediction accuracy index which shows how accurate the analyst's prediction is compared to the realized earnings regardless of its sign. It is also concluded that there is no significant association between the accuracy ofearnings prediction and special relationship. We interpret the results implying that market discipline based on reputation effect is working in Korean stock market in the sense that security companies do not seem to be influenced by an incentive to offer distorted information on affiliated firms. While many of the existing studies confirm the relationship between the ability of the analystand the accuracy of the analyst's prediction, these factors cannot be controlled in the above analysis due to the lack of relevant data. As an indirect way to examine the possibility that such relationship might have distorted the result, we perform an additional but identical analysis based on a sub-sample consisting only of reports by best analysts. The result also confirms the earlier conclusion that the common ownership structure does not affect the accuracy and bias of earnings prediction by the analyst.

  • PDF

Avifauna and Management of Breeding Season in Taeanhaean National Park (태안해안국립공원의 번식기 조류상과 관리)

  • Paik, In-Hwan;Jin, Seon-Deok;Yu, Jae-Pyoung;Paek, Woon-Kee
    • Korean Journal of Environment and Ecology
    • /
    • v.24 no.2
    • /
    • pp.139-146
    • /
    • 2010
  • The survey was done in order to find what kinds of birds visit Taeanhaean National Park during breeding season, where we fixed up 10 coastal areas and islands within the National Park. Three groups concurrently performed the field research from 5th to 9th of July in 2009. Total 58 species and 7,323 individuals were recorded in Taeanhaean National Park. 48 species including 6,187 individuals were observed in coastal areas and 33 species including 1,136 individuals in island areas. The most dominant species in the National Park are Larus crassirostris which accounts for 60% of the birds inhabiting there, and they seem to have been bred in the islands near the National Park. The birds observed only around the coastal areas include Anas poecilorhyncha, Fulica atra, Egretta intermedia and the others which consist of 25 species and amount to 318 individuals, and the birds found exclusively in island areas include Phalacrocorax filamentosus, Apus pacificus¸ Locustella pleskei and other birds, which consist of 10 species and the number of those individuals observed was 308. The inhabited islands areas such as Gauido were characterized by high ratio of waterbird population, which seems to be correlated with the factors such as the extent of island, the richness of water resources, and the diversity of habitats. Based on the data collected during the research and other data from the previous observations, the kinds of dominant species remain nearly unchanged. And in spite of the oil spill accident in 2007, the increase in the number of waterbirds compared to 2004 may be the evidence that the area is recovering from the environmental pollution. At present, the tidal power plants are being built or scheduled to be built and large-scale reclamation is also under way. What is worse, those areas are seeing the increase of pension construction, which is likely to be the potential cause of damage and disturbance against some key habitats for the waterbirds. Therefore, it is a major priority that we build the bird information system to efficiently manage the knowledge-based asset collected from bird-watching groups and to better monitor the areas that need enhanced database through which the National Park can be appropriately administered.