• Title/Summary/Keyword: 분류 알고리즘

Search Result 3,147, Processing Time 0.03 seconds

A Study on Electron Dose Distribution of Cones for Intraoperative Radiation Therapy (수술중 전자선치료에 있어서 선량분포에 관한 연구)

  • Kang, Wee-Saing;Ha, Sung-Whan;Yun, Hyong-Geun
    • Progress in Medical Physics
    • /
    • v.3 no.2
    • /
    • pp.1-12
    • /
    • 1992
  • For intraoperative radiation therapy using electron beams, a cone system to deliver a large dose to the tumor during surgical operation and to save the surrounding normal tissue should be developed and dosimetry for the cone system is necessary to find proper X-ray collimator setting as well as to get useful data for clinical use. We developed a docking type of a cone system consisting of two parts made of aluminum: holder and cone. The cones which range from 4cm to 9cm with 1cm step at 100cm SSD of photon beam are 28cm long circular tubular cylinders. The system has two 26cm long holders: one for the cones larger than or equal to 7cm diamter and another for the smaller ones than 7cm. On the side of the holder is an aperture for insertion of a lamp and mirror to observe treatment field. Depth dose curve. dose profile and output factor at dept of dose maximum. and dose distribution in water for each cone size were measured with a p-type silicone detector controlled by a linear scanner for several extra opening of X-ray collimators. For a combination of electron energy and cone size, the opening of the X-ray collimator was caused to the surface dose, depths of dose maximum and 80%, dose profile and output factor. The variation of the output factor was the most remarkable. The output factors of 9MeV electron, as an example, range from 0.637 to 1.549. The opening of X-ray collimators would cause the quantity of scattered electrons coming to the IORT cone system. which in turn would change the dose distribution as well as the output factor. Dosimetry for an IORT cone system is inevitable to minimize uncertainty in the clinical use.

  • PDF

A Case Study of a Text Mining Method for Discovering Evolutionary Patterns of Mobile Phone in Korea (국내 휴대폰의 진화패턴 규명을 위한 텍스트 마이닝 방안 제안 및 사례 연구)

  • On, Byung-Won
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.2
    • /
    • pp.29-45
    • /
    • 2015
  • Systematic theory, concepts, and methodology for the biological evolution have been developed while patterns and principles of the evolution have been actively studied in the past 200 years. Furthermore, they are applied to various fields such as evolutionary economics, evolutionary psychology, evolutionary linguistics, making significant progress in research. In addition, existing studies have applied main biological evolutionary models to artifacts although such methods do not fit to them. These models are also limited to generalize evolutionary patterns of artifacts because they are designed in terms of a subjective point of view of experts who know well about the artifacts. Unlike biological organisms, because artifacts are likely to reflect the imagination of the human will, it is known that the theory of biological evolution cannot be directly applied to artifacts. In this paper, beyond the individual's subjective, the aim of our research is to present evolutionary patterns of a given artifact based on peeping the idea of the public. For this, we propose a text mining approach that presents a systematic framework that can find out the evolutionary patterns of a given artifact and then visualize effectively. In particular, based on our proposal, we focus mainly on a case study of mobile phone that has emerged as an icon of innovation in recent years. We collect and analyze review posts on mobile phone available in the domestic market over the past decade, and discuss the detailed results about evolutionary patterns of the mobile phone. Moreover, this kind of task is a tedious work over a long period of time because a small number of experts carry out an extensive literature survey and summarize a huge number of materials to finally draw a diagram of evolutionary patterns of the mobile phone. However, in this work, to minimize the human efforts, we present a semi-automatic mining algorithm, and through this research we can understand how human creativity and imagination are implemented. In addition, it is a big help to predict the future trend of mobile phone in business and industries.

Construction of Gene Network System Associated with Economic Traits in Cattle (소의 경제형질 관련 유전자 네트워크 분석 시스템 구축)

  • Lim, Dajeong;Kim, Hyung-Yong;Cho, Yong-Min;Chai, Han-Ha;Park, Jong-Eun;Lim, Kyu-Sang;Lee, Seung-Su
    • Journal of Life Science
    • /
    • v.26 no.8
    • /
    • pp.904-910
    • /
    • 2016
  • Complex traits are determined by the combined effects of many loci and are affected by gene networks or biological pathways. Systems biology approaches have an important role in the identification of candidate genes related to complex diseases or traits at the system level. The gene network analysis has been performed by diverse types of methods such as gene co-expression, gene regulatory relationships, protein-protein interaction (PPI) and genetic networks. Moreover, the network-based methods were described for predicting gene functions such as graph theoretic method, neighborhood counting based methods and weighted function. However, there are a limited number of researches in livestock. The present study systemically analyzed genes associated with 102 types of economic traits based on the Animal Trait Ontology (ATO) and identified their relationships based on the gene co-expression network and PPI network in cattle. Then, we constructed the two types of gene network databases and network visualization system (http://www.nabc.go.kr/cg). We used a gene co-expression network analysis from the bovine expression value of bovine genes to generate gene co-expression network. PPI network was constructed from Human protein reference database based on the orthologous relationship between human and cattle. Finally, candidate genes and their network relationships were identified in each trait. They were typologically centered with large degree and betweenness centrality (BC) value in the gene network. The ontle program was applied to generate the database and to visualize the gene network results. This information would serve as valuable resources for exploiting genomic functions that influence economically and agriculturally important traits in cattle.

Response Modeling for the Marketing Promotion with Weighted Case Based Reasoning Under Imbalanced Data Distribution (불균형 데이터 환경에서 변수가중치를 적용한 사례기반추론 기반의 고객반응 예측)

  • Kim, Eunmi;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.29-45
    • /
    • 2015
  • Response modeling is a well-known research issue for those who have tried to get more superior performance in the capability of predicting the customers' response for the marketing promotion. The response model for customers would reduce the marketing cost by identifying prospective customers from very large customer database and predicting the purchasing intention of the selected customers while the promotion which is derived from an undifferentiated marketing strategy results in unnecessary cost. In addition, the big data environment has accelerated developing the response model with data mining techniques such as CBR, neural networks and support vector machines. And CBR is one of the most major tools in business because it is known as simple and robust to apply to the response model. However, CBR is an attractive data mining technique for data mining applications in business even though it hasn't shown high performance compared to other machine learning techniques. Thus many studies have tried to improve CBR and utilized in business data mining with the enhanced algorithms or the support of other techniques such as genetic algorithm, decision tree and AHP (Analytic Process Hierarchy). Ahn and Kim(2008) utilized logit, neural networks, CBR to predict that which customers would purchase the items promoted by marketing department and tried to optimized the number of k for k-nearest neighbor with genetic algorithm for the purpose of improving the performance of the integrated model. Hong and Park(2009) noted that the integrated approach with CBR for logit, neural networks, and Support Vector Machine (SVM) showed more improved prediction ability for response of customers to marketing promotion than each data mining models such as logit, neural networks, and SVM. This paper presented an approach to predict customers' response of marketing promotion with Case Based Reasoning. The proposed model was developed by applying different weights to each feature. We deployed logit model with a database including the promotion and the purchasing data of bath soap. After that, the coefficients were used to give different weights of CBR. We analyzed the performance of proposed weighted CBR based model compared to neural networks and pure CBR based model empirically and found that the proposed weighted CBR based model showed more superior performance than pure CBR model. Imbalanced data is a common problem to build data mining model to classify a class with real data such as bankruptcy prediction, intrusion detection, fraud detection, churn management, and response modeling. Imbalanced data means that the number of instance in one class is remarkably small or large compared to the number of instance in other classes. The classification model such as response modeling has a lot of trouble to recognize the pattern from data through learning because the model tends to ignore a small number of classes while classifying a large number of classes correctly. To resolve the problem caused from imbalanced data distribution, sampling method is one of the most representative approach. The sampling method could be categorized to under sampling and over sampling. However, CBR is not sensitive to data distribution because it doesn't learn from data unlike machine learning algorithm. In this study, we investigated the robustness of our proposed model while changing the ratio of response customers and nonresponse customers to the promotion program because the response customers for the suggested promotion is always a small part of nonresponse customers in the real world. We simulated the proposed model 100 times to validate the robustness with different ratio of response customers to response customers under the imbalanced data distribution. Finally, we found that our proposed CBR based model showed superior performance than compared models under the imbalanced data sets. Our study is expected to improve the performance of response model for the promotion program with CBR under imbalanced data distribution in the real world.

Oceanic Application of Satellite Synthetic Aperture Radar - Focused on Sea Surface Wind Retrieval - (인공위성 합성개구레이더 영상 자료의 해양 활용 - 해상풍 산출을 중심으로 -)

  • Jang, Jae-Cheol;Park, Kyung-Ae
    • Journal of the Korean earth science society
    • /
    • v.40 no.5
    • /
    • pp.447-463
    • /
    • 2019
  • Sea surface wind is a fundamental element for understanding the oceanic phenomena and for analyzing changes of the Earth environment caused by global warming. Global research institutes have developed and operated scatterometers to accurately and continuously observe the sea surface wind, with the accuracy of approximately ${\pm}20^{\circ}$ for wind direction and ${\pm}2m\;s^{-1}$ for wind speed. Given that the spatial resolution of the scatterometer is 12.5-25.0 km, the applicability of the data to the coastal area is limited due to complicated coastal lines and many islands around the Korean Peninsula. In contrast, Synthetic Aperture Radar (SAR), one of microwave sensors, is an all-weather instrument, which enables us to retrieve sea surface wind with high resolution (<1 km) and compensate the sparse resolution of the scatterometer. In this study, we investigated the Geophysical Model Functions (GMF), which are the algorithms for retrieval of sea surface wind speed from the SAR data depending on each band such as C-, L-, or X-band radar. We reviewed in the simulation of the backscattering coefficients for relative wind direction, incidence angle, and wind speed by applying LMOD, CMOD, and XMOD model functions, and analyzed the characteristics of each GMF. We investigated previous studies about the validation of wind speed from the SAR data using these GMFs. The accuracy of sea surface wind from SAR data changed with respect to observation mode, GMF type, reference data for validation, preprocessing method, and the method for calculation of relative wind direction. It is expected that this study contributes to the potential users of SAR images who retrieve wind speeds from SAR data at the coastal region around the Korean Peninsula.

A study of Artificial Intelligence (AI) Speaker's Development Process in Terms of Social Constructivism: Focused on the Products and Periodic Co-revolution Process (인공지능(AI) 스피커에 대한 사회구성 차원의 발달과정 연구: 제품과 시기별 공진화 과정을 중심으로)

  • Cha, Hyeon-ju;Kweon, Sang-hee
    • Journal of Internet Computing and Services
    • /
    • v.22 no.1
    • /
    • pp.109-135
    • /
    • 2021
  • his study classified the development process of artificial intelligence (AI) speakers through analysis of the news text of artificial intelligence (AI) speakers shown in traditional news reports, and identified the characteristics of each product by period. The theoretical background used in the analysis are news frames and topic frames. As analysis methods, topic modeling and semantic network analysis using the LDA method were used. The research method was a content analysis method. From 2014 to 2019, 2710 news related to AI speakers were first collected, and secondly, topic frames were analyzed using Nodexl algorithm. The result of this study is that, first, the trend of topic frames by AI speaker provider type was different according to the characteristics of the four operators (communication service provider, online platform, OS provider, and IT device manufacturer). Specifically, online platform operators (Google, Naver, Amazon, Kakao) appeared as a frame that uses AI speakers as'search or input devices'. On the other hand, telecommunications operators (SKT, KT) showed prominent frames for IPTV, which is the parent company's flagship business, and 'auxiliary device' of the telecommunication business. Furthermore, the frame of "personalization of products and voice service" was remarkable for OS operators (MS, Apple), and the frame for IT device manufacturers (Samsung) was "Internet of Things (IoT) Integrated Intelligence System". The econd, result id that the trend of the topic frame by AI speaker development period (by year) showed a tendency to develop around AI technology in the first phase (2014-2016), and in the second phase (2017-2018), the social relationship between AI technology and users It was related to interaction, and in the third phase (2019), there was a trend of shifting from AI technology-centered to user-centered. As a result of QAP analysis, it was found that news frames by business operator and development period in AI speaker development are socially constituted by determinants of media discourse. The implication of this study was that the evolution of AI speakers was found by the characteristics of the parent company and the process of co-evolution due to interactions between users by business operator and development period. The implications of this study are that the results of this study are important indicators for predicting the future prospects of AI speakers and presenting directions accordingly.

Development of Topic Trend Analysis Model for Industrial Intelligence using Public Data (텍스트마이닝을 활용한 공개데이터 기반 기업 및 산업 토픽추이분석 모델 제안)

  • Park, Sunyoung;Lee, Gene Moo;Kim, You-Eil;Seo, Jinny
    • Journal of Technology Innovation
    • /
    • v.26 no.4
    • /
    • pp.199-232
    • /
    • 2018
  • There are increasing needs for understanding and fathoming of business management environment through big data analysis at industrial and corporative level. The research using the company disclosure information, which is comprehensively covering the business performance and the future plan of the company, is getting attention. However, there is limited research on developing applicable analytical models leveraging such corporate disclosure data due to its unstructured nature. This study proposes a text-mining-based analytical model for industrial and firm level analyses using publicly available company disclousre data. Specifically, we apply LDA topic model and word2vec word embedding model on the U.S. SEC data from the publicly listed firms and analyze the trends of business topics at the industrial and corporate levels. Using LDA topic modeling based on SEC EDGAR 10-K document, whole industrial management topics are figured out. For comparison of different pattern of industries' topic trend, software and hardware industries are compared in recent 20 years. Also, the changes of management subject at firm level are observed with comparison of two companies in software industry. The changes of topic trends provides lens for identifying decreasing and growing management subjects at industrial and firm level. Mapping companies and products(or services) based on dimension reduction after using word2vec word embedding model and principal component analysis of 10-K document at firm level in software industry, companies and products(services) that have similar management subjects are identified and also their changes in decades. For suggesting methodology to develop analysis model based on public management data at industrial and corporate level, there may be contributions in terms of making ground of practical methodology to identifying changes of managements subjects. However, there are required further researches to provide microscopic analytical model with regard to relation of technology management strategy between management performance in case of related to various pattern of management topics as of frequent changes of management subject or their momentum. Also more studies are needed for developing competitive context analysis model with product(service)-portfolios between firms.

A Study on the Effect of the Document Summarization Technique on the Fake News Detection Model (문서 요약 기법이 가짜 뉴스 탐지 모형에 미치는 영향에 관한 연구)

  • Shim, Jae-Seung;Won, Ha-Ram;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.201-220
    • /
    • 2019
  • Fake news has emerged as a significant issue over the last few years, igniting discussions and research on how to solve this problem. In particular, studies on automated fact-checking and fake news detection using artificial intelligence and text analysis techniques have drawn attention. Fake news detection research entails a form of document classification; thus, document classification techniques have been widely used in this type of research. However, document summarization techniques have been inconspicuous in this field. At the same time, automatic news summarization services have become popular, and a recent study found that the use of news summarized through abstractive summarization has strengthened the predictive performance of fake news detection models. Therefore, the need to study the integration of document summarization technology in the domestic news data environment has become evident. In order to examine the effect of extractive summarization on the fake news detection model, we first summarized news articles through extractive summarization. Second, we created a summarized news-based detection model. Finally, we compared our model with the full-text-based detection model. The study found that BPN(Back Propagation Neural Network) and SVM(Support Vector Machine) did not exhibit a large difference in performance; however, for DT(Decision Tree), the full-text-based model demonstrated a somewhat better performance. In the case of LR(Logistic Regression), our model exhibited the superior performance. Nonetheless, the results did not show a statistically significant difference between our model and the full-text-based model. Therefore, when the summary is applied, at least the core information of the fake news is preserved, and the LR-based model can confirm the possibility of performance improvement. This study features an experimental application of extractive summarization in fake news detection research by employing various machine-learning algorithms. The study's limitations are, essentially, the relatively small amount of data and the lack of comparison between various summarization technologies. Therefore, an in-depth analysis that applies various analytical techniques to a larger data volume would be helpful in the future.

An Outlier Detection Using Autoencoder for Ocean Observation Data (해양 이상 자료 탐지를 위한 오토인코더 활용 기법 최적화 연구)

  • Kim, Hyeon-Jae;Kim, Dong-Hoon;Lim, Chaewook;Shin, Yongtak;Lee, Sang-Chul;Choi, Youngjin;Woo, Seung-Buhm
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.33 no.6
    • /
    • pp.265-274
    • /
    • 2021
  • Outlier detection research in ocean data has traditionally been performed using statistical and distance-based machine learning algorithms. Recently, AI-based methods have received a lot of attention and so-called supervised learning methods that require classification information for data are mainly used. This supervised learning method requires a lot of time and costs because classification information (label) must be manually designated for all data required for learning. In this study, an autoencoder based on unsupervised learning was applied as an outlier detection to overcome this problem. For the experiment, two experiments were designed: one is univariate learning, in which only SST data was used among the observation data of Deokjeok Island and the other is multivariate learning, in which SST, air temperature, wind direction, wind speed, air pressure, and humidity were used. Period of data is 25 years from 1996 to 2020, and a pre-processing considering the characteristics of ocean data was applied to the data. An outlier detection of actual SST data was tried with a learned univariate and multivariate autoencoder. We tried to detect outliers in real SST data using trained univariate and multivariate autoencoders. To compare model performance, various outlier detection methods were applied to synthetic data with artificially inserted errors. As a result of quantitatively evaluating the performance of these methods, the multivariate/univariate accuracy was about 96%/91%, respectively, indicating that the multivariate autoencoder had better outlier detection performance. Outlier detection using an unsupervised learning-based autoencoder is expected to be used in various ways in that it can reduce subjective classification errors and cost and time required for data labeling.

A Statistical Analysis of Phenotypic Diversity Based on Genetic Traits in Barley Germplasms (특성평가 정보를 활용한 보리 유전자원 형태적 형질 다양성의 통계적 분석)

  • Yu, Dong Su;Shin, Myoung-Jae;Park, Jin-Cheon;Kang, Manjung
    • Korean Journal of Plant Resources
    • /
    • v.35 no.5
    • /
    • pp.641-651
    • /
    • 2022
  • The biodiversity research of barley, a functional food, is proceeding to conserve germplasms and develop new cultivar of barley to improve its functional effects. In this study, with 25,104 barley germplasms in the National Agrobiodiversity Center, South Korea, the biodiversity index of species was much lower (1.17) than the origins (24.73) because of the presence of a biased species, Hordeum vulgare subsp. vulgare, but the species and origin of germplasms were significantly different with regard to genetic traits. In the clustering analysis based on genetic traits, we found that 97% barley germplasms could mostly be distributed between 1~7 clusters out of a total of 15 clusters; 'normal and uzu type', 'lodging', and 'loose smut' were commonly represented in the 1~7 clusters and some clusters showed specific differences in five genetic traits including 'growth habit'. In correlation of each genetic trait, the infection of 'barley yellow mosaic virus' was highly correlated to 'number of grains per spike'. '1000 grain weight' was weakly correlated with seven genetic traits including 'number of grains per spike'. Our analysis for barley's biodiversity can provide a useful guide to the species' phenotypes that need to be collected to conserve biodiversity and to breed new barley varieties.