• Title/Summary/Keyword: Measure

Search Result 31,185, Processing Time 0.068 seconds

The Study on the Debris Slope Landform in the Southern Taebaek Mountains (태백산맥 남부산지의 암설사면지형)

  • Jeon, Young-Gweon
    • Journal of the Korean Geographical Society
    • /
    • v.28 no.2
    • /
    • pp.77-98
    • /
    • 1993
  • The intent of this study is to analyze the characteristics of distribution, patter, and deposits of the exposed debris slope landform by aerial photography interpretation, measure-ment on the topographical maps and field surveys in the southern part Taebaek mountains. It also aims to research the arrangement types of mountain slope and the landform development of debris slopes in this area. In conclusion, main observations can be summed up as follows. 1. The distribution characteristics 1)From the viewpoint of bedrocks, the distribution density of talus is high in case of the bedrock with high density of joints, sheeting structures and hard rocks, but that of the block stream is high in case of intrusive rocks with the talus line. 2)From the viewpoint of bedrocks, the distribution density of talus is high in case of the bedrock with high density of joints, sheeting structures and hard rocks, but that of the block stream is high in case of inrtusive rocks with the talus line. 2) From the viewpoint of distribution altitude, talus is mainly distributed in the 301~500 meters part above the sea level, while the block stream is distributed in the 101~300 meters part. 3) From the viewpoint of slope oriention, the distribution density of talus on the slope facing the south(S, SE, SW) is a little higher than that of talus on the slope facing the north(N, NE, NW). 2. The Pattern Characteristics 1) The tongue-shaped type among the four types is the most in number. 2) The average length of talus slope is 99 meters, especially that of talus composed of hornfels or granodiorite is longer. Foth the former is easy to make free face; the latter is easdy to produce round stones. The average length of block stream slope is 145 meters, the longest of all is one km(granodiorite). 3) The gradient of talus slope is 20~45${^\circ}$, most of them 26-30${^\croc}$; but talus composed of intrusive rocks is gentle. 4) The slope pattern of talus shows concave slope, which means readjustment of constituent debris. Some of the block stream slope patterns show concave slope at the upper slope and the lower slope, but convex slope at the middle slope; others have uneven slope. 3. The deposit characteristics 1) The average length of constituent debris is 48~172 centimeters in diameter, the sorting of debris is not bad without matrix. That of block stream is longer than that of talus; this difference of debris average diameter is funda-mentally caused by joint space of bedrocks. 2) The shape of constituent debris in talus is mainly angular, but that of the debris composed of intrusive rocks is sub-angular. The shape of constituent debris in block stream is mainly sub-roundl. 3) IN case dof talus, debris diameter is generally increasing with downward slope, but some of them are disordered and the debris diameter of the sides are larger than that of the middle part on a landform surface. In block stream, debris diameter variation is perpendicularly disordered, and the debris diameter of the middle part is generally larger than that of the sides on a landform surface. 4)The long axis orientation of debris is a not bad at the lower part of the slope in talus (only 2 of 6 talus). In block stream(2 of 3), one is good in sorting; another is not bad. The researcher thinks that the latter was caused by the collapse of constituent debris. 5) Most debris were weathered and some are secondly weathered in situ, but talus composed of fresh debris is developing. 4. The landform development of debris slopes and the arrangement types of the mountain slope 1) The formation and development period of talus is divided into two periods. The first period is formation period of talus9the last glacial period), the second period is adjustment period(postglacial age). And that of block stream is divided into three periods: the first period is production period of blocks(tertiary, interglacial period), the second formation period of block stream(the last glacial period), and the third adjustment period of block stream(postglacialage). 2) The arrangement types of mountain slope are divided into six types in this research area, which are as follows. Type I; high level convex slope-free face-talus-block stream-alluvial surface Type II: high level convex slope-free face-talus-alluvial surface Type III: free face-talus-block stream-all-uvial surface Type IV: free face-talus-alluval surface Type V: talus-alluval surface Type VI: block stream-alluvial surface Particularly, type IV id\s basic type of all; others are modified ones.

  • PDF

A Study on Aviation Safety and Third Country Operator of EU Regulation in light of the Convention on international Civil Aviation (시카고협약체계에서의 EU의 항공법규체계 연구 - TCO 규정을 중심으로 -)

  • Lee, Koo-Hee
    • The Korean Journal of Air & Space Law and Policy
    • /
    • v.29 no.1
    • /
    • pp.67-95
    • /
    • 2014
  • Some Contracting States of the Chicago Convention issue FAOC(Foreign Air Operator Certificate) and conduct various safety assessments for the safety of the foreign operators which operate to their state. These FAOC and safety audits on the foreign operators are being expanded to other parts of the world. While this trend is the strengthening measure of aviation safety resulting in the reduction of aircraft accident. FAOC also burdens the other contracting States to the Chicago Convention due to additional requirements and late permission. EASA(European Aviation Safety Agency) is a body governed by European Basic Regulation. EASA was set up in 2003 and conduct specific regulatory and executive tasks in the field of civil aviation safety and environmental protection. EASA's mission is to promote the highest common standards of safety and environmental protection in civil aviation. The task of the EASA has been expanded from airworthiness to air operations and currently includes the rulemaking and standardization of airworthiness, air crew, air operations, TCO, ATM/ANS safety oversight, aerodromes, etc. According to Implementing Rule, Commission Regulation(EU) No 452/2014, EASA has the mandate to issue safety authorizations to commercial air carriers from outside the EU as from 26 May 2014. Third country operators (TCO) flying to any of the 28 EU Member States and/or to 4 EFTA States (Iceland, Norway, Liechtenstein, Switzerland) must apply to EASA for a so called TCO authorization. EASA will only take over the safety-related part of foreign operator assessment. Operating permits will continue to be issued by the national authorities. A 30-month transition period ensures smooth implementation without interrupting international air operations of foreign air carriers to the EU/EASA. Operators who are currently flying to Europe can continue to do so, but must submit an application for a TCO authorization before 26 November 2014. After the transition period, which lasts until 26 November 2016, a valid TCO authorization will be a mandatory prerequisite, in the absence of which an operating permit cannot be issued by a Member State. The European TCO authorization regime does not differentiate between scheduled and non-scheduled commercial air transport operations in principle. All TCO with commercial air transport need to apply for a TCO authorization. Operators with a potential need of operating to the EU at some time in the near future are advised to apply for a TCO authorization in due course, even when the date of operations is unknown. For all the issue mentioned above, I have studied the function of EASA and EU Regulation including TCO Implementing Rule newly introduced, and suggested some proposals. I hope that this paper is 1) to help preparation of TCO authorization, 2) to help understanding about the international issue, 3) to help the improvement of korean aviation regulations and government organizations, 4) to help compliance with international standards and to contribute to the promotion of aviation safety, in addition.

Intelligent VOC Analyzing System Using Opinion Mining (오피니언 마이닝을 이용한 지능형 VOC 분석시스템)

  • Kim, Yoosin;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.113-125
    • /
    • 2013
  • Every company wants to know customer's requirement and makes an effort to meet them. Cause that, communication between customer and company became core competition of business and that important is increasing continuously. There are several strategies to find customer's needs, but VOC (Voice of customer) is one of most powerful communication tools and VOC gathering by several channels as telephone, post, e-mail, website and so on is so meaningful. So, almost company is gathering VOC and operating VOC system. VOC is important not only to business organization but also public organization such as government, education institute, and medical center that should drive up public service quality and customer satisfaction. Accordingly, they make a VOC gathering and analyzing System and then use for making a new product and service, and upgrade. In recent years, innovations in internet and ICT have made diverse channels such as SNS, mobile, website and call-center to collect VOC data. Although a lot of VOC data is collected through diverse channel, the proper utilization is still difficult. It is because the VOC data is made of very emotional contents by voice or text of informal style and the volume of the VOC data are so big. These unstructured big data make a difficult to store and analyze for use by human. So that, the organization need to automatic collecting, storing, classifying and analyzing system for unstructured big VOC data. This study propose an intelligent VOC analyzing system based on opinion mining to classify the unstructured VOC data automatically and determine the polarity as well as the type of VOC. And then, the basis of the VOC opinion analyzing system, called domain-oriented sentiment dictionary is created and corresponding stages are presented in detail. The experiment is conducted with 4,300 VOC data collected from a medical website to measure the effectiveness of the proposed system and utilized them to develop the sensitive data dictionary by determining the special sentiment vocabulary and their polarity value in a medical domain. Through the experiment, it comes out that positive terms such as "칭찬, 친절함, 감사, 무사히, 잘해, 감동, 미소" have high positive opinion value, and negative terms such as "퉁명, 뭡니까, 말하더군요, 무시하는" have strong negative opinion. These terms are in general use and the experiment result seems to be a high probability of opinion polarity. Furthermore, the accuracy of proposed VOC classification model has been compared and the highest classification accuracy of 77.8% is conformed at threshold with -0.50 of opinion classification of VOC. Through the proposed intelligent VOC analyzing system, the real time opinion classification and response priority of VOC can be predicted. Ultimately the positive effectiveness is expected to catch the customer complains at early stage and deal with it quickly with the lower number of staff to operate the VOC system. It can be made available human resource and time of customer service part. Above all, this study is new try to automatic analyzing the unstructured VOC data using opinion mining, and shows that the system could be used as variable to classify the positive or negative polarity of VOC opinion. It is expected to suggest practical framework of the VOC analysis to diverse use and the model can be used as real VOC analyzing system if it is implemented as system. Despite experiment results and expectation, this study has several limits. First of all, the sample data is only collected from a hospital web-site. It means that the sentimental dictionary made by sample data can be lean too much towards on that hospital and web-site. Therefore, next research has to take several channels such as call-center and SNS, and other domain like government, financial company, and education institute.

Construction of Consumer Confidence index based on Sentiment analysis using News articles (뉴스기사를 이용한 소비자의 경기심리지수 생성)

  • Song, Minchae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.1-27
    • /
    • 2017
  • It is known that the economic sentiment index and macroeconomic indicators are closely related because economic agent's judgment and forecast of the business conditions affect economic fluctuations. For this reason, consumer sentiment or confidence provides steady fodder for business and is treated as an important piece of economic information. In Korea, private consumption accounts and consumer sentiment index highly relevant for both, which is a very important economic indicator for evaluating and forecasting the domestic economic situation. However, despite offering relevant insights into private consumption and GDP, the traditional approach to measuring the consumer confidence based on the survey has several limits. One possible weakness is that it takes considerable time to research, collect, and aggregate the data. If certain urgent issues arise, timely information will not be announced until the end of each month. In addition, the survey only contains information derived from questionnaire items, which means it can be difficult to catch up to the direct effects of newly arising issues. The survey also faces potential declines in response rates and erroneous responses. Therefore, it is necessary to find a way to complement it. For this purpose, we construct and assess an index designed to measure consumer economic sentiment index using sentiment analysis. Unlike the survey-based measures, our index relies on textual analysis to extract sentiment from economic and financial news articles. In particular, text data such as news articles and SNS are timely and cover a wide range of issues; because such sources can quickly capture the economic impact of specific economic issues, they have great potential as economic indicators. There exist two main approaches to the automatic extraction of sentiment from a text, we apply the lexicon-based approach, using sentiment lexicon dictionaries of words annotated with the semantic orientations. In creating the sentiment lexicon dictionaries, we enter the semantic orientation of individual words manually, though we do not attempt a full linguistic analysis (one that involves analysis of word senses or argument structure); this is the limitation of our research and further work in that direction remains possible. In this study, we generate a time series index of economic sentiment in the news. The construction of the index consists of three broad steps: (1) Collecting a large corpus of economic news articles on the web, (2) Applying lexicon-based methods for sentiment analysis of each article to score the article in terms of sentiment orientation (positive, negative and neutral), and (3) Constructing an economic sentiment index of consumers by aggregating monthly time series for each sentiment word. In line with existing scholarly assessments of the relationship between the consumer confidence index and macroeconomic indicators, any new index should be assessed for its usefulness. We examine the new index's usefulness by comparing other economic indicators to the CSI. To check the usefulness of the newly index based on sentiment analysis, trend and cross - correlation analysis are carried out to analyze the relations and lagged structure. Finally, we analyze the forecasting power using the one step ahead of out of sample prediction. As a result, the news sentiment index correlates strongly with related contemporaneous key indicators in almost all experiments. We also find that news sentiment shocks predict future economic activity in most cases. In almost all experiments, the news sentiment index strongly correlates with related contemporaneous key indicators. Furthermore, in most cases, news sentiment shocks predict future economic activity; in head-to-head comparisons, the news sentiment measures outperform survey-based sentiment index as CSI. Policy makers want to understand consumer or public opinions about existing or proposed policies. Such opinions enable relevant government decision-makers to respond quickly to monitor various web media, SNS, or news articles. Textual data, such as news articles and social networks (Twitter, Facebook and blogs) are generated at high-speeds and cover a wide range of issues; because such sources can quickly capture the economic impact of specific economic issues, they have great potential as economic indicators. Although research using unstructured data in economic analysis is in its early stages, but the utilization of data is expected to greatly increase once its usefulness is confirmed.

The Effects of Pergola Wisteria floribunda's LAI on Thermal Environment (그늘시렁 Wisteria floribunda의 엽면적지수가 온열환경에 미치는 영향)

  • Ryu, Nam-Hyong;Lee, Chun-Seok
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.45 no.6
    • /
    • pp.115-125
    • /
    • 2017
  • This study was to investigate the user's thermal environments under the pergola($L\;7,200{\times}W\;4,200{\times}H\;2,700mn$) covered with Wisteria floribunda(Willd.) DC. according to the variation of leaf area index(LAI). We carried out detailed measurements with two human-biometeorological stations on a popular square Jinju, Korea($N35^{\circ}10^{\prime}59.8^{{\prime}{\prime}}$, $E\;128^{\circ}05^{\prime}32.0^{{\prime}{\prime}}$, elevation: 38m). One of the stations stood under a pergola, while the other in the sun. The measurement spots were instrumented with microclimate monitoring stations to continuously measure air temperature and relative humidity, wind speed, shortwave and longwave radiation from the six cardinal directions at the height of 0.6m so as to calculate the Universal Thermal Climate Index(UTCI) from $9^{th}$ April to $27^{th}$ September 2017. The LAI was measured using the LAI-2200C Plant Canopy Analyzer. The analysis results of 18 day's 1 minute term human-biometeorological data absorbed by a man in sitting position from 10am to 4pm showed the following. During the whole observation period, daily average air temperatures under the pergola were respectively $0.7{\sim}2.3^{\circ}C$ lower compared with those in the sun, daily average wind speed and relative humidity under the pergola were respectively 0.17~0.38m/s and 0.4~3.1% higher compared with those in the sun. There was significant relationship in LAI, Julian day number and were expressed in the equation $y=-0.0004x^2+0.1719x-11.765(R^2=0.9897)$. The average $T_{mrt}$ under the pergola were $11.9{\sim}25.4^{\circ}C$ lower and maximum ${\Delta}T_{mrt}$ under the pergola were $24.1{\sim}30.2^{\circ}C$ when compared with those in the sun. There was significant relationship in LAI, reduction ratio(%) of daily average $T_{mrt}$ compared with those in the sun and was expressed in the equation $y=0.0678{\ln}(x)+0.3036(R^2=0.9454)$. The average UTCI under the pergola were $4.1{\sim}8.3^{\circ}C$ lower and maximum ${\Delta}UTCI$ under the pergola were $7.8{\sim}10.2^{\circ}C$ when compared with those in the sun. There was significant relationship in LAI, reduction ratio(%) of daily average UTCI compared with those in the sun and were expressed in the equation $y=0.0322{\ln}(x)+0.1538(R^2=0.8946)$. The shading by the pergola covered with vines was very effective for reducing daytime UTCI absorbed by a man in sitting position at summer largely through a reduction in mean radiant temperature from sun protection, lowering thermal stress from very strong(UTCI >$38^{\circ}C$) and strong(UTCI >$32^{\circ}C$) down to strong(UTCI >$32^{\circ}C$) and moderate(UTCI >$26^{\circ}C$). Therefore the pergola covered with vines used for shading outdoor spaces is essential to mitigate heat stress and can create better human thermal comfort especially in cities during summer. But the thermal environments under the pergola covered with vines during the heat wave supposed to user "very strong heat stress(UTCI>$38^{\circ}C$)". Therefore users must restrain themselves from outdoor activities during the heat waves.

Research Trend Analysis Using Bibliographic Information and Citations of Cloud Computing Articles: Application of Social Network Analysis (클라우드 컴퓨팅 관련 논문의 서지정보 및 인용정보를 활용한 연구 동향 분석: 사회 네트워크 분석의 활용)

  • Kim, Dongsung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.195-211
    • /
    • 2014
  • Cloud computing services provide IT resources as services on demand. This is considered a key concept, which will lead a shift from an ownership-based paradigm to a new pay-for-use paradigm, which can reduce the fixed cost for IT resources, and improve flexibility and scalability. As IT services, cloud services have evolved from early similar computing concepts such as network computing, utility computing, server-based computing, and grid computing. So research into cloud computing is highly related to and combined with various relevant computing research areas. To seek promising research issues and topics in cloud computing, it is necessary to understand the research trends in cloud computing more comprehensively. In this study, we collect bibliographic information and citation information for cloud computing related research papers published in major international journals from 1994 to 2012, and analyzes macroscopic trends and network changes to citation relationships among papers and the co-occurrence relationships of key words by utilizing social network analysis measures. Through the analysis, we can identify the relationships and connections among research topics in cloud computing related areas, and highlight new potential research topics. In addition, we visualize dynamic changes of research topics relating to cloud computing using a proposed cloud computing "research trend map." A research trend map visualizes positions of research topics in two-dimensional space. Frequencies of key words (X-axis) and the rates of increase in the degree centrality of key words (Y-axis) are used as the two dimensions of the research trend map. Based on the values of the two dimensions, the two dimensional space of a research map is divided into four areas: maturation, growth, promising, and decline. An area with high keyword frequency, but low rates of increase of degree centrality is defined as a mature technology area; the area where both keyword frequency and the increase rate of degree centrality are high is defined as a growth technology area; the area where the keyword frequency is low, but the rate of increase in the degree centrality is high is defined as a promising technology area; and the area where both keyword frequency and the rate of degree centrality are low is defined as a declining technology area. Based on this method, cloud computing research trend maps make it possible to easily grasp the main research trends in cloud computing, and to explain the evolution of research topics. According to the results of an analysis of citation relationships, research papers on security, distributed processing, and optical networking for cloud computing are on the top based on the page-rank measure. From the analysis of key words in research papers, cloud computing and grid computing showed high centrality in 2009, and key words dealing with main elemental technologies such as data outsourcing, error detection methods, and infrastructure construction showed high centrality in 2010~2011. In 2012, security, virtualization, and resource management showed high centrality. Moreover, it was found that the interest in the technical issues of cloud computing increases gradually. From annual cloud computing research trend maps, it was verified that security is located in the promising area, virtualization has moved from the promising area to the growth area, and grid computing and distributed system has moved to the declining area. The study results indicate that distributed systems and grid computing received a lot of attention as similar computing paradigms in the early stage of cloud computing research. The early stage of cloud computing was a period focused on understanding and investigating cloud computing as an emergent technology, linking to relevant established computing concepts. After the early stage, security and virtualization technologies became main issues in cloud computing, which is reflected in the movement of security and virtualization technologies from the promising area to the growth area in the cloud computing research trend maps. Moreover, this study revealed that current research in cloud computing has rapidly transferred from a focus on technical issues to for a focus on application issues, such as SLAs (Service Level Agreements).

Product Community Analysis Using Opinion Mining and Network Analysis: Movie Performance Prediction Case (오피니언 마이닝과 네트워크 분석을 활용한 상품 커뮤니티 분석: 영화 흥행성과 예측 사례)

  • Jin, Yu;Kim, Jungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.49-65
    • /
    • 2014
  • Word of Mouth (WOM) is a behavior used by consumers to transfer or communicate their product or service experience to other consumers. Due to the popularity of social media such as Facebook, Twitter, blogs, and online communities, electronic WOM (e-WOM) has become important to the success of products or services. As a result, most enterprises pay close attention to e-WOM for their products or services. This is especially important for movies, as these are experiential products. This paper aims to identify the network factors of an online movie community that impact box office revenue using social network analysis. In addition to traditional WOM factors (volume and valence of WOM), network centrality measures of the online community are included as influential factors in box office revenue. Based on previous research results, we develop five hypotheses on the relationships between potential influential factors (WOM volume, WOM valence, degree centrality, betweenness centrality, closeness centrality) and box office revenue. The first hypothesis is that the accumulated volume of WOM in online product communities is positively related to the total revenue of movies. The second hypothesis is that the accumulated valence of WOM in online product communities is positively related to the total revenue of movies. The third hypothesis is that the average of degree centralities of reviewers in online product communities is positively related to the total revenue of movies. The fourth hypothesis is that the average of betweenness centralities of reviewers in online product communities is positively related to the total revenue of movies. The fifth hypothesis is that the average of betweenness centralities of reviewers in online product communities is positively related to the total revenue of movies. To verify our research model, we collect movie review data from the Internet Movie Database (IMDb), which is a representative online movie community, and movie revenue data from the Box-Office-Mojo website. The movies in this analysis include weekly top-10 movies from September 1, 2012, to September 1, 2013, with in total. We collect movie metadata such as screening periods and user ratings; and community data in IMDb including reviewer identification, review content, review times, responder identification, reply content, reply times, and reply relationships. For the same period, the revenue data from Box-Office-Mojo is collected on a weekly basis. Movie community networks are constructed based on reply relationships between reviewers. Using a social network analysis tool, NodeXL, we calculate the averages of three centralities including degree, betweenness, and closeness centrality for each movie. Correlation analysis of focal variables and the dependent variable (final revenue) shows that three centrality measures are highly correlated, prompting us to perform multiple regressions separately with each centrality measure. Consistent with previous research results, our regression analysis results show that the volume and valence of WOM are positively related to the final box office revenue of movies. Moreover, the averages of betweenness centralities from initial community networks impact the final movie revenues. However, both of the averages of degree centralities and closeness centralities do not influence final movie performance. Based on the regression results, three hypotheses, 1, 2, and 4, are accepted, and two hypotheses, 3 and 5, are rejected. This study tries to link the network structure of e-WOM on online product communities with the product's performance. Based on the analysis of a real online movie community, the results show that online community network structures can work as a predictor of movie performance. The results show that the betweenness centralities of the reviewer community are critical for the prediction of movie performance. However, degree centralities and closeness centralities do not influence movie performance. As future research topics, similar analyses are required for other product categories such as electronic goods and online content to generalize the study results.

Rough Set Analysis for Stock Market Timing (러프집합분석을 이용한 매매시점 결정)

  • Huh, Jin-Nyung;Kim, Kyoung-Jae;Han, In-Goo
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.77-97
    • /
    • 2010
  • Market timing is an investment strategy which is used for obtaining excessive return from financial market. In general, detection of market timing means determining when to buy and sell to get excess return from trading. In many market timing systems, trading rules have been used as an engine to generate signals for trade. On the other hand, some researchers proposed the rough set analysis as a proper tool for market timing because it does not generate a signal for trade when the pattern of the market is uncertain by using the control function. The data for the rough set analysis should be discretized of numeric value because the rough set only accepts categorical data for analysis. Discretization searches for proper "cuts" for numeric data that determine intervals. All values that lie within each interval are transformed into same value. In general, there are four methods for data discretization in rough set analysis including equal frequency scaling, expert's knowledge-based discretization, minimum entropy scaling, and na$\ddot{i}$ve and Boolean reasoning-based discretization. Equal frequency scaling fixes a number of intervals and examines the histogram of each variable, then determines cuts so that approximately the same number of samples fall into each of the intervals. Expert's knowledge-based discretization determines cuts according to knowledge of domain experts through literature review or interview with experts. Minimum entropy scaling implements the algorithm based on recursively partitioning the value set of each variable so that a local measure of entropy is optimized. Na$\ddot{i}$ve and Booleanreasoning-based discretization searches categorical values by using Na$\ddot{i}$ve scaling the data, then finds the optimized dicretization thresholds through Boolean reasoning. Although the rough set analysis is promising for market timing, there is little research on the impact of the various data discretization methods on performance from trading using the rough set analysis. In this study, we compare stock market timing models using rough set analysis with various data discretization methods. The research data used in this study are the KOSPI 200 from May 1996 to October 1998. KOSPI 200 is the underlying index of the KOSPI 200 futures which is the first derivative instrument in the Korean stock market. The KOSPI 200 is a market value weighted index which consists of 200 stocks selected by criteria on liquidity and their status in corresponding industry including manufacturing, construction, communication, electricity and gas, distribution and services, and financing. The total number of samples is 660 trading days. In addition, this study uses popular technical indicators as independent variables. The experimental results show that the most profitable method for the training sample is the na$\ddot{i}$ve and Boolean reasoning but the expert's knowledge-based discretization is the most profitable method for the validation sample. In addition, the expert's knowledge-based discretization produced robust performance for both of training and validation sample. We also compared rough set analysis and decision tree. This study experimented C4.5 for the comparison purpose. The results show that rough set analysis with expert's knowledge-based discretization produced more profitable rules than C4.5.

A Hybrid SVM Classifier for Imbalanced Data Sets (불균형 데이터 집합의 분류를 위한 하이브리드 SVM 모델)

  • Lee, Jae Sik;Kwon, Jong Gu
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.125-140
    • /
    • 2013
  • We call a data set in which the number of records belonging to a certain class far outnumbers the number of records belonging to the other class, 'imbalanced data set'. Most of the classification techniques perform poorly on imbalanced data sets. When we evaluate the performance of a certain classification technique, we need to measure not only 'accuracy' but also 'sensitivity' and 'specificity'. In a customer churn prediction problem, 'retention' records account for the majority class, and 'churn' records account for the minority class. Sensitivity measures the proportion of actual retentions which are correctly identified as such. Specificity measures the proportion of churns which are correctly identified as such. The poor performance of the classification techniques on imbalanced data sets is due to the low value of specificity. Many previous researches on imbalanced data sets employed 'oversampling' technique where members of the minority class are sampled more than those of the majority class in order to make a relatively balanced data set. When a classification model is constructed using this oversampled balanced data set, specificity can be improved but sensitivity will be decreased. In this research, we developed a hybrid model of support vector machine (SVM), artificial neural network (ANN) and decision tree, that improves specificity while maintaining sensitivity. We named this hybrid model 'hybrid SVM model.' The process of construction and prediction of our hybrid SVM model is as follows. By oversampling from the original imbalanced data set, a balanced data set is prepared. SVM_I model and ANN_I model are constructed using the imbalanced data set, and SVM_B model is constructed using the balanced data set. SVM_I model is superior in sensitivity and SVM_B model is superior in specificity. For a record on which both SVM_I model and SVM_B model make the same prediction, that prediction becomes the final solution. If they make different prediction, the final solution is determined by the discrimination rules obtained by ANN and decision tree. For a record on which SVM_I model and SVM_B model make different predictions, a decision tree model is constructed using ANN_I output value as input and actual retention or churn as target. We obtained the following two discrimination rules: 'IF ANN_I output value <0.285, THEN Final Solution = Retention' and 'IF ANN_I output value ${\geq}0.285$, THEN Final Solution = Churn.' The threshold 0.285 is the value optimized for the data used in this research. The result we present in this research is the structure or framework of our hybrid SVM model, not a specific threshold value such as 0.285. Therefore, the threshold value in the above discrimination rules can be changed to any value depending on the data. In order to evaluate the performance of our hybrid SVM model, we used the 'churn data set' in UCI Machine Learning Repository, that consists of 85% retention customers and 15% churn customers. Accuracy of the hybrid SVM model is 91.08% that is better than that of SVM_I model or SVM_B model. The points worth noticing here are its sensitivity, 95.02%, and specificity, 69.24%. The sensitivity of SVM_I model is 94.65%, and the specificity of SVM_B model is 67.00%. Therefore the hybrid SVM model developed in this research improves the specificity of SVM_B model while maintaining the sensitivity of SVM_I model.

A Review of Personal Radiation Dose per Radiological Technologists Working at General Hospitals (전국 종합병원 방사선사의 개인피폭선량에 대한 고찰)

  • Jung, Hong-Ryang;Lim, Cheong-Hwan;Lee, Man-Koo
    • Journal of radiological science and technology
    • /
    • v.28 no.2
    • /
    • pp.137-144
    • /
    • 2005
  • To find the personal radiation dose of radiological technologists, a survey was conducted to 623 radiological technologists who had been working at 44 general hospitals in Korea's 16 cities and provinces from 1998 to 2002. A total of 2,624 cases about personal radiological dose that were collected were analyzed by region, year and hospital, the results of which look as follows : 1. The average radiation dose per capita by region and year for the 5 years was 1.61 mSv. By region, Daegu showed the highest amount 4.74 mSv, followed by Gangwon 4.65 mSv and Gyeonggi 2.15 mSv. The lowest amount was recorded in Chungbuk 0.91 mSv, Jeju 0.94 mSv and Busan 0.97 mSv in order. By year, 2000 appeared to be the year showing the highest amount of radiation dose 1.80 mSv, followed by 2002 1.77 mSv, 1999 1.55 mSv, 2001 1.50 mSv and 1998 1.36 mSv. 2. In 1998, Gangwon featured the highest amount of radiological dose per capita 3.28 mSv, followed by Gwangju 2.51 mSv and Daejeon 2.25 mSv, while Jeju 0.86mSv and Chungbuk 0.85 mSv belonged to the area where the radiation dose remained less than 1.0 mSv In 1999, Gangwon also topped the list with 5.67 mSv, followed by Daegu with 4.35 mSv and Gyeonggi with 2.48 mSv. In the same year, the radiation dose was kept below 1.0 mSv. in Ulsan 0.98 mSv, Gyeongbuk 0.95 mSv and Jeju 0.91 mSv. 3. In 2000, Gangwon was again at the top of the list with 5.73 mSv. Ulsan turned out to have less than 1.0 mSv of radiation dose in the years 1998 and 1999 consecutively, whereas the amount increased relatively high to 5.20 mSv. Chungbuk remained below the level of 1.0 mSv with 0.79 mSv. 4. In 2001, Daegu recorded the highest amount of radiation dose among those ever analyzed for 5 years with 9.05 mSv, followed by Gangwon with 4.01 mSv. The area with less than 1.0 mSv included Gyeongbuk 0.99 mSv and Jeonbuk 0.92 mSv. In 2002, Gangwon also led the list with 4.65 mSv while Incheon 0.88 mSv, Jeonbuk 0.96 mSv and Jeju 0.68 mSv belonged to the regions with less than 1.0 mSv of radiation dose. 5. By hospital, KMH in Daegu showed the record high amount of average radiation dose during the period of 5 years 6.82 mSv, followed by GAH 5.88 mSv in Gangwon and CAH 3.66 mSv in Seoul. YSH in Jeonnam 0.36 mSv comes first in the order of the hospitals with least amount of radiation dose, followed by GNH in Gyeongnam 0.39 mSv and DKH in Chungnam 0.51 mSv. There is a limit to the present study in that a focus is laid on the radiological technologists who are working at the 3rd referral hospitals which are regarded to be stable in terms of working conditions while radiological technologists who are working at small-sized hospitals are excluded from the survey. Besides, there are also cases in which hospitals with less than 5 years since establishment are included in the survey and the radiological technologists who have worked for less than 5 years at a hospital are also put to survey. We can't exclude the possibility, either, of assumption that the difference of personal average radiological dose by region, hospital and year might be ascribed to the different working conditions and facilities by medical institutions. It seems therefore desirable to develop standardized instruments to measure working environment objectively and to invent device to compare and analyze them by region and hospital more accurately in the future.

  • PDF