• Title/Summary/Keyword: K-평균 군집방법

Search Result 192, Processing Time 0.034 seconds

Online news-based stock price forecasting considering homogeneity in the industrial sector (산업군 내 동질성을 고려한 온라인 뉴스 기반 주가예측)

  • Seong, Nohyoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.1-19
    • /
    • 2018
  • Since stock movements forecasting is an important issue both academically and practically, studies related to stock price prediction have been actively conducted. The stock price forecasting research is classified into structured data and unstructured data, and it is divided into technical analysis, fundamental analysis and media effect analysis in detail. In the big data era, research on stock price prediction combining big data is actively underway. Based on a large number of data, stock prediction research mainly focuses on machine learning techniques. Especially, research methods that combine the effects of media are attracting attention recently, among which researches that analyze online news and utilize online news to forecast stock prices are becoming main. Previous studies predicting stock prices through online news are mostly sentiment analysis of news, making different corpus for each company, and making a dictionary that predicts stock prices by recording responses according to the past stock price. Therefore, existing studies have examined the impact of online news on individual companies. For example, stock movements of Samsung Electronics are predicted with only online news of Samsung Electronics. In addition, a method of considering influences among highly relevant companies has also been studied recently. For example, stock movements of Samsung Electronics are predicted with news of Samsung Electronics and a highly related company like LG Electronics.These previous studies examine the effects of news of industrial sector with homogeneity on the individual company. In the previous studies, homogeneous industries are classified according to the Global Industrial Classification Standard. In other words, the existing studies were analyzed under the assumption that industries divided into Global Industrial Classification Standard have homogeneity. However, existing studies have limitations in that they do not take into account influential companies with high relevance or reflect the existence of heterogeneity within the same Global Industrial Classification Standard sectors. As a result of our examining the various sectors, it can be seen that there are sectors that show the industrial sectors are not a homogeneous group. To overcome these limitations of existing studies that do not reflect heterogeneity, our study suggests a methodology that reflects the heterogeneous effects of the industrial sector that affect the stock price by applying k-means clustering. Multiple Kernel Learning is mainly used to integrate data with various characteristics. Multiple Kernel Learning has several kernels, each of which receives and predicts different data. To incorporate effects of target firm and its relevant firms simultaneously, we used Multiple Kernel Learning. Each kernel was assigned to predict stock prices with variables of financial news of the industrial group divided by the target firm, K-means cluster analysis. In order to prove that the suggested methodology is appropriate, experiments were conducted through three years of online news and stock prices. The results of this study are as follows. (1) We confirmed that the information of the industrial sectors related to target company also contains meaningful information to predict stock movements of target company and confirmed that machine learning algorithm has better predictive power when considering the news of the relevant companies and target company's news together. (2) It is important to predict stock movements with varying number of clusters according to the level of homogeneity in the industrial sector. In other words, when stock prices are homogeneous in industrial sectors, it is important to use relational effect at the level of industry group without analyzing clusters or to use it in small number of clusters. When the stock price is heterogeneous in industry group, it is important to cluster them into groups. This study has a contribution that we testified firms classified as Global Industrial Classification Standard have heterogeneity and suggested it is necessary to define the relevance through machine learning and statistical analysis methodology rather than simply defining it in the Global Industrial Classification Standard. It has also contribution that we proved the efficiency of the prediction model reflecting heterogeneity.

Impact of Difference in Korean Wave Awareness among Chinese Women on Quality Perception and Purchasing Behavior of Korean Cosmetic Products (중국여성의 한류 인지도 차이가 한국 화장품에 대한 품질인식과 구매행동에 미치는 영향)

  • Lee, Jeong-Suk
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.10
    • /
    • pp.5097-5104
    • /
    • 2013
  • To derive implication for marketing strategy for Korean cosmetic products in China, an analysis was conducted on the difference in quality perception and purchase behavior between two groups of Chinese women classified by their awareness of Korean Wave. Analytical methods including k-means clustering method, independent samples t-test, factor analysis were applied on the survey results of Chinese women residing in Guangzhou city. The positive impact of Korean Wave on quality perception and brand image is much stronger for higher awareness group, compared against for lower awareness group, that leads to higher product satisfaction and willingness to recommend purchases. Thus, marketing strategies need to be adjusted based on the difference in customers awareness of Korean Wave. However, the low price is the primary inducement for purchases for both groups, increased efforts to enhance brand image and product quality as premium products is strongly required, together with the utilization of Koran Wave.

Marine Algal Flora and Community Structure of Igidea Area in Busan, Korea (부산 이기대 지역의 해조상 및 군집구조)

  • Shin, Bong-Kyun;Kwon, Chun-Jung;Lee, Suk-Mo;Choi, Chang-Geun
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.20 no.2
    • /
    • pp.121-129
    • /
    • 2014
  • Marine algal flora and community structure were seasonally investigated at four sites in the vicinity of the Igidae on the southern east coast of Korea from May 2010 to February 2011. A total of 66 species including 9 of Chlorophyta, 14 of Phaeophyta, 43 of Rhodophyta were found during the survey period. Among these species, 16 species were found throughout the year. Seasonal mean biomass in wet weight was 123.6 (spring), 2,061.6 (summer), 412.0 (autumn), 678.9 (winter) $g{\cdot}m^{-2}$. Maximum biomass was recorded in summer($2,061.6g{\cdot}m^{-2}$), and minimum was recorded in spring($123.6g{\cdot}m^{-2}$). Spatial maximum and minimum species number were recorded at station 3 and 4(50 species) and at station 1(47 species). At station 1, 2 directly exposure on Yongho and Daeyeon cheon (stream) run off, and discharge from Nambu sewage treatment plants near coastal area, species diversity was relatively low and dominant species were similar throughout four seasons. The R/P, C/P and (R+C)/P value reflecting flora characteristics were 3.07, 0.64 and 3.71, respectively. The flora investigated could be classified into six functional groups such as coarsely branch form 39.39 %, sheet form 30.30 %, thick leather form 13.64 %, filamentous form 12.12 %, crustose form 3.03% and jointed calcareous form 1.52 % during survey period. The number of marine algae species in Igidea were 96 species at 1996 to 1997 and 66 species at 2010 to 2011, respectively. The change of seaweed species is due to the pollution loaded from sewage treatment plant and stream. We thus recommend that the positive maintenance control method like sewage treatment, for the protection of seaweed bed should be proceeded to near coastal area.

Effect of Sand Extraction on Meiobenthic Community of Jangbong-do in the Eastern Yellow Sea of Korea (서해 주문도 연안 사질 조하대에서의 해사채취가 중형저서동물 군집에 미치는 영향 연구)

  • Kang, Teawook;Min, Won-Gi;Hong, Jae-Sang;Kim, Dongsung
    • Korean Journal of Environmental Biology
    • /
    • v.32 no.2
    • /
    • pp.138-152
    • /
    • 2014
  • The objective of the study survey was to determine the effect of marine sand extraction on community composition and rate of recolonization of the meiobenthos following cessation of mining activities. Because of meiobenthic distribution in nature, high abundance, intimate association with sediments, fast reproduction, benthic larva period, sensitivity to pollution and rapid life histories, meiobenthos are widely regarded as ideal organisms to study the potential ecological indicator of natural and anthropogenic stresses. The community structure of meiobenthos was studied at seven stations within sandy tidal and sub tidal zones in Jangbongdo in the Yellow Sea, Korea from Aug. 2006 to Dec. 2007. Meiobenthic samples were collected by three core samples, with a 3.6 cm in diameter, from each sediment sample taken with a Smith-McIntyre Grab. It was found that sand mining often causes complete removal of the sediment and the damage to the habitats of meiobenthos. This study in the effect showed that sand mining resulted in a reduction in total abundance and biomass of meiobenthos in mining area. The finding of this study further showed that initial restoration of abundance and biomass within one year of the cessation of sand mining.

RAPD Analysis for Genetic Diversity of Melon Species (참외와 멜론의 유전적 다양성에 대한 RAPD 분석)

  • Mo, Suk-Youn;Im, Sung-Hee;Go, Gwan-DaI;Ann, Chong-Mun;Kim, Doo Hwan
    • Horticultural Science & Technology
    • /
    • v.16 no.1
    • /
    • pp.21-24
    • /
    • 1998
  • RAPD markers were analyzed in order to detect the genetic variation and diversity of the fifty-two melon lines. SDS extraction method produced more and purer DNA than CTAB method. RAPD reaction conditions were optimized as follows ; 10ng template DNA, 270nM primer, $200{\mu}M$ each of dATP, dCTP, dGTP and dTTP, $0.3{\mu}unit$ dynazyme and 10x buffer brought to $15{\mu}l$ final volume with distilled water. The adequate annealing temperature was $39^{\circ}C$ and forty cycles of amplification produced the best RAPD band patterns. Among a total of 123 bands from 12 random primers, 25 polymorphic bands(20%) were selected as reliable markers. The average number of polymorphic bands per primer was 2.1 among the 52 lines. Intragroup genetic relationship based on the marker difference was closer than intergroup genetic relationship. The 52 lines could be grouped into two major group (Korean landraces and melon lines) and then melon group subdivided into two subgroups (net melon lines and no-net melon). This result corresponded to morphological grouping. Eight RAPD markers separated the Korean landraces and melon groups and four RAPD markers separated net melon and no-net melon groups.

  • PDF

Recognition of Flat Type Signboard using Deep Learning (딥러닝을 이용한 판류형 간판의 인식)

  • Kwon, Sang Il;Kim, Eui Myoung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.4
    • /
    • pp.219-231
    • /
    • 2019
  • The specifications of signboards are set for each type of signboards, but the shape and size of the signboard actually installed are not uniform. In addition, because the colors of the signboard are not defined, so various colors are applied to the signboard. Methods for recognizing signboards can be thought of as similar methods of recognizing road signs and license plates, but due to the nature of the signboards, there are limitations in that the signboards can not be recognized in a way similar to road signs and license plates. In this study, we proposed a methodology for recognizing plate-type signboards, which are the main targets of illegal and old signboards, and automatically extracting areas of signboards, using the deep learning-based Faster R-CNN algorithm. The process of recognizing flat type signboards through signboard images captured by using smartphone cameras is divided into two sequences. First, the type of signboard was recognized using deep learning to recognize flat type signboards in various types of signboard images, and the result showed an accuracy of about 71%. Next, when the boundary recognition algorithm for the signboards was applied to recognize the boundary area of the flat type signboard, the boundary of flat type signboard was recognized with an accuracy of 85%.

Phytosociological Community Type Classification and Stand Structure in the Forest Vegetation of Hongdo Island, Jeollanam-do Province (전라남도 홍도 산림식생의 식물사회학적 군락유형분류와 임분 구조)

  • Kim, Ho-Jin;Shin, Jae-Kwon;Lee, Cheul-Ho;Yun, Chung-Weon
    • Journal of Korean Society of Forest Science
    • /
    • v.107 no.3
    • /
    • pp.245-257
    • /
    • 2018
  • The study was carried out to discover the forest vegetation structure in Hongdo Island, Jeonnam province. Vegetation data were collected by total of forty one quadrate plots using Z-M phytosociological method from June to August in 2017, and analyzed by vegetation classification, mean importance value and species diversity. As a result of vegetation type classification, Castanopsis sieboldii community group was classified at a top level of vegetation hierarchy. In the level of community, it was classified into Neolitsea sericea community and Carpinus turczaninowii community. N. sericea community was subdivided into Ficus erecta group(Vegetation unit 1) and Arisaema ringens group(VU 2). C. turczaninowii community was subdivided into Fraxinus sieboldiana group(VU 3) and C. turczaninowii typical group(VU 4). Therefore, it was classified into total of four vegetation units(one community group, three communities and four groups). As a result of mean importance value, Castanopsis sieboldii was the highest in VU 1, VU 2, VU 4, and C. turczaninowii in VU 4, respectively. In case of species diversity, VU 3 showed the highest among four units in species diversity index. In conclusion, the forest vegetation of Hongdo Island was classified into four units and seven species groups. Hongdo Island could be conclusively managed by community ecological approach for the units and groups. Also it was considered that a research for the succession to the evergreen broad-leaved forest should be more intensively proceeded near future.

The Concentration Distribution and Source Identification of Polychlorinated Biphenyls in River Sediment (하천 퇴적물 중 PCBs 농도분포 및 발생원 해석)

  • Jin, Ronghu;Oh, Jung-Keun;Kim, Jong-Guk;Kim, Kyoung-Soo
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.32 no.11
    • /
    • pp.995-1000
    • /
    • 2010
  • To investigate the relationship between polychlorinated byphenyls (PCBs) sources and concentration level in sediment, total 63 sediment samples with three-time sampling at one site were measured at 21 sites in Nakdong River. As a result of analysis, total concentrations and toxic equivalent (TEQ) concentration of Dioxin-like PCBs were ranged from 3.0 to 6,600 pg/g-dry with a mean value of 440 pg/g-dry and

Characteristics of Groundwater Quality for Agricultural Irrigation in Plastic Film House Using Multivariate Analysis (다변량분석법을 이용한 시설재배지 지하수 수질 특성)

  • Kim, Jin-Ho;Choi, Chul-Mann;Lee, Jong-Sik;Yun, Sun-Gang;Lee, Jung-Taek;Cho, Kwang-Rae;Lim, Su-Jung;Choi, Seung-Chul;Lee, Gyeong-Ja;Kwon, Yeu-Seok;Kyung, Ki-Chon;Uhm, Mi-Jeong;Kim, Hee-Kwon;Lee, You-Seok;Kim, Chan-Yong;Lee, Seong-Tae;Ryu, Jong-Su
    • Korean Journal of Environmental Agriculture
    • /
    • v.27 no.1
    • /
    • pp.1-9
    • /
    • 2008
  • The main purpose of this study is to accumulate the fundamental data representing groundwater of plastic film houses by means of water quality and its multivariate statistical analysis. Groundwater samples were collected in every two years since 2000 to 2004 from total 211 sites. According to the result of water quality analysis, ground water quality was suitable for irrigation purpose averagely. Correlation analysis showed that EC was highest positively correlated with $Mg^{2+}$ to 0.810(p<0.01), 0.776(p<0.01) in April and July, respectively. $NO_3-N$ was highest positively correlated with T-N to 0.794(p<0.01) in October. This result shows that it can lead to a different result even in similar case sometimes. Four factors were extracted through factor analysis in April and July, but five factors were extracted in October. The proportions of cumulative variance by the factor were 64.9, 60.2, and 70.7 in April, July, and October, respectively. The first factor was highly related to anions and cations such as $Ca^{2+},\;Mg^{2+},\;Cl^-,\;{SO_4}^{2-}$, and EC in contrast to that of stream water. According to the cluster analysis, 211 sites are classified into four groups. Common type of ground water quality was shown in group A. The pH and $PO_4-P$ were highest in Group B. The anions and cations were highest in Group C. $COD_{Cr}$ was highest in Group D.

Redescription and Multivariate Analysis of Genus Phintella (Araneae, Salticidae) from Korea (한국산 Phintella속(거미목, 깡충거미과)의 재기재와 다변량분석)

  • Bo-Keun Seo
    • Animal Systematics, Evolution and Diversity
    • /
    • v.11 no.2
    • /
    • pp.183-197
    • /
    • 1995
  • Description and identifications of 6 species belonging to genus Phintella from Korea are in insufficient and inaccurate situation. In the present paper, redescriptions illustrations and identification key are provided for 7 species of genus Phintella including P. popovi newly recorded in Korean spider fauna, and Ocius munitus described by Wesolowska (1981s) was synonymized to P.cavaleriei. For the author's identiication and pairing to be valid multivariate analysis was performed with 13 RVCs below STD 0.05 to 134 individuals. The result of discriminant analysis carried out with 13 RVCs of 134 individuals was not satisfactory, but cluster analysis performed with mean ratio values of 14 OTUs to 13 RVCs showed the same result with author's pairing except P.abnormis , which has larger dissimilarity than the pairs of the others. So pairing of 7 species was possible as a whole because one species only failed in pairing , even though this is imperful result. This method to be helpful to pairing test and identification if it were to improve.

  • PDF