• Title/Summary/Keyword: Analyzing

Search Result 28,883, Processing Time 0.065 seconds

Correlation Analysis of Factors Affecting the Collimator Size used during Lumbar Spine Lateral Examination in Digital Radiography System (디지털 방사선 장비에서 요추 측면 검사 시 사용되는 조사야 크기에 영향을 미치는 요인의 상관관계 분석)

  • Young-Cheol Joo;Sin-Young Yu
    • Journal of the Korean Society of Radiology
    • /
    • v.18 no.4
    • /
    • pp.345-353
    • /
    • 2024
  • The purpose of this study was to suggest an appropriate collimation size and central X-ray incidence point by analyzing the correlation between the collimation size used in lumbar lateral examination and factors affecting the collimation size. the lumbar lateral examination results of 148 patients suitable for the purpose of this study were analyzed. The measurement method was to set the total horizontal width shown in the image to the size of the irradiation field(collimation) used during the examination. The distance connected vertically from the end of the dorsal field to the apophyseal joint of the third lumbar vertebra(AJD), the distance from the dorsal end of the image field to the center of the body of the third lumbar vertebra(BD), and the distance from the end of the dorsal field of the image to the center of the pedicle of the third lumbar vertebra(PD). The distance was measured. For comparative analysis of the mean values of dependent variables according to gender, age, height, weight, and body mass index, the mean values were compared using the independent samples t test and one-way ANOVA. For post hoc analysis, duncan was used. The correlation between independent and dependent variables was analyzed using Pearson correlation analysis. In this study, statistical significance was set at a p value of 0.05 or lower. The average value of the collimation size during the lumbar spine lateral examination was 252.45 mm, AJD was 102.11 mm, BD was 141.17 mm, and PD was 119.73 mm. The mean values of collimation size, AJD, BD, and PD were larger in men than in women, but statistical significance for the difference in mean values by gender was found only in BD (p<0.05). There was a slight difference in the mean value of each group according to age, but there was no statistical significance (p>0.05). The collimation size and mean values of AJD, BD, and PD according to height, weight, and body mass index differed depending on the independent variables, and the differences were all statistically significant (p<0.05). As a result of the correlation analysis, field size and AJD, BD, and PD showed no correlation with gender and age, a weak positive correlation with height, and a medium positive correlation with weight and body mass index. The results of this study showed that CS was correlated with height, weight, and BMI during lumbar lateral examination. If the entrance point of the central X-ray is moved to the appophyseal joint by considering weight and BMI when adjusting the collimation size in clinical practice, it is expected that the collimation size can be reduced bu about 5%.

Evaluation of Fruit Yield and Quality of Netted Melon, Water and Nutrient Use Efficiency in a Closed Hydroponic System (순환식 수경재배 멜론의 수량과 품질, 관개수 및 양분 이용 효율성 평가)

  • Minju Shin;Seungri Yoon;Jin Hyun Kim;Ho Jeong Jeong;Sung Kyeom Kim
    • Journal of Bio-Environment Control
    • /
    • v.32 no.4
    • /
    • pp.492-500
    • /
    • 2023
  • The spectrum of this study was research on the closed hydroponic cultivation of netted melons (Cucumis melo L.) using coir substrate, analyzing the impact of this cultivation method on melon yield, fruit quality, and the efficiency of water and nutrient usage. The experimental results showed that the average fruit weight of the melons grown in a closed system was 71.4 g higher than that of the open system, and the fruit width was on average 0.2 cm larger, showing a statistically significant difference. However, there was no difference in the average sugar content of the fruit flesh and height. Although there is no substantial commercial difference, it is conjectured that the change in the macronutrients ratio in the irrigation has played a role in the statistically significant increase in fruit weight, which is attributed to changes in the crops' nutrient uptake concentrations. This necessitates further research for a more comprehensive understanding. In terms of the productivity of irrigation required to produce the fruit, applying the closed system resulted in an increase of 7.6 kg/ton compared to the open system, saving 31.6% of water resources. Additionally, in terms of nutrients, cultivating in a closed system allowed for savings of approximately 59, 25, 55, 83, 76, and 87% of N, P, K, Ca, Mg, and S, respectively, throughout the entire cultivation period. As the drainage was reused, the ratios of NO3- and Ca2+ increased up to a maximum of 9.6 and 9.1%, respectively, while the ratios of other ions gradually decreased. In summary, these results suggest that closed hydroponic cultivation can effectively optimize the use of water and fertilizer while maintaining excellent fruit quality in melon cultivation.

Analysis of Fruit Quality and Productivity of 'Kawanakajima Hakuto' Peach according to the Different Irrigation Starting Point (관수 개시점에 따른 복숭아 '천중도백도'의 과실 품질 및 생산성 변화 분석)

  • Seul Ki Lee;Jung Gun Cho;Jae Hoon Jeong;Dongyong Lee;Jeom Hwa Han;Si Hyeong Jang;Suhyun Ryu;Heetae Kim;Sang-Hyeon Kang
    • Journal of Bio-Environment Control
    • /
    • v.32 no.4
    • /
    • pp.475-483
    • /
    • 2023
  • This study was conducted to determine the optimal irrigation starting point by analyzing tree growth, physiological responses, fruit quality, and productivity in peach orchards. Seven-year-old 'Kawanakajima Hakuto' peach trees were used in an experimental field (35°49'30.4"N, 127°01'33.2"E) located within the National Institute of Horticultural and Herbal Science located in Wanju-gun, Jeollabuk-do. The irrigation starting point was set with four levels of -20, -40, -60, and -80 kPa from June to September 2022. While there were no significant differences in increase of trunk cross-section area and leaf area among treatments, shoot length and diameter decreased in the -80 kPa and -20 kPa treatments. The photosynthetic rate measured in August was highest for -60 kPa (17.7 μmol·m-2·s-1), followed by -40 kPa (15.6 μmol·m-2·s-1), -20 kPa (14.5 μmol·m-2·s-1) and -80 kPa (14.0 μmol·m-2·s-1). SPAD value measured in May and August was lower in the -80 kPa and -20 kPa treatments than in the -60 kPa and -40 kPa treatments. The harvest date reached three days earlier in the -20 kPa treatment compared to other treatments. The fruit weight was highest in the -60 kPa (379.1 g), followed by -40 kPa (344.0 g), -80 kPa (321.0 g) and -20 kPa (274.9 g). Firmness was the lowest in the -20 kPa treatment. The soluble solid content was highest in the -60 kPa treatment (13.3°Bx).The ratio of marketable fruits was highest in the -60 kPa treatment (50.7%) and lowest in the -80 kPa treatment (23.4%). In conclusion, we suggest that setting the irrigation starting point at -60 kPa could improve the fruit quality and yield in peach orchards.

The Effect of Empathy Value of Chinese Female University Students on Affection with Sustainable Fashion Products on Affection and Purchase Intention (중국 여대생의 지속가능한 패션제품에 대한 공감가치가 호감도와 구매의사에 미치는 영향)

  • Yi-Fei Wu;Young-Sook Lee
    • Journal of Internet of Things and Convergence
    • /
    • v.10 no.3
    • /
    • pp.35-48
    • /
    • 2024
  • This study analyzed the value empathy of environmentally sustainable fashion products, encompassing environmental, economic, and social values, drawing from existing literature. We sought to verify the relationship between empathic value and the likability and purchase intention towards these products. To validate these relationships, we formulated research hypotheses and conducted an online survey targeting female college students residing in Guangzhou, Guangdong Province, China, who have experience purchasing environmentally sustainable fashion products. The survey was conducted from August 10th to August 20th, 2023, with a total distribution of 352 questionnaires. Among the collected responses, 313 valid responses were utilized for data analysis. The collected survey data underwent frequency analysis, exploratory factor analysis, reliability and validity analysis, correlation analysis, and multiple regression analysis using SPSS 26.0 software. The analysis yielded the following results. First, the empathy value of environmentally sustainable fashion products was classified into environmental protection values, economic values, and social values. Second, the economic and social values of environmentally sustainable fashion products were found to have a positive effect on favorability. Third, it was found that the environmental protection value and social value of environmentally sustainable fashion products had a positive effect on purchase intention. Fourth, it was found that Chinese female college students' favorability toward environmentally sustainable fashion products had a positive effect on their purchase intention. Based on these results, it is judged that companies need to emphasize the characteristics of products such as environmental protective value, economic value, and social value in order to promote consumers' purchase of environmentally sustainable fashion products. The purpose of this study is to help develop marketing strategies for environmentally sustainable fashion products by providing basic data, development ideas, and methods useful for environmentally sustainable fashion-related industries and companies by analyzing the relationship between empathy value, favorability, and purchase intention.

Exploring the Effects of Corporate Organizational Culture on Financial Performance: Using Text Analysis and Panel Data Approach (기업의 조직문화가 재무성과에 미치는 영향에 대한 연구: 텍스트 분석과 패널 데이터 방법을 이용하여)

  • Hansol Kim;Hyemin Kim;Seung Ik Baek
    • Information Systems Review
    • /
    • v.26 no.1
    • /
    • pp.269-288
    • /
    • 2024
  • The main objective of this study is to empirically explore how the organizational culture influences financial performance of companies. To achieve this, 58 companies included in the KOSPI 200 were selected from an online job platform in South Korea, JobPlanet. In order to understand the organizational culture of these companies, data was collected and analyzed from 81,067 reviews written by current and former members of these companies on JobPlanet over a period of 9 years from 2014 to 2022. To define the organizational culture of each company based on the review data, this study utilized well-known text analysis techniques, namely Word2Vec and FastText analysis methods. By modifying, supplementing, and extending the keywords associated with the five organizational culture values (Innovation, Integrity, Quality, Respect, and Teamwork) defined by Guiso et al. (2015), this study created a new Culture Dictionary. By using this dictionary, this study explored which cultural values-related keywords appear most often in the review data of each company, revealing the relative strength of specific cultural values within companies. Going a step further, the study also investigated which cultural values statistically impact financial performance. The results indicated that the organizational culture focusing on innovation and creativity (Innovation) and on customers and the market (Quality) positively influenced Tobin's Q, an indicator of a company's future value and growth. For the indicator of profitability, ROA, only the organizational culture emphasizing customers and the market (Quality) showed statistically significant impact. This study distinguishes itself from traditional surveys and case analysis-based research on organizational culture by analyzing large-scale text data to explore organizational culture.

Analysis of Greenhouse Thermal Environment by Model Simulation (시뮬레이션 모형에 의한 온실의 열환경 분석)

  • 서원명;윤용철
    • Journal of Bio-Environment Control
    • /
    • v.5 no.2
    • /
    • pp.215-235
    • /
    • 1996
  • The thermal analysis by mathematical model simulation makes it possible to reasonably predict heating and/or cooling requirements of certain greenhouses located under various geographical and climatic environment. It is another advantages of model simulation technique to be able to make it possible to select appropriate heating system, to set up energy utilization strategy, to schedule seasonal crop pattern, as well as to determine new greenhouse ranges. In this study, the control pattern for greenhouse microclimate is categorized as cooling and heating. Dynamic model was adopted to simulate heating requirements and/or energy conservation effectiveness such as energy saving by night-time thermal curtain, estimation of Heating Degree-Hours(HDH), long time prediction of greenhouse thermal behavior, etc. On the other hand, the cooling effects of ventilation, shading, and pad ||||&|||| fan system were partly analyzed by static model. By the experimental work with small size model greenhouse of 1.2m$\times$2.4m, it was found that cooling the greenhouse by spraying cold water directly on greenhouse cover surface or by recirculating cold water through heat exchangers would be effective in greenhouse summer cooling. The mathematical model developed for greenhouse model simulation is highly applicable because it can reflects various climatic factors like temperature, humidity, beam and diffuse solar radiation, wind velocity, etc. This model was closely verified by various weather data obtained through long period greenhouse experiment. Most of the materials relating with greenhouse heating or cooling components were obtained from model greenhouse simulated mathematically by using typical year(1987) data of Jinju Gyeongnam. But some of the materials relating with greenhouse cooling was obtained by performing model experiments which include analyzing cooling effect of water sprayed directly on greenhouse roof surface. The results are summarized as follows : 1. The heating requirements of model greenhouse were highly related with the minimum temperature set for given greenhouse. The setting temperature at night-time is much more influential on heating energy requirement than that at day-time. Therefore It is highly recommended that night- time setting temperature should be carefully determined and controlled. 2. The HDH data obtained by conventional method were estimated on the basis of considerably long term average weather temperature together with the standard base temperature(usually 18.3$^{\circ}C$). This kind of data can merely be used as a relative comparison criteria about heating load, but is not applicable in the calculation of greenhouse heating requirements because of the limited consideration of climatic factors and inappropriate base temperature. By comparing the HDM data with the results of simulation, it is found that the heating system design by HDH data will probably overshoot the actual heating requirement. 3. The energy saving effect of night-time thermal curtain as well as estimated heating requirement is found to be sensitively related with weather condition: Thermal curtain adopted for simulation showed high effectiveness in energy saving which amounts to more than 50% of annual heating requirement. 4. The ventilation performances doting warm seasons are mainly influenced by air exchange rate even though there are some variations depending on greenhouse structural difference, weather and cropping conditions. For air exchanges above 1 volume per minute, the reduction rate of temperature rise on both types of considered greenhouse becomes modest with the additional increase of ventilation capacity. Therefore the desirable ventilation capacity is assumed to be 1 air change per minute, which is the recommended ventilation rate in common greenhouse. 5. In glass covered greenhouse with full production, under clear weather of 50% RH, and continuous 1 air change per minute, the temperature drop in 50% shaded greenhouse and pad & fan systemed greenhouse is 2.6$^{\circ}C$ and.6.1$^{\circ}C$ respectively. The temperature in control greenhouse under continuous air change at this time was 36.6$^{\circ}C$ which was 5.3$^{\circ}C$ above ambient temperature. As a result the greenhouse temperature can be maintained 3$^{\circ}C$ below ambient temperature. But when RH is 80%, it was impossible to drop greenhouse temperature below ambient temperature because possible temperature reduction by pad ||||&|||| fan system at this time is not more than 2.4$^{\circ}C$. 6. During 3 months of hot summer season if the greenhouse is assumed to be cooled only when greenhouse temperature rise above 27$^{\circ}C$, the relationship between RH of ambient air and greenhouse temperature drop($\Delta$T) was formulated as follows : $\Delta$T= -0.077RH+7.7 7. Time dependent cooling effects performed by operation of each or combination of ventilation, 50% shading, pad & fan of 80% efficiency, were continuously predicted for one typical summer day long. When the greenhouse was cooled only by 1 air change per minute, greenhouse air temperature was 5$^{\circ}C$ above outdoor temperature. Either method alone can not drop greenhouse air temperature below outdoor temperature even under the fully cropped situations. But when both systems were operated together, greenhouse air temperature can be controlled to about 2.0-2.3$^{\circ}C$ below ambient temperature. 8. When the cool water of 6.5-8.5$^{\circ}C$ was sprayed on greenhouse roof surface with the water flow rate of 1.3 liter/min per unit greenhouse floor area, greenhouse air temperature could be dropped down to 16.5-18.$0^{\circ}C$, whlch is about 1$0^{\circ}C$ below the ambient temperature of 26.5-28.$0^{\circ}C$ at that time. The most important thing in cooling greenhouse air effectively with water spray may be obtaining plenty of cool water source like ground water itself or cold water produced by heat-pump. Future work is focused on not only analyzing the feasibility of heat pump operation but also finding the relationships between greenhouse air temperature(T$_{g}$ ), spraying water temperature(T$_{w}$ ), water flow rate(Q), and ambient temperature(T$_{o}$).

  • PDF

A Store Recommendation Procedure in Ubiquitous Market for User Privacy (U-마켓에서의 사용자 정보보호를 위한 매장 추천방법)

  • Kim, Jae-Kyeong;Chae, Kyung-Hee;Gu, Ja-Chul
    • Asia pacific journal of information systems
    • /
    • v.18 no.3
    • /
    • pp.123-145
    • /
    • 2008
  • Recently, as the information communication technology develops, the discussion regarding the ubiquitous environment is occurring in diverse perspectives. Ubiquitous environment is an environment that could transfer data through networks regardless of the physical space, virtual space, time or location. In order to realize the ubiquitous environment, the Pervasive Sensing technology that enables the recognition of users' data without the border between physical and virtual space is required. In addition, the latest and diversified technologies such as Context-Awareness technology are necessary to construct the context around the user by sharing the data accessed through the Pervasive Sensing technology and linkage technology that is to prevent information loss through the wired, wireless networking and database. Especially, Pervasive Sensing technology is taken as an essential technology that enables user oriented services by recognizing the needs of the users even before the users inquire. There are lots of characteristics of ubiquitous environment through the technologies mentioned above such as ubiquity, abundance of data, mutuality, high information density, individualization and customization. Among them, information density directs the accessible amount and quality of the information and it is stored in bulk with ensured quality through Pervasive Sensing technology. Using this, in the companies, the personalized contents(or information) providing became possible for a target customer. Most of all, there are an increasing number of researches with respect to recommender systems that provide what customers need even when the customers do not explicitly ask something for their needs. Recommender systems are well renowned for its affirmative effect that enlarges the selling opportunities and reduces the searching cost of customers since it finds and provides information according to the customers' traits and preference in advance, in a commerce environment. Recommender systems have proved its usability through several methodologies and experiments conducted upon many different fields from the mid-1990s. Most of the researches related with the recommender systems until now take the products or information of internet or mobile context as its object, but there is not enough research concerned with recommending adequate store to customers in a ubiquitous environment. It is possible to track customers' behaviors in a ubiquitous environment, the same way it is implemented in an online market space even when customers are purchasing in an offline marketplace. Unlike existing internet space, in ubiquitous environment, the interest toward the stores is increasing that provides information according to the traffic line of the customers. In other words, the same product can be purchased in several different stores and the preferred store can be different from the customers by personal preference such as traffic line between stores, location, atmosphere, quality, and price. Krulwich(1997) has developed Lifestyle Finder which recommends a product and a store by using the demographical information and purchasing information generated in the internet commerce. Also, Fano(1998) has created a Shopper's Eye which is an information proving system. The information regarding the closest store from the customers' present location is shown when the customer has sent a to-buy list, Sadeh(2003) developed MyCampus that recommends appropriate information and a store in accordance with the schedule saved in a customers' mobile. Moreover, Keegan and O'Hare(2004) came up with EasiShop that provides the suitable tore information including price, after service, and accessibility after analyzing the to-buy list and the current location of customers. However, Krulwich(1997) does not indicate the characteristics of physical space based on the online commerce context and Keegan and O'Hare(2004) only provides information about store related to a product, while Fano(1998) does not fully consider the relationship between the preference toward the stores and the store itself. The most recent research by Sedah(2003), experimented on campus by suggesting recommender systems that reflect situation and preference information besides the characteristics of the physical space. Yet, there is a potential problem since the researches are based on location and preference information of customers which is connected to the invasion of privacy. The primary beginning point of controversy is an invasion of privacy and individual information in a ubiquitous environment according to researches conducted by Al-Muhtadi(2002), Beresford and Stajano(2003), and Ren(2006). Additionally, individuals want to be left anonymous to protect their own personal information, mentioned in Srivastava(2000). Therefore, in this paper, we suggest a methodology to recommend stores in U-market on the basis of ubiquitous environment not using personal information in order to protect individual information and privacy. The main idea behind our suggested methodology is based on Feature Matrices model (FM model, Shahabi and Banaei-Kashani, 2003) that uses clusters of customers' similar transaction data, which is similar to the Collaborative Filtering. However unlike Collaborative Filtering, this methodology overcomes the problems of personal information and privacy since it is not aware of the customer, exactly who they are, The methodology is compared with single trait model(vector model) such as visitor logs, while looking at the actual improvements of the recommendation when the context information is used. It is not easy to find real U-market data, so we experimented with factual data from a real department store with context information. The recommendation procedure of U-market proposed in this paper is divided into four major phases. First phase is collecting and preprocessing data for analysis of shopping patterns of customers. The traits of shopping patterns are expressed as feature matrices of N dimension. On second phase, the similar shopping patterns are grouped into clusters and the representative pattern of each cluster is derived. The distance between shopping patterns is calculated by Projected Pure Euclidean Distance (Shahabi and Banaei-Kashani, 2003). Third phase finds a representative pattern that is similar to a target customer, and at the same time, the shopping information of the customer is traced and saved dynamically. Fourth, the next store is recommended based on the physical distance between stores of representative patterns and the present location of target customer. In this research, we have evaluated the accuracy of recommendation method based on a factual data derived from a department store. There are technological difficulties of tracking on a real-time basis so we extracted purchasing related information and we added on context information on each transaction. As a result, recommendation based on FM model that applies purchasing and context information is more stable and accurate compared to that of vector model. Additionally, we could find more precise recommendation result as more shopping information is accumulated. Realistically, because of the limitation of ubiquitous environment realization, we were not able to reflect on all different kinds of context but more explicit analysis is expected to be attainable in the future after practical system is embodied.

Evaluation of Error Factors in Quantitative Analysis of Lymphoscintigraphy (Lymphoscintigraphy의 정량분석 시 오류 요인에 관한 평가)

  • Yeon, Joon-Ho;Kim, Soo-Yung;Choi, Sung-Ook;Seok, Jae-Dong
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.15 no.2
    • /
    • pp.76-82
    • /
    • 2011
  • Purpose: Lymphoscintigraphy is absolutely being used standard examination in lymphatic diagnosis, evaluation after treatment, and it is useful for lymphedema to plan therapy. In case of lymphoscintigraphy of lower-extremity lymphedema, it had an effect on results if patients had not pose same position on the examination of 1 min, 1 hour and 2 hours after injection. So we'll study the methods to improve confidence with minimized quantitative analysis errors by influence factors. Materials and Methods: Being used the Infinia of GE Co. we injected $^{99m}Tc$-phytate 37 MBq (1.0 mCi) 4 sylinges into 40 people's feet hypodermically from June to August 2010 in Samsung Medical Center. After we acquired images of fixed and unfixed condition, we confirmed the count values change by attenuation of soft tissue and bone according to different feet position. And we estimated 5 times increasing 2 cm of distance between $^{99m}Tc$ point source and detector each time to check counts difference according to distance change by different feet position. Finally, we compared 1 and 6 min lymphoscintigraphy images with same position to check the effect of quantitative analysis results owing to difference of amounts of movement of the $^{99m}Tc$-phytate in the lymphatic duct. Results: Percentage difference regarding error values showed minimum 2.7% and maximum 25.8% when comparing fixed and unfixed feet position of lymphoscintigraphy examination at 1 min after injection. And count values according to distance were 173,661 (2 cm), 172,095 (4 cm), 170,996 (6 cm), 167,677 (8 cm), 169,208 counts (10 cm) which distance was increased interval of 2 cm and basal value was mean 176,587 counts, and percentage difference values were not over 2.5% such as 1.27, 1.79, 2.04, 2.42, 2.35%. Also, Assessment results about amounts of movement in lymphatic duct within 6 min until scanning after injection showed minimum 0.15%, and maximum 2.3% which were amounts of movement. We can recognize that error values represent over 20% due to only attenuation of soft tissue and bone except for distance difference (2.42%) and amounts of movement in lymphatic duct (2.3%). Conclusion: It was show that if same patients posed different feet position on the examination of 1 min, 1 hour and 2 hours after injection in the lymphoscintigraphy which is evaluating lymphatic flow of patients with lymphedema and analyzing amount of intake by lymphatic system, maximum error value represented 25.8% due to attenuation of soft tissue and bone, and PASW (Predictive Analytics Software) showed that fixed and unfixed feet position was different each other. And difference of distance between detector and feet and change of count values by difference of examination beginning time after injection influence on quantitative analysis results partially. Therefore, we'll make an effort to fix feet position and make the most of fixing board in lymphoscintigraphy with quantitative analysis.

  • PDF

Public Sentiment Analysis of Korean Top-10 Companies: Big Data Approach Using Multi-categorical Sentiment Lexicon (국내 주요 10대 기업에 대한 국민 감성 분석: 다범주 감성사전을 활용한 빅 데이터 접근법)

  • Kim, Seo In;Kim, Dong Sung;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.45-69
    • /
    • 2016
  • Recently, sentiment analysis using open Internet data is actively performed for various purposes. As online Internet communication channels become popular, companies try to capture public sentiment of them from online open information sources. This research is conducted for the purpose of analyzing pulbic sentiment of Korean Top-10 companies using a multi-categorical sentiment lexicon. Whereas existing researches related to public sentiment measurement based on big data approach classify sentiment into dimensions, this research classifies public sentiment into multiple categories. Dimensional sentiment structure has been commonly applied in sentiment analysis of various applications, because it is academically proven, and has a clear advantage of capturing degree of sentiment and interrelation of each dimension. However, the dimensional structure is not effective when measuring public sentiment because human sentiment is too complex to be divided into few dimensions. In addition, special training is needed for ordinary people to express their feeling into dimensional structure. People do not divide their sentiment into dimensions, nor do they need psychological training when they feel. People would not express their feeling in the way of dimensional structure like positive/negative or active/passive; rather they express theirs in the way of categorical sentiment like sadness, rage, happiness and so on. That is, categorial approach of sentiment analysis is more natural than dimensional approach. Accordingly, this research suggests multi-categorical sentiment structure as an alternative way to measure social sentiment from the point of the public. Multi-categorical sentiment structure classifies sentiments following the way that ordinary people do although there are possibility to contain some subjectiveness. In this research, nine categories: 'Sadness', 'Anger', 'Happiness', 'Disgust', 'Surprise', 'Fear', 'Interest', 'Boredom' and 'Pain' are used as multi-categorical sentiment structure. To capture public sentiment of Korean Top-10 companies, Internet news data of the companies are collected over the past 25 months from a representative Korean portal site. Based on the sentiment words extracted from previous researches, we have created a sentiment lexicon, and analyzed the frequency of the words coming up within the news data. The frequency of each sentiment category was calculated as a ratio out of the total sentiment words to make ranks of distributions. Sentiment comparison among top-4 companies, which are 'Samsung', 'Hyundai', 'SK', and 'LG', were separately visualized. As a next step, the research tested hypothesis to prove the usefulness of the multi-categorical sentiment lexicon. It tested how effective categorial sentiment can be used as relative comparison index in cross sectional and time series analysis. To test the effectiveness of the sentiment lexicon as cross sectional comparison index, pair-wise t-test and Duncan test were conducted. Two pairs of companies, 'Samsung' and 'Hanjin', 'SK' and 'Hanjin' were chosen to compare whether each categorical sentiment is significantly different in pair-wise t-test. Since category 'Sadness' has the largest vocabularies, it is chosen to figure out whether the subgroups of the companies are significantly different in Duncan test. It is proved that five sentiment categories of Samsung and Hanjin and four sentiment categories of SK and Hanjin are different significantly. In category 'Sadness', it has been figured out that there were six subgroups that are significantly different. To test the effectiveness of the sentiment lexicon as time series comparison index, 'nut rage' incident of Hanjin is selected as an example case. Term frequency of sentiment words of the month when the incident happened and term frequency of the one month before the event are compared. Sentiment categories was redivided into positive/negative sentiment, and it is tried to figure out whether the event actually has some negative impact on public sentiment of the company. The difference in each category was visualized, moreover the variation of word list of sentiment 'Rage' was shown to be more concrete. As a result, there was huge before-and-after difference of sentiment that ordinary people feel to the company. Both hypotheses have turned out to be statistically significant, and therefore sentiment analysis in business area using multi-categorical sentiment lexicons has persuasive power. This research implies that categorical sentiment analysis can be used as an alternative method to supplement dimensional sentiment analysis when figuring out public sentiment in business environment.

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.