• Title/Summary/Keyword: Sentence analyze

Search Result 116, Processing Time 0.022 seconds

A Study and Investigation for the Attitude about Smoking of Boys' and Girls' High School in Seoul (서울시내 남녀고교생의 흡연에 관한 태도 조사연구)

  • Sim Young Ae
    • Journal of Korean Public Health Nursing
    • /
    • v.3 no.1
    • /
    • pp.74-100
    • /
    • 1989
  • Inspite of the lots of studies on the harmfulness of cigarette smoking to the body published by many researchers since 1950, cigarette smoking people are increasing in number especially, cigarette smoking by young and women causes a serious problem. Examining the physiological motives of youth shows that, impulse which the youth want to immitate the adults, alluring curiousity, and defiant physiology of escaping from the norm of traditional groups which has been banned are cooperated well compoundly. As the period of the youth is the one which they accumulate knowledge and charactor by learning as well as the period of growth mentally, and physically they should be rightly educated about smoking before they addicted to smoking and it is desirable for us to make the youth to understand how harmfully the smoking is to effect to their growth and mental soundness simply not as a social norm which they should not smoke. The main motive of this study on the attitude of smoking by the youth is to give basic materials related on this field. For this study, 647 questionnaires were used as studying material which were able to analyze among 720 questionnaires of 2 classes of each grade of 3 high schools among the high schools of boys, girls and co-educated in Seoul from Oct. 21, 1988 through Oct. 26, 1988. Study Instrument are graded in Likert's 5 point from 40 questions which are 20 questions m affirmations and 20 questions in negations after analyzing the factors on 60 simple sentence questions which the students showed in preliminary studies. And these are systemized to be measured from 1 point which means they think smoking IS very bad to 5points which means they think smoking is really good. In these collected materials, technical statistics of frequency. percentage, average, standard deviation are used for general character and smoking attitude, $X^2-test$ for examinning Independant variables of physical. emotional, ethical and other areas pearson's coefficient of correlation for related direction and degree" and step­regression analysis for the degree of relative contribution of all variables which effect smoking attitude. The results of this study are as follows; 1. The smoking attitude of high school boys and girls showed average of 1.78 in physical area, 2.63 in emotional area, 2.61 in ethical area, 2.29 in other area respectively in a negative attitude generally also the negative attitude are expressed most strongly in physical area. I've can also say by this results that smoking is harmful to their health and further more it can be judged that this proves the youth in the period of preparation be adults have a strong curiousity in the emotional, ethical and other areas. 2. The most influential variables in each field as related factors effecting smoking attitude of the student can be explained from 13.2 in physical area the lowest experienced variables to 25.2 in emotional area the highest of degree of smoking experience. The fact that the more the smoking experienced students are increasing in number the higher tendency which accept the' smoking tells as the importance of health education about the population of latest student's smoking as important variables shown equally in each area. Those of grade, age, numbers of smoking people in house are showed meaningful in pure interrelation. Those related to the acceptance of teacher's smoking, sex, mothors education are shown meaningful in opposite interrelations. This means that the' increasing number' of smoking people in grade age, the number of smoker in family have a affirmative attitude. And people who are not interested in teacher's smoking wants to quit it, and whose mother's education is higher have a negative attitude. 3. The most negatively answered questions of the smoking attitude In physical, emotional, ethical and other areas are as belows; Firstly too much smoking is harmful to our health is 1.12 point. Secondly smoking have a ill-effect on pregnancy and embryo is 1.13 point. Thirdly smoking is harmful· to our health is 1.27 point. Fourthly smoking in crowed area with the people such as In a bus or subway should be prohibited is 1.27point. Fifthly smoking can ruin lungs is 1.31 point. And the most affirmatively answered questions are also as below; Firstly we showed smoke depending on time and place is 3.96 points. Secondly smoking is just habit is 3.83 points. Thirdly smoking people seem to be unable and deplorable is 3.69 point. Fourthly smoking should be prohibited by law is 3.56 points. Fifthly high school student's smoking is immitation of adults is 3.52 points.

  • PDF

Financial Fraud Detection using Text Mining Analysis against Municipal Cybercriminality (지자체 사이버 공간 안전을 위한 금융사기 탐지 텍스트 마이닝 방법)

  • Choi, Sukjae;Lee, Jungwon;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.119-138
    • /
    • 2017
  • Recently, SNS has become an important channel for marketing as well as personal communication. However, cybercrime has also evolved with the development of information and communication technology, and illegal advertising is distributed to SNS in large quantity. As a result, personal information is lost and even monetary damages occur more frequently. In this study, we propose a method to analyze which sentences and documents, which have been sent to the SNS, are related to financial fraud. First of all, as a conceptual framework, we developed a matrix of conceptual characteristics of cybercriminality on SNS and emergency management. We also suggested emergency management process which consists of Pre-Cybercriminality (e.g. risk identification) and Post-Cybercriminality steps. Among those we focused on risk identification in this paper. The main process consists of data collection, preprocessing and analysis. First, we selected two words 'daechul(loan)' and 'sachae(private loan)' as seed words and collected data with this word from SNS such as twitter. The collected data are given to the two researchers to decide whether they are related to the cybercriminality, particularly financial fraud, or not. Then we selected some of them as keywords if the vocabularies are related to the nominals and symbols. With the selected keywords, we searched and collected data from web materials such as twitter, news, blog, and more than 820,000 articles collected. The collected articles were refined through preprocessing and made into learning data. The preprocessing process is divided into performing morphological analysis step, removing stop words step, and selecting valid part-of-speech step. In the morphological analysis step, a complex sentence is transformed into some morpheme units to enable mechanical analysis. In the removing stop words step, non-lexical elements such as numbers, punctuation marks, and double spaces are removed from the text. In the step of selecting valid part-of-speech, only two kinds of nouns and symbols are considered. Since nouns could refer to things, the intent of message is expressed better than the other part-of-speech. Moreover, the more illegal the text is, the more frequently symbols are used. The selected data is given 'legal' or 'illegal'. To make the selected data as learning data through the preprocessing process, it is necessary to classify whether each data is legitimate or not. The processed data is then converted into Corpus type and Document-Term Matrix. Finally, the two types of 'legal' and 'illegal' files were mixed and randomly divided into learning data set and test data set. In this study, we set the learning data as 70% and the test data as 30%. SVM was used as the discrimination algorithm. Since SVM requires gamma and cost values as the main parameters, we set gamma as 0.5 and cost as 10, based on the optimal value function. The cost is set higher than general cases. To show the feasibility of the idea proposed in this paper, we compared the proposed method with MLE (Maximum Likelihood Estimation), Term Frequency, and Collective Intelligence method. Overall accuracy and was used as the metric. As a result, the overall accuracy of the proposed method was 92.41% of illegal loan advertisement and 77.75% of illegal visit sales, which is apparently superior to that of the Term Frequency, MLE, etc. Hence, the result suggests that the proposed method is valid and usable practically. In this paper, we propose a framework for crisis management caused by abnormalities of unstructured data sources such as SNS. We hope this study will contribute to the academia by identifying what to consider when applying the SVM-like discrimination algorithm to text analysis. Moreover, the study will also contribute to the practitioners in the field of brand management and opinion mining.

Verification the Systems Thinking Factor Structure and Comparison of Systems Thinking Based on Preferred Subjects about Elementary School Students' (초등학생의 시스템 사고 요인 구조 검증과 선호 과목에 따른 시스템 사고 비교)

  • Lee, Hyonyong;Jeon, Jaedon;Lee, Hyundong
    • Journal of The Korean Association For Science Education
    • /
    • v.39 no.2
    • /
    • pp.161-171
    • /
    • 2019
  • The purposes of this study are: 1) to verify the systems thinking factor structure of elementary school students and 2) to compare systems thinking according to their preferred subjects in order to get implications for following research. For the study, pre-tests analyze data from 732 elementary school students using the STMI (Systems Thinking Measuring Instrument) developed by Lee et al. (2013). And exploratory factor analysis was conducted to identify the factor structure of the students. Based on the results of the pre-test, the expert group council revised the STMI so that elementary school students could respond to the 5-factor structure that STMI intended. In the post-test, 503 data were analyzed by modified STMI and exploratory factor analysis was performed. The results of the study are as follows: First, in the pre-test, elementary school students responded to the STMI with a test paper consisting of two factors (personal internal factors and personal external factors). The total reliability of the instrument was .932 and the reliability of each factor was analyzed as .857 and .894. Second, for modified STMI, elementary school students responded a 4-factor instrument. Team learning, Shared Vision, and Personal Mastery were derived independent factors, and mental model and systems analysis were derived 1-factor. The total reliability of the instrument was .886 and the reliability of each factor was analyzed as .686 to .864. Finally, a comparison of systems thinking according to preferred subjects showed a significant difference between students who selected science (engineering) group and art (music and physical education). In conclusion, it was confirmed that statistically meaningful results could be obtained using STMI modified by term and sentence structure appropriate for elementary school students, and it is a necessary to study the relation of systems thinking with various student variables such as the preferred subjects.

Analysis of the 2022 Revised Science Curriculum Grades 3-4 Achievement Standards Based on Bloom's New Taxonomy of Educational Objectives and Comparison to the 2015 Revised Curriculum (Bloom의 신교육목표분류에 따른 2022 개정 과학과 교육과정 초등학교 3~4학년군 성취기준 분석 및 2015 개정 교육과정과의 비교)

  • Kim, Woo-Joong;Kim, Dong-Suk;Shin, Young-Joon;Kwon, Nan-Joo;Oh, Phil-Seok
    • Journal of Korean Elementary Science Education
    • /
    • v.43 no.3
    • /
    • pp.353-364
    • /
    • 2024
  • The purpose of this study is to analyze the achievement standards for grades 3-4 of the 2022 revised science curriculum and identify the goals of science education for grades 3-4 of the 2022 revised curriculum, as well as provide implications for the development of the science textbooks for grades 3-4 and the direction of teaching for teachers in the field. For this purpose, 57 achievement standards of the Science Department 2022 revised curriculum for grades 3-4 were analyzed as to their knowledge dimensions and cognitive processes according to Bloom's Taxonomy of the New Educational Objectives. In cases where an achievement standard is a double sentence or combines two or more knowledge dimensions or cognitive process dimensions, we separated the sentences after having consulted with a group of experts and divided the achievement standards into 57 sentences. We then analyzed the frequency of the categorization of concepts and descriptors by comparing them with the previously studied elementary science standards from the 2015 revised curriculum. The main findings of the study are as follows. First, in the knowledge dimension, the "factual knowledge" accounted for 50 items (86%), compared to "conceptual knowledge" (10%), and "procedural knowledge" (4%), and "metacognitive knowledge" was not analyzed at all. Second, in terms of the cognitive processes, "Understanding" was the highest at 60% with 34 items. It was followed by "applying" with 11%, "creating" with 19%, "evaluating" with 15%, and "analyzing" and "remembering" with 6%. Third, when analyzing the descriptors, "I can explain" was the highest with 9%, followed by "comparison" with 6%, and "practice" and "classification" with 5%. Fourth, compared to the 2015 revised curriculum, "conceptual knowledge" was reduced and "factual knowledge" was overwhelmingly increased. Fifth, in the cognitive process dimension, "understanding,' has increased significantly, while the other cognitive process dimensions have decreased. Conclusions and implications based on these findings are as follows: the focus of the Science Department for grades 3-4 in the 2022 revised curriculum is heavily weighted toward the "factual knowledge," with "understanding" dominating the cognitive process dimensions. As a result, many concepts and applications have been reduced. Based on the results of the comparison of the descriptors with the results of the 2015 revised curriculum, the implications for the development of the science textbooks for grades 3-4 of the 2022 revised curriculum were discussed, and so were the implications of the curriculum for the field.

Impact of Semantic Characteristics on Perceived Helpfulness of Online Reviews (온라인 상품평의 내용적 특성이 소비자의 인지된 유용성에 미치는 영향)

  • Park, Yoon-Joo;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.29-44
    • /
    • 2017
  • In Internet commerce, consumers are heavily influenced by product reviews written by other users who have already purchased the product. However, as the product reviews accumulate, it takes a lot of time and effort for consumers to individually check the massive number of product reviews. Moreover, product reviews that are written carelessly actually inconvenience consumers. Thus many online vendors provide mechanisms to identify reviews that customers perceive as most helpful (Cao et al. 2011; Mudambi and Schuff 2010). For example, some online retailers, such as Amazon.com and TripAdvisor, allow users to rate the helpfulness of each review, and use this feedback information to rank and re-order them. However, many reviews have only a few feedbacks or no feedback at all, thus making it hard to identify their helpfulness. Also, it takes time to accumulate feedbacks, thus the newly authored reviews do not have enough ones. For example, only 20% of the reviews in Amazon Review Dataset (Mcauley and Leskovec, 2013) have more than 5 reviews (Yan et al, 2014). The purpose of this study is to analyze the factors affecting the usefulness of online product reviews and to derive a forecasting model that selectively provides product reviews that can be helpful to consumers. In order to do this, we extracted the various linguistic, psychological, and perceptual elements included in product reviews by using text-mining techniques and identifying the determinants among these elements that affect the usability of product reviews. In particular, considering that the characteristics of the product reviews and determinants of usability for apparel products (which are experiential products) and electronic products (which are search goods) can differ, the characteristics of the product reviews were compared within each product group and the determinants were established for each. This study used 7,498 apparel product reviews and 106,962 electronic product reviews from Amazon.com. In order to understand a review text, we first extract linguistic and psychological characteristics from review texts such as a word count, the level of emotional tone and analytical thinking embedded in review text using widely adopted text analysis software LIWC (Linguistic Inquiry and Word Count). After then, we explore the descriptive statistics of review text for each category and statistically compare their differences using t-test. Lastly, we regression analysis using the data mining software RapidMiner to find out determinant factors. As a result of comparing and analyzing product review characteristics of electronic products and apparel products, it was found that reviewers used more words as well as longer sentences when writing product reviews for electronic products. As for the content characteristics of the product reviews, it was found that these reviews included many analytic words, carried more clout, and related to the cognitive processes (CogProc) more so than the apparel product reviews, in addition to including many words expressing negative emotions (NegEmo). On the other hand, the apparel product reviews included more personal, authentic, positive emotions (PosEmo) and perceptual processes (Percept) compared to the electronic product reviews. Next, we analyzed the determinants toward the usefulness of the product reviews between the two product groups. As a result, it was found that product reviews with high product ratings from reviewers in both product groups that were perceived as being useful contained a larger number of total words, many expressions involving perceptual processes, and fewer negative emotions. In addition, apparel product reviews with a large number of comparative expressions, a low expertise index, and concise content with fewer words in each sentence were perceived to be useful. In the case of electronic product reviews, those that were analytical with a high expertise index, along with containing many authentic expressions, cognitive processes, and positive emotions (PosEmo) were perceived to be useful. These findings are expected to help consumers effectively identify useful product reviews in the future.

Multi-Dimensional Analysis Method of Product Reviews for Market Insight (마켓 인사이트를 위한 상품 리뷰의 다차원 분석 방안)

  • Park, Jeong Hyun;Lee, Seo Ho;Lim, Gyu Jin;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.57-78
    • /
    • 2020
  • With the development of the Internet, consumers have had an opportunity to check product information easily through E-Commerce. Product reviews used in the process of purchasing goods are based on user experience, allowing consumers to engage as producers of information as well as refer to information. This can be a way to increase the efficiency of purchasing decisions from the perspective of consumers, and from the seller's point of view, it can help develop products and strengthen their competitiveness. However, it takes a lot of time and effort to understand the overall assessment and assessment dimensions of the products that I think are important in reading the vast amount of product reviews offered by E-Commerce for the products consumers want to compare. This is because product reviews are unstructured information and it is difficult to read sentiment of reviews and assessment dimension immediately. For example, consumers who want to purchase a laptop would like to check the assessment of comparative products at each dimension, such as performance, weight, delivery, speed, and design. Therefore, in this paper, we would like to propose a method to automatically generate multi-dimensional product assessment scores in product reviews that we would like to compare. The methods presented in this study consist largely of two phases. One is the pre-preparation phase and the second is the individual product scoring phase. In the pre-preparation phase, a dimensioned classification model and a sentiment analysis model are created based on a review of the large category product group review. By combining word embedding and association analysis, the dimensioned classification model complements the limitation that word embedding methods for finding relevance between dimensions and words in existing studies see only the distance of words in sentences. Sentiment analysis models generate CNN models by organizing learning data tagged with positives and negatives on a phrase unit for accurate polarity detection. Through this, the individual product scoring phase applies the models pre-prepared for the phrase unit review. Multi-dimensional assessment scores can be obtained by aggregating them by assessment dimension according to the proportion of reviews organized like this, which are grouped among those that are judged to describe a specific dimension for each phrase. In the experiment of this paper, approximately 260,000 reviews of the large category product group are collected to form a dimensioned classification model and a sentiment analysis model. In addition, reviews of the laptops of S and L companies selling at E-Commerce are collected and used as experimental data, respectively. The dimensioned classification model classified individual product reviews broken down into phrases into six assessment dimensions and combined the existing word embedding method with an association analysis indicating frequency between words and dimensions. As a result of combining word embedding and association analysis, the accuracy of the model increased by 13.7%. The sentiment analysis models could be seen to closely analyze the assessment when they were taught in a phrase unit rather than in sentences. As a result, it was confirmed that the accuracy was 29.4% higher than the sentence-based model. Through this study, both sellers and consumers can expect efficient decision making in purchasing and product development, given that they can make multi-dimensional comparisons of products. In addition, text reviews, which are unstructured data, were transformed into objective values such as frequency and morpheme, and they were analysed together using word embedding and association analysis to improve the objectivity aspects of more precise multi-dimensional analysis and research. This will be an attractive analysis model in terms of not only enabling more effective service deployment during the evolving E-Commerce market and fierce competition, but also satisfying both customers.