• Title/Summary/Keyword: Feature analyze

Search Result 825, Processing Time 0.025 seconds

Performance analysis of Frequent Itemset Mining Technique based on Transaction Weight Constraints (트랜잭션 가중치 기반의 빈발 아이템셋 마이닝 기법의 성능분석)

  • Yun, Unil;Pyun, Gwangbum
    • Journal of Internet Computing and Services
    • /
    • v.16 no.1
    • /
    • pp.67-74
    • /
    • 2015
  • In recent years, frequent itemset mining for considering the importance of each item has been intensively studied as one of important issues in the data mining field. According to strategies utilizing the item importance, itemset mining approaches for discovering itemsets based on the item importance are classified as follows: weighted frequent itemset mining, frequent itemset mining using transactional weights, and utility itemset mining. In this paper, we perform empirical analysis with respect to frequent itemset mining algorithms based on transactional weights. The mining algorithms compute transactional weights by utilizing the weight for each item in large databases. In addition, these algorithms discover weighted frequent itemsets on the basis of the item frequency and weight of each transaction. Consequently, we can see the importance of a certain transaction through the database analysis because the weight for the transaction has higher value if it contains many items with high values. We not only analyze the advantages and disadvantages but also compare the performance of the most famous algorithms in the frequent itemset mining field based on the transactional weights. As a representative of the frequent itemset mining using transactional weights, WIS introduces the concept and strategies of transactional weights. In addition, there are various other state-of-the-art algorithms, WIT-FWIs, WIT-FWIs-MODIFY, and WIT-FWIs-DIFF, for extracting itemsets with the weight information. To efficiently conduct processes for mining weighted frequent itemsets, three algorithms use the special Lattice-like data structure, called WIT-tree. The algorithms do not need to an additional database scanning operation after the construction of WIT-tree is finished since each node of WIT-tree has item information such as item and transaction IDs. In particular, the traditional algorithms conduct a number of database scanning operations to mine weighted itemsets, whereas the algorithms based on WIT-tree solve the overhead problem that can occur in the mining processes by reading databases only one time. Additionally, the algorithms use the technique for generating each new itemset of length N+1 on the basis of two different itemsets of length N. To discover new weighted itemsets, WIT-FWIs performs the itemset combination processes by using the information of transactions that contain all the itemsets. WIT-FWIs-MODIFY has a unique feature decreasing operations for calculating the frequency of the new itemset. WIT-FWIs-DIFF utilizes a technique using the difference of two itemsets. To compare and analyze the performance of the algorithms in various environments, we use real datasets of two types (i.e., dense and sparse) in terms of the runtime and maximum memory usage. Moreover, a scalability test is conducted to evaluate the stability for each algorithm when the size of a database is changed. As a result, WIT-FWIs and WIT-FWIs-MODIFY show the best performance in the dense dataset, and in sparse dataset, WIT-FWI-DIFF has mining efficiency better than the other algorithms. Compared to the algorithms using WIT-tree, WIS based on the Apriori technique has the worst efficiency because it requires a large number of computations more than the others on average.

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

A Study on Analyzing Sentiments on Movie Reviews by Multi-Level Sentiment Classifier (영화 리뷰 감성분석을 위한 텍스트 마이닝 기반 감성 분류기 구축)

  • Kim, Yuyoung;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.71-89
    • /
    • 2016
  • Sentiment analysis is used for identifying emotions or sentiments embedded in the user generated data such as customer reviews from blogs, social network services, and so on. Various research fields such as computer science and business management can take advantage of this feature to analyze customer-generated opinions. In previous studies, the star rating of a review is regarded as the same as sentiment embedded in the text. However, it does not always correspond to the sentiment polarity. Due to this supposition, previous studies have some limitations in their accuracy. To solve this issue, the present study uses a supervised sentiment classification model to measure a more accurate sentiment polarity. This study aims to propose an advanced sentiment classifier and to discover the correlation between movie reviews and box-office success. The advanced sentiment classifier is based on two supervised machine learning techniques, the Support Vector Machines (SVM) and Feedforward Neural Network (FNN). The sentiment scores of the movie reviews are measured by the sentiment classifier and are analyzed by statistical correlations between movie reviews and box-office success. Movie reviews are collected along with a star-rate. The dataset used in this study consists of 1,258,538 reviews from 175 films gathered from Naver Movie website (movie.naver.com). The results show that the proposed sentiment classifier outperforms Naive Bayes (NB) classifier as its accuracy is about 6% higher than NB. Furthermore, the results indicate that there are positive correlations between the star-rate and the number of audiences, which can be regarded as the box-office success of a movie. The study also shows that there is the mild, positive correlation between the sentiment scores estimated by the classifier and the number of audiences. To verify the applicability of the sentiment scores, an independent sample t-test was conducted. For this, the movies were divided into two groups using the average of sentiment scores. The two groups are significantly different in terms of the star-rated scores.

Analysis of Teachers' Perceptions on the Subject Competencies of Integrated Science (통합과학 교과 역량에 대한 교사들의 인식 분석)

  • Ahn, Yumin;Byun, Taejin
    • Journal of The Korean Association For Science Education
    • /
    • v.40 no.2
    • /
    • pp.97-111
    • /
    • 2020
  • In the 2015 revised curriculum, 'Integrated Science' was established to increase convergent thinking and designated as a common subject for all students to learn, regardless of career. In addition, the 2015 revised curriculum introduced 'competence' as a distinctive feature from the previous curriculum. In the 2015 revised curriculum, competencies are divided into core competencies of cross-curricular character and subject competencies based on academic knowledge and skills of the subject. The science curriculum contains five subject competencies: scientific thinking, scientific inquiry, scientific problem solving, scientific communication, scientific participation and life-long learning. However, the description of competencies in curriculum documents is insufficient, and experts' perceptions of competencies are not uniform. Therefore, this study examines the perceptions of science subjects in science high school teachers by deciding that comprehension of competencies should be preceded in order for competency-based education to be properly applied to school sites. First, we analyzed the relationship between achievement standards and subject competencies of integrated science through the operation of an expert working group with a high understanding of the integrated science achievement standards. Next, 31 high school science teachers examined the perception of the five subject competencies through a descriptive questionnaire. The semantic network analysis has been utilized to analyze the teachers' responses. The results of the analysis showed that the three curriculum competencies of scientific inquiry, scientific communication, scientific participation and life-long learning ability are similar to the definitions of teachers and curriculum documents, but in the case of scientific thinking and scientific problem solving, there are some gaps in perception and definition in curriculum documents. In addition, the results of the comprehensive analysis of teachers' perceptions on the five competencies show that the five curriculum competencies are more relevant than mutually exclusive or independent.

Bilateral retinoblastoma: Long-term follow-up results from a single institution (단일기관의 장기추적 결과)

  • Choi, Sang Yul;Kim, Dong Hwan;Lee, Kang Min;Lee, Hyun Jae;Kim, Mi-Sook;Lee, Tai-Won;Choi, Sang Wook;Kim, Dong Ho;Park, Kyung Duk;Lee, Jun Ah
    • Clinical and Experimental Pediatrics
    • /
    • v.52 no.6
    • /
    • pp.674-679
    • /
    • 2009
  • Purpose : The authors aimed to analyze the long-term effects of treatments, especially external beam radiotherapy (EBRT), in bilateral retinoblastoma patients. Methods : This retrospective study analyzed the medical records of 22 bilateral retinoblastoma patients who were registered between October, 1987 and October, 1998 and followed-up for more than 10 years. They were treated by enucleation, EBRT, and systemic chemotherapy. Age at diagnosis, sex, delay prior to treatment, Reese-Ellsworth (RE) classification, and the local treatment modalities were analyzed in relation to recurrence-free survival (RFS) and complications. Results : Median age at diagnosis was 7.0 months (range 1.7-31.6 months). Leukocoria was the most common presenting feature. Two patients had a familial history. The RE classifications of the 44 eyes were group II in 4, III in 14, IV in 4, and V in 22. At the end of a median follow-up period of 141 months (range 55-218 months), 20 patients were alive. The 10-year ocular survival rate of the 44 eyes was $56.8{\pm}7.5%$. The 10-year RFS and ocular survival rate of the 29 eyes treated by combined EBRT and chemotherapy were 75.9% and 86.2%, respectively. Treatment delay (>3 months) was found to be related to higher risk of recurrence. Complications after EBRT were cataract, retinal detachment, phthisis bulbi, and facial asymmetry. No patient developed a second malignancy during the follow-up period. Conclusion : Early detection and prompt treatment can increase ocular survival rates. In addition, careful attention should be paid to possible long-term sequelae in these patients.

Estimation of the Superelevation Safety Factor Considering Operating Speed at 3-Dimensional Alignment (입체선형의 주행속도를 고려한 편경사 안전율 산정에 관한 연구)

  • Park, Tae-Hoon;Kim, Joong-Hyo;Park, Je-Jin;Park, Ju-Won;Ha, Tae-Jun
    • Journal of Korean Society of Transportation
    • /
    • v.23 no.7 s.85
    • /
    • pp.159-163
    • /
    • 2005
  • The propriety between suppliers and demanders in geometric design is very important. Although the final purpose of constructing roads is to concern about the driver s comfort, unfortunately, it has not been considered so far. We've considered the regularity and quickness in considering driver's comfort but there should be considered the safety for the accident as well. If drivers are appeared to be more speeding than designer's intention, there will be needed some supplements to increase the safety rate for the roads. Even if both an upward and downward section are supposed to exist at the same time for solid geometry of the roads like this, it is true that the recent design for the 3-D solid geometry section has been done as flat 2-D and the minimum plane curve radius and the maximum cant have been decided just by calculating without considering operating speed between an upward and downward section at the same point. In this investigation, thus, I'd like to calculate the safety of the cant by considering the speed features of the solid geometry for the first lane of four lane rural roads. To begin with, we investigated the driving speed of the car, which is not been influenced by a preceding car to analyze the influence of the geometrical structure by using Nc-97. Secondly, we statistically analyzed the driving features of the solid geometry after comparing the 6 sections, that is, measuring the driving speed feature at 12 points and combining the influence of the vertical geometry and plane geometry to the driving speed of the plane curve which was researched before. Finally, we estimated the value of cant which considers the driving speed not by using it which has applied uniformly without considering it properly, though there were some differences between a designed speed and driving speed through the result of the basic statistical analysis but by introducing the new safety rate rule, a notion of ${\alpha}$. As a result of the research, we could see the driving features of the car and suggest the safety rate which considers these. For considering the maximum cant, if we apply the safety rate, the result of this experiment, which considers 3-D solid geometry, there'll be the improvement of the driver's safety for designing roads. In addition, after collecting and analyzing the data for the road sections which have various geometrical structures by expanding this experiment it is considered that there should be developed the models which considers 3-D solid geometry.

Analysis of Radiation Treatment Planning by Dose Calculation and Optimization Algorithm (선량계산 및 최적화 알고리즘에 따른 치료계획의 영향 분석)

  • Kim, Dae-Sup;Yoon, In-Ha;Lee, Woo-Seok;Baek, Geum-Mun
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.24 no.2
    • /
    • pp.137-147
    • /
    • 2012
  • Purpose: Analyze the Effectiveness of Radiation Treatment Planning by dose calculation and optimization algorithm, apply consideration of actual treatment planning, and then suggest the best way to treatment planning protocol. Materials and Methods: The treatment planning system use Eclipse 10.0. (Varian, USA). PBC (Pencil Beam Convolution) and AAA (Anisotropic Analytical Algorithm) Apply to Dose calculation, DVO (Dose Volume Optimizer 10.0.28) used for optimized algorithm of Intensity Modulated Radiation Therapy (IMRT), PRO II (Progressive Resolution Optimizer V 8.9.17) and PRO III (Progressive Resolution Optimizer V 10.0.28) used for optimized algorithm of VAMT. A phantom for experiment virtually created at treatment planning system, $30{\times}30{\times}30$ cm sized, homogeneous density (HU: 0) and heterogeneous density that inserted air assumed material (HU: -1,000). Apply to clinical treatment planning on the basis of general treatment planning feature analyzed with Phantom planning. Results: In homogeneous density phantom, PBC and AAA show 65.2% PDD (6 MV, 10 cm) both, In heterogeneous density phantom, also show similar PDD value before meet with low density material, but they show different dose curve in air territory, PDD 10 cm showed 75%, 73% each after penetrate phantom. 3D treatment plan in same MU, AAA treatment planning shows low dose at Lung included area. 2D POP treatment plan with 15 MV of cervical vertebral region include trachea and lung area, Conformity Index (ICRU 62) is 0.95 in PBC calculation and 0.93 in AAA. DVO DVH and Dose calculation DVH are showed equal value in IMRT treatment plan. But AAA calculation shows lack of dose compared with DVO result which is satisfactory condition. Optimizing VMAT treatment plans using PRO II obtained results were satisfactory, but lower density area showed lack of dose in dose calculations. PRO III, but optimizing the dose calculation results were similar with optimized the same conditions once more. Conclusion: In this study, do not judge the rightness of the dose calculation algorithm. However, analyzing the characteristics of the dose distribution represented by each algorithm, especially, a method for the optimal treatment plan can be presented when make a treatment plan. by considering optimized algorithm factors of the IMRT or VMAT that needs to optimization make a treatment plan.

  • PDF

Studies on Sericin Fixation by Use of Alum Meal (명반처리에 의한 견직물개선연구 -Sericin 정착을 중심으로 하여-)

  • 최병희;남중희
    • Journal of Sericultural and Entomological Science
    • /
    • v.21 no.2
    • /
    • pp.11-19
    • /
    • 1979
  • This has been carried out how the sericin insoluble fixations of raw silk should be with potassium alum. This is learned from the leather tanning technique which the process works with collagen, a kind of proteins. Former reports had shown such works, however, they did not consider the moisture absorbability after their process reports by using chromium alum, formalin or vinyl acetate grafting. This report, however, paid attention to protect such absorbability as well as sericin fixation, so far it may be useful for plactical use of silk. In order to clear how the sericin is fixed with such chemicals, fundermental mechanism of weding process and chemical reaction against proteins were also discussed. The obtained results of the report are as followings. 1. Alum should not be treated for raw silk with high temperature bath like other reports because such treat induces raw silk to be stiffly after the treat. 2. It is recommended that raw silk should be treated with alum solution at room temperature for more than three hours. Even in this case, the use of only alum with raw silk could to fix sericin some how, but it increased the water proofness of the silk. 3. 1% of alum solution was found to be able to fix the sericin of raw silk. 4. In case we consider only sericin fixations, a combination treat of 1% alum for three hours and 0.5% NaOH for ten minutes method showed the best result. 5. In case we consider sericin insoluble fixation and moisture absorbility, the reversive combination of the above process was found to be the best results. 6. Sericin fixing evidence was shown with drying feature curves of wed each treated silk where we could to analyze how the chemical nature is changed after each treat. 7. Deguming ratio may be obtained up to 4.3% after the alum combination treat with regular raw silk. Such ratio was considered to be good enough for the purpose when the textile is washed with warm soap water. 8. Moisture absorbability of the combination treat of alum and NaOH was found to be good enough as well as non treated silk. 9. The tenacity and elongation of the treated silk did not change even after three month. 10. Above all, this method is considered to be better process than other coloured fixing (tannin method. Cr-alum method) or smell fixing (formalin method. vinyl acetate method).

  • PDF

When Robots Meet the Elderly: The Contexts of Interaction and the Role of Mediators (노인과 로봇은 어떻게 만나는가: 상호작용의 조건과 매개자의 역할)

  • Shin, Heesun;Jeon, Chihyung
    • Journal of Science and Technology Studies
    • /
    • v.18 no.2
    • /
    • pp.135-179
    • /
    • 2018
  • How do robots interact with the elderly? In this paper, we analyze the contexts of interaction between robots and the elderly and the role of mediators in initiating, facilitating, and maintaining the interaction. We do not attempt to evaluate the robot's performance or measure the impact of robots on the elderly. Instead, we focus on the circumstances and contexts within which a robot is situated as it interacts with the elderly. Our premise is that the success of human-robot interaction does not depend solely on the robot's technical capability, but also on the pre-arranged settings and local contingencies at the site of interaction. We select three television shows that feature robots for the elderly and one "dementia-prevention" robot in a regional healthcare center as our sites for observing robot-elderly interaction: "Grandma's Robot"(tvN), "Co-existence Experiment''(JTBC), "Future Diary"(MBC), and the Silbot class in Suwon. By analyzing verbal and non-verbal interactions between the elderly and the robots in these programs, we point out that in most cases the robots and the elderly do not meet one-to-one; the interaction is usually mediated by an actor who is not an old person. These mediators are not temporary or secondary components in the robot-elderly interaction; they play a key role in the relationship by arranging the first meeting, triggering initial interactions, and carefully observing unfolding interactions. At critical moments, the mediators prevent the interaction from falling apart by intervening verbally or physically. Based on our observation of the robot-elderly interaction, we argue that we can better understand and evaluate the human-robot interaction in general by paying attention to the existence and role of the mediators. We suggest that researchers in human-robot interaction should expand their analytical focus from one-to-one interactions between humans and robots to human-robot-human interactions in diverse real-world situations.

Management of the Nakdong-Jeongmaek based on the Characteristics of Cold Air - Focused on Busan, Ulsan, Pohang - (찬공기 특성을 고려한 낙동정맥 관리방안 연구 - 부산, 울산, 포항 인근을 대상으로 -)

  • Eum, Jeong-Hee;Son, Jeong-Min
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.44 no.5
    • /
    • pp.103-115
    • /
    • 2016
  • This study aims to analyze the properties of cold air production and its flow of Nakdong-Jeongmaek(mountain ranges), and to suggest management strategies for Nakdong-Jeongmaek in order to enhance the green air conditioning functions of Jeongmaek. For this purpose, three study sites including Gudeoksan Mountain and the vicinity in Busan, Goheonsan Mountain and the vicinity in Ulsan, and Unjusan Mountain and the vicinity in Pohang were selected. The results found that cold air flow and its height of the three study sites were analyzed based on topographic properties and land use. Management strategies for preserving and enhancing their temperature reduction functions were suggested. The cold air produced in the vicinity of Gudeoksan was not fully developed and spread because of the high-density development at the border of Jeongmaek. Since high pressures of development are expected at the border, high conservation policies are required. In the vicinity of Goheonsan, where the agricultural complex and industrial park are located, cold air flows well throughout the entire study site thanks to fully developed cold air in the wide, flat valley. Hence, plans to maintain the current cold air flow are required, and conservation plans to mitigate future developments are also needed in the flat valley. The cold air in Unjusan and the vicinity with its complex and narrow mountain valleys gradually develops into valley bottoms. In order to take advantage of the terrain, the valley near the cold air production areas are preserved. In particular, special plans are required to prevent damage to the cold air layer near Youngcheonho Lake, where the highest height of cold air was recorded due to the closed and lower terrain feature. This study could support the establishment of systematic management plans of Nakdong-Jeongmaek to preserve and enhance its green air conditioning functions.