• Title/Summary/Keyword: problem features

Search Result 1,863, Processing Time 0.028 seconds

Set Covering-based Feature Selection of Large-scale Omics Data (Set Covering 기반의 대용량 오믹스데이터 특징변수 추출기법)

  • Ma, Zhengyu;Yan, Kedong;Kim, Kwangsoo;Ryoo, Hong Seo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.39 no.4
    • /
    • pp.75-84
    • /
    • 2014
  • In this paper, we dealt with feature selection problem of large-scale and high-dimensional biological data such as omics data. For this problem, most of the previous approaches used simple score function to reduce the number of original variables and selected features from the small number of remained variables. In the case of methods that do not rely on filtering techniques, they do not consider the interactions between the variables, or generate approximate solutions to the simplified problem. Unlike them, by combining set covering and clustering techniques, we developed a new method that could deal with total number of variables and consider the combinatorial effects of variables for selecting good features. To demonstrate the efficacy and effectiveness of the method, we downloaded gene expression datasets from TCGA (The Cancer Genome Atlas) and compared our method with other algorithms including WEKA embeded feature selection algorithms. In the experimental results, we showed that our method could select high quality features for constructing more accurate classifiers than other feature selection algorithms.

The High School Students' Problem Solving Patterns and Their Features in Scientific Inquiry (고등학생의 탐구 사고력 문제 해결 과정에 나타난 유형과 특징)

  • Kim, Ik-Gyun;Hwang, Yu-Jeong
    • Journal of The Korean Association For Science Education
    • /
    • v.13 no.2
    • /
    • pp.152-162
    • /
    • 1993
  • The high school students' problem solving patterns and their features in scientific inquiry, especially on controlling variables and stating hypothesis have been investigated. The 8 problems on controlling variables and stating hypothesis were selected out of the scientific inquiry area in the experimental tryout of Aptitude Assessment for College Education, and had been used to find the patterns and their features. The results of findings are as follows: There were seven patterns in the process of solving problems. Five of seven patterns were found in right answers and four patterns in wrong answers. Two patterns were found in both right and wrong answers. Some students could solve the problems even though they did not understand the elements of the scientific inquiry, controlling variables and stating hypothesis. The false application of physics concepts, misunderstanding about the elements of the scientific inquiry and using unrelated experience and conjectures were the features of students' wrong answers. On the other hand, the right application of physics concepts, understanding and applying the elements right, infering answers from the tables and figures on statements of suggested problems were the features of right answers. The further studies on this kind may helpful to find the higher mental abilities related to scientific inquiry and to develop tools for testing students' scientific inquiry thinking skills.

  • PDF

Document Clustering Using Semantic Features and Fuzzy Relations

  • Kim, Chul-Won;Park, Sun
    • Journal of information and communication convergence engineering
    • /
    • v.11 no.3
    • /
    • pp.179-184
    • /
    • 2013
  • Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem of the ontology approach is that it is usually difficult to find a comprehensive ontology that can cover all the concepts mentioned in a collection. This paper proposes a new document clustering method using semantic features and fuzzy relations for solving the problems of ontology and matrix factorization approaches. The proposed method can improve the quality of document clustering because the clustered documents use fuzzy relation values between semantic features and terms to distinguish clearly among dissimilar documents in clusters. The selected cluster label terms can represent the inherent structure of a document set better by using semantic features based on non-negative matrix factorization, which is used in document clustering. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.

An Artificial Neural Network-Based Drug Proarrhythmia Assessment Using Electrophysiological Characteristics of Cardiomyocytes (심근 세포의 전기생리학적 특징을 이용한 인공 신경망 기반 약물의 심장독성 평가)

  • Yoo, Yedam;Jeong, Da Un;Marcellinus, Aroli;Lim, Ki Moo
    • Journal of Biomedical Engineering Research
    • /
    • v.42 no.6
    • /
    • pp.287-294
    • /
    • 2021
  • Cardiotoxicity assessment of all drugs has been performed according to the ICH guidelines since 2005. Non-clinical evaluation S7B has focused on the hERG assay, which has a low specificity problem. The comprehensive in vitro proarrhythmia assay (CiPA) project was initiated to correct this problem, which presented a model for classifying the Torsade de pointes (TdP)-induced risk of drugs as biomarkers calculated through an in silico ventricular model. In this study, we propose a TdP-induced risk group classifier of artificial neural network (ANN)-based. The model was trained with 12 drugs and tested with 16 drugs. The ANN model was performed according to nine features, seven features, five features as an individual ANN model input, and the model with the highest performance was selected and compared with the classification performance of the qNet input logistic regression model. When the five features model was used, the results were AUC 0.93 in the high-risk group, AUC 0.73 in the intermediate-risk group, and 0.92 in the low-risk group. The model's performance using qNet was lower than the ANN model in the high-risk group by 17.6% and in the low-risk group by 29.5%. This study was able to express performance in the three risk groups, and it is a model that solved the problem of low specificity, which is the problem of hERG assay.

Text-Confidence Feature Based Quality Evaluation Model for Knowledge Q&A Documents (텍스트 신뢰도 자질 기반 지식 질의응답 문서 품질 평가 모델)

  • Lee, Jung-Tae;Song, Young-In;Park, So-Young;Rim, Hae-Chang
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.10
    • /
    • pp.608-615
    • /
    • 2008
  • In Knowledge Q&A services where information is created by unspecified users, document quality is an important factor of user satisfaction with search results. Previous work on quality prediction of Knowledge Q&A documents evaluate the quality of documents by using non-textual information, such as click counts and recommendation counts, and focus on enhancing retrieval performance by incorporating the quality measure into retrieval model. Although the non-textual information used in previous work was proven to be useful by experiments, data sparseness problem may occur when predicting the quality of newly created documents with such information. To solve data sparseness problem of non-textual features, this paper proposes new features for document quality prediction, namely text-confidence features, which indicate how trustworthy the content of a document is. The proposed features, extracted directly from the document content, are stable against data sparseness problem, compared to non-textual features that indirectly require participation of service users in order to be collected. Experiments conducted on real world Knowledge Q&A documents suggests that text-confidence features show performance comparable to the non-textual features. We believe the proposed features can be utilized as effective features for document quality prediction and improve the performance of Knowledge Q&A services in the future.

Enhancing the Creative Problem Solving Skill by Using the CPS Learning Model for Seventh Grade Students with Different Prior Knowledge Levels

  • Cojorn, Kanyarat;Koocharoenpisal, Numphon;Haemaprasith, Sunee;Siripankaew, Pramuan
    • Journal of The Korean Association For Science Education
    • /
    • v.32 no.8
    • /
    • pp.1333-1344
    • /
    • 2012
  • This study aimed to enhance creative problem solving skill by using the Creative Problem Solving (CPS) learning model which was developed based on creative problem solving approach and five essential features of inquiry. The key strategy of the CPS learning model is using real life problem situations to provide students opportunities to practice creative problem solving skill through 5 learning steps: engaging, problem exploring, solutions creating, plan executing, and concepts examining. The science content used for examining the CPS learning model was "matter and properties of matter" that consists of 3 learning units: Matter, Solution, and Acid-Base Solution. The process to assess the effectiveness of the learning model used the experimental design of the Pretest-Posttest Control-Group Design. Seventh grade-students in the experimental group learned by the CPS learning model. At the same time, students at the same grade level in the control group learned by conventional learning model. The learning models and students' prior knowledge levels were served as the independent variables. The creative problem solving skill was classified in to 4 aspects in: fluency, flexibility, originality, and reasoning. The results indicated that in all aspects, the students' mean scores of creative problem solving between students in experimental group and control group were significantly different at the .05 level. Also, the progression of students' creative problem solving skills was found highly progressed at the later instructional periods. When comparing the creative problem solving scores between groups of students with different levels of prior knowledge, the differences of their creative problem solving scores were founded at .05 level. The findings of this study confirmed that the CPS learning model is effective in enhancing the students' creative problem solving skill.

Research for Distinctive Features of Geometry Problem Solving According to Achievement Level on Middle School Students (중학생의 성취수준에 따른 기하 문제해결의 특징 탐색)

  • Kim Ki-Yoen;Kim Sun-Hee
    • School Mathematics
    • /
    • v.8 no.2
    • /
    • pp.215-237
    • /
    • 2006
  • In this study, we research distinctive features of geometry problem solving of middle school students whose mathematical achievement levels are distinguished by National Assessment of Educational Achievement. We classified 9 students into 3 groups according to their level : advanced level, proficient level, basic level. They solved an atypical geometry problem while all their problem solving stages were observed and then analyzed in aspect of development of geometrical concepts and access to the route of problem solving. As those analyses, we gave some suggestions of teaching on mathematics as students' achievement level.

  • PDF

Performance Analysis of Brightness-Combined LLAH (밝기 정보를 결합한 LLAH의 성능 분석)

  • Park, Hanhoon;Moon, Kwang-Seok
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.2
    • /
    • pp.138-145
    • /
    • 2016
  • LLAH(Locally Likely Arrangement Hashing) is a method which describes image features by exploiting the geometric relationship between their neighbors. Inherently, it is more robust to large view change and poor scene texture than conventional texture-based feature description methods. However, LLAH strongly requires that image features should be detected with high repeatability. The problem is that such requirement is difficult to satisfy in real applications. To alleviate the problem, this paper proposes a method that improves the matching rate of LLAH by exploiting together the brightness of features. Then, it is verified that the matching rate is increased by about 5% in experiments with synthetic images in the presence of Gaussian noise.

Automatic Word Spacing for Korean Using CRFs with Korean Features (한국어 특성과 CRFs를 이용한 자동 띄어쓰기 시스템)

  • Lee, Hyun-Woo;Cha, Jeong-Won
    • MALSORI
    • /
    • no.65
    • /
    • pp.125-141
    • /
    • 2008
  • In this work, we propose an automatic word spacing system for Korean using conditional random fields (CRFs) with Korean features. We map a word spacing problem into a classification problem in our work. We build a basic system which uses CRFs and Eumjeol bigram. After then, we analyze the result of inner-test. We extend a basic system added by some Korean features which are Josa, Eomi and two head Eumjeols of word extracting from lexicon. From the results of experiment, we can see that the proposed method is better than previous methods. Additionally the proposed method will be able to use mobile and speech applications because of very small size of model.

  • PDF

Use of Word Clustering to Improve Emotion Recognition from Short Text

  • Yuan, Shuai;Huang, Huan;Wu, Linjing
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.4
    • /
    • pp.103-110
    • /
    • 2016
  • Emotion recognition is an important component of affective computing, and is significant in the implementation of natural and friendly human-computer interaction. An effective approach to recognizing emotion from text is based on a machine learning technique, which deals with emotion recognition as a classification problem. However, in emotion recognition, the texts involved are usually very short, leaving a very large, sparse feature space, which decreases the performance of emotion classification. This paper proposes to resolve the problem of feature sparseness, and largely improve the emotion recognition performance from short texts by doing the following: representing short texts with word cluster features, offering a novel word clustering algorithm, and using a new feature weighting scheme. Emotion classification experiments were performed with different features and weighting schemes on a publicly available dataset. The experimental results suggest that the word cluster features and the proposed weighting scheme can partly resolve problems with feature sparseness and emotion recognition performance.