• Title/Summary/Keyword: 정보의 구조

Search Result 22,725, Processing Time 0.051 seconds

Experimental Models of Schizophrenia (정신분열병의 실험적 모델)

  • Cheon, Jin-Sook
    • Korean Journal of Biological Psychiatry
    • /
    • v.6 no.2
    • /
    • pp.153-160
    • /
    • 1999
  • Animal models can provide a useful tool for the study of some aspects of psychiatric disorders and their treatment. The four criteria for the evaluation of animal models of psychiatric disorders are as following : 1) similarity of inducing conditions 2) similarity of behavioral state 3) common underlying neurobiological mechanisms 4) reversal by clinically effective treatment techniques. Several animal models have been proposed for schizophrenia : phenylethylamine model, L-dopa model, hallucinogen model, cocaine model, amphetamine model, phencyclidine model, noradrenergic reward system lesion model, reticular stimulation model, social isolation model, conditioned avoidance reaction, catalepsy test, paw test, self-stimulation paradigms, latent inhibition paradigms, blocking paradigms, prepulse inhibition of the startle reflex, rodent interaction, social behavior in monkeys, hippocampal damage, high ambient pressure, and models using selective breeding. Among them, animals with bilateral lesion of the hippocampus may provide an adequate animal model for several symptoms of schizophrenia, and ketamine model can reproduce negative symptoms and cognitive deficits as well as positive symptoms of schizophrenia. In conclusion, no model of schizophrenia is entirely representative of the disease, and findings gleaned from model systems must be cautiously interpreted. Furthermore, the process of developing and validating animal models must work in concert with the process to identify reliable measures of human phenomenology.

  • PDF

Hierarchical Overlapping Clustering to Detect Complex Concepts (중복을 허용한 계층적 클러스터링에 의한 복합 개념 탐지 방법)

  • Hong, Su-Jeong;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.111-125
    • /
    • 2011
  • Clustering is a process of grouping similar or relevant documents into a cluster and assigning a meaningful concept to the cluster. By this process, clustering facilitates fast and correct search for the relevant documents by narrowing down the range of searching only to the collection of documents belonging to related clusters. For effective clustering, techniques are required for identifying similar documents and grouping them into a cluster, and discovering a concept that is most relevant to the cluster. One of the problems often appearing in this context is the detection of a complex concept that overlaps with several simple concepts at the same hierarchical level. Previous clustering methods were unable to identify and represent a complex concept that belongs to several different clusters at the same level in the concept hierarchy, and also could not validate the semantic hierarchical relationship between a complex concept and each of simple concepts. In order to solve these problems, this paper proposes a new clustering method that identifies and represents complex concepts efficiently. We developed the Hierarchical Overlapping Clustering (HOC) algorithm that modified the traditional Agglomerative Hierarchical Clustering algorithm to allow overlapped clusters at the same level in the concept hierarchy. The HOC algorithm represents the clustering result not by a tree but by a lattice to detect complex concepts. We developed a system that employs the HOC algorithm to carry out the goal of complex concept detection. This system operates in three phases; 1) the preprocessing of documents, 2) the clustering using the HOC algorithm, and 3) the validation of semantic hierarchical relationships among the concepts in the lattice obtained as a result of clustering. The preprocessing phase represents the documents as x-y coordinate values in a 2-dimensional space by considering the weights of terms appearing in the documents. First, it goes through some refinement process by applying stopwords removal and stemming to extract index terms. Then, each index term is assigned a TF-IDF weight value and the x-y coordinate value for each document is determined by combining the TF-IDF values of the terms in it. The clustering phase uses the HOC algorithm in which the similarity between the documents is calculated by applying the Euclidean distance method. Initially, a cluster is generated for each document by grouping those documents that are closest to it. Then, the distance between any two clusters is measured, grouping the closest clusters as a new cluster. This process is repeated until the root cluster is generated. In the validation phase, the feature selection method is applied to validate the appropriateness of the cluster concepts built by the HOC algorithm to see if they have meaningful hierarchical relationships. Feature selection is a method of extracting key features from a document by identifying and assigning weight values to important and representative terms in the document. In order to correctly select key features, a method is needed to determine how each term contributes to the class of the document. Among several methods achieving this goal, this paper adopted the $x^2$�� statistics, which measures the dependency degree of a term t to a class c, and represents the relationship between t and c by a numerical value. To demonstrate the effectiveness of the HOC algorithm, a series of performance evaluation is carried out by using a well-known Reuter-21578 news collection. The result of performance evaluation showed that the HOC algorithm greatly contributes to detecting and producing complex concepts by generating the concept hierarchy in a lattice structure.

T-Cache: a Fast Cache Manager for Pipeline Time-Series Data (T-Cache: 시계열 배관 데이타를 위한 고성능 캐시 관리자)

  • Shin, Je-Yong;Lee, Jin-Soo;Kim, Won-Sik;Kim, Seon-Hyo;Yoon, Min-A;Han, Wook-Shin;Jung, Soon-Ki;Park, Se-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.5
    • /
    • pp.293-299
    • /
    • 2007
  • Intelligent pipeline inspection gauges (PIGs) are inspection vehicles that move along within a (gas or oil) pipeline and acquire signals (also called sensor data) from their surrounding rings of sensors. By analyzing the signals captured in intelligent PIGs, we can detect pipeline defects, such as holes and curvatures and other potential causes of gas explosions. There are two major data access patterns apparent when an analyzer accesses the pipeline signal data. The first is a sequential pattern where an analyst reads the sensor data one time only in a sequential fashion. The second is the repetitive pattern where an analyzer repeatedly reads the signal data within a fixed range; this is the dominant pattern in analyzing the signal data. The existing PIG software reads signal data directly from the server at every user#s request, requiring network transfer and disk access cost. It works well only for the sequential pattern, but not for the more dominant repetitive pattern. This problem becomes very serious in a client/server environment where several analysts analyze the signal data concurrently. To tackle this problem, we devise a fast in-memory cache manager, called T-Cache, by considering pipeline sensor data as multiple time-series data and by efficiently caching the time-series data at T-Cache. To the best of the authors# knowledge, this is the first research on caching pipeline signals on the client-side. We propose a new concept of the signal cache line as a caching unit, which is a set of time-series signal data for a fixed distance. We also provide the various data structures including smart cursors and algorithms used in T-Cache. Experimental results show that T-Cache performs much better for the repetitive pattern in terms of disk I/Os and the elapsed time. Even with the sequential pattern, T-Cache shows almost the same performance as a system that does not use any caching, indicating the caching overhead in T-Cache is negligible.

Optimization of Support Vector Machines for Financial Forecasting (재무예측을 위한 Support Vector Machine의 최적화)

  • Kim, Kyoung-Jae;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.241-254
    • /
    • 2011
  • Financial time-series forecasting is one of the most important issues because it is essential for the risk management of financial institutions. Therefore, researchers have tried to forecast financial time-series using various data mining techniques such as regression, artificial neural networks, decision trees, k-nearest neighbor etc. Recently, support vector machines (SVMs) are popularly applied to this research area because they have advantages that they don't require huge training data and have low possibility of overfitting. However, a user must determine several design factors by heuristics in order to use SVM. For example, the selection of appropriate kernel function and its parameters and proper feature subset selection are major design factors of SVM. Other than these factors, the proper selection of instance subset may also improve the forecasting performance of SVM by eliminating irrelevant and distorting training instances. Nonetheless, there have been few studies that have applied instance selection to SVM, especially in the domain of stock market prediction. Instance selection tries to choose proper instance subsets from original training data. It may be considered as a method of knowledge refinement and it maintains the instance-base. This study proposes the novel instance selection algorithm for SVMs. The proposed technique in this study uses genetic algorithm (GA) to optimize instance selection process with parameter optimization simultaneously. We call the model as ISVM (SVM with Instance selection) in this study. Experiments on stock market data are implemented using ISVM. In this study, the GA searches for optimal or near-optimal values of kernel parameters and relevant instances for SVMs. This study needs two sets of parameters in chromosomes in GA setting : The codes for kernel parameters and for instance selection. For the controlling parameters of the GA search, the population size is set at 50 organisms and the value of the crossover rate is set at 0.7 while the mutation rate is 0.1. As the stopping condition, 50 generations are permitted. The application data used in this study consists of technical indicators and the direction of change in the daily Korea stock price index (KOSPI). The total number of samples is 2218 trading days. We separate the whole data into three subsets as training, test, hold-out data set. The number of data in each subset is 1056, 581, 581 respectively. This study compares ISVM to several comparative models including logistic regression (logit), backpropagation neural networks (ANN), nearest neighbor (1-NN), conventional SVM (SVM) and SVM with the optimized parameters (PSVM). In especial, PSVM uses optimized kernel parameters by the genetic algorithm. The experimental results show that ISVM outperforms 1-NN by 15.32%, ANN by 6.89%, Logit and SVM by 5.34%, and PSVM by 4.82% for the holdout data. For ISVM, only 556 data from 1056 original training data are used to produce the result. In addition, the two-sample test for proportions is used to examine whether ISVM significantly outperforms other comparative models. The results indicate that ISVM outperforms ANN and 1-NN at the 1% statistical significance level. In addition, ISVM performs better than Logit, SVM and PSVM at the 5% statistical significance level.

Social Network Analysis for the Effective Adoption of Recommender Systems (추천시스템의 효과적 도입을 위한 소셜네트워크 분석)

  • Park, Jong-Hak;Cho, Yoon-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.305-316
    • /
    • 2011
  • Recommender system is the system which, by using automated information filtering technology, recommends products or services to the customers who are likely to be interested in. Those systems are widely used in many different Web retailers such as Amazon.com, Netfix.com, and CDNow.com. Various recommender systems have been developed. Among them, Collaborative Filtering (CF) has been known as the most successful and commonly used approach. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. However, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting in advance whether the performance of CF recommender system is acceptable or not is practically important and needed. In this study, we propose a decision making guideline which helps decide whether CF is adoptable for a given application with certain transaction data characteristics. Several previous studies reported that sparsity, gray sheep, cold-start, coverage, and serendipity could affect the performance of CF, but the theoretical and empirical justification of such factors is lacking. Recently there are many studies paying attention to Social Network Analysis (SNA) as a method to analyze social relationships among people. SNA is a method to measure and visualize the linkage structure and status focusing on interaction among objects within communication group. CF analyzes the similarity among previous ratings or purchases of each customer, finds the relationships among the customers who have similarities, and then uses the relationships for recommendations. Thus CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. Under the assumption that SNA could facilitate an exploration of the topological properties of the network structure that are implicit in transaction data for CF recommendations, we focus on density, clustering coefficient, and centralization which are ones of the most commonly used measures to capture topological properties of the social network structure. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. We explore how these SNA measures affect the performance of CF performance and how they interact to each other. Our experiments used sales transaction data from H department store, one of the well?known department stores in Korea. Total 396 data set were sampled to construct various types of social networks. The dependant variable measuring process consists of three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used UCINET 6.0 for SNA. The experiments conducted the 3-way ANOVA which employs three SNA measures as dependant variables, and the recommendation accuracy measured by F1-measure as an independent variable. The experiments report that 1) each of three SNA measures affects the recommendation accuracy, 2) the density's effect to the performance overrides those of clustering coefficient and centralization (i.e., CF adoption is not a good decision if the density is low), and 3) however though the density is low, the performance of CF is comparatively good when the clustering coefficient is low. We expect that these experiment results help firms decide whether CF recommender system is adoptable for their business domain with certain transaction data characteristics.

Improving Bidirectional LSTM-CRF model Of Sequence Tagging by using Ontology knowledge based feature (온톨로지 지식 기반 특성치를 활용한 Bidirectional LSTM-CRF 모델의 시퀀스 태깅 성능 향상에 관한 연구)

  • Jin, Seunghee;Jang, Heewon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.253-266
    • /
    • 2018
  • This paper proposes a methodology applying sequence tagging methodology to improve the performance of NER(Named Entity Recognition) used in QA system. In order to retrieve the correct answers stored in the database, it is necessary to switch the user's query into a language of the database such as SQL(Structured Query Language). Then, the computer can recognize the language of the user. This is the process of identifying the class or data name contained in the database. The method of retrieving the words contained in the query in the existing database and recognizing the object does not identify the homophone and the word phrases because it does not consider the context of the user's query. If there are multiple search results, all of them are returned as a result, so there can be many interpretations on the query and the time complexity for the calculation becomes large. To overcome these, this study aims to solve this problem by reflecting the contextual meaning of the query using Bidirectional LSTM-CRF. Also we tried to solve the disadvantages of the neural network model which can't identify the untrained words by using ontology knowledge based feature. Experiments were conducted on the ontology knowledge base of music domain and the performance was evaluated. In order to accurately evaluate the performance of the L-Bidirectional LSTM-CRF proposed in this study, we experimented with converting the words included in the learned query into untrained words in order to test whether the words were included in the database but correctly identified the untrained words. As a result, it was possible to recognize objects considering the context and can recognize the untrained words without re-training the L-Bidirectional LSTM-CRF mode, and it is confirmed that the performance of the object recognition as a whole is improved.

The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms (다중 클래스 데이터셋의 메타특징이 판별 알고리즘의 성능에 미치는 영향 연구)

  • Kim, Jeonghun;Kim, Min Yong;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.23-45
    • /
    • 2020
  • Big data is creating in a wide variety of fields such as medical care, manufacturing, logistics, sales site, SNS, and the dataset characteristics are also diverse. In order to secure the competitiveness of companies, it is necessary to improve decision-making capacity using a classification algorithm. However, most of them do not have sufficient knowledge on what kind of classification algorithm is appropriate for a specific problem area. In other words, determining which classification algorithm is appropriate depending on the characteristics of the dataset was has been a task that required expertise and effort. This is because the relationship between the characteristics of datasets (called meta-features) and the performance of classification algorithms has not been fully understood. Moreover, there has been little research on meta-features reflecting the characteristics of multi-class. Therefore, the purpose of this study is to empirically analyze whether meta-features of multi-class datasets have a significant effect on the performance of classification algorithms. In this study, meta-features of multi-class datasets were identified into two factors, (the data structure and the data complexity,) and seven representative meta-features were selected. Among those, we included the Herfindahl-Hirschman Index (HHI), originally a market concentration measurement index, in the meta-features to replace IR(Imbalanced Ratio). Also, we developed a new index called Reverse ReLU Silhouette Score into the meta-feature set. Among the UCI Machine Learning Repository data, six representative datasets (Balance Scale, PageBlocks, Car Evaluation, User Knowledge-Modeling, Wine Quality(red), Contraceptive Method Choice) were selected. The class of each dataset was classified by using the classification algorithms (KNN, Logistic Regression, Nave Bayes, Random Forest, and SVM) selected in the study. For each dataset, we applied 10-fold cross validation method. 10% to 100% oversampling method is applied for each fold and meta-features of the dataset is measured. The meta-features selected are HHI, Number of Classes, Number of Features, Entropy, Reverse ReLU Silhouette Score, Nonlinearity of Linear Classifier, Hub Score. F1-score was selected as the dependent variable. As a result, the results of this study showed that the six meta-features including Reverse ReLU Silhouette Score and HHI proposed in this study have a significant effect on the classification performance. (1) The meta-features HHI proposed in this study was significant in the classification performance. (2) The number of variables has a significant effect on the classification performance, unlike the number of classes, but it has a positive effect. (3) The number of classes has a negative effect on the performance of classification. (4) Entropy has a significant effect on the performance of classification. (5) The Reverse ReLU Silhouette Score also significantly affects the classification performance at a significant level of 0.01. (6) The nonlinearity of linear classifiers has a significant negative effect on classification performance. In addition, the results of the analysis by the classification algorithms were also consistent. In the regression analysis by classification algorithm, Naïve Bayes algorithm does not have a significant effect on the number of variables unlike other classification algorithms. This study has two theoretical contributions: (1) two new meta-features (HHI, Reverse ReLU Silhouette score) was proved to be significant. (2) The effects of data characteristics on the performance of classification were investigated using meta-features. The practical contribution points (1) can be utilized in the development of classification algorithm recommendation system according to the characteristics of datasets. (2) Many data scientists are often testing by adjusting the parameters of the algorithm to find the optimal algorithm for the situation because the characteristics of the data are different. In this process, excessive waste of resources occurs due to hardware, cost, time, and manpower. This study is expected to be useful for machine learning, data mining researchers, practitioners, and machine learning-based system developers. The composition of this study consists of introduction, related research, research model, experiment, conclusion and discussion.

A Study on the Tempo Direction of Narrative Webtoons -Focusing on - (서사 웹툰에서 템포 연출의 재미 요소에 대한 연구 -<묘진전>을 중심으로-)

  • Kim, Seong-jae
    • Cartoon and Animation Studies
    • /
    • s.47
    • /
    • pp.193-215
    • /
    • 2017
  • This study has researched that tempo is an element influencing the fun of narrative webtoon. In spite of many elements that could create fun in narrative webtoon, the theory this study pays attention to is the accumulation and solution of tension. Lee Hyun-bee said in his book that the accumulation and solution of tension would be the element creating fun. Tensions of a story create the immersion by bringing readers into the story. However, if such tensions are maintained throughout the whole story, readers get insensitive to tensions, so that the accumulation and solution of tension should be used in turn to maintain the immersion. One of the directions creating the accumulation and solution of tension in narrative webtoon is the direction of tempo. When creating a narrative webtoon with the full-length structure, it is not easy to describe the whole incident from beginning to the end of it in order of time. Therefore, it is inevitable to have differences between story time and narrative time, and the difference of this time is called 'tempo'. This tempo creates fun when readers are immersed in the work, by adjusting breaths of the story in the direction of narrative webtoon. Such a role of tempo direction is based on the relation between the occurrence of tempo direction and information of the story. The information actually leading the story creates the accumulation and relief of tension which is the essential element of fun formation while tempo also maximizes the effects of accumulation and relief of tension. Tempo direction in narrative webtoons uses panels and gaps between them. The scene direction using panels and gaps between them considers tempo and dynamics because of the temporality of panels and gaps between them. This paper analyzes the use of tempo direction for narrative webtoon through the analysis on the 1st episode of . The significance of this study is to reveal that tempo direction is one of the factors creating fun in narrative webtoons, and also to suggest the theoretical grounds for researches on direction creating fun in the future.

Opportunity Tree Framework Design For Optimization of Software Development Project Performance (소프트웨어 개발 프로젝트 성능의 최적화를 위한 Opportunity Tree 모델 설계)

  • Song Ki-Won;Lee Kyung-Whan
    • The KIPS Transactions:PartD
    • /
    • v.12D no.3 s.99
    • /
    • pp.417-428
    • /
    • 2005
  • Today, IT organizations perform projects with vision related to marketing and financial profit. The objective of realizing the vision is to improve the project performing ability in terms of QCD. Organizations have made a lot of efforts to achieve this objective through process improvement. Large companies such as IBM, Ford, and GE have made over $80\%$ of success through business process re-engineering using information technology instead of business improvement effect by computers. It is important to collect, analyze and manage the data on performed projects to achieve the objective, but quantitative measurement is difficult as software is invisible and the effect and efficiency caused by process change are not visibly identified. Therefore, it is not easy to extract the strategy of improvement. This paper measures and analyzes the project performance, focusing on organizations' external effectiveness and internal efficiency (Qualify, Delivery, Cycle time, and Waste). Based on the measured project performance scores, an OT (Opportunity Tree) model was designed for optimizing the project performance. The process of design is as follows. First, meta data are derived from projects and analyzed by quantitative GQM(Goal-Question-Metric) questionnaire. Then, the project performance model is designed with the data obtained from the quantitative GQM questionnaire and organization's performance score for each area is calculated. The value is revised by integrating the measured scores by area vision weights from all stakeholders (CEO, middle-class managers, developer, investor, and custom). Through this, routes for improvement are presented and an optimized improvement method is suggested. Existing methods to improve software process have been highly effective in division of processes' but somewhat unsatisfactory in structural function to develop and systemically manage strategies by applying the processes to Projects. The proposed OT model provides a solution to this problem. The OT model is useful to provide an optimal improvement method in line with organization's goals and can reduce risks which may occur in the course of improving process if it is applied with proposed methods. In addition, satisfaction about the improvement strategy can be improved by obtaining input about vision weight from all stakeholders through the qualitative questionnaire and by reflecting it to the calculation. The OT is also useful to optimize the expansion of market and financial performance by controlling the ability of Quality, Delivery, Cycle time, and Waste.

Study on the Usefulness about Molecular Breast Imaging In Dense Breast (치밀형 유방에서 Molecular Breast Imaging 검사의 유용성에 관한 고찰)

  • Baek, Song Ee;Kang, Chun Goo;Lee, Han Wool;Park, Min Soo;Choi, Young Sook;Kim, Jae Sam
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.20 no.1
    • /
    • pp.42-46
    • /
    • 2016
  • Purpose Mammography is the most widely used scan for the early diagnosis since it is possible to observe the anatomy of the breast. however, The sensitivity is markedly reduced in high-risk patients with dense breast. Molecular Breast Imaging (MBI) sacn is possible to get the high resolution functional imaging, and This new neclear medicine technique get the more improved diagnostic information through It is useful for confirmation of tumor's location in dense breast. The purpose of this study is to evaluate the usefulness of MBI for tumor diagnosis in patients with dense breast. Materials and Methods We investigated 10 patients female breast cancer with dense breast type who had visited the hospital from September 1st to Octorber 10th, 2015. The patients underwent both MBI and Mammography. MBI (Discovery 750B; General Electric Healthcare, USA) scan was 99mTc-MIBI injected with 20 mCi on the opposite side of the arm with the lesions, after 20 minutes, gained bilateral breast CC (CranioCaudal), MLO (Medio Lateral Oblique) View. Mammography was also conducted in the same posture. MBI and Mammography images were compared to evaluate the sensitivity and specificity of each case utilizing both image and two images in blind tests. Results The results of the blind test for breast cancer showed that the sensitivity of Mammography, MBI scan was 63%, 89%, respectively, and that their specificity was 38%, 87%, respectively. Using both the Mammography and MBI scan was Sensitivity 92%, specificity 90%. Conclusion This research has found that, The tumor of dense tissue that can not easily distinguishable in Mammography is possible to more accurate diagnosis since It is easy to visually evaluation. But MBI sacn has difficulty imaging microcalcificatons, If used in conjunction with mammography it is thought to give provide more diagnostic information.

  • PDF