• Title/Summary/Keyword: Tree-search

Search Result 636, Processing Time 0.021 seconds

Hierarchical Overlapping Clustering to Detect Complex Concepts (중복을 허용한 계층적 클러스터링에 의한 복합 개념 탐지 방법)

  • Hong, Su-Jeong;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.111-125
    • /
    • 2011
  • Clustering is a process of grouping similar or relevant documents into a cluster and assigning a meaningful concept to the cluster. By this process, clustering facilitates fast and correct search for the relevant documents by narrowing down the range of searching only to the collection of documents belonging to related clusters. For effective clustering, techniques are required for identifying similar documents and grouping them into a cluster, and discovering a concept that is most relevant to the cluster. One of the problems often appearing in this context is the detection of a complex concept that overlaps with several simple concepts at the same hierarchical level. Previous clustering methods were unable to identify and represent a complex concept that belongs to several different clusters at the same level in the concept hierarchy, and also could not validate the semantic hierarchical relationship between a complex concept and each of simple concepts. In order to solve these problems, this paper proposes a new clustering method that identifies and represents complex concepts efficiently. We developed the Hierarchical Overlapping Clustering (HOC) algorithm that modified the traditional Agglomerative Hierarchical Clustering algorithm to allow overlapped clusters at the same level in the concept hierarchy. The HOC algorithm represents the clustering result not by a tree but by a lattice to detect complex concepts. We developed a system that employs the HOC algorithm to carry out the goal of complex concept detection. This system operates in three phases; 1) the preprocessing of documents, 2) the clustering using the HOC algorithm, and 3) the validation of semantic hierarchical relationships among the concepts in the lattice obtained as a result of clustering. The preprocessing phase represents the documents as x-y coordinate values in a 2-dimensional space by considering the weights of terms appearing in the documents. First, it goes through some refinement process by applying stopwords removal and stemming to extract index terms. Then, each index term is assigned a TF-IDF weight value and the x-y coordinate value for each document is determined by combining the TF-IDF values of the terms in it. The clustering phase uses the HOC algorithm in which the similarity between the documents is calculated by applying the Euclidean distance method. Initially, a cluster is generated for each document by grouping those documents that are closest to it. Then, the distance between any two clusters is measured, grouping the closest clusters as a new cluster. This process is repeated until the root cluster is generated. In the validation phase, the feature selection method is applied to validate the appropriateness of the cluster concepts built by the HOC algorithm to see if they have meaningful hierarchical relationships. Feature selection is a method of extracting key features from a document by identifying and assigning weight values to important and representative terms in the document. In order to correctly select key features, a method is needed to determine how each term contributes to the class of the document. Among several methods achieving this goal, this paper adopted the $x^2$�� statistics, which measures the dependency degree of a term t to a class c, and represents the relationship between t and c by a numerical value. To demonstrate the effectiveness of the HOC algorithm, a series of performance evaluation is carried out by using a well-known Reuter-21578 news collection. The result of performance evaluation showed that the HOC algorithm greatly contributes to detecting and producing complex concepts by generating the concept hierarchy in a lattice structure.

Location Service Modeling of Distributed GIS for Replication Geospatial Information Object Management (중복 지리정보 객체 관리를 위한 분산 지리정보 시스템의 위치 서비스 모델링)

  • Jeong, Chang-Won;Lee, Won-Jung;Lee, Jae-Wan;Joo, Su-Chong
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.985-996
    • /
    • 2006
  • As the internet technologies develop, the geographic information system environment is changing to the web-based service. Since geospatial information of the existing Web-GIS services were developed independently, there is no interoperability to support diverse map formats. In spite of the same geospatial information object it can be used for various proposes that is duplicated in GIS separately. It needs intelligent strategies for optimal replica selection, which is identification of replication geospatial information objects. And for management of replication objects, OMG, GLOBE and GRID computing suggested related frameworks. But these researches are not thorough going enough in case of geospatial information object. This paper presents a model of location service, which is supported for optimal selection among replication and management of replication objects. It is consist of tree main services. The first is binding service which can save names and properties of object defined by users according to service offers and enable clients to search them on the service of offers. The second is location service which can manage location information with contact records. And obtains performance information by the Load Sharing Facility on system independently with contact address. The third is intelligent selection service which can obtain basic/performance information from the binding service/location service and provide both faster access and better performance characteristics by rules as intelligent model based on rough sets. For the validity of location service model, this research presents the processes of location service execution with Graphic User Interface.

Analysis of User′s Satisfaction to the Small Urban Spaces by Environmental Design Pattern Language (환경디자인 패턴언어를 통해 본 도심소공간의 이용만족도 분석에 관한 연구)

  • 김광래;노재현;장동주
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.16 no.3
    • /
    • pp.21-37
    • /
    • 1989
  • Environmental design pattern of the nine Small Urban Spaces at C.B.D. in City of Seoul are surveyed and analyzed for user's satisfaction and behavior under the environmental design evaluation by using Christopher Alexander's Pattern Language. Small Urban Spaces as a part of streetscape are formed by physical factors as well as visual environment and interacting user's behavior. Therefore, user's satisfaction and behavior at the nine Urban Small Spaces were investigated under the further search for some possibilities of application of those Pattern Languages. A pattern language has a structure of a network. It is used in sequence, going through the patterns, moving always from large patterns to smaller, always from the ones which create comes simply from the observation that most of the wonderful places of the city were not blade by architects but by the people. It defines the limited number of arrangements of spaces that make sense in any given culture. And it actually gives us the power to generate these coherent arrangement of space. As a results, 'Plaza', 'Seats'and 'Aecessibility' related design Patterns are highly evaluated by Pattern Frequency, Pattern Interaction and their Composition ranks, thus reconfirm Whyte's Praise of urban Small Spaces in our inner city design environments. According to the multiple regression analysis of user's evaluation, the environmental functions related to the satisfaction were 'Plaza', 'Accessibility' and 'Paving'. According to the free response, user's prefer such visually pleasing environmental design object as 'Waterscape' and 'Setting'. In addition to, the basic needs in Urban Small Spaces are amenity facilities as bench, drinking water and shade for rest.

  • PDF

Feature Analysis of Metadata Schemas for Records Management and Archives from the Viewpoint of Records Lifecycle (기록 생애주기 관점에서 본 기록관리 메타데이터 표준의 특징 분석)

  • Baek, Jae-Eun;Sugimoto, Shigeo
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.10 no.2
    • /
    • pp.75-99
    • /
    • 2010
  • Digital resources are widely used in our modern society. However, we are facing fundamental problems to maintain and preserve digital resources over time. Several standard methods for preserving digital resources have been developed and are in use. It is widely recognized that metadata is one of the most important components for digital archiving and preservation. There are many metadata standards for archiving and preservation of digital resources, where each standard has its own feature in accordance with its primary application. This means that each schema has to be appropriately selected and tailored in accordance with a particular application. And, in some cases, those schemas are combined in a larger frame work and container metadata such as the DCMI application framework and METS. There are many metadata standards for archives of digital resources. We used the following metadata standards in this study for the feature analysis me metadata standards - AGLS Metadata which is defined to improve search of both digital resources and non-digital resources, ISAD(G) which is a commonly used standard for archives, EAD which is well used for digital archives, OAIS which defines a metadata framework for preserving digital objects, and PREMIS which is designed primarily for preservation of digital resources. In addition, we extracted attributes from the decision tree defined for digital preservation process by Digital Preservation Coalition (DPC) and compared the set of attributes with these metadata standards. This paper shows the features of these metadata standards obtained through the feature analysis based on the records lifecycle model. The features are shown in a single frame work which makes it easy to relate the tasks in the lifecycle to metadata elements of these standards. As a result of the detailed analysis of the metadata elements, we clarified the features of the standards from the viewpoint of relationships between the elements and the lifecycle stages. Mapping between metadata schemas is often required in the long-term preservation process because different schemes are used in the records lifecycle. Therefore, it is crucial to build a unified framework to enhance interoperability of these schemes. This study presents a basis for the interoperability of different metadata schemas used in digital archiving and preservation.

Construction of Genetic Linkage Map and Identification of Quantitative Trait Loci in Populus davidiana using Genotyping-by-sequencing (Genotyping-by-sequencing 기법을 이용한 사시나무(Populus davidiana) 유전연관지도 작성 및 양적형질 유전자좌 탐색)

  • Suvi Kim;Yang-gil Kim;Dayoung Lee;Hye-jin Lee;Kyu-Suk Kang
    • Journal of Korean Society of Forest Science
    • /
    • v.112 no.1
    • /
    • pp.40-56
    • /
    • 2023
  • Tree species within the Populus genus grow rapidly and have an excellent capacity to absorb carbon, conferring substantial ability to effective purify the environment. Poplar breeding can be achieved rapidly and efficiently if a genetic linkage map is constructed and quantitative trait loci (QTLs) are identified. Here, a high-density genetic linkage map was constructed for the control pollinated progeny using the genotyping-by-sequencing (GBS) technique, which is a next-generation sequencing method. A search was also performed for the genes associated with quantitative traits located in the genetic linkage map by examining the variables of height and diameter at root collar, and resilience to insect damage. The height and diameter at root collar were measured directly, while the ability to recover from insect damage was scored in a 4-year-old breeding population of aspen hybrids (Odae19 × Bonghyeon4 F1) established in the research forest of Seoul National University. After DNA extraction, paternity was confirmed using five microsatellite markers, and only the individuals for which paternity was confirmed were used for the analysis. The DNA was cut using restriction enzymes and the obtained DNA fragments were prepared using a GBS library and sequenced. The analyzed results were sorted using Populus trichocarpa as a reference genome. Overall, 58,040 aligned single-nucleotide polymorphism (SNP) markers were identified, 17,755 of which were used for mapping genetic linkages. The genetic linkage map was divided into 19 linkage groups, with a total length of 2,129.54 cM. The analysis failed to identify any growth-related QTLs, but a gene assumed to be related to recovery from insect damage was identified on linkage group (chromosome) 4 through genome-wide association study.

Interpreting Bounded Rationality in Business and Industrial Marketing Contexts: Executive Training Case Studies (집행관배훈안례연구(阐述工商业背景下的有限合理性):집행관배훈안례연구(执行官培训案例研究))

  • Woodside, Arch G.;Lai, Wen-Hsiang;Kim, Kyung-Hoon;Jung, Deuk-Keyo
    • Journal of Global Scholars of Marketing Science
    • /
    • v.19 no.3
    • /
    • pp.49-61
    • /
    • 2009
  • This article provides training exercises for executives into interpreting subroutine maps of executives' thinking in processing business and industrial marketing problems and opportunities. This study builds on premises that Schank proposes about learning and teaching including (1) learning occurs by experiencing and the best instruction offers learners opportunities to distill their knowledge and skills from interactive stories in the form of goal.based scenarios, team projects, and understanding stories from experts. Also, (2) telling does not lead to learning because learning requires action-training environments should emphasize active engagement with stories, cases, and projects. Each training case study includes executive exposure to decision system analysis (DSA). The training case requires the executive to write a "Briefing Report" of a DSA map. Instructions to the executive trainee in writing the briefing report include coverage in the briefing report of (1) details of the essence of the DSA map and (2) a statement of warnings and opportunities that the executive map reader interprets within the DSA map. The length maximum for a briefing report is 500 words-an arbitrary rule that works well in executive training programs. Following this introduction, section two of the article briefly summarizes relevant literature on how humans think within contexts in response to problems and opportunities. Section three illustrates the creation and interpreting of DSA maps using a training exercise in pricing a chemical product to different OEM (original equipment manufacturer) customers. Section four presents a training exercise in pricing decisions by a petroleum manufacturing firm. Section five presents a training exercise in marketing strategies by an office furniture distributer along with buying strategies by business customers. Each of the three training exercises is based on research into information processing and decision making of executives operating in marketing contexts. Section six concludes the article with suggestions for use of this training case and for developing additional training cases for honing executives' decision-making skills. Todd and Gigerenzer propose that humans use simple heuristics because they enable adaptive behavior by exploiting the structure of information in natural decision environments. "Simplicity is a virtue, rather than a curse". Bounded rationality theorists emphasize the centrality of Simon's proposition, "Human rational behavior is shaped by a scissors whose blades are the structure of the task environments and the computational capabilities of the actor". Gigerenzer's view is relevant to Simon's environmental blade and to the environmental structures in the three cases in this article, "The term environment, here, does not refer to a description of the total physical and biological environment, but only to that part important to an organism, given its needs and goals." The present article directs attention to research that combines reports on the structure of task environments with the use of adaptive toolbox heuristics of actors. The DSA mapping approach here concerns the match between strategy and an environment-the development and understanding of ecological rationality theory. Aspiration adaptation theory is central to this approach. Aspiration adaptation theory models decision making as a multi-goal problem without aggregation of the goals into a complete preference order over all decision alternatives. The three case studies in this article permit the learner to apply propositions in aspiration level rules in reaching a decision. Aspiration adaptation takes the form of a sequence of adjustment steps. An adjustment step shifts the current aspiration level to a neighboring point on an aspiration grid by a change in only one goal variable. An upward adjustment step is an increase and a downward adjustment step is a decrease of a goal variable. Creating and using aspiration adaptation levels is integral to bounded rationality theory. The present article increases understanding and expertise of both aspiration adaptation and bounded rationality theories by providing learner experiences and practice in using propositions in both theories. Practice in ranking CTSs and writing TOP gists from DSA maps serves to clarify and deepen Selten's view, "Clearly, aspiration adaptation must enter the picture as an integrated part of the search for a solution." The body of "direct research" by Mintzberg, Gladwin's ethnographic decision tree modeling, and Huff's work on mapping strategic thought are suggestions on where to look for research that considers both the structure of the environment and the computational capabilities of the actors making decisions in these environments. Such research on bounded rationality permits both further development of theory in how and why decisions are made in real life and the development of learning exercises in the use of heuristics occurring in natural environments. The exercises in the present article encourage learning skills and principles of using fast and frugal heuristics in contexts of their intended use. The exercises respond to Schank's wisdom, "In a deep sense, education isn't about knowledge or getting students to know what has happened. It is about getting them to feel what has happened. This is not easy to do. Education, as it is in schools today, is emotionless. This is a huge problem." The three cases and accompanying set of exercise questions adhere to Schank's view, "Processes are best taught by actually engaging in them, which can often mean, for mental processing, active discussion."

  • PDF