• Title/Summary/Keyword: Data Tree

Search Result 3,331, Processing Time 0.031 seconds

A Study on University Big Data-based Student Employment Roadmap Recommendation (대학 빅데이터 기반 학생 취업 로드맵 추천에 관한 연구)

  • Park, Sangsung
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.17 no.3
    • /
    • pp.1-7
    • /
    • 2021
  • The number of new students at many domestic universities is declining. In particular, private universities, which are highly dependent on tuition, are experiencing a crisis of existence. Amid the declining school-age population, universities are striving to fill new students by improving the quality of education and increasing the student employment rate. Recently, there is an increasing number of cases of using the accumulated big data of universities to prepare measures to fill new students. A representative example of this is the analysis of factors that affect student employment. Existing employment-influencing factor analysis studies have applied quantitative models such as regression analysis to university big data. However, since the factors affecting employment differ by major, it is necessary to reflect this. In this paper, the factors affecting employment by major are analyzed using the data of University C and the decision tree model. In addition, based on the analysis results, a roadmap for student employment by major is recommended. As a result of the experiment, four decision tree models were constructed for each major, and factors affecting employment by major and roadmap were derived.

Extraction of the Tree Regions in Forest Areas Using LIDAR Data and Ortho-image (라이다 자료와 정사영상을 이용한 산림지역의 수목영역추출)

  • Kim, Eui Myoung
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.21 no.2
    • /
    • pp.27-34
    • /
    • 2013
  • Due to the increased interest in global warming, interest in forest resources aimed towards reducing greenhouse gases have subsequently increased. Thus far, data related to forest resources have been obtained, through the employment of aerial photographs or satellite images, by means of plotting. However, the use of imaging data is disadvantageous; merely, due to the fact that recorded measurements such as the height of trees, in dense forest areas, lack accuracy. Within such context, the authors of this study have presented a method of data processing in which an individual tree is isolated within forested areas through the use of LIDAR data and ortho-images. Such isolation resulted in the provision of more efficient and accurate data in regards to the height of trees. As for the data processing of LIDAR, the authors have generated a normalized digital surface model to extract tree points via local maxima filtering, and have additionally, with motives to extract forest areas, applied object oriented image classifications to the processing of data using ortho-images. The final tree point was then given a figure derived from the combination of LIDAR and ortho-images results. Based from an experiment conducted in the Yongin area, the authors have analyzed the merits and demerits of methods that either employ LIDAR data or ortho-images and have thereby obtained information of individual trees within forested areas by combining the two data; thus verifying the efficiency of the above presented method.

Podiatric Clinical Diagnosis using Decision Tree Data Mining (결정트리 데이터마이닝을 이용한 족부 임상 진단)

  • Kim, Jin-Ho;Park, In-Sik;Kim, Bong-Ok;Yang, Yoon-Seok;Won, Yong-Gwan;Kim, Jung-Ja
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.2
    • /
    • pp.28-37
    • /
    • 2011
  • With growing concerns about healthy life recently, although the podiatry which deals with the whole area for diagnosis, treatment of foot and leg, and prevention has been widely interested, research in our country is not active. Also, because most of the previous researches in data analysis performed the quantitative approaches, the reasonable level of reliability for clinical application could not be guaranteed. Clinical data mining utilizes various data mining analysis methods for clinical data, which provides decision support for expert's diagnosis and treatment for the patients. Because the decision tree can provide good explanation and description for the analysis procedure and is easy to interpret the results, it is simple to apply for clinical problems. This study investigate rules of item of diagnosis in disease types for adapting decision tree after collecting diagnosed data patients who are 2620 feet of 1310(males:633, females:677) in shoes clinic (department of rehabilitation medicine, Chungnam National University Hospital). and we classified 15 foot diseases followed factor of 22 foot diseases, which investigated diagnosis of 64 rules. Also, we analyzed and compared correlation relationship of characteristic of disease and factor in types through made decision tree from 5 class types(infants, child, adolescent, adult, total). Investigated results can be used qualitative and useful knowledge for clinical expert`s, also can be used tool for taking effective and accurate diagnosis.

RSP-DS: Real Time Sequential Patterns Analysis in Data Streams (RSP-DS: 데이터 스트림에서의 실시간 순차 패턴 분석)

  • Shin Jae-Jyn;Kim Ho-Seok;Kim Kyoung-Bae;Bae Hae-Young
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.9
    • /
    • pp.1118-1130
    • /
    • 2006
  • Existed pattern analysis algorithms in data streams environment have researched performance improvement and effective memory usage. But when new data streams come, existed pattern analysis algorithms have to analyze patterns again and have to generate pattern tree again. This approach needs many calculations in real situation that needs real time pattern analysis. This paper proposes a method that continuously analyzes patterns of incoming data streams in real time. This method analyzes patterns fast, and thereafter obtains real time patterns by updating previously analyzed patterns. The incoming data streams are divided into several sequences based on time based window. Informations of the sequences are inputted into a hash table. When the number of the sequences are over predefined bound, patterns are analyzed from the hash table. The patterns form a pattern tree, and later created new patterns update the pattern tree. In this way, real time patterns are always maintained in the pattern tree. During pattern analysis, suffixes of both new pattern and existed pattern in the tree can be same. Then a pointer is created from the new pattern to the existed pattern. This method reduce calculation time during duplicated pattern analysis. And old patterns in the tree are deleted easily by FIFO method. The advantage of our algorithm is proved by performance comparison with existed method, MILE, in a condition that pattern is changed continuously. And we look around performance variation by changing several variable in the algorithm.

  • PDF

ANALYSIS OF NEIGHBOR-JOINING BASED ON BOX MODEL

  • Cho, Jin-Hwan;Joe, Do-Sang;Kim, Young-Rock
    • Journal of applied mathematics & informatics
    • /
    • v.25 no.1_2
    • /
    • pp.455-470
    • /
    • 2007
  • In phylogenetic tree construction the neighbor-joining algorithm is the most well known method which constructs a trivalent tree from a pairwise distance data measured by DNA sequences. The core part of the algorithm is its cherry picking criterion based on the tree structure of each quartet. We give a generalized version of the criterion based on the exact box model of quartets, known as the tight span of a metric. We also show by experiment why neighbor-joining and the quartet consistency count method give similar performance.

An Improvement Video Search Method for VP-Tree by using a Trigonometric Inequality

  • Lee, Samuel Sangkon;Shishibori, Masami;Han, Chia Y.
    • Journal of Information Processing Systems
    • /
    • v.9 no.2
    • /
    • pp.315-332
    • /
    • 2013
  • This paper presents an approach for improving the use of VP-tree in video indexing and searching. A vantage-point tree or VP-tree is one of the metric space-based indexing methods used in multimedia database searches and data retrieval. Instead of relying on the Euclidean distance as a measure of search space, the proposed approach focuses on the trigonometric inequality for compressing the search range, which thus, improves the search performance. A test result of using 10,000 video files shows that this method reduced the search time by 5-12%, as compared to the existing method that uses the AESA algorithm.

Enhanced Routing Algorithm for ZigBee using a Family Set of a Destination Node (목적지의 가족집합을 이용한 향상된 ZigBee 라우팅 알고리즘)

  • Shin, Hyun-Jae;Ahn, Sae-Young;Jo, Young-Jun;An, Sun-Shin
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.59 no.12
    • /
    • pp.2329-2336
    • /
    • 2010
  • Hierarchical tree routing is a inefficient routing method of transmitting data in a wireless sensor network. Zigbee routing which is made to improve inefficiency of the hierarchical tree routing only fulfills the tree routing when a destination node don't exists in neighbor nodes of a router. We suggest a TFSR algorithm that is improved more than the zigbee routing. The TFSR algorithm generates a family set included a parent node and child nodes and over of a destination node, and uses this information. According to simulation results, the TFSR algorithm reduce routing costs over 30 percent in comparison with the hierarchical tree routing and the zigbee routing.

Multicast Tree to Minimize Maximum Delay in Dynamic Overlay Network

  • Lee Chae-Y.;Baek Jin-Woo
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2006.05a
    • /
    • pp.1609-1615
    • /
    • 2006
  • Overlay multicast technique is an effective way as an alternative to IP multicast. Traditional IP multicast is not widely deployed because of the complexity of IP multicast technology and lack of application. But overlay multicast can be easily deployed by effectively reducing complexity of network routers. Because overlay multicast resides on top of densely connected IP network, In case of multimedia streaming service over overlay multicast tree, real-time data is sensitive to end-to-end delay. Therefore, moderate algorithm's development to this network environment is very important. In this paper, we are interested in minimizing maximum end-to-end delay in overlay multicast tree. The problem is formulated as a degree-bounded minimum delay spanning tree, which is a problem well-known as NP-hard. We develop tabu search heuristic with intensification and diversification strategies. Robust experimental results show that is comparable to the optimal solution and applicable in real time

  • PDF

Multi-Interval Discretization of Continuous-Valued Attributes for Constructing Incremental Decision Tree (증분 의사결정 트리 구축을 위한 연속형 속성의 다구간 이산화)

  • Baek, Jun-Geol;Kim, Chang-Ouk;Kim, Sung-Shick
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.27 no.4
    • /
    • pp.394-405
    • /
    • 2001
  • Since most real-world application data involve continuous-valued attributes, properly addressing the discretization process for constructing a decision tree is an important problem. A continuous-valued attribute is typically discretized during decision tree generation by partitioning its range into two intervals recursively. In this paper, by removing the restriction to the binary discretization, we present a hybrid multi-interval discretization algorithm for discretizing the range of continuous-valued attribute into multiple intervals. On the basis of experiment using semiconductor etching machine, it has been verified that our discretization algorithm constructs a more efficient incremental decision tree compared to previously proposed discretization algorithms.

  • PDF

Discovery and Recommendation of User Search Patterns from Web Data (웹 데이터에서의 사용자 탐색 패턴 발견 및 추천)

  • 구흠모;양재영;홍광희;최중민
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.287-296
    • /
    • 2002
  • 웹 사용 마이닝은 데이터마이닝을 바탕으로 사용자의 로그 파일 정보를 이용하여 웹이 이용되는 패턴을 발견한다. 이를 이용하여 웹을 개선하여 사용자들이 보다 빨리 원하는 내용을 검색할 수 있도록 할 수 있으며 시스템 관리자에게는 효율적인 웹 구조를 인한 정보를 제공할 수 있다. 웹 사용 마이닝에서 사용하는 데이터는 성형화되어 있지 않으며 웹 사용 패턴을 분석하는데 방해가 되는 잡음 데이터까지 포함하고 있다. 이것은 기존에 개발된 여러 데이터마이닝 기법을 적용하는데 어려움으로 작용한다. 이러한 어려움을 해결하기 위해 본 논문에서는 새로운 방법을 도입한 SPMiner을 .제안한다. SPMiner는 웹의 구조를 이용하여 로그 파일의 전처리 과정을 줄이며 사용자의 탐색 패턴 분석을 효율적으로 수행 할 수 있는 시스템이다. SPMiner는 WebTree 에이전트를 이용하여 웹 사이트 구조를 분석하여 WebTree를 생성하고 사용자 로그 파일을 분석하여 각 웹 페이지의 사용빈도에 대한 정보를 추출한다. WebTree와 로그 파일에서 추출된 웹 페이지에 대한 정보는 SPMiner에 의해 패턴을 분석할 퍼 이용될 수 있는 형태인 WebTree$^{+}$로 병합된다 WebTree$^{+}$는 패턴 발견을 쉽게 해주며 사용자에게 추천할 정보나 웹 페이지를 능동적으로 추천할 수 있게 만들어 준다.

  • PDF