Search | Korea Science

Main Content Extraction from Web Pages Based on Node Characteristics

Liu, Qingtang;Shao, Mingbo;Wu, Linjing;Zhao, Gang;Fan, Guilin;Li, Jun
- Journal of Computing Science and Engineering
- /
- v.11 no.2
- /
- pp.39-48
- /
- 2017
Main content extraction of web pages is widely used in search engines, web content aggregation and mobile Internet browsing. However, a mass of irrelevant information such as advertisement, irrelevant navigation and trash information is included in web pages. Such irrelevant information reduces the efficiency of web content processing in content-based applications. The purpose of this paper is to propose an automatic main content extraction method of web pages. In this method, we use two indicators to describe characteristics of web pages: text density and hyperlink density. According to continuous distribution of similar content on a page, we use an estimation algorithm to judge if a node is a content node or a noisy node based on characteristics of the node and neighboring nodes. This algorithm enables us to filter advertisement nodes and irrelevant navigation. Experimental results on 10 news websites revealed that our algorithm could achieve a 96.34% average acceptable rate.
https://doi.org/10.5626/JCSE.2017.11.2.39 인용 PDF KSCI

A Study on Taxi Route Extraction Based on a Node-Link Model for Aircraft Movements on Airport Surface (노드링크 모델 기반 항공기 공항 지상이동 경로 추출 기법에 대한 연구)

Jeong, Myeongsook;Eun, Yeonju;Kim, Hyounkyoung;Jeon, Daekeun
- Journal of the Korean Society for Aviation and Aeronautics
- /
- v.25 no.3
- /
- pp.51-60
- /
- 2017
Estimation of the taxi-out and taxi-in times of aircraft on a airport surface is one of the essential features of Departure Manager (DMAN). Especially for an airport with multiple runways and large ramp areas, estimation of the taxi-out and taxi-in times are mainly dependent on the taxi routes on airport surface. This paper described the method of automatic extraction of the the taxi routes using the ASDE track data and the Dijkstra algorithm based on the node-link model of a airport surface movements. In addition, we analyzed the ground operation status of Incheon International Airport using the extracted taxi routes.
PDF KSCI

A Development of Feature Extraction and Condition Diagnosis Algorithm for Lens Injection Molding Process (렌즈 사출성형 공정 상태 특징 추출 및 진단 알고리즘의 개발)

Baek, Dae Seong;Nam, Jung Soo;Lee, Sang Won
- Journal of the Korean Society for Precision Engineering
- /
- v.31 no.11
- /
- pp.1031-1040
- /
- 2014
In this paper, a new condition diagnosis algorithm for the lens injection molding process using various features extracted from cavity pressure, nozzle pressure and screw position signals is developed with the aid of probability neural network (PNN) method. A new feature extraction method is developed for identifying five (5), seven (7) and two (2) critical features from cavity pressure, nozzle pressure and screw position signals, respectively. The node energies extracted from cavity and nozzle pressure signals are also considered based on wavelet packet decomposition (WPD). The PNN method is introduced to build the condition diagnosis model by considering the extracted features and node energies. A series of the lens injection molding experiments are conducted to validate the model, and it is demonstrated that the proposed condition diagnosis model is useful with high diagnosis accuracy.
https://doi.org/10.7736/KSPE.2014.31.11.1031 인용 PDF KSCI

Text-Independent Speaker Identification System Based On Vowel And Incremental Learning Neural Networks

Heo, Kwang-Seung;Lee, Dong-Wook;Sim, Kwee-Bo
- 제어로봇시스템학회:학술대회논문집
- /
- 2003.10a
- /
- pp.1042-1045
- /
- 2003
In this paper, we propose the speaker identification system that uses vowel that has speaker's characteristic. System is divided to speech feature extraction part and speaker identification part. Speech feature extraction part extracts speaker's feature. Voiced speech has the characteristic that divides speakers. For vowel extraction, formants are used in voiced speech through frequency analysis. Vowel-a that different formants is extracted in text. Pitch, formant, intensity, log area ratio, LP coefficients, cepstral coefficients are used by method to draw characteristic. The cpestral coefficients that show the best performance in speaker identification among several methods are used. Speaker identification part distinguishes speaker using Neural Network. 12 order cepstral coefficients are used learning input data. Neural Network's structure is MLP and learning algorithm is BP (Backpropagation). Hidden nodes and output nodes are incremented. The nodes in the incremental learning neural network are interconnected via weighted links and each node in a layer is generally connected to each node in the succeeding layer leaving the output node to provide output for the network. Though the vowel extract and incremental learning, the proposed system uses low learning data and reduces learning time and improves identification rate.
PDF

A Study on the Feature Extraction for High Speed Character Recognition -By Using Interative Extraction and Hierarchical Formation of Directional Information- (고속 문자 인식을 위한 특징량 추출에 관한 연구 - 방향정보의 반복적 추출과 특징량의 계층성을 이용하여 -)

강선미;이기용;양윤모;양윤모;김덕진
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.29B no.11
- /
- pp.102-110
- /
- 1992
In this paper, a new method of character recognition is proposed. It uses density information, in addition to positional and directional information generally used, to recognize a character. Four directional feature primitives are extracted from the thinning templates on the observation that the output of the templates have directional property in general. A simple and fast feature extraction scheme is possible. Features are organized from recursive nonary tree(N-tree) that corresponds to normalized character area. Each node of the N-tree has four directional features that are sum of the features of it's nine sub-nodes. Every feature primitive from the templates are added to the corresponding leaf and then summed to the upper nodes successively. Recognition can be accomplished by using appropriate feature level of N-tree. Also, effectiveness of each node's feature vector was tested by experiment. A method to implement the proposed feature vector organization algorithm into hardware is proposed as well. The third generation node, which is 4$\times$4, is used as a unit processing element to extract features, and it was implemented in hardware. As a result, we could observe that it is possible to extract feature vector for real-time processing.
PDF

Detection of Mycobacterium bovis in the lymph node of tuberculin positive cattle by guanidium isothiocyanate/silica DNA extraction and polymerase chain reaction

Cho, Yun-Sang;Jung, Suk-Chan;Yoo, Han-Sang;Kim, Jong-Man
- Korean Journal of Veterinary Service
- /
- v.30 no.2
- /
- pp.233-241
- /
- 2007
Tuberculin positive cattle without gross tubercle lesions should be confirmed by the bacteriological examination to determine the state of the infection. To overcome the time-consuming and laborious identification by culture and biochemical tests, polymerase chain reaction (PCR) has been used to identify Mycobacterium bovis. Due to various lipids in the cell wall of Mycobacterium spp, novel methods of DNA extraction from Mycobacterium spp have been developed. In this study, a newly developed guanidium isothiocyanate/silica DNA extraction method was directly applied to specimens from the tuberculin positive cattle. DNAs were directly extracted from the lymph nodes and the major polymorphic tandem repeat (MPTR) and mycobacterial protein of BCG 70 (MPB70) were amplified using PCR. The DNA extraction method using guanidium isothiocyanate/silica was efficient and safe, and the MPTR and MPB70 primers were specific to M bovis. Therefore, MPTR and MPB70 PCRs will be useful for the detection of M bovis in the lymph node from skin-test positive cattle.
PDF KSCI

Automatic Extraction of 2-Dimensional Finite Element Connectivities by Search Technique (탐색기법을 이용한 2차원 유한요소 연결관계의 자동추출)

김한수
- Journal of the Computational Structural Engineering Institute of Korea
- /
- v.13 no.3
- /
- pp.329-336
- /
- 2000
A method for automatic extraction of 2-dimensional finite element connectivities by searching the shortest closed path from a certain node to the starting node was developed. Only the best path among the possible paths was probed. The uniqueness and validity of the extracted path were examined. The proposed method was proved to be complete. Examples show that the proposed method can extract elements exactly from the irregular mesh which can not be handled easily by the conventional automatic mesh generation.
PDF

Low Power Design Using the Extraction of kernels (커널 추출을 이용한 저전력설계)

이귀상;정미경
- Proceedings of the IEEK Conference
- /
- 1999.06a
- /
- pp.369-372
- /
- 1999
In this paper, we propose a new method for power estimation in nodes of multi-level combinational circuits and describe its application to the extraction of common expressions for low power design. It is assumed that each node is implemented as a complex gate and the capacitance and the switching activity of the nodes are considered in the power estimation. Extracting common expressions which is accomplished mostly by the extraction of kernels, can be transformed to the problem of rectangle covering. We describe how the newly proposed estimation method can be applied to the rectangle covering problem and show the experimental results with comparisons to the results of SIS-1.2.
PDF

Improving Embedding Model for Triple Knowledge Graph Using Neighborliness Vector (인접성 벡터를 이용한 트리플 지식 그래프의 임베딩 모델 개선)

Cho, Sae-rom;Kim, Han-joon
- The Journal of Society for e-Business Studies
- /
- v.26 no.3
- /
- pp.67-80
- /
- 2021
The node embedding technique for learning graph representation plays an important role in obtaining good quality results in graph mining. Until now, representative node embedding techniques have been studied for homogeneous graphs, and thus it is difficult to learn knowledge graphs with unique meanings for each edge. To resolve this problem, the conventional Triple2Vec technique builds an embedding model by learning a triple graph having a node pair and an edge of the knowledge graph as one node. However, the Triple2 Vec embedding model has limitations in improving performance because it calculates the relationship between triple nodes as a simple measure. Therefore, this paper proposes a feature extraction technique based on a graph convolutional neural network to improve the Triple2Vec embedding model. The proposed method extracts the neighborliness vector of the triple graph and learns the relationship between neighboring nodes for each node in the triple graph. We proves that the embedding model applying the proposed method is superior to the existing Triple2Vec model through category classification experiments using DBLP, DBpedia, and IMDB datasets.
https://doi.org/10.7838/jsebs.2021.26.3.067 인용 PDF KSCI

An algorithm for the image improvement in the multi-view images coding (Multi-view 영상 코딩에서 영상 개선 알고리듬)

김도현;최동준;양영일
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.35S no.7
- /
- pp.53-61
- /
- 1998
In this paper, we propose an efficient multi-view images coding algorithm to find the optimal depth and texture from the set of multi-view images. The proposed algorithm consists of two consecutive steps, i) the depth estraction step, and ii) the texture extraction step, comparedwith the traditional algorithem which finds the depth and texture concurrently. The X-Y plane of the normalized object space is divided into traingular paatches and the Z value of the node is determined in the first step and then the texture of the each patch is extracted in the second step. In the depth extraction step, the depth of the node is determined by applying the block based disparity compensation method to the windowed area centered at the node. In the second step, the texture of the traingular patches is extracted from the multi-view images by applying the affine transformation based disparity compensation method to the traingular pateches with the depth extracted from the first step. Experimental results show that the SNR(Singnal-to- Noise Ratio) of images enconded by our algorithm is better than that of images encoded by the traditional algorithm by the amount about 4dB for for the test sets of multi-view images called dragon, kid, city and santa.
PDF

Search Result 62, Processing Time 0.038 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)