DOI QR코드

DOI QR Code

Triangulation Based Skeletonization and Trajectory Recovery for Handwritten Character Patterns

  • Phan, Dung (University of Science-Ho Chi Minh City) ;
  • Na, In-Seop (Department of Computer Science, Chonnam National University) ;
  • Kim, Soo-Hyung (Department of Computer Science, Chonnam National University) ;
  • Lee, Guee-Sang (Department of Computer Science, Chonnam National University) ;
  • Yang, Hyung-Jeong (Department of Computer Science, Chonnam National University)
  • Received : 2014.10.11
  • Accepted : 2014.11.18
  • Published : 2015.01.31

Abstract

In this paper, we propose a novel approach for trajectory recovery. Our system uses a triangulation procedure for skeletonization and graph theory to extract the trajectory. Skeletonization extracts the polyline skeleton according to the polygonal contours of the handwritten characters, and as a result, the junction becomes clear and the characters that are touching each other are separated. The approach for the trajectory recovery is based on graph theory to find the optimal path in the graph that has the best representation of the trajectory. An undirected graph model consisting of one or more strokes is constructed from a polyline skeleton. By using the polyline skeleton, our approach accelerates the process to search for an optimal path. In order to evaluate the performance, we built our own dataset, which includes testing and ground-truth. The dataset consist of thousands of handwritten characters and word images, which are extracted from five handwritten documents. To show the relative advantage of our skeletonization method, we first compare the results against those from Zhang-Suen, a state-of-the-art skeletonization method. For the trajectory recovery, we conduct a comparison using the Root Means Square Error (RMSE) and Dynamic Time Warping (DTW) in order to measure the error between the ground truth and the real output. The comparison reveals that our approach has better performance for both the skeletonization stage and the trajectory recovery stage. Moreover, the processing time comparison proves that our system is faster than the existing systems.

Keywords

1. Introduction

The drawing order of static handwritten characters can be recoverd from images, and doing so plays an important role in a wide range of applications [1–5], including offline character recognition, character structure analysis, design of character fonts, etc. To this end, many studies have investigated handwritten trajectory recovery methods over the past decades. Such algorithms are generally classified into three main approaches, including local tracing, global tracing, and hybrid tracing methods.

A heuristic rule method was proposed in Ref. [6] for local tracing, and this method applies some heuristic rules for any feature points where some strokes intersect. However, although a local tracing algorithm usually has a low computation cost, it is very sensitive to noise. Ref. [7–16] developed global tracing methods that improve the performance of the trajectory recovery. The recovery problem is formulated as combinatorial optimization problems by using graph theory. Then, such problems are reduced to the Traveling Salesman problem. However, this approach may lead to a computational explosion if the given handwritten image is complex.

Hybrid tracing algorithms have been studied in order to resolve the issues presented by the above methods [17–19]. This approach uses an algorithm that is based on an approach from graph theory, and the algorithm is executed in two steps. First, the graph structure is analyzed at every crossing points on the given handwritten image, and such information is evaluated against template models in order to judge the types of corresponding edges and vertices. Then, the final drawing order is obtained from the labeling information. However, this method depends on local structures of the given image, and smoothest trajectory possible is not guaranteed.

Most approaches that are used for trajectory recovery are based on the skeleton obtained from the handwritten patterns, and skeletonization is one of the most important steps because it directly affects the results for the trajectory. Skeletonization operates using one of three main approaches that are based on pixel elimination [20–21], modeling [22–23], and contouring [24–25]. Pixel elimination and the model-based approach are robust when used with low-resolution images. However, these two approaches can have many spurious results at the junction area. The contour-based method is often used to handle high-resolution images. The existing contour-based methods depend on the medial axis transform (MAT), and in order to obtain better results for the skeletonization, it requires a large number of sample points. Therefore, the complexity of the algorithm increases.

In the past few decades, many contour-based approaches that use triangulation for the skeletonization have been proposed [26]. The skeletonization from the contour analysis depends on the medial axis transformation, and the skeleton is obtained by searching the local symmetry of the edges. The locus of the points of symmetry is extracted as the medial axis, and these axes are then connected in nodes to form the skeleton. The advantage of this approach is that it can generate an accurate skeleton. However, extensive sample points are required in order to estimate the skeleton, and the complexity of the computation increases.

In 2001, Melhi proposed a novel triangulation procedure for thinning hand-written text [27]. In this method, the boundaries of the words are converted into polygons, the polygons are then converted into sequences of non-overlapping triangles, and finally, each triangle is replaced by its skeleton to form a fully connected word skeleton. If an invariance for the rotation and translation is necessary, it can be achieved by rotating each connected component of a word to produce the principal horizontal direction. Then the chain encoding of the boundary is started from the left intersection with the principal axis. The method offers several advantages relative to alternative techniques. Since the operation is based on word boundaries, the time that it takes for the method to complete does not increase rapidly as the resolution of the images increases, unlike methods that iteratively remove outer layers of the pixels to form the pattern object. However, this method was created specifically for cursive handwriting, and it might be useful in the skeletonization of other line-like patterns, such as those that are found in chromosomes, fingerprints, and line drawings. In 2007, Lam and Yam proposed a skeletonization technique that is based on Delaunay Triangulation and Piecewise Bezier Interpolation [28]. The objective of this study was to develop an efficient skeletonization method that was applicable to high-resolution planar shapes. The algorithm is a medial axis transformation approach that assures the skeletal accuracy. Instead of linear interpolation, a Bezier curve is used to connect the inconsecutive sample points. The Bezier curve can approximate the gradient change of the symmetry point locus. Therefore, the proposed method is applicable to skeleton gradient-dependent applications.

In 2011, Morrison used triangle refinement through a constrained Delaunay triangulation (CDT) to improve the skeletonization method [29], and Baga [30] proposed another improvement. The results show a vast improvement in the refinement algorithm, as compared with the unmodified CDT technique. It took roughly twice as much time to skeletonize the image using the refinement algorithm, than the unmodified CDT technique. In order to improve the performance and to overcome the current drawbacks of the skeletonization and handwritten trajectory recovery, we propose a novel approach for handwritten skeletonization and trajectory recovery that is based on triangulation and graph theory. Our system is composed of two stages, handwritten skeletonization and trajectory recovery. In our approach, we first propose the skeletonization method based on the polygonal contours of the handwritten character. In order to produce cleaner and smoother contours, the handwritten contours are represented using a polygonal contour approximation with a line-fitting algorithm. Then, Delaunay triangulation is used to generate the triangle mesh of the handwritten characters. From the triangle mesh, the skeleton is estimated based on a moving average of the triangles on the mesh. This method not only makes the junction clear, but it also separates the characters that are simply touching, resulting in polyline skeletonization.

The second stage is used for handwritten trajectory recovery. The approach for this algorithm is based graph theory, and it can be used to find the optimal path in the graph that has the best representation for the trajectory. An undirected graph model is then constructed from the polyline skeleton of the handwritten character that consists of one or more strokes. Then, the graph is decomposed into single strokes by searching for the optimal path. The smoothest path with the lowest cost on the graph represents the final trajectory of the handwritten character, and in advance of using the polyline skeleton of the handwritten character, our approach accelerates the process by searching for the optimal path step. The trajectory of the handwritten character can be more accurately estimated by using this new approach for skeletonization and by combining the geometric characteristics into a graph theory-based approach, which is very useful for the recognition task.

We built our own dataset to evaluate the performance of the proposed method, and the dataset includes testing samples and the ground-truth. The dataset contains thousands of handwritten characters and words images that were extracted from five handwritten documents. For our evaluation, first we compared our skeletonization method against Zhang-Suen. The relative improvement of our method was expressed as in improvement in the match between the restored image and the input image. In order to evaluate the trajectory recovery, we made a comparison using a measurement method that includes the Root Mean Square Error (RMSE) and Dynamic Time Warping (DTW). The comparison indicated that our approach had better performance in the skeletonization stage and in the trajectory recovery stage. Moreover, a comparison of the processing time indicates that our system is faster than the system against which it was compared.

The rest of this thesis is organized into three sections. In Section 2, we talk about the background related to our work. Our proposed system is presented in Section 3, and Section 4 presents the results of the performance evaluation. In the final section, we discuss the conclusions and our future works.

 

2. Background and Related Works

2.1 Delaunay Triangulation

In two dimensions, the triangulation of a set V of vertices is a set T of triangles for which the vertices are collectively V. The vertices in set V have interiors that do not intersect each other, and their union is the convex hull of V if every triangle intersects V only at the triangle’s vertices. The Delaunay triangulation D of V, introduced by Delaunay in 1934, is a graph that is defined as follows. Any circle in the plane is said to be empty if it does not enclose a vertex from V. (Vertices are permitted on the circle.) Let u and v be any two vertices of V. A circumcircle (circumscribing circle) of the uv edge is any circle that passes through u and v. Any edge has an infinite number of circumcircles, and the uv edge is in D if and only if there exists an empty circumcircle for uv. The edge that satsfies this property is said to be a Delaunay. Therefore, the Delaunay triangulation of a vertex set is clearly unique because the definition specifies an unambiguous test for the presence or absence of an edge in the triangulation. A Delaunay is every edge that lies on the boundary of the convex hull of a vertex set and has no vertex in its interior. For any convex hull edge e, it is always possible to find an empty circumcircle of e by starting with the smallest circumcircle of e and “growing” it away from the triangulation, and every edge connecting a vertex to its nearest neighbor is a Delaunay. If w is the vertex nearest v, the smallest circle passing through v and w does not enclose any vertices.

It’s not at all obvious that the set of Delaunay edges of a vertex set collectively forms a triangulation. For the definition given above, the Delaunay triangulation is guaranteed to be a triangulation only if the vertices of V are in a general position, here meaning that no four vertices of V lie on a common circle. The first step to prove this argument is to describe the notion of a Delaunay triangle. The circumcircle of a triangle is the unique circle that passes through all three of its vertices. A triangle is said to be a Delaunay triangle if and only if its circumcircle is empty.

2.2 Dynamic Time Warping For Time Series Matching (DTW)

A distance measurement between the time series is needed in order to determine the similarity between them and to classify the time series. Euclidean distance is an efficient distance measurement that can be used. The Euclidian distance between the two time series is simply the sum of the squared distances from each nth point in one time series to the nth point in the other. The main disadvantage of using the Euclidean distance for time series data is that its results are very unintuitive. If two time series are identical, but one shifts slightly along the time axis, then the Euclidean distance may reflect them as being very different from each other. Dynamic time warping (DTW) was therefore introduced [34] to overcome this limitation, and it provides intuitive distance measurements between time series by ignoring both the global and local shifts in the time dimension. Fig. 1 shows an illustration of the matching method of the query and the reference using a normal distance comparison, which provides one-to-one matching, and a DTW-based method, which provides one-to-many matching.

Fig. 1.Illustration of matching: (a) one to one matching; (b) one to many matching using DTW. DTW looks for the minimum distance mapping between query and reference.

2.2.1 Problem Formulation

The dynamic time warping problem is stated as follows. Given two time series X and Y, in Equation (1), with lengths |X| and |Y|, construct a warp path W defined in equation (2) as

where K is the length of the warp path and the kth element of the warp path is wk = (i,j), where i is an index from time series X, and j is an index from time series Y. The warp path must start at the beginning of each time series at w1 = (1,1) and must finish at the end of both time series at wk = (|X|,|Y|). This ensures that every index for both time series is used in the warp path. There is also a constraint on the warp path that forces i and j to monotonically increase in the warp path, which is why the lines representing the warp path in Fig. 1 do not overlap. Every index for each time series must be used, and this can be stated in Eq. (3) as

The optimal warp path is that with the minimum distance, where the distance of a warp path W is defined as in Equation (4) as

DistW) is the distance (typically a Euclidean distance) of warp path W, and Dist(wki, wkj) is the distance between the two data point indices (one from X and one from Y) in the kth element of the warp path.

2.2.2 DTW Algorithm

A dynamic programming approach is used to find this minimum-distance warp path. Instead of attempting to solve the entire problem at once, solutions to sub-problems (portions of the time series) are found and are used to repeatedly find the solutions to a slightly larger problem until the solution is found for the entire time series. A two-dimensional |X| by |Y| cost matrix D is constructed where the value at D(i,j) is the minimum distance warp path that can be constructed from the two time series X' = x1, x2, ..., xi and Y' = y1, y, ..., yi. The value at D(|X|,|Y|) will contain the minimum-distance warp path between the time series X and Y, and both axes of D represent the time. The x-axis is the time of time series X, and the y-axis is the time of time series Y.

To find the minimum-distance warp path, every cell of the cost matrix must be filled. The rationale behind using a dynamic programming approach for this problem is the following. Since the value at D(i,j) is the minimum warp distance of the two time series with lengths i and j, if the minimum warp distances are already known for all slightly smaller portions of those time series at a single data point away from lengths i and j, then the value at D(i,j) is the minimum distance of all possible warp paths for the time series that are one data point smaller than i and j, plus the distance between the two points xi and yj. Since the warp path must either increase by one or stay the same along the i and j axes, the distances of the optimal warp paths one data point smaller than lengths i and j are contained in the matrix at D(i - 1, j), D(i, j - 1), and D(i - 1, j - 1). So the value of a cell in the cost matrix is defined as in Equation (5):

After the entire cost matrix is filled, the warp path must be found from D(1,1) to D(|X|,|Y|). The warp path is actually calculated in reverse order starting at D(|X|,|Y|). A greedy search is performed to evaluate cells to the left, down, and diagonally to the bottom-left. Whichever of these three adjacent cells with the smallest value is added to the beginning of the warp path that has been found so far, and the search continues from that cell. The search stops when D(1, 1) is reached.

 

3. Proposed Method

There are two main stages in our system including handwritten skeletonization and trajectory recovery. In the first stage, we propose the skeletonization method based on the polygonal contours of the handwritten character. In order to make the contour cleaner and smoother, the contours of handwritten character are represented by polygonal contours approximated by using the line-fitting algorithm. Then, Delaunay triangulation is used to generate the triangle mesh for the handwritten character . From the triangle mesh, the skeleton is estimated according to the moving average of the triangles on the mesh, and this method generates the polyline skeletonization. The second stage involves handwritten trajectory recovery. The approach for this algorithm is based on graph theory to find the optimal path in the graph with the best representation for the trajectory. An undirected graph model is then constructed from the polyline skeleton of the handwritten character consisting of one or more strokes. Then, the graph is decomposed into single strokes by searching for the optimal path. The final trajectory of the handwritten character is represented by the smoothest path on the graph with the minimum cost. The general flowchart of the system is presented in Fig. 2.

Fig. 2.Flowchart of Proposed System

3.1 Triangulation Procedure based Skeletonization

Our proposed skeletonization method has four main steps: contour extraction, polygonal contour approximation, Delaunay Triangulation-based Triangular Mesh Generation, and polyline skeleton estimation.

3.1.1 Contours Estimation and Polygonal Contours Approximation

This step references Dilip K. Prasad’s work [31, 32]. He proposed an analytical formula that can be used to choose the optimal threshold value for the line-fitting algorithm, such that the user does not need to prescribe a control parameter, and the line fitting method can automatically choose the threshold value depending on the digital curve itself. Fig. 3 shows pseudo-code of Prasad’s method, and as an example, the approximation of the polygonal contours is shown in Fig. 4.

Fig. 3.Pseudo-code of the Polygonal Contours Approximation Algorithm [31]

Fig. 4.Example of Polygonal Contours of Handwritten

3.1.2 Triangular Mesh Generation and Constraints for Delaunay Triangulation

The Delaunay Triangulation (DT) is a method that convers a set of Cartesian coordinates into a triangular mesh. It maps circumcircles to vertices, such that each circle intersects the boundary at three vertexes. The vertices mutually connect form a triangle, providing that there is no other vertex inside.

A 2D Delaunay triangulation of a set of points ensures that the circumcircle associated with each triangle contains no other points in its interior. In 2D triangulations, we can impose edge constraints between the pairs of points. In our problem, we define the constraint for the DT based on the polygonal boundary. In order to produce the fine triangle mesh, we partition the polygon edges into small edges based on the stroke width value. The stroke width is estimated by using the method introduce in Ref. [35], and an example of this step is shown in Fig. 5.

Fig. 5.Example of triangle mesh generation: (a) Red lines defined the constraints for Delaunay triangulation; (b) Triangle Mesh on ‘A’ character.

3.1.3 Polyline Skeleton Estimation

To estimate the polyline skeleton, we first classify the triangle into 3 classes: terminal, connection, and junction. The terminal triangle contains 2 border edges that are on the polygonal boundary. The Triangle contains 1 border edge that is detected as a connection triangle. The junction triangle is the triangle that does not contain any border edges. See Fig. 6 for an illustration of the triangle classification. The estimated skeleton is a polyline including the line segment estimated for each triangle.

Fig. 6.Three types of triangles: Thick lines indicate the edges of the polygon boundary.

(1) Skeleton Estimation for Terminal and Connection Triangles

We consider the terminal triangles as spurious segments, and they are excluded from the skeleton. However, they are used to estimate the line segment s at the junction triangle if they are adjacent. For the connection triangles, the line segments are estimated as the moving average, see Fig. 7.

Fig. 7.Line segment estimation for the connection triangle: The red line is an estimated line segment.

(2) Skeleton Estimate for Junction Triangle

The most important step in our method is to estimate the line segments for the junction triangle. A handwritten character often has a junction of two lines. However, a junction of three lines, four lines, or more can also appear. First, we solve the problem at a junction of two lines. There are two cases of two-line junctions (Fig. 8). For the first type, we search the best couples of the line segment [blue line in Fig. 8(a)] which has the smallest direction change. The connection of these two line segments is the estimated line segment. The remaining line is extended forward to the estimated line segment.

Fig. 8.(a) Two Types of Junction Triangle; (b) Final Polyline Skeleton.

The second problem is the estimation of the line segment for a junction of more than 2 lines. In this case, many junction triangles are adjacent, and therefore we find the convex polygon that contains these adjacent junction triangles. Then, the circumcenter of the convex polygon is extracted, and the estimated line segments are the connections between the middle of each edge of the convex polygon to the circumcenter. Fig. 8(b) shows an example of the final skeleton.

3.2 Trajectory Recovery for Handwritten

Our approach for this step is based on the graph theory approach that can be used to find the optimal path in the graph with the best representation of the trajectory. An undirected graph model is constructed from the polyline skeleton of the handwritten character consisting of one or more strokes. Then, the graph is decomposed into single strokes by searching for the optimal path. The final trajectory of the handwritten character is represented by the smoothest path on the graph with the minimum cost. Our approach accelerates the process by searching for the optimal path before using the polyline skeleton of the handwritten character.

3.2.1 Assumption for Trajectory Recovery

In our trajectory recovery approach, we assume that there is no double trace in the handwritten trajectory. Therefore, multiple strokes can be decomposed through an optimal path search. Another assumption for detecting the start point and the end point is that the handwritten trajectory should start from left to right without any redundant trace. For the junction process, our proposed method can handle well with the junction with a limited number of strokes that is intersected at a junction.

3.2.2 Graph Representation of the handwritten skeleton

We define an undirected graph G = (V, E) where V = {v1, v2, ..., vn} indicates a set of vertices, and E = {e1, e2, ..., em} indicates a set of edges. Each vertex vi ∈ V, i = 1,2, ..., n is a geometrica feature point that corresponds to some terminal point or some intersection of some stroke segment. Each edge ei ∈ E, i = 1,2, .., m corresponds to a segment of a stroke in a handwritten skeleton.

From the polyline skeleton, we can easily construct a graph representation. We can consider polyline as a graph in which set of edges are set of polyline and set of vertices are a set of points from the polyline. The set of vertices and edges can be normalized by removing unnecessary vertices with a degree of two.

3.2.3 Optimal Path Search For Trajectory Recovery

Whenever we search for an optimal path in the graph, we need to identify the vertex from which we start the trace. It is important to first find the optimal path. Frequently, human writing starts up at a new point, which means that the start point should be a vertex with degree of one. However, a vertex of degree one may be a start/end point, or the start point can touch other strokes.

To select the start point, we look for a candidate from all vertices with an odd degree in the graph. The start point is the most left vertex in a list of odd vertices. In the case where there are near odd vertices around the candidate of the start point, we consider the position of these two points in order to select the best candidate. After the start point is detected, the process for searching the optimal path is performed in order to obtain the final trajectory.

According to the start point, we will begin to search for the optimal path in the graph. The seach algorithm is known as the Basic Trace Algorithm (BTA). Given that a start vertex and one of edges are connected to the vertex, we search for the candidate adjacent edge that has the minimum cost value representing the direction difference between the current edge and the adjacent edge. However, the stroke segments in the handwritten character are always likely to be curve, which may cause a mismatch between the cost value and the smoothness at the connection of the two edges if we use the entire edge for the cost calculation. Therefore, we first estimate a unit vector, which indicates the local direction of an edge. The cost value between e1, and e2 is defined in Equation (6) as

where e1, e2 indicates an edge in the graph, and v1, v2 indicate unit direction vectors corresponding to e1, e2. v1 = (x1, x2), v2 = (y1, y2).

3.2.4 Multi-stroke decomposition through optimal path searching

Based on the optimal path search, a trace (single stroke) is assessed in the graph until an odd vertex is reached. Then, the start point is updated from the list of remaining vertices in the graph. The process used to find an optimal path is looped until all of the edges in the graph are passed. Fig. 9 shows a result of the trajectory recovery with multi-stroke decomposition.

Fig. 9.Example of the decomposition of multiple strokes: each color marks for each stroke.

 

4. PERFORMANCE EVALUATION

4.1. System Environment and Database

Our system was implemented in MATLAB R2012b on our workspace with a system configured with an Intel® Core™ i3-3220 CPU @ 3.30 GHz, 8G RAM, Windows 8.1 64b. Our database was generated from five online documents in the XML format handwritten by five people. The online handwritten documents contain a list of point sequences for each stroke of the character/words in the whole document. In order to evaluate our system, we generate the test and the training data two types of images including single-stroke images and multi-stroke (character/word) images from these documents. The test images are restored from point sequences from the online dataset, and the first dataset includes 2180 images with a single stroke. The second dataset includes 1640 images of multiple strokes, which can be either handwritten characters or words. A ground truth dataset is extracted corresponding to the test image.

4.2 Method for Performance Evaluation

(1) Precision, Recall, and Accuracy for Skeleton Evaluation

In order to measure the accuracy and the error of the output skeleton, we propose a procedure that consists of the following three steps: stroke width estimation [35], image reconstruction from the polyline skeleton with an estimated stroke width, and precision, recall and accuracy calculation with the input image and reconstructed image. The following formulas define the precision, recall, and accuracy computation [see equations (7, 8, and 9].

where tp, tn, fp, and fn correspond to the true positive, true negative, false positive, false negative, and they are compared to the result of the restored image I (test) and a given input image J (ground truth).

(2) Root Mean Square Error (RMSE) for Handwritten Trajectory Matching

The RMSE is a measure of precision, and it is used to determine the accuracy of the transformation from one coordinate system to another. It is the difference between the desired output coordinate for the ground truth of the trajectory and the actual one. The formula for the RMSE of the point sequences is defined following as Equation (10):

where xt and yt are the actual coordinates of the signature i at time t ; are the predicted coordinates of handwritten i at time t ; and li is the length of handwritten i .

(3) Dynamic Time Warping (DTW) for Handwritten Trajectory Matching

In general, DTW is a method that is used to calculate the optimal match between the two given sequences (e.g., time series) with certain restrictions. The sequences are "warped" non-linearly in the time dimension in order to determine the measure of their similarity in a manner that is independent of certain non-linear variations in the time dimension. The detail for this method was presented in Section 2.3.

In our system evaluation, we apply the DTW method to the trajectory evaluation in order to find the optimal matching between the ground truth point sequences and the real-output stroke point sequences of our system. The DTW measurement provides the minimum error cost possible for the two point sequences. We apply the DTW on both the x-coordinate and y-coordinate to obtain the final error cost between the two point sequences.

4.3 Evaluation of the Skeletonization Method

In order to evaluate our skeletonization method, the datasets were also examined with Zhang-Suen, a state-of-the-art skeletonization method. We estimate the value of the duplicated rate, precision, and recall, as presented in Section 4.2(1). Besides, we also evaluate the processing time in order to show our fast algorithm, even though our system spends time for polygonal contour estimation. According to the report of the evaluation and the comparison in Table 1, our proposed method shows a competitive result compared against the Zhang-Suen method. Moreover, in Fig. 10, the result of the comparison shows that our proposed method is more robust in the case where there is junction distortion. In the Zhang-Suen result, the skeleton of the characters around the junction are distorted in the ‘A, ‘t’, and ‘e’ characters. These kinds of distorted skeletons may mislead the trajectory recovery process. However, our result provides a good skeleton, which is extremely useful for the next trajectory recovery step.

Table 1.Comparison of Skeletonization Method

Fig. 10.Experimental Result Comparison: (a) Binary Image; (b) Result of Zhang-Suen Method, around the junction, skeleton of ‘A’, ‘t’, ‘e’ characters are distorted.; (c) Result of Our Method, the right skeleton around junction of these character are estimated.

4.4 Evaluation of Trajectory Recovery

In order to evaluate the trajectory recovery, we developed two systems in order to make a comparison. The first system uses our proposed method, which includes the proposed polyline skeleton estimation and the proposed handwritten trajectory recovery methods. In the second system, we re-implement the graph-based trajectory recovery method that uses the output of the Zhang-Suen skeletization method. We denote it as the ‘Zhang + Graph’ approach. Two types of errors, explained in Sections 4.3.2 and 4.3.3 are used to evaluate these two systems.

We calculate the average error of these two systems on the two datasets, including the single stroke and multiple strokes. We also measure the processing time for the system in order make a comparison. A report of these two systems is presented in Tables 2 and 3. Table 2 shows the average errors and processing time of the program, and we also calculate the average errors at each pixel to have more detailed information on the errors (Table 3).

Table 2.Comparison Evaluation: Average Errors and Processing Time per Image.

Table 3.Comparison Evaluation: Average Errors per pixel

The report indicates that our proposed system has better results with a smaller error in the two error measurements, RMSE and DTW. Our system had fewer errors than the ‘Zhang + Graph’ approach. Moreover, our proposed system is very fast when compared to the other one. The average processing of our system is of 405 ms while the ‘Zhang + Graph’ takes 930 ms, which is much more than ours. In order to make the observations more clear, we summarize the evaluation in a chart in Fig.11. Finally, Table 4 presents a summary of the evaluation over all the dataset. In the end, the trajectory evaluation has many challenges because human writing is not unique. Therefore, it is not easy to recover the exact handwriting in some cases. Fig. 12 shows some examples of the handwriting trajectory recovery performed by our proposed system.

Fig. 11.Performance Evaluation Comparison for All Datasets

Table 4.Performance Evaluation for All Datasets

Fig. 12.Experimental Results: (a) Zhang+Graph Based Method and (b) Our Proposed Method

 

5. Conclusion

In this paper, we propose a novel approach for handwriting skeletonization and trajectory recovery based on triangulation and graph theory. In our approach, we first propose a skeletonization method that is based on the polygonal contours of the handwritten character and a triangulation procedure. This method not only makes the junction clear, but also separates characters that are simply touching. This method generates polyline skeletonization. The second stage recovers the handwritten trajectory. The approach used in this algorithm is based on graph theory and can find the optimal path in the graph with the best representation of the trajectory. The final trajectory of the handwritten character is represented by the smoothest path on the graph with the minimum cost value. In advance of using the polyline skeleton of the handwritten character, our approach accelerates the process by searching for the optimal path. By using this new approach for skeletonization and combining the geometric characteristics into the graph theory approach, the trajectory of the handwritten character can be more accurately estimated, which is very useful for the recognition task. Moreover, the polyline skeleton used for the trajectory and lead trajectory recovery provides a trajectory close to that of human writing. The performance evaluation of the skeletonization and trajectory recovery with the two methods measures the errors of our system and provides a comparison with other methods that show the powerl of our proposed system.

References

  1. Plamondon, R. and S.N. Srihari, "Online and off-line handwriting recognition: a comprehensive survey," PAMI, IEEE Trans. on, 22(1), 63-84, 2000. https://doi.org/10.1109/34.824821
  2. Srihari, S.N. and E.J. Kuebert, "Integration of hand-written address interpretation technology into the United States Postal Service Remote Computer Reader system," in Proc. of 4th ICDAR. 1997, IEEE Press: Ulm Germany. 2, 892-896, 1997.
  3. Viard-Gaudin, C., P.-M. Lallican, and S. Knerr, "Recognition directed recovering of temporal information from handwriting images," Pattern Recognition Letters, 26, 2537-2548, 2005. https://doi.org/10.1016/j.patrec.2005.04.019
  4. Plamondon, R. and C.M. Privitera, "The segmentation of cursive handwriting: an approach based on off-line recovery of the motor-temporal information. Image Processing," IEEE Transactions on, 8(1), 80-91, 1999.
  5. Babcock, M.K. and J.J. Freyd, "Perception of Dynamic Information in Static Handwritten Forms," American Journal of Psychology, 101(1), 111-130, 1988. https://doi.org/10.2307/1422797
  6. S. Lee and J. C. Pan, "Offline Tracing and Representation of Signatures," IEEE Trans. Systems, Man, and Cybernetics, 22(4), 755-771, 1992. https://doi.org/10.1109/21.156588
  7. T. Huang and M. Yasuhara, "Recovery of Information on the Drawing Order of Single-Stroke Cursive Handwritten Characters from Their 2D Images," IPSJ Trans., 36(9), 2132-2143, 1995.
  8. S. Jager, "Recovering Writing Traces in Off-Line Handwriting Recognition: Using a Global Optimization Technique," in Proc. of 13th Int. Conf.on Pattern Recognition, 931-935, 1996.
  9. Kivari, B., "Time-Efficient Stroke Extraction Method for Handwritten Signatures," in Proc. of 7th WSEAS International Conference on APPLIED COMPUTER SCIENCE, Venice, Italy, 2007.
  10. Nel, E.M., J.A. du Preez, and B.M. Herbst, "Estimating the pen trajectories of static signatures using hidden Markov models. PAMI," IEEE Transactions on, 27(11), 1733-1746, 2005.
  11. Lee, S. and J.C. Pan, "Offline tracing and representation of signatures. Systems," Man and Cybernetics, IEEE Transactions on, 22(4), 755-771, 1992. https://doi.org/10.1109/21.156588
  12. L'Homer, E., "Extraction of strokes in handwritten characters," Pattern Recognition, 33(7), 1147- 1160, 2000. https://doi.org/10.1016/S0031-3203(99)00103-X
  13. Yu Qiao; Nishiara, M.; Makoto Yasuhara, "A Framework Toward Restoration of Writing Order from Single-Stroked Handwriting Image," Pattern Analysis and Machine Intelligence, IEEE Transactions on, 28(11), 1724-1737, 2006. https://doi.org/10.1109/TPAMI.2006.216
  14. Steinherz, T., et al., "Offline Loop Investigation for Handwriting Analysis," PAMI, IEEE Transactions on, 31(2), 193-209, 2009. https://doi.org/10.1109/TPAMI.2008.68
  15. Niels, R. and L. Vuurpijl, "Automatic Trajectory Extraction and Validation of Scanned Handwritten Characters," in Proc. of 10th IWFHR. 2006, IRISA/INSA - Campus Universitaire de Beaulieu: La Baule, France, 2006.
  16. Qiao, Y. and M. Yasuhara, "Reccovering Drawing Order From Offline Handwritten Image Using Direction Context and Optimal Euler Path," in Proc. of 31st ICASSP. Toulouse, France, 2006.
  17. Y. Kato and M. Yasuhara, "Recovery of Drawing Order from Scanned Images of Multi-Stroke Handwriting," in Proc. of Fifth Int. Conf. on Document Analysis and Recognition, 261-264, 1999.
  18. Y. Kato and M. Yasuhara, "Recovery of drawing order from single-stroke handwriting images," IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(9), 938-949, 2000. https://doi.org/10.1109/34.877517
  19. H. Fujioka and T. Nagoya, "A Graph Theoretic Algorithm for Recovering Stroke Order from Multi-Stroke Character Images," in Proc. of the 2011 2nd Int. Conf. on Innovative Computing and Communication, and 2011 2nd Asia-Pacific Conference on Information Technology and Ocean Engineering, 34-37, 2014.
  20. T.Y.Zhang, C.Y.Suen, "A Fast Parallel Algorithm for Thinning Digital Patterns," Communication of the ACM, 27, 236-239, 1984. https://doi.org/10.1145/357994.358023
  21. L. Lam, S. -W. Lee, and C. Y. Suen, "Thinning Methodologies - A Comprehensive Survey," IEEE Trans. Pattern Analysis and Machine Intelligence, 14, 869-885, 1992. https://doi.org/10.1109/34.161346
  22. M. Senthilanyaki, S. Veni, K. A. N. Kutty, "Hexagonal Pixel Grid Modeling for Edge Detection and Design of Cellular Architecture for Binary Image Skeletonization," Annual India Conference (INIDCON), 1-6, 2006.
  23. R. M. Ralenichka, M. B. Zaremba, "Multi-scale model-based skeletonization of object shapes using self-organizing maps," in Proc. of 16th Int. Conf. on Pattern Recognition, 1, 143-146, 2002.
  24. H. Rom and G. Medioni., "Hierarchical decomposition and axial shape description," IEEE Trans. on Pattern Analysis and Machine Intelligence, 15, 973-981, 1993. https://doi.org/10.1109/34.254054
  25. J. J. Zou and H. Yan, "Skeletonization of ribbon-like shapes based on regularity and singularity analyses," IEEE Trans. on Systems, Man and Cybernetics, 31, 401-407, 2001. https://doi.org/10.1109/3477.931528
  26. Lakshmi, J.K., Punithavalli, M., "A Survey on Skeletons in Digital Image Processing," in Proc. of 2009 International Conference on Digital Image Processing, 260-269, 2009.
  27. M Melhi, S.S Ipson, W Booth, "A novel triangulation procedure for thinning hand-written text," Pattern Recognition Letters, 22(10), 1059-1071, 2001. https://doi.org/10.1016/S0167-8655(01)00038-1
  28. Lam, J.H.M.; Yam, Y., "A skeletonization technique based on Delaunay triangulation and piecewise bezier interpolation," in Proc. of 2007 6th International Conference on Information, Communications & Signal Processing, 1-5, 2007.
  29. Paul Morrison, Ju Jia Zou, "Triangle refinement in a constrained Delaunay triangulation skeleton, Pattern Recognition," 40(10), 2754-2765, 2007. https://doi.org/10.1016/j.patcog.2006.12.021
  30. Soumen Bag, Gaurav Harit, "An improved contour-based thinning method for character images," Pattern Recognition Letters, 32(14), 1836-1842, 2011. https://doi.org/10.1016/j.patrec.2011.07.001
  31. Dilip K. Prasad, Chai Quek, Maylor K.H. Leung, and Siu-Yeung Cho, "Parameter independent line fitting method," in Proc. of 1st Asian Conference on Pattern Recognition (ACPR 2011), 28- 30, 2011.
  32. Dilip K. Prasad and Maylor K. H. Leung, "Polygonal Representation of Digital Curves," Digital Image Processing, Stefan G. Stanciu (Ed.), InTech, 2012.
  33. Paul Morrison, Ju Jia Zou, "Triangle refinement in a constrained Delaunay triangulation skeleton," Pattern Recognition, 40(10), 2754-2765, 2007. https://doi.org/10.1016/j.patcog.2006.12.021
  34. Kruskall, J. & M. Liberman, "The Symmetric Time Warping Problem: From Continuous to Discrete," In Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparison, Addison-Wesley Publishing Co., Reading, Massachusetts, 125-161, 1983.
  35. Yoshiyuki Yamashita, Koichi Higuchi, Youichi Yamada, Yunosuke Haga, "Classification of hand-printed Kanji characters by the structured segment matching method," In Proceeding of Pattern Recognition Letters 1, 475-479, 1983. https://doi.org/10.1016/0167-8655(83)90089-2
  36. Huy Hoang Nguyen, GueeSang Lee, SooHyung Kim, and HyungJeong Yang, "An Effective Orientation-based Method and Parameter Space Discretization for Defined Object Segmentation," KSII Trans. Internet and Information Systems, 7(12), 180-3199, 2013.
  37. Jun Sik Lim, In Seop Na, and Soo Hyung Kim, "Correction of Signboard Distortion by Vertical Stroke Estimation," KSII Transactions on Internet and Information Systems, 7(9), 2312-2325, 2013. https://doi.org/10.3837/tiis.2013.09.014
  38. Sang Cheol Park, In Seop Na, Tae Ho Han, Soo Hyung Kim, Guee Sang Lee, "Lane detection and tracking in PCR gel electrophoresis images," Computers and Electronics in Agriculture, 83, 85-91, 2012. https://doi.org/10.1016/j.compag.2012.01.016