1. Introduction
In industry, robots are used in various applications such as object handling, parts assembly and welding [1-2]. Usually, they are used in relatively fixed and inflexible working conditions. Recently, diverse sensors including camera and laser-based 3D sensing systems have been used to provide greater flexibility to the robot thorough visual feedback [3-8].
Park and Mills [3] proposed an algorithm to compute the pose of thin-walled sheet metal parts using seven laser sensors. It can be used for assembly by robot in automotive industry. Their algorithm consists of two steps. During off-line step sensor measurements are correlated to the mislocation information of the robot. On-line procedure iteratively refine pose using mapping information obtained during off-line stage. Bone and Capson [4] presented an algorithm for robotic fixtures assembly using 2D and 3D computer vision technology. 2D computer vision is used for picking up parts. 3D computer vision is used to align the parts before joining them. Watanabe et al. [5] proposed an algorithm for the accurate setting of work pieces in robotic cell. They use motion control by visual feedback thorough attaching a camera to the robot. Mario et al. [6] presented a method for online recognition and classification of objects for robotic assembly tasks using a camera attached on the robot. Lisa et al. [7] presents assembly manipulation of small objects by dual-arm using stereo camera system. Hvilshoj et al. [8] deals with the past, present and future of autonomous industrial mobile manipulation.
Park and Mills [3]’s algorithm is the most relevant to our approach. Their algorithm requires off-line stage to compute mapping between part mis-locations and sensor measurements. Proposed algorithm does not require off-line computation so that it can operate directly using sensor information.
Four 3D sensing systems are attached on robot arm. Many steps are required to use information of multiple 3D sensing systems together with robot. First, extrinsic calibration among 3D sensing systems is required to fuse information into one common coordinate system. Then, hand/eye calibration is required to transform information of sensor coordinate system into robot coordinate system. Finally, pose estimation algorithm to align one part to the other is required. This paper deals with the automatic registration of two parts that can be used for assembly process in industry.
2. Proposed Approach
Each 3D sensing system consists of a camera and laser stripe system. Fig. 1 shows the demo system that includes four 3D sensing systems, robot arm, and two parts where one lies on the ground and the other is attached on the end of robot arm. Four 3D sensing systems are attached on the end of robot arm. The final goal is to align the upper body to the lower body such that they are within predefined gap and height difference along boundary of two parts. It can be used for diverse applications such as parts alignment in automotive assembly line.
Fig. 1.Demo system consisted of two parts and four 3D sensing system mounted on the end of robot arm.
2.1 Extrinsic calibration among sensing systems
Demo system that has two parts and four sensing system is used as shown in Fig. 1 to demonstrate presented algorithm. 3D sensing system consisted of camera and slit laser is shown in Fig. 2. They are attached on the end of the robot arm as shown in Fig. 1.
Fig. 2.3D sensing systems consisted of camera and laser.
Transformation between robot tool’s coordinate system and sensor coordinate system is required to use 3D information of sensing system for the manipulation of robot. Usually, it is called hand/eye calibration. Before computing the hand/eye transform, it is necessary to know relative poses among 3D sensing systems to transform 3D measurements of each sensing system into the common coordinate system. We select one among four 3D sensing systems as the reference sensor coordinate system.
Extrinsic calibration among sensor systems is done using Tsai camera calibration algorithm [9]. Each camera on four 3D sensing systems is calibrated using the same world coordinate system as shown in Fig. 3. Then, we compute the relative pose of each camera with respect to the same world coordinate using camera calibration information.
Fig. 3.Calibration structure used in the extrinsic calibration among four 3D sensing systems.
Xi = RiXw + Ti (i = 1 ~ 4)
Xi, Xw is the 3D coordinate of a point with respect to the i -th camera coordinate system and world coordinate system. Ri, Ti is rotation and translation of the i -th camera with respect to the world coordinate system. 3D coordinate of each sensor system is transformed into the first sensor system using information of the extrinsic calibration. Finally, it is converted into the robot tool’s coordinate using hand/eye calibration information.
2.2 Hand/eye calibration
In previous section, 3D coordinates measured by four 3D sensing systems are converted into the reference coordinate system using extrinsic calibration information. Finally, it is transformed into the robot tool’s coordinate system to operate robot according to sensor information. It is usually called hand/eye calibration.
Hand / eye calibration can be formulated as AX = XB where A corresponds to robot motion and B is related to camera motion. X is the unknown hand / eye transformation between the camera and robot coordinate system. Many approaches have been proposed and they typically compute the unknown transformation using known motion of the robot [10].
We designed a simple hand/eye calibration procedure using planar calibration structure by reflecting constraints on our application. In our application, we can install four 3D sensing systems into the robot arm for them to have small difference in rotation with respect to the coordinate system of robot tool. Laser in 3D sensing systems is aligned with the coordinate axes on the robot as shown in Fig. 4. Under this assumption, we can neglect rotation between two coordinate systems. Therefore, we need to compute only translation between the 3D sensing system and robot tool coordinate system. To compute the translation, planar calibration structure is installed on the end of robot arm as shown in Fig. 4.
Fig. 4.Hand/eye calibration structure.
Translation between two coordinate systems is computed by image processing and manually measured value. Fig. 5-(a) shows the original image of hand/eye calibration structure captured by a camera in 3D sensing system. Fig. 5-(b) shows the result of the detection of laser points on the image. We find laser contours on the image using binarization and connected component analysis [11]. We assume that there is a small rotation difference between two frames of R − xyz (hand frame) and L − xyz (eye frame). Therefore, remaining unknowns are translation between two frames. We use the 3D coordinates of two laser points to compute the translation between two frames. The x value of the center point of the laser contour and the y value of the laser point that corresponds to the end position of hand / eye calibration structure is used to compute the unknown translation as follows.
Fig. 5.Hand / eye calibration result: (a) original image; (b) processing result.
Tx and Ty is unknown translation components along the x and y axis between the sensing system and robot tool. xs is distance along x direction from tool center to the hand / eye calibration structure. Its value is measured manually. xm is the x coordinate of the center point of detected laser contour on image. ys is the distance along the y direction from the tool center to the end of hand / eye calibration structure. Its value is measured by hand. ye is the y coordinate of the end point of detected laser contour on image. Therefore, we can compute two translation components along x and y axis using Eq. (2). The unknown translation Tz along the z direction is set manually by directly measuring the distance. Finally, we can transform laser coordinate of 3D sensing system into the tool center’s coordinate system. Pose computation is done with respect to the robot’s tool frame so that it can be applied directly on the robot.
2.3 Pose Estimation using greedy search
Pose estimation is done using control points on image. They are detected automatically among laser points on image. Four sets of 3D sensing system are installed at the end of robot arm as shown in Fig. 1. Images acquired at the initial position of the robot are shown in Fig. 6. For each image, there are two separated laser lines that are projected on the upper and lower body.
Fig. 6.Original images acquired at the starting position (from left to right: sensor system 1~4).
First, laser contour on image is extracted using binarization and connected component analysis [11]. Then, boundary points on the upper and lower body are selected as control points. We use eight control points in computing the pose of the robot where two control points from each 3D sensing system are used.
If we know correspondences between two 3D point sets, the pose can be computed using typical algorithms in [12] where transformation is computed using 3D-3D correspondences. But, in our application, it is difficult to find correspondences between control points on the upper and lower body because there is no distinct features on the image.
We can think of using registration algorithm of Iterative Closest Point (ICP). But, there are few sample points so that it can be caught in local minimum. Also, ICP usually assumes that two point sets have partial correspondences. But, in our application, it is difficult to find corresponddences on the upper and lower body because there is a gap and height difference between two parts.
We overcome those difficulties by greedy search-based pose estimation algorithm. We designed cost term by reflecting geometric constraints among four 3D sensing systems. We find the solution that minimizes the gap and height difference by searching all combinations of rotation and translation within predefined range. We do not consider gradient-based search algorithm because it can be caught in local minimum.
We assume that the initial pose of the upper body is not much deviated from the best pose. It is reasonable assumption in production line where robot positions are taught in advance using teaching pendent.
The cost term is defined as follows. We find a transformation that gives a minimum cost by greedy search.
c is cost value at the current value of rotation and translation. gi and hi is the computed gap and height difference of the i-th control point using current value of rotation and translation. is the target value of gap of the i-th control point. is the target value of height difference of the i-th control point.
The current value of gap and height difference of gi and hi is computed as shown in Fig. 7. We detect two control points per 3D sensing system. They correspond to the end position of each part. Fig. 6 shows original images acquired by the camera in 3D sensing system at the starting position before alignment. Coordinates of control points are transformed into the coordinate of robot tool using hand/eye calibration information. All the computation is done under the coordinate system of robot tool.
Fig. 7.Cost terms in pose computation using eight control points.
In Fig. 7, and is the i -th control point of the upper and lower body. li represents i -th line on the lower body and it is computed by fitting using laser points.
is the estimated i -th line on the upper body. The gradient value of line and is set as the mean value of gradient of lines l2 and l4 . The gradient value of line and is set as the average value of gradient of lines l1 and l3 . is set to pass corresponding point on the upper side using gradient value mentioned before.
gi represents the gap value of the i -th control point. It is set as the distance from the control point on the lower body to the corresponding line .
Height difference hi is set as the difference of x values between two corresponding control points on the upper and lower body. In our coordinate frames in Fig. 4, height is related to the x axis.
We find a solution through greed search. We evaluate cost function of Eq. (3) for given value of R and T . They are chosen within predefined range near the initial value with same increment. The value of R and T that give a minimum cost value is chosen as solution. We transform only control points on the upper body using the current value of R and T . Coordinates of control points on the lower body are fixed during computation. They are computed once at the starting position.
We proposed a search-based pose estimation algorithm where cost terms are designed by reflecting geometric configuration of sensing systems.
3. Experimental Results
Experiments are done using demo system shown in Fig. 1. Four 3D sensing systems are attached on the end of robot arm. An industrial 6-DOF robot by Hyundai heavy industries is used. Robot motion is controlled using RS-232 and it is integrated into program that processes sensing units.
A 3D sensing system consisted of a camera and slit laser is calibrated using our earlier work [13]. Extrinsic calibration among four sensor systems is done using calibration structure shown in Fig. 3. Planar chessboard pattern is used for the calibration of the camera by Tsai algorithm [9]. Four cameras on each 3D sensing system is calibrated under the same world coordinate system, therefore we can compute relative pose of cameras with respect to the reference camera.
Table 1 shows the relative error of extrinsic calibration among sensor systems. We compared computed distance between sensor systems to the distance in original design. We can notice that the extrinsic calibration gives accurate result.
Table 1.Relative error in the computation of extrinsic calibration among four 3D sensing systems
Experiments are done by choosing the initial position of the robot in random within predefined range. We set the initial position of the upper body to have some deviation from the best pose. The translation along X and Y direction was selected randomly within ±3mm . The translation along Z direction is chosen between 0 and 15mm . The rotation along X and Y direction were selected randomly within ±1° . The rotation along Z direction was selected randomly within ±0.6° . Those ranges were chosen by reflecting the actual deviations in production line.
Fig. 6 shows original images acquired at the starting position. Distinct deviation between the upper and lower part exists where we can see large gap and height difference between them. Black papers are attached on the surface of parts to make the detection of laser contour on image easy. In case of without using it, the scattering of laser on object surface makes robust detection of laser contour difficult. Further research is required for the stable detection of laser contour on metal surface.
Fig. 8 shows the result of the detection of control points on the upper and lower body using images of Fig. 6. For each image, two control points located at the boundary of each body is successfully extracted. They are displayed as circles on image. Fig. 9 shows four images by each 3D sensing system after final alignment. We can know that gaps by sensor system 1 and 3 and gaps by sensor system 2 and 4 have a similar value.
Fig. 8.The result of extracting control points from the upper and lower body’s image (from left to right: sensor system 1~4).
Fig. 9.Images after aligning the upper body to the lower body (from left to right: sensor system 1~4).
The accuracy of proposed algorithm is evaluated as follows. We manually measured the gap and height difference after alignment. This value has no significant difference from the measurement value given by 3D sensing system at the final position. Therefore, error statistics are computed using value given by the sensor system at the final position.
Error statistics are obtained from 30 trials where the initial position was chosen randomly. Table 2 shows the mean error of gap and height difference at each stage by proposed algorithm. Table 3 shows the standard deviation of the error of gap and height difference after the final alignment. In all case, we could finish alignment after three adjustment steps.
Table 2.Mean error of gap and height difference after final alignment.
Table 3.Standard deviation of the error of gap and height difference after final alignment.
We configured demo system as shown in Fig. 1 to test the feasibility of proposed algorithm in automotive assembly line. In particular, our demo system is configured for parts alignment application. The management specification is as follows. Gap error should be in the range of ±0.5mm and height difference should be in the range of ±1.0mm . We could satisfy them successfully thorough all experiments.
4. Conclusion
In this paper, an algorithm for the automatic registration of two parts using multiple 3D sensing systems on a robot is proposed. Pose estimation is done based on greedy search where cost terms are proposed by reflecting the geometrical configuration of four 3D sensing systems. Proposed algorithm use greedy search within predefined range to avoid local minimum that can occur in gradient-based algorithm. The experimental results show the feasibility of the proposed algorithm. For further research, we are considering robust detection of laser contour on metal surface where specular reflection and surface irregularities can cause difficulties.
참고문헌
- A. Hormann, “Development of an Advanced Robot for Autonomous Assembly,” in Proceedings of International Conference on Robotics and Automation, pp. 2452-2457, 1991.
- S. Jorg, J. Langwald, J. Stelter, G. Hirzinger, and C. Natale, “Flexible Robot-Assembly using a Multi-Sensor Approach,” in Proceedings of International Conference on Robotics and Automation, pp. 3687-3694, 2000.
- E.J. Park, and J.K. Mills, “Three-Dimensional Localization of Thin-Walled Sheet Metal Parts for Robotic Assembly,” Journal of Robotic Systems, vol. 19, no. 5, pp. 207-217, 2002. https://doi.org/10.1002/rob.10035
- G.M. Bone, and D. Capson, “Vision-guided fixtureless assembly of automotive components,” Robotics and Computer Integrated Manufacturing, vol. 19, pp. 79-87, 2003. https://doi.org/10.1016/S0736-5845(02)00064-9
- A. Watanabe, S. Sakakibara, K. Ban, M. Yamada, and G. Shen, “Autonomous Visual Measurement for Accurate Setting of Workpieces in Robotic Cells,” CIRP Annals-Manufacturing Technology, vol. 54, no. 1, pp. 13-18, 2005. https://doi.org/10.1016/S0007-8506(07)60039-0
- P.C. Mario, L.J. Ismael, R.C. Reyes, and C.C. Jorge, “Machine vision approach for robotic assembly,” Assembly Automation, vol. 25, pp. 204-216, 2011.
- R.L.A. Shauri, and K. Nonami, “Assembly manipulation of small objects by dual-arm manipulator,” Assembly Automation, vol. 31, pp. 263-274, 2011. https://doi.org/10.1108/01445151111150604
- M. Hvilshoj, S. Bogh, O.S. Nielsen, and O. Madsen, “Autonomous industrial mobile manipulation(AIMM): past, present and future,” Industrial Robot: An International Journal, vol. 39, pp.120-135, 2012. https://doi.org/10.1108/01439911211201582
- R.Y. Tsai, “A Versatile Camera Calibration Techniques for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses,” IEEE J. Robotics and Automation, pp. 323-344, 1987.
- K.H. Strobl and G. Hirzinger, “Optimal Hand-Eye Calibration,” Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, 2006.
- R. M. Haralick and L. G. Shapiro, Computer and Robot Vision, Addison Wesley, 1992.
- K.S. Arun, T.S. Huang, and S.D. Blostein, “Least-squares fitting of two 3-D point sets,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, pp. 698-700, 1987.
- J.E. Ha and K.W. Her, “Calibration of structured light stripe system using plane with slits,” Optical Engineering, vol. 52, no.1, 2013.