DOI QR코드

DOI QR Code

Adjusting Edit Scripts on Tree-structured Documents

트리구조의 문서에 대한 편집스크립트 조정

  • 이석균 (단국대학교 SW융합대학 소프트웨어학과) ;
  • 엄현민 (단국대학교 공과대학 소프트웨어학과)
  • Received : 2018.12.23
  • Accepted : 2019.01.31
  • Published : 2019.04.30

Abstract

Since most documents used in web, XML, office applications are tree-structured, diff, merge, and version control for tree-structured documents in multi-user environments are crucial tasks. However research on edit scripts which is a basis for them is in primitive stage. In this paper, we present a document model for understanding the change of tree-structured documents as edit scripts are executed, and propose a method of switching adjacent edit operations on tree-structured documents based on the analysis of the effects of edit operations. Mostly, edit scripts which are produced as the results of diff on tree-structured documents only consist of basic operations such as update, insert, delete. However, when move and copy are included in edit scripts, because of the characteristics of their complex operation, it is often that edit scripts are generated to execute in two passes. In this paper, using the proposed method of switching edit operations, we present an algorithm of transforming the edit scripts of X-treeESgen, which are designed to execute in two passes, into the ones that can be executed in one pass.

웹, XML, 오피스 어플리케이션에 사용되는 대부분의 문서들은 트리 구조로 구성되어 있으며 특히 다중 사용자 환경에서의 트리 구조의 문서의 차이 발견, 합병, 버전 제어 등의 연구가 활발하다. 그러나 이들의 기초가 되는 편집스크립트에 대한 연구는 초보적인 상태에 있다. 본 논문에서는 편집연산들의 실행 시 트리구조의 문서의 변화를 이해하기 위한 문서 모델을 제시하고 편집연산들의 실행 효과의 분석을 통해 트리 구조 문서에 대한 인접한 편집연산들의 순서 교환 방법을 제안한다. 트리 구조 문서에 대한 변화탐지의 결과로 생성되는 편집스크립트들은 대부분 기본연산들(갱신, 삽입, 삭제)만으로 구성된다. 그러나 이동, 복사연산을 포함하는 경우, 이들의 복합연산의 특성으로 인해 주로 2단계 패스의 실행을 전제로 하는 편집스크립트를 생성한다. 본 논문에서는 제안한 편집 연산들의 순서 교환 방법을 통해 2단계 패스의 실행을 전제로 하는 X-treeESgen의 편집스크립트를 단일 패스로 변환하는 알고리즘을 제시한다.

Keywords

SOJBB3_2019_v24n2_1_f0001.png 이미지

Fig. 1 Example of Source Tree

SOJBB3_2019_v24n2_1_f0002.png 이미지

Fig. 2 Two step pass-based edit script and its application to the source tree

Table 1 List of Edit Operations

SOJBB3_2019_v24n2_1_t0001.png 이미지

Table 2 Switching two adjacent edit operations in [U, D], [U, I], or [U, U].

SOJBB3_2019_v24n2_1_t0002.png 이미지

Table 3 Switching two adjacent edit operations in [D, D], [D, I], or [D, U].

SOJBB3_2019_v24n2_1_t0003.png 이미지

Table 4 Switching two adjacent edit operations in [I, I].

SOJBB3_2019_v24n2_1_t0004.png 이미지

Table 5 Transform composite operations(C, M) into simple operations.

SOJBB3_2019_v24n2_1_t0005.png 이미지

Table 6 Switching two adjacent edit operations in [M_c, _], or [C_c, _].

SOJBB3_2019_v24n2_1_t0006.png 이미지

Table 7 Switching two adjacent edit operations in [_, M_p] or [_, C_p].

SOJBB3_2019_v24n2_1_t0007.png 이미지

References

  1. Ronnau, S., Scheffczyk, J. and Borghoff, U., "Towards XML Version Control of Office Documents," Proceedings of ACM Symposium on Document Engineering, pp. 10-19, 2005.
  2. Antila, C., Trevino, J. and Weaver, G. "A Hierarchic Diff Algorithm For Collaborative Music Document Editing," Proceedings of Technologies for Music Notation & Representation, 2017.
  3. Weaver, C. and Smith, S., "XUTools: Unix Commands for Processing Next-Generation Structured Text," Proceedings of Large Installation System Adminstration Conference, 2012.
  4. Ronnau, S. and Borghoff, U., "XCC: Change Control of XML Documents," Computer Science - Research and Development, Vol. 27, Issue 2, pp. 95-111, 2012. https://doi.org/10.1007/s00450-010-0140-2
  5. Lee, Suk Kyoon, “Change Detection of Hangul Documents Based on X-treeDiff+,” Journal of the Korea Industrial Information Systems Society, Vol. 15, No. 4, pp. 29-37, 2010.
  6. Selkow, S. "The Tree-To-Tree Editing Problem," Information Processing Letters, Vol. 6, No. 6, 1977. DOI:10.1016/0020-0190(77)90064-3
  7. Zhang, K. and Shasha, D., “Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems,” SIAM Journal of Computing, Vol. 18, No. 6, pp. 1245-1262, 1989. https://doi.org/10.1137/0218082
  8. Cobena, G., Abiteboul, S. and Marian, A., "Detecting Changes in XML Documents," Proceedings of the 18th International Conference on Data Engineering, 2002.
  9. Lee, S. and Kim, D., "X-treeDiff+: Efficient Change Detection Algorithm in XML Documents," Lecture Notes in Computer Science, Vol. 4096, pp. 1037-1046, 2006.
  10. Lee, S. and Kim, D., “Improving Performance of Change Detection Algorithms through the Efficiency of Matching,” KIPS Transactions on Software and Data Engineering, Vol. 14, No. 2, pp. 145-156, 2007.
  11. Chawathe, S., "Comparing Hierarchical Data in External Memory," Proceedings of the 25th International Conference on Very Large Data Bases. 1999.
  12. diffxml, http://diffxml.sourceforge.net/ (accessed on Mar. 21th, 2019)
  13. Fontaine, R., "Change Control for XML: Do it right," Proceedings of XML Europe Conference, 2003.
  14. DeltaXML, http://www.deltaxml.com (accessed on Mar. 21th, 2019)
  15. Dohn, H. and Riechle, D., "Fine-grained Change Detection in Structured Text Documents," Proceedings of the 2014 ACM Symposium on Document Engineering, pp. 87-96, 2014.
  16. Autexier, S., "Similarity-Based Diff, Three-Way Diff and Merge," International Journal of Software Informatics, Vol 9, Issue 2, pp 259-277, 2015.
  17. Lee, S., “An Algorithm Generating Edit Scripts for XML Documents,” Journal of the Institute of Electronics and Information Engineers: CI, Vol. 48, No. 1, pp. 80-89, 2011.
  18. Kang, J., "A Study on Version Management of Documents with Hierarchical Structure," Master Thesis, Dankook University, 2013.
  19. Kim, S., Jung, S., Kang, Y. and Cho, W., “Mobile Office Construction on a Geotechnical Information System,” Journal of the Korea Industrial Information Systems Society, Vol. 15, No. 5, pp. 125-135, 2010.
  20. Woo, W., “A Study on Developing XML Documents and RDB Mapping Using Tag Free XML Development Tools,” Journal of the Korea Industrial Information Systems Society, Vol. 11, No. 5, pp. 37-52, 2006.