Browse > Article
http://dx.doi.org/10.13088/jiis.2018.24.3.243

An Algorithm for Finding a Relationship Between Entities: Semi-Automated Schema Integration Approach  

Kim, Yongchan (College of Business Administration, Seoul National University)
Park, Jinsoo (College of Business Administration, Seoul National University)
Suh, Jihae (Big Data Institute, Seoul National University)
Publication Information
Journal of Intelligence and Information Systems / v.24, no.3, 2018 , pp. 243-262 More about this Journal
Abstract
Database schema integration is a significant issue in information systems. Because schema integration is a time-consuming and labor-intensive task, many studies have attempted to automate it. Researchers typically use XML as the source schema and leave much of the work to be done through DBA intervention, e.g., there are various naming conflicts related to relationship names in schema integration. In the past, the DBA had to intervene to resolve the naming-conflict name. In this paper, we introduce an algorithm that automatically generates relationship names to resolve relationship name conflicts that occur during schema integration. This algorithm is based on an Internet collocation and English sentence example dictionary. The relationship between the two entities is generated by analyzing examples extracted based on dictionary data through natural language processing. By building a semi-automated schema integration system and testing this algorithm, we found that it showed about 90% accuracy. Using this algorithm, we can resolve the problems related to naming conflicts that occur at schema integration automatically without DBA intervention.
Keywords
Schema Integration; Naming Conflicts; Natural Language Processing; XML; Entity Relationship Diagram (ERD);
Citations & Related Records
연도 인용수 순위
  • Reference
1 Spaccapietra, S., Parent, C., and Dupont, Y, "Model independent assertions for integration of heterogeneous schemas," The International Journal on Very Large Data Bases, Vol.1, No.1 (1992), 81-126.   DOI
2 Storey, V. C, "Understanding semantic relationships," The International Journal on Very Large Data Bases Vol.2, No.4 (1993), 455-488.   DOI
3 Suh, J., and Jinsoo P, "Effects of Domain Familiarity on Conceptual Modeling Performance." Journal of Database Management, Vol 28, No. 2 (2017), 27-55.   DOI
4 Unal, O., and Afsarmanesh, H, "Semi-automated schema integration with SASMINT," Knowledge and information systems, Vol.23, No.1 (2010), 99-128.   DOI
5 Zerdazi, A., and Myriam L, "Matching of Enhanced XML Schemas with a Measure of Structural-context Similarity." WEBIST (2007)
6 Algergawy, A., Richi, N., and Gunter S, "Element similarity measures in XML schema matching." Information Sciences, Vol. 180, No. 24 (2010), 4975-4998.   DOI
7 Castano, S., De Antonellis, V., Fugini, M. G., and Pernici, B, "Conceptual schema analysis: techniques and applications," ACM Transactions on Database Systems, Vol. 23, No.3 (1998), 286-333.   DOI
8 Batini, C., and Lenzerini, M, "A methodology for data schema integration in the entity relationship model," IEEE Transactions on Software Engineering, Vol.10, No.6 (1984), 650-664.
9 Batini, C., Lenzerini, M., and Navathe, S. B, "A comparative analysis of methodologies for database schema integration," ACM computing surveys, Vol.18, No.4 (1986), 323-364.   DOI
10 Beeri, C., and Milo, T, "Schemas for integration and translation of structured and semi-structured data," International conference on database theory, Springer Berlin Heidelberg, 1999.
11 Chau, P. Y., and Hu, P. J. H., "Information technology acceptance by individual professionals: A model comparison approach," Decision sciences, Vol. 32, No. 4(2001), 699-719.   DOI
12 Chen, P. P. S, "English sentence structure and entity-relationship diagrams," Information Sciences, Vol.29, No.2 (1983), 127-149.   DOI
13 Chen, P. P. S, "The entity-relationship model-toward a unified view of data." ACM Transactions on Database Systems, Vo.1, No.1 (1976), 9-36.   DOI
14 Date, C. J. (1990). An Introduction to Database Systems, Vol. 1, Fifth Edn, Reading: Addison-Wesley.
15 Kaul, M., Drosten, K., and Neuhold, E. J, "Viewsystem: Integrating heterogeneous information bases by object-oriented views," Data Engineering, 1990.
16 Davies, I., Green, P., Rosemann, M., Indulska, M., and Gallo, S. " How do practitioners use conceptual modeling in practice?" Data & Knowledge Engineering, Vol. 58, No. 3(2006), 358-380.   DOI
17 Gotthard, W., Lockemann, P. C., and Neufeld, A, "System-guided view integration for object-oriented databases," IEEE Transactions on knowledge and Data Engineering, Vol.4, No.1 (1992), 1-22.   DOI
18 Gou, G., and Rada C, "Efficiently querying large XML data repositories: A survey." IEEE Transactions on Knowledge and Data Engineering, Vol.19, No. 10 (2007), 1381-1430   DOI
19 Hayne, S., and Ram, S, "Multi-user view integration system (MUVIS): An expert system for view integration," Data Engineering, 1990.
20 Jin, S., and Kang, W, "Mapping Rules for ER to XML Using XML schema," Proceedings 10th Southern Association for Information Systems Conference. Jacksonville, Florida, USA. 2007.
21 Kwan, I., and Fong, J, "Schema integration methodology and its verification by use of information capacity," Information Systems, Vol. 24, No.5 (1999), 355-376.   DOI
22 Lee, M. L., and Ling, T. W, "A methodology for structural conflict resolution in the integration of entity-relationship schemas," Knowledge and Information Systems, Vol.5, No.2 (2003), 225-247.   DOI
23 Melnik, S., Rahm, E., and Bernstein, P. A, "Rondo: A programming platform for generic model management," Proceedings of the 2003 ACM SIGMOD international conference on Management of data. ACM, 2003.
24 Spaccapietra, S., and Parent, C, "View integration: A step forward in solving structural conflicts," IEEE transactions on Knowledge and data Engineering, Vol. 6, No.2 (1994), 258-274.   DOI
25 Motro, A, "Superviews: Virtual integration of multiple databases," IEEE Transactions on Software Engineering, Vol.7 (1987), 785-798.
26 Pottinger, R., and Bernstein, P. A, "Schema merging and mapping creation for relational sources," Proceedings of the 11th international conference on extending database technology: Advances in database technology. ACM, 2008.