Automated Schema Matching and Data Projection to Enhance Master Data Management Data Quality

Sarbaree Mishra; Sairamesh Konidala

Authors

Sarbaree Mishra Program Manager at Molina Healthcare Inc., USA Author
Sairamesh Konidala Vice President, JP Morgan & Chase, USA Author

Keywords:

Automated Data Mapping, Schema Matching, Data Quality, Machine Learning

Abstract

Especially in master data management (MDM), the quality of data is very crucial for preserving the correctness, consistency & the dependability of the information of an organization. The main challenges of companies have is combining data from many sources—each with a different structure & the format—which causes discrepancies & difficulties creating a coherent data view. Novel solutions that help to align the consistency of data structures across many systems include automated data mapping and schema matching. By automating the detection of linkages between data fields using smart algorithms and machine learning models, these techniques significantly lower human effort and errors usually related with the process. Furthermore, automatic data mapping and schema matching assure consistent data structure across systems, thereby improving data quality and therefore operational efficiency and better decision-making. These techniques also help to eliminate duplicates and discrepancies in data, therefore enabling the preservation of a single, consistent source of truth for vital organizational information.

References

1. Loshin, D. (2010). Master data management. Morgan Kaufmann.

2. Drumm, C., Schmitt, M., Do, H. H., & Rahm, E. (2007, November). Quickmig: automatic schema matching for data migration projects. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (pp. 107-116).

3. Talburt, J. R., & Zhou, Y. (2015). Entity information life cycle for big data: Master data management and information integration. Morgan Kaufmann.

4. Shahbaz, Q. (2015). Data mapping for data warehouse design. Elsevier.

5. Mahanti, R. (2019). Data quality: dimensions, measurement, strategy, management, and governance. Quality Press.

6. Zhu, Y., & Yang, J. (2019). Automatic data matching for geospatial models: a new paradigm for geospatial data and models sharing. Annals of GIS, 25(4), 283-298.

7. Geisler, S., Quix, C., Weber, S., & Jarke, M. (2016). Ontology-based data quality management for data streams. Journal of Data and Information Quality (JDIQ), 7(4), 1-34.

8. Curino, C., Moon, H. J., Deutsch, A., & Zaniolo, C. (2013). Automating the database schema evolution process. The VLDB Journal, 22, 73-98.

9. Morrison, J. L. (1995). Spatial data quality. Elements of spatial data quality, 202, 1-12.

10. Gal, A. (2006). Managing uncertainty in schema matching with top-k schema mappings. In Journal on Data Semantics VI (pp. 90-114). Berlin, Heidelberg: Springer Berlin Heidelberg.

11. Woodall, P., Oberhofer, M., & Borek, A. (2014). A classification of data quality assessment and improvement methods. International Journal of Information Quality 16, 3(4), 298-321.

12. Loshin, D. (2010). The practitioner's guide to data quality improvement. Elsevier.

13. Ehrlinger, L., Werth, B., & Wöß, W. (2018). Automated continuous data quality measurement with QuaIIe. International Journal on Advances in Software, 11(3), 400-417.

14. Dreibelbis, A. (2008). Enterprise master data management: an SOA approach to managing core information. Pearson Education India.

15. Konstantinou, N., Koehler, M., Abel, E., Civili, C., Neumayr, B., Sallinger, E., ... & Paton, N. W. (2017, May). The VADA architecture for cost-effective data wrangling. In Proceedings of the 2017 ACM International Conference on Management of Data (pp. 1599-1602).

16. Thumburu, S. K. R. (2022). Data Integration Strategies in Hybrid Cloud Environments. Innovative Computer Sciences Journal, 8(1).

17. Thumburu, S. K. R. (2022). Scalable EDI Solutions: Best Practices for Large Enterprises. Innovative Engineering Sciences Journal, 2(1).

18. Gade, K. R. (2022). Data Modeling for the Modern Enterprise: Navigating Complexity and Uncertainty. Innovative Engineering Sciences Journal, 2(1).

19. Gade, K. R. (2022). Migrations: AWS Cloud Optimization Strategies to Reduce Costs and Improve Performance. MZ Computing Journal, 3(1).

20. Katari, A., & Vangala, R. Data Privacy and Compliance in Cloud Data Management for Fintech.

21. Katari, A., Ankam, M., & Shankar, R. Data Versioning and Time Travel In Delta Lake for Financial Services: Use Cases and Implementation.

22. Komandla, V. Enhancing Product Development through Continuous Feedback Integration “Vineela Komandla”.

23. Thumburu, S. K. R. (2021). Optimizing Data Transformation in EDI Workflows. Innovative Computer Sciences Journal, 7(1).

24. Thumburu, S. K. R. (2021). Integrating Blockchain Technology into EDI for Enhanced Data Security and Transparency. MZ Computing Journal, 2(1).

25. Gade, K. R. (2021). Data-Driven Decision Making in a Complex World. Journal of Computational Innovation, 1(1).

Automated Schema Matching and Data Projection to Enhance Master Data Management Data Quality

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite