Chitanut Tachepun. A data-masking guideline for optimizing insights and privacy under GDPR compliance . Master's Degree(Information Technology Management). Mahidol University. Mahidol University Library and Knowledge Center. : Mahidol University, 2020.
A data-masking guideline for optimizing insights and privacy under GDPR compliance
Abstract:
After the General Data Protection Regulation (GDPR) has been enforced it became a disruptive issue to every organization due to its severe penalties. So the protecting data is important to every organization. The data Pseudonymization and Anonymization are some of the employed techniques to protect and reduce the privacy risks from the data breach as described in Article 32. However, those techniques destroy the pattern of the data, which represents the fact that it could be analyzed or monetized to gain useful insights by using data analytics or data science approaches. This thesis focuses on optimizing the criteria of methods choosing to gain the balancing the privacy and insights that the data could be useful for analyzing and also compliance with the GDPR. This thesis proposes the guideline consists of three techniques: Tokenization, Suppression, and Generalization to protect personal data by calculating risk scores from two methods: Data classification and Data uniqueness. The resulting criteria in the guideline are experimented to achieve the optimized classification performance in five protected open data compared with their original by using three data mining algorithms empowered by hyperparameter tuning. The results show that the protected data by the proposed guideline can protect adequate information and achieve insignificant classification performance when compared to the unprotected data.