S.Sujatha, M.Yuvarani
A large number of cloud services require users to share private data like electronic health records for data analysis or mining, bringing privacy concerns. Anonymizing data sets via generalization to satisfy certain privacy requirements such as k-anonymity is a widely used category of privacy preserving techniques. It is a challenge for existing anonymization approaches to achieve privacy preservation on privacy-sensitive large-scale data sets due to their insufficiency of scalability. We propose a scalable Bottom up Generalization (BUG) approach to anonymize large-scale data sets using the MapReduce framework on cloud. We deliberately design a group of innovative MapReduce jobs to concretely accomplish the computation in a highly scalable way. We will try to prove that with this approach, the scalability and efficiency of BUG can be significantly improved over existing approaches