Abstract
Data is the fuel, the glue and the product of online collaboration. Big Data is the driving force behind collaborative computing and is enabling and facilitating the next wave of innovation. Unfortunately, privacy is one of the core weaknesses of the entire ecosystem. The prevailing wisdom is that sensitive data can be protected in Big Data sets. In this paper, we decompose the problem space and mathematically discuss the implications for privacy when one connects the many, large data sets that comprise a Big Data collection.