As the olympics drew to a close, the business of collating and analysing the data began in earnest. Take for instance the medal table. There has already been some interesting statistics emerging.
When you looked at the normal medal table, ordered by medal count, you see the usuals at the top - China, USA etc. But when you cross-reference the medal volumes against each country's population size, you get a very different view. As a measure of medals per capita, New Zealand are top with Slovenia, Denmark and Australia close behind. This is because of their small population sizes.
The education establishment and development of sports will soon come under scrutiny, as recent data also shows that 30% of Great Britain's medals were won by people who attended a private, fee-paying school. This is not representative of GB, as these schools only comprise of 7% of the school population, and implies that privileged children go on to be more successful in the olympics.
As the olympic data becomes available to more and more people, expect more insight to arise as this data gets joined to other proprietary data sets. Which brings me to the crux of my point.
When you share your information, what you don't know is which data sets are going to be joined to it. How will your data be extrapolated, and will that extrapolation be correct? What kind of business and personal decisions could be made that affect your future happiness, comfort and freedom?
So, while collecting data for one purpose may be perfectly ethical, joining it to an unrelated source to make unrelated assumptions may not be (like trying to guess someone's height by weighing them!!). A good question you might want to ask about your business is.. What safeguarding - if any - can be done to ensure your analysts do not draw incorrect insight from tenuously joined data?