Can Data be Democratized?


I have just returned from the Open Data Science Conference in Boston, Ma. The word open in the title refers to open source software. This is essentially software that is not proprietary and hence is contributed to and supported by a community instead of being the sole property of a for-profit corporation. This appeals to our sense of fairness, does it not? After all, information wants to be free.

As its title suggests, the conference focuses on data science and this includes such sub-specialties as machine learning, artificial intelligence, and, deep learning. Like all true science, these fields use the scientific method to derive knowledge, and, like other scientific fields, that knowledge can be used for good or for evil.

Thankfully, there is much good being done. I attended a talk given by Datakind, an organization that gets data scientists to work pro bono on projects that promote the public good. Among other things, they have marshaled their resources to try to eliminate deaths from pedestrian accidents, and to help farmers know when it is best to irrigate their crops, thus conserving water. Also, machine learning has made possible the deployment of portable ultrasound machines with the capability to diagnose disease without the aid of a doctor, or even a technician, to read the scans to remote areas that have limited access to doctors and hospitals.

Also, machine learning has been used to investigate the human genome, leading to the discovery of drugs that use our own immune system to fight certain cancers, thus, relieving some patients from the toxic effects of chemotherapy. Clearly, there is more to data science than helping Target know who is pregnant or helping Wall Street make a killing.

However, where is all this data coming from? I think you know the answer. To quote Donovan, “it comes from here and there and you and me”. We are giving up our data freely everyday and we are getting little directly in return for it. Yes, we benefit from new scientific discoveries and perhaps, even new consumer products. Nevertheless, there is a social cost involved in this data revolution. Many low skilled and yes, even middle class semi-skilled jobs will be eliminated by these technologies within the next few years.

One way to deal with this significant social dislocation might be to pay people for the use of their data. This would not be a government welfare benefit or a guaranteed minimum income, but, would instead be a payment to individuals for something of value that they are providing. This is much like when one sells a stock, which, by the way, is a fictitious creation of human imagination, not an attribute of the physical world.

Most every businessperson can relate to how businesses transform raw inputs into value-added outputs. They would also readily acknowledge that they must pay something for the raw materials (inputs). So, why is it somehow not the fair and reasonable thing to do when the input is information?

What do you think?


