The Eighth Roddam Narasimha Distinguished Lecture was conducted on August 5 on Data Science: The Good, the Bad and the Ugly. Prof Jayant R Haritsa, a data scientist and professor in the Department of Computational & Data Sciences at Indian Institute of Science, Bengaluru delivered the lecture and helped the audience decipher various aspects of data science.
During his talk, Professor Haritsa gave a background and various examples of the use of data science in various sectors and by government entities and corporate companies. Talking about the good aspects of data science first, Professor Haritsa spoke about some of the sectors where companies are utilising the power of data science and analytics to understand their customer preferences and provide them with better choices and advanced services. He gave an example of the power sector where deployment of a condition monitoring and predictive analytics solutions helps the power company’s managers in taking informed maintenance decisions quickly and thus saves huge losses, and also improves service delivery.
He then introduced the audience to some lesser-known perils and realities of data science. He also informed that only a few enterprises really have curated big data, but many others claim to have it too because of the much created hype around it. Talking about some of the interesting methodological issues of data science, Professor Haritsa said, “Big data encourages to ask the wrong questions. People get big answers to such questions, but coming up with the right questions is more difficult than coming up with the right answers. Nowadays, people tend to compare incomparable things because of the big data available on the web.”
The audience was amused with the example of one such big data analysis which claimed to calculate the age of a musician depending on the kind of music genre he/she plays, but the statistical and probability subtleties are being lost in the big picture.
He also gave some examples of the design errors in the implementation of big-data systems and how it can lead to the breakdown of critical web infrastructure. Explaining the limitations of data science systems, he said, “Everybody loves big data systems but you can not test them. So, many big data systems will be prone to failure by definition because they are too difficult to test due to the large scale of the data.”
Professor Haritsa also pointed out to its methodological misuse that, “Data science can be used to bend us to preconceived biases and reinforce them. The directed behaviour is being made possible because of data science, through news feed algorithms to selectively push an opinion, confuse the issues with fake news, and apply peer pressure on social media.”
In his concluding remarks, he emphasised on the ideal way to use data science for the benefit of mankind and said, “It is very important to use data science, it should be a tool of last resort, not the first, to validate a hypothesis, and should be used as a support tool and not substitute for domain expertise. Because data science, like nuclear power, has enormous potential for benefiting mankind, if used with care, otherwise it has equally destructive power for ruining the society.”
Data science is trying to look into the future. It proves to be a win-win situation for the customers as well as for the companies. Data science can play a very direct and positive role for services in transport, consumer electronics, banking, power sectors, etc. There is scope for data science to do a lot of public good and, at the same time, also help the corporate world in terms of better services and more projects.
Prof Jayant R Haritsa