Data visualization is always about communicating information. Most of us fall into the trap of visualizing for the sake of visualizing to the point that the entire point of the analysis is lost. For example, plain, bold text telling you an important statistic is way more effective than a fancy pie graph with more than seven partitions simply because it communicates the information better. Likewise, a visualization of a small slice of the data is better than visualizing everything if you’re only really concerned with the small portion.
Our exercises with applying traditional statistical methods on tricky data sets really stuck with me. If I were alone and did not know any better, I would have committed the exact mistakes that the course has taught me to avoid. Before this course, all I had was a hammer and everything looked like a nail. All in all, I would sum up this course as a tool kit. It has given me more than enough to work with such that now when I am exploring new data, I can be present them more confidently.
Building on the idea of people coming from various backgrounds, it has been established that the industries around us today works in a dynamic and interdisciplinary way. One must be able to appeal and communicate to a wide variety of audiences across the different markets and industries. This is where the union of communication and data analysis is emphasized. Data visualization is instrumental to comprehensively communicate research analysis and results but for this to translate into concrete actions, it must be understood by the different stakeholders at hand.
One of the problems of technical people (usually scientists) in communicating data is that they confuse summary statistics with storytelling. Thousands of infographics are being made, yet very little insights are taken from them. A call to action is perhaps the most important part of our stories. So what? What do you want your audience to do or make out of the story you share with them? Like what we’ve said time and time again, data analytics is not enough; we must be able to CONSUME the results and not just PRODUCE them.
On how you would tell a story and designing your presentation is only limited to your imagination and will depend on the type of audience you have. There are a lot of tools or software to create your presentation form excel, powerpoint to the more sophisticated ones like tableau and gephi. It does not need to be the grandest presentation but as long as you were able to convey your message to the audience is what is important.
I now realize that a true data scientist is one who is in the middle of the action, someone who ties together the different aspects of a problem, whether it be the business side, finance, technical, or even HR side of things; and after taking all of these into consideration, aims to solve a problem using a data-driven approach. These, to me, are radically different notions: the notion of simply outputting data that will “magically” solve a problem, versus realizing that we are first and foremostproblem solvers, whose main weapon and tool is the data driven approach that we bring into the fray. And I hope that the next batch would find this as exciting and as fulfilling as I do now.
One must not also forget that all of these data-driven initiatives have potential impact on the client, and to society in general. Most data scientists are too excited about the potential benefits of the project, that they fail to think about the risks on data privacy and other ethical issues. If these issues are not addressed, the project will also fail, no matter how revolutionary it is.
A student of the MSDS program must always remember this... that despite their proficiency in the classroom, innate business acumen, daily immense pressure, and tantamount workload, it is people’s lives that we are analyzing, it is an organization’s future that we are augmenting, and it is a society that we are drawing impact for – data science is a powerful tool that draws impact that many have not yet realized.
Further to the real-world applications, I also greatly value the exposure we got to the other components of data science such as AI Ethics, Data Privacy Act, and Data Strategy. It is good that even early on, we were not automatically submerged into the technical side of things. The MSDS program ensured that we get a full picture of how data science relates with other fields. It's good that we are exposed to the business side of things, the legal side, and the human-centric side as well. I must say the MSDS program is turning out to be greater than expected. I look forward to our remaining months here.
Personally, this course has managed to set my expectations for my career as a data scientist. It is both a ride on a rough road and a slide on slippery ice and that I need to remind myself that data science is the car to cross the rainbow and not the actual pot of gold.