Data analysis turns out to be a difficult process mainly because only a few people can describe exactly on how to do it. And this does not depend on how regular do they do Data-Analysis, it is simply just that the process by which we state a question, explore data, model it formally, interpret the results and then communicate our findings, indeed is a difficult process to not only generalize but also to abstract. Fundamentally, Data analysis is an art that we cannot just very easily automate.
Data analysts have many tools to themselves at their disposal, may it be from linear regression to classification trees or to random forests, these very tools are very carefully implemented on computers. But ultimately, above all, it also takes a data analyst who can not only find a way to assemble all tools but also apply them to data and answer the questions that are of Significant interest to the people.
This book in very simple terms describes the process of analyzing data and is therefore a distillation of the extensive experiences of the authors not only in both managing and conducting their own data analyses but to add to it even more, this book also holds a very careful observation on what produces results and what fails to produce useful insights into data in a format that proves to be applicable to both the practitioners as well as managers in data science.
This book is definitely a worthy read that does not limit itself to only those who aspire to become Data-Scientists but is also equally worthwhile for those who just wish to understand the concepts on how Data Analysis Works and how this art can be cultivated.
About The Book:
The Art Of Data-Science is one book that does not waste its space on the latest Technologies and instead focuses even more on enduring the basic fundamentals of Data-Analysis.
Keeping the Topical title aside, this guide accompanies no exercises apart from a few snatches of R-code, as it is not meant to be used as a handbook for data analysis either and in fact aims to teach the readers on how to think like a productive Data-Analyst so that they can utilize the knowledge provided through this to the fullest.
Along the way, as you go through this book, Peng and Matsui break the process of the analysis of data into a list of core activities, which begin with defining the question and ends in the communication of the results. A process that they described as - The epicycle of data analysis, that points out to a pattern of thinking and acting that is seen to be repeated in all core activities. The authors also try to explain on how an analyst will often cycle through the same pattern mentioned several times during a single activity, and how this process will very often send the scientists back to earlier steps.
The book is not only a quick read that is worth a look but also benefits the readers with a pay as you like price so that no-one has an excuse for not getting their hands on it.
By the time the readers reach the end of this broad, deep book they'll come to learn that data analysis in the real-world is like playing Snakes and Ladders on a board, but, with no ladders.
About The Authors:
Roger D. Peng is an Associate Professor at the Johns Hopkins Bloomberg School of Public Health who is currently working in environmental biostatistics and researching on the effects of air pollution and climate change on health. He is also a Co-Founder of the Johns Hopkins Data Science Specialization, which is said to have enrolled over 1.5 million students.
Elizabeth Matsui is a Professor of Pediatrics at Johns Hopkins University school of medicine who also runs a data management and analysis center along with Roger D. Peng that supports clinical traits and epidemiologic studies.
Image Source: Learndatasci