A Programming Guide with Probability and Statistics

Aug. 6, 2017, 10:24 a.m. By: Prakarsh Saxena

Think Stats

Probability and Statistics- the terms which resonate together to create the vast applications of the fields of Data Science and Machine Learning, have immensely grown a huge followers’ base in this era. But programming the concept sometimes gets tricky and requires a lot of contemplation on the code. Allen B. Downey, in his ‘Think’ series has written a book to solve just the problem for everyone.

Think Stats

Think Stats is one of the books in the ‘Think’ series authored by Allen B. Downey, published by O’Reilly Media, which focuses on Probability and Statistics for Python programmers. The book helps readers envisage the concepts into Python code using simple techniques which will help explore real data sets and answers interesting questions. It contains a case study using data from the National Institutes of Health, and it demands from the readers to work on real life projects using realistic datasets.

Of course, one needs to have a basic understanding and skills of Python, to make full use of the book. Think Stats is based on a Python library for probability distribution (Probability Mass Functions (PMFs) and Cumulative Distribution Function (CDFs)), which include techniques to represent and plot PMFs and CDFs. Many of the exercises have short programs to run as experiments and help readers develop a deeper understanding of the topic.

Other important topics among many, include details about outliers, conditional probability, plotting histograms, different types of distributions (e.g. exponential, Pareto, Normal), Bayes’ theorem, Hypothesis testing, estimations etc.

The thing which stands out of the rest of the books is the inclusion and proper explanation of Bayesian Statistics for programming, which the author feels, is too important to be neglected. By taking advantage of the PMF and CDF libraries, it is possible for even beginners to learn the concepts and solve challenging problems related to the same.

About the Author

Allen B. Downey is an American computer scientist and Professor of Computer Science at Franklin W. Olin College of Engineering, Needham, Massachusetts. After receiving his BS and MA in Civil Engineering from MIT, he received his PhD in Computer Science at University of California, Berkeley. He started his career as Research Fellow in San Diego Supercomputer Center and has been associated with Colby College, Wellesley College and Boston University before becoming a professor of Computer Science at Olin College. He has also served as a Visiting Scientist and Google Inc.

Downey has published several books, which are freely available online from Green Tea Press. His ‘Think Python’, ‘Think Bayes’ and ‘Think Stats’ have been the more popular ones in the collection and has continued to inspire programmers to reach new levels in development. His latest book, ‘Think DSP’ was launched in August’16.

PDF Link : Think Stats - Probability and Statistics for Programmers (Version 1.6.0)

PDF Link : Think Stats - Exploratory Data Analysis in Python (Version 2.0.35)