I do own a copy of the following books, they hold the key answers for all the major concepts and are very much handy if there is a need to explore those topics at any given point in time, but these topics if studied exclusively are so raw and dry that you may not feel comfortable to start/begin with, nevertheless – these books are a must have if you are an aspiring data scientist :
“Applied Business Statistics ” – Making Better Business Decisions,
7th Edition by Ken Black
~ 850 pages
“Statistics For Business and Economics ”
12th Edition by Anderson, Sweeney, Williams, Camm, Cochran
~ 1088 pages
Just count the number of pages, its 1900 + and imagine how boring it can get with all that theory and math. These were my forward learning books, I did go through them but couldn’t quite get a good grip on the concepts as they seem too technical and deep to understand with all the mathematics and statistics constantly around you. You can read about forward and backward learning in my previous post which outlines aspects and difficulties one goes through when they come across terms that they haven’t heard in their lifetime.
The ultimate success ( I would say at this point it all began ) in developing an interest to better understand the concepts began, when I researched the world wide web for a book which would give me an easy to understand methodology of analytical concepts along with mathematics and statistics of course ( there is no data science without them ). I will make it simpler for you by giving out the name of the book :
” Programming Collective Intelligence ” – Building smart web 2.0 applications
1st Edition by Toby Segaran
~ 360 pages
Don’t get confused by the phrase web 2.0 application, the book focuses on machine learning and data science in a constructive manner. I have a PDF version of this book at the present and I take prints of chapters and go through them. I am planning to get a hard copy at a later stage as it would be easier to handle and carry along. The book is published in the year 2007 ( 10 Years ago ) – The topics in the book seem like they were published yesterday. It’s probably the first book that you should get started with. The first chapter will make you comfortable and by the time you are sailing through the pages you are already in the second doing Euclidean & Pearson. The book is dependent on python, so if you do not have python installed in your system – better get started now.
There are lot of good reviews about the book and I am planning to go through all chapters and blog about topics which I found interesting. I use Ken Black & Andersen Sweeney when I feel I need more inputs about the concept explained in PCI there by actually implementing backward learning.
If you are reading PCI or have read it before, let me know in the comments section which section of PCI you felt is the best.