I was emboldened to write this book after my video series called Data Science With Julia[1] got some traction. That too after a tweet about Decision Tree[2] was liked by Julia Language itself. So I thought why not give it more?

This book should be seen as my attempt to explain Data Science to my self and nothing more. Will this book rise to professional stature is yet to be seen.

Front Cover

The front cover showcases a scene from Indian mythology where the forces of Good (Devar’s) and that of Evil (Asuran’s) churn the cosmic ocean revealing things like poison and elixir. In a similar way a Data Scientist can churn the vast amount of data that he has at his disposal for good purpose like finding out a new drug that might work, or for evil purpose like tracking and invading privacy of some one.

Back Cover

We thank Richard M. Stallman ( and Free Software Foundation ( for making this book possible.

1. What you need to know

1.1. GNU/Linux

You need to know GNU/Linux if you have not used it, one of the best places to learn it is

1.2. Math

Data Science is the a place where data processing meets computer science. Computers are good and are very fast at math, and data science is math. To know math, one can look into the courses offered by Khan Academy[3]. One can go through these courses

  • Precalculus

  • Calculus

  • Matrix

  • Probability and Statistics

One also read this book Mathematics for Machine Learning [mml].

2. What you need to have?

A good decent powered computer might be needed to run programs in this book. I would say its preferable to have a GNU/Linux[4] machines so that you can explore the field of data science.


3. What is Datascience?

There is lot of Data, in fact we have coined the term data explosion, and more times than we realize these data indicate something valuable. With data people have become great stock traders[5], they have made machines win competitions that were thought only humans could win it[6].