As a beginner it is easier to get lost in the details and shear overwhelming nature of learning machine learning.
Cross the beginner’s block
I get a lot of mails from readers asking.
 How do I get started with Machine Learning?
 I do not have a background in math, how can I learn data science?
More often the materials on blog posts and courses are often targeted at intermediates. But remember it is easier to get started without the math. You would still need the math, but it can come later. Below is a step by step guide to get started, but remember..
Be Curious
When I first started with machine learning, I started reading anything that had the title data science/machine learning. I often did not understand most of it, but slowly I started to grow chunks of knowledge which I later assembled. The important skill here is to be curious and believe.
Learn a tool
Never get overwhelmed with a choice of tool. Just pick one!. Often beginners are divided between R and Python. Here are a list of resources to get started with the tools.
Python
 Intro to Python for Data Science
 Google Developers Python Course
 Think Python: How to Think Like a Computer Scientist
 Real Python
 Codecademy: Python
 Head First Python
 Data Science from Scratch: First Principles with Python
R
 HandsOn Programming with R
 Google Tutorial
 Swirl Tutorial
 Data Camp tutorial
 Coursera R Programming
 R for Everyone
 QuickR
 The Art of R Programming – A Tour of Statistical Software Design
 Beginning R: The Statistical Programming
 Advanced Analytics and Graphics (AddisonWesley Data & Analytics Series)
 Machine Learning With R
Get your hands dirty
The best place for a good data source would be the UCI Machine Learning Repository. The repository is an inventory of many small real world examples. Start with the simple Iris Data Set.
Learn to explore the data and try the following with the tool of choice. Preparing data for data science problems is an art of its own right. Below are the list of techniques you should try your hands at.

Wrangle
Start by dicing the data into subsets. Understand the variables and their types. Take a look at the variables that might impact the machine learning problem at hand.

Transform
Try simple data transformations like aggregation, decomposition (splitting the variables) , log transforms.

Visualize
A key part of solving data problems is to understand the data at hand. Visualization is a wonderful way to understand the data and the hidden gold in them.

Question:
Majority of the data science problems is to look for answers. Practice asking questions and look for answers in the data.
Applied Data Science Process
Understand the process behind solutions to data science problems.The most common approach to solving data science problems is as follows.
 Define the problem: Understand the problem that is being solved
 Analyze data: Analyze the data to for patterns and information that could be used to develop a model.
 Data preparation: Prepare the data for modelling.
 Model: Start applying machine learning algorithms and validate.
 Evaluate: Evaluate the performance of the model and choose the best performing model.
 Deploy: Implement the model in production.
Practice, Practice, Practice
Once you start learn the tools, get your hands at the data , practice the applied data science process, it is important to rinse and repeat this process on different datasets across different domains.
Diving Deep
As you start learning the tricks of the trade, it is important to get deep down to the details. The next step is to dive deeper into the algorithms and to understand why they work and how they work. Understand when one is better than the other, under what circumstances they perform better.
Summary
In this post you will learn a step by step approach to learn data science, understand simple approaches to learn and get better at doing applied data science.
1 thought on “5 Steps to Get Started With Data Science”