As a beginner it is easier to get lost in the details and shear overwhelming nature of learning machine learning.
Cross the beginner’s block
I get a lot of mails from readers asking.
- How do I get started with Machine Learning?
- I do not have a background in math, how can I learn data science?
More often the materials on blog posts and courses are often targeted at intermediates. But remember it is easier to get started without the math. You would still need the math, but it can come later. Below is a step by step guide to get started, but remember..
When I first started with machine learning, I started reading anything that had the title data science/machine learning. I often did not understand most of it, but slowly I started to grow chunks of knowledge which I later assembled. The important skill here is to be curious and believe.
Learn a tool
Never get overwhelmed with a choice of tool. Just pick one!. Often beginners are divided between R and Python. Here are a list of resources to get started with the tools.
- Google Developers Python Course
- Think Python: How to Think Like a Computer Scientist
- Real Python
- Data Science from Scratch: First Principles with Python
Get your hands dirty
Learn to explore the data and try the following with the tool of choice. Preparing data for data science problems is an art of its own right. Below are the list of techniques you should try your hands at.
Start by dicing the data into subsets. Understand the variables and their types. Take a look at the variables that might impact the machine learning problem at hand.
Try simple data transformations like aggregation, decomposition (splitting the variables) , log transforms.
A key part of solving data problems is to understand the data at hand. Visualization is a wonderful way to understand the data and the hidden gold in them.
Majority of the data science problems is to look for answers. Practice asking questions and look for answers in the data.
Applied Data Science Process
Understand the process behind solutions to data science problems.The most common approach to solving data science problems is as follows.
- Define the problem: Understand the problem that is being solved
- Analyze data: Analyze the data to for patterns and information that could be used to develop a model.
- Data preparation: Prepare the data for modelling.
- Model: Start applying machine learning algorithms and validate.
- Evaluate: Evaluate the performance of the model and choose the best performing model.
- Deploy: Implement the model in production.
Practice, Practice, Practice
Once you start learn the tools, get your hands at the data , practice the applied data science process, it is important to rinse and repeat this process on different datasets across different domains.
As you start learning the tricks of the trade, it is important to get deep down to the details. The next step is to dive deeper into the algorithms and to understand why they work and how they work. Understand when one is better than the other, under what circumstances they perform better.
In this post you will learn a step by step approach to learn data science, understand simple approaches to learn and get better at doing applied data science.