The idea that computers can be programmed to learn; the ability to write code that lets a program understand some aspect of the world through data is nothing short of amazing.

Many years ago, when I was about fourteen and working at an event for a magazine I was publishing at the time, I got to know a computer science graduate who was working on his final year project. As we were talking, his laptop was on a step, running some program that seemed to be outputting endless strings of characters to finish running. Obviously, I was curious and had to ask him about it.

He was using the biological principles of crossovers and mutations to write algorithms that can “evolve” a dataset into a valid solution. If your immediately thinking what the point of that is? but genetic algorithms are often used in things like robotics to say for example to train a robot to behave like a human, or to optimize things like shipping or traffic routes.

It instantly resonated with me.

At the time, I had a rudimentary understanding of evolution, mostly based off exercises on cross-pollinating pea plants to breed different kinds of peas. The idea of using basic principles of Darwinian evolution to write code that can solve problems by programmatically evolving data blew me away. Six years later, some residual effect of that moment lead me to study Bioinformatics; a field at the intersection of genetics and computer science.


As with most things you learn, while you can develop an understanding and retain the core concepts, the practice and implementation of these concepts is constantly evolving and improving the practical workflows. I'm hoping to use this document to keep track of relevant tools, full-stack of machine learning if you will, organized by a few higher-level concepts.

Learn machine learning

It's easy to get overwhelmed trying to get a practical grasp of machine learning. Outcomes are too often predictable and it's rarely the case that you could model something and expect it to behave exactly as you predict. Creating and tuning models is an art that requires a thorough understanding of the tools at your disposal and the statistical modelling behind it. You often adapt your techniques as you explore the problem space you're trying to solve for.

What works for me has always been repeatedly applying what I learn in projects until I get a good feel for not just the underlying concepts, but how it is handled in software.

It's also tough to learn everything about such a vast and rapidly evolving topic. Ideally, once you feel you've got a sound introduction to machine learning, figure out what specific area(s) you want to specialize in and start doing your own research on to understand how to use them in practice in a resonably current way.

machine learning Source: Areas of machine learning

I'm going to try and organize this sparse collection of knowledge with enough context and information for it to make sense to anyone reading. When in doubt Google...

What is machine learning?

It's a sub-area of artificial intelligence that allows computers to self-learn without having to be explicitly programmed. Machine learning aims to understand patterns in large sets of input data and then predict outputs based on the models it generate.

Workflow

The machine learning workflow (Source)

What is a machine learning algorithm?

Machine learning employs algorithms that can learn from and make predictions on data. These are typically borrowed from statistics and range from simple regression algorithms to decision trees and more.

Here's a good resource to learn more about different ML algorithms and where to use them: Essentials of Machine Learning Algorithms.

What is a machine learning model?

Generally, it refers to the model artifact (the solution) created after training an ML algorithm. Once you have a properly trained ML model you can use it to make predictions for new inputs. The goal of machine learning is to properly train ML algorithms to create such models. When I say 'models' in this post I'm always referring to this definition.

But, there really isn't a single consistent definition of the term 'model' within the ML community. The term gets thrown around a lot and can refer to anything from statistical models or data models used in ML; like columns, data types and sources or even specifications of neural nets. Be wary of this when you read up on ML across technical and mathematical guides.


ML algorithms

There's a lot, and each one has its own set of appropriate use cases. You can classify ML algorithms based on learning style or similarity. The diagram below (open in new tab) does a great job of summarising the popular ones by similarity. For the purposes of this post, I'll group them based on learning style: supervised and unsupervised learning.

supervised vs unsupervised

Supervised learning
This is where the machine learning algorithm is trained using example scenarios. The training data comes tagged with known labels that allow the algorithms to build a model based on it. Once the model is trained sufficiently the algorithm will be able to determine the labels for unseen instances.

Problems solved with supervised learning can be further broken down into classification and regression problems.

Unsupervised learning
In contrast to supervised learning, unsupervised learning uses training data that is not labeled. This essentially means the algorithm figures out how to make sense (recognize patterns) of the data on its own.

Unsupervised learning can be grouped into clustering and association problems.

Semi-Supervised learning
This is a mix of the two previous approaches - only some of the input data is labeled.


Linear Regression for Supervised Learning

This is essentially the "Hello World" tutorial for machine learning. Linear regression is used to understand the relationship between input (x) and output (y) variables. When there is only one input variable (x), it's called simple linear regression. You've probably seen this technique used in simple statistics.

The most common technique used to train a linear regression equation is called Ordinary Least Squares. So, when we use this process to train a model in machine learning it's usually referred to as Ordinary Least Squares Linear Regression.

A simple regression model for input (x) and output (y) can be modeled as such:

y = B0 + B1*x

The coefficient B1 (beta) is an estimate of the regression slope, and the additional coefficient B0 estimates the regression intercept giving the line an additional degree of freedom.

Follow this tutorial to learn four techniques used to prepare a linear regression model: Simple Linear Regression, Ordinary Least Squares, Gradient Descent and Regularization.

You'll soon notice that a lot of machine learning these days are just different ways of curve fitting using basic statistics. Machine learning (at least in my opinion) only gets really exciting when you step into the world of deep learning.


Deep learning

This is a sub-field of machine learning that's shown a lot of promise in recent years. It's concerned with algorithms that are based on the structure and function of neurons in the brain.

One of the most exciting features of deep learning is its performance in feature learning; the algorithms perform particularly well in being able to detect features from raw data. A good example is a deep learning model's ability to do things like identify the wheels from the image of a car.

The diagram below illustrates the difference between typical machine learning and deep learning:

deepmlvs

Deep learning usually consists of multiple layers. They typically combine simpler models to build more complicated ones by passing along data from one layer to another; which is one of the primary reasons deep learning outperforms other learning algorithms as the amount of data increases.

For a definitive introduction to deep learning read The Deep Learning Book available online for free through MIT.


TensorFlow

TensorFlow is a Python library for fast numerical computing that was designed specifically for machine learning. It was open-sourced by Google with the hope of putting deep learning capabilities in the hands of a lot more researchers and developers around the world.

The official tutorials can be somewhat confusing to a beginner, I recommend starting with this series to get what the writer calls the gentlest introduction to Tensorflow.

How to use TensorFlow

Once installed it provides multiple APIs for training ML models. The higher level APIs built on top of what is called the TensorFlow Core (the lowest level API with most control) are the easiest to learn and should be where you start.

It's counterintuitive to include a full TensorFlow tutorial within this post when there already exists countless resources online that do it perfectly well... start with the official one:

Getting started with TensorFlow

While TensorFlow is the most popular machine learning library, there're several great alternatives like Torch (used by Facebook), Caffe(deep learning framework by Berkeley AI Research) and many more.


What next?

Once you understand the basics thoroughly you should have some idea of what your interest in machine learning is; do you want to use it for your app? or for research?

Based on your interests you should be able to dwelve deeper into any of the different areas by following the links and topics in this post or dig up what you need with a few Google searches.

Machine learning is a tough topic to master. But if you've read up to this point, chances are you'd agree it's an extremely valuable asset to have by your side. The hard part is getting a good foundation in machine learning. After that, it's a matter of knowing what you want to accomplish and iterating your way to solutions.

“Torture the data, and it will confess to anything.” – Ronald Coase

Be cautious when you apply ML in the field - given the inherent nature of these algorithms, it can sometimes be confusing to tell if the algorithm comes to a conclusion by following a meaningful set of steps that can be replicated or simply arrived at a result that 'smells right' through a faulty process.

Questions, thoughts, and feedback? Tweet me at @devudara.

References: