This is the story of a dive into the head by a software engineer in the "profound" end of automatic learning. Amplifr's CTO shares the notes taken on his ongoing journey by Ruby's developer to deep learning enthusiasts and provides tips on getting started from scratch and getting the best out of a life-changing experience.
Let's start with an imaginary portrait and see if you recognize yourself or someone you know.
You are a software engineer who works daily with the code, building complex things, transforming business requirements into application logic and shipping mostly on time. You tried your hand in different programming languages and chose your main weapon. You're pretty sure of what you're doing and you're ready to learn something entirely new, something to give you new powers and to keep you relevant in the profession.
Keep listening to the "new electricity" that is artificial intelligence. Whole industries are being transformed by advances in machine learning while the most optimistic researchers compare the rise of the AI to the industrial revolution.
That revolution started in your life and you want to participate. Feel the itching that you do not know how to scratch. Want to learn more about AI / ML / DL and experiment with it. You have no idea where to start.
Hi, my name is Alexey and I was that person two years ago. Today I supervise machine learning at Amplifr, a social media management startup that I run as CTO. I'm still very involved with the "traditional" code: after all, we're a Ruby on Rails application. While I was not working on our core code base, I participated in machine learning competitions (and recently won an NLP competition), I attended IA conferences, read scientific articles, did experiments every day, and explored how I can apply # 39; automatic learning to realize Amplifr stand out from the competition (more about this, in our future posts).
A general rule is that all modern artificial neural networks (RNN, CNN, DBN, DQN, c & # 39; is always an "N", as in "Network") belong to deep learning.
First, let's set the record directly. The "correct" use of the terms Artificial Intelligence, Machine Learning and Deep Learning is a topic of an endless debate that quickly becomes rather acute and confusing for beginners. To keep things simple, we consider ML as a set of tools that come from AI-and DL as a particular subset of them. I will use all three terms in some way interchangeably, but mainly in the context of deep learning.
As I still remember very well as it just came out fresh, I want to offer some advice to those who start exploring the new field.
1. Do not shake
Suppose you are not much in mathematics. Personally, I graduated in a technical university eight years ago and since then I had not read a math book (at least before starting with the DL). You know how it goes: read your language / framework documentation more often than anything else.
After some initial research on google and talking to people more prone to mathematics around you, here's an impression of the amount of knowledge you need to accumulate before trying to solve real-world problems with neural networks. At least, that was an impression I two years ago:
- Get a good knowledge of linear algebra. Textbooks. Operations with matrices. Pen and paper, rows of numbers in high square brackets.
- Welcoming probability theory. More textbooks The Reverend Thomas Bayes is your friend again.
P (B), and similar.
- Study all the classic ML concepts, starting with linear regression.
- Learn how to implement all those algorithms in Python, C, C ++ or Java.
- Find out how to cook data sets, extract functionality, fine-tune the parameters and develop an insight on which a particular algorithm fits the task in question.
- Get familiar with DL frameworks / libraries (in my time, it was Theano and Torch, now it's probably PyTorch, TensorFlow and Keras).
Dogs against cats and dogs against Redux cats are famous competitions of Kaggle on the classification of images.
And only after having learned all this, according to some experts, would you be fit enough to solve some practical problem such as saying to dogs' cats.
If you're a little like me, the list above is enough to shrink your ego, and this leads to nothing but sweet and sweet procrastination.
But do not worry! Although technically everything on the list is true, those are not basic level requirements. If you know how to program, you already know how to train a model.
2. Remember that it is still the code
Take a look at this code:
# a group of Python imports like: from fastai.imports import * data = ImageClassifierData.from_paths("./dogscats", TFMS=tfms_from_model(resnet34, 224)) learn = ConvLearner.pretrained(resnet34, data, precompute=True) learn.in shape(00:01, 3) # Wait about 17 seconds ...
What can we say from this?
- It's Python.
- Use the
fastaiDeep learning library.
- It is three lines long (not counting imports).
resnet34It seems to be important here. Quick Googling explains why.
This is an example of a fantastic (and free!) Course fast.ai by Jeremy Howard. The idea that Jeremy promotes: start with pre-boxed abstractions and dig deeper only after some practical practice.
The above code is a classification model of the pre-trained image (format on ImageNet, a set of data of about 15 million images) so that it can solve the already mentioned task Dogs Vs. Cats. It achieves a 98% accuracy in just three epochs (passes over data). The training takes 17 seconds on a computer equipped with a GPU. These results blow up the first attempts to solve the same problem out of the water.
Of course, behind those three lines of code lie years of research, dozens of academic articles and thousands of hours of trial and error. But these are the lines you can use now. And once the substance is acquired, classifying images for your own use case (and using it in production) is not very different from the separation of dogs and cats.
3. Find a partner in crime
In my free time, I played the double bass and the guitar, I started painting in oil, I started surfing.
Procrastination, however, is not unrelated to me either: in my less productive periods I watch with frustration the TV series and the hours of waste on MMORPG and fantasy books. I'm a
nerdhuman after all.
When I was ready to start with machine learning and, in particular, with in-depth learning, my friend and I were in the balance with Heroes of the Storm.
To deceive myself into taking a first step on a long road to new knowledge I had to do a pact with my friend, who also dreamed of artificial intelligence. We decided to stop procrastinating together, enrolling in the same course and supervising the results of others. We now participate regularly in competitions together.
If none of your friends offline is willing to learn by your side, the Internet is your friend: there are many places online where you can find other beginners to work with.
4. Avoid cognitive overload
It is known that he learns something that is too difficult it's a frustrating experience. As humans, we are wired to avoid frustration. At the same time, learning something that is too easy it is not satisfactory: you quickly lose all motivation. The trick is to bite exactly how much you can chew.
The first online course I followed was Udacity's Deep Learning Nanodegree: an expensive ($ 999 now, about $ 400 at the time I took it) program that promises a four month primer on theory plus practice request to apply what you have learned in the real world. As a bonus, by completing the course you will unlock a discounted registration in Nanogeneration of driverless cars.
My mistake was that I went too deep, without first covering my bases. When I felt a bit behind a concept introduced in the course, I panicked and started reading everything I could find online: articles, books, other courses.
As a result, I could not remain focused on a material that should have given me a base on which to build. With hindsight, I strongly recommend that you stick with me one Of course and not trying to learn things in parallel. People, after all, are notoriously bad in multi-tasking.
If we were starting now, I'd like to first take a look at Jeremy Howard's fast.ai, which I've already mentioned, and at Andrew Ng's latest offer of Coursera (it's a certified cost attached, but you can get it for free). It consists of five courses: from the introduction to neural networks and deep learning, through the discovery of convolutional neural networks (which are essential for working with image data), to sequence models ( speech recognition, musical synthesis, data of any time series).
The second course is more focused on theory, while the former emphasizes "quick and dirty" implementations, which, I believe, is the best way to start. Just remember to walk, avoid multi-tasking and take small steps.
5. Set up your panoramas
Instead of trying to learn it all at once, try to choose areas where the use of deep learning techniques provides results that are more satisfying to you personally. Working with something that you can relate to (as opposed to dealing with random abstract data points) will keep you motivated. You need a feedback cycle, a way to get tangible results from your experiments.
Here are some ideas for starter projects:
- If you are passionate visual arts (cinema, photography, video, fine arts), dive into Computer Vision. Neural networks are used to classify objects on images, highlight areas of interest (anomalies on MRI scans or pedestrians on the road), detect emotions or age on portraits, transfer artistic styles, even generate original works of art.
- If you are more interested sound, you can compose music with neural networks, classify genres and recommend new tracks like Spotify does. Or, you can explore the transfer of the vocal style and pretend to speak in the voice of another person.
- If you are there video games, you should definitely take a look at the reinforced learning. You can train the game's AI to be better than you in your favorite game. Besides, you can play, and nobody blames you for this, because, you know, the search.
- If you are passionate user experience and customer service: review Natural Language Processing and сhatbots, so you can automate (to some extent) interact with your customers: get the intention out of a message, classify support tickets, offer answers immediate to the most common questions.
After testing our computer vision, my friend and I have shifted our focus to automatic speech recognition (ASR) and natural language processing. The artificial vision, with industry giants (Google, Apple) supporting self-driving projects, is now probably the most funded research area and also the one where the deep learning techniques have more cemented their positions : in the classification of images, accuracy of the neural network the forecasts grew from less than 75% in 2010 to 98% and up to 2018.
The main reason for this late flowering was the limitations of hardware: machine translation activities require an enormous amount of memory and processing power to train large neural networks.
The challenges related to language (especially those dealing with the written word), on the other hand, have only recently started to benefit from neural networks. The hottest field right now is Machine Translation (MT). Everyone has probably noticed that the quality of Google Translate has improved drastically in recent years. Deep learning plays an important role in this from 2015.
To get an idea of how fast DL can turn a research area that has not seen significant progress for decades, here's the fun fact:
Neural networks first appeared in the automatic translation competition three years ago, in 2015. In 2016, 90% of the contenders in such competitions were based on neural networks.
There is a huge amount of knowledge to be extracted from academic papers on the subject and applied to real-world tasks, especially if your startup has something to do with the text (and Amplifr does) .
If this convinces you to try to apply in-depth learning to NLP, take a look at Stanford's CS224n course: "Development of natural language with in-depth learning". You do not need to be a Stanford student to attend the course: all lesson videos are available on YouTube. If you progress better in the group setting, there is all the subreddit dedicated to the course where you can find online study partners.
6. Be competitive
The field of machine learning is inherently competitive. In 2010, Kaggle, now the world's largest community of data scientists and machine students, brought the spirit of a hackathon into what had been mostly an academic field. Since then, the competitive way to solve ML tasks had become a standard practice. Companies and institutions, from Microsoft to CERN, offer awards to solve challenges in exchange for a royalty-free license to use the technique behind the winning voice.
An ML competition is the best way to assess your skills, get the feel of the "baseline" in a particular field, get inspired by more advanced competitors, find colleagues to work with and just get your name out there.
Consider participating in a competition as a rite of passage for a student of amateur machines. For my friend and me, this transition happened in 2017, a year in our individual study. We chose Understanding the Amazon from Space сompetition on Kaggle, as this was our chance to play with the classification of multi-class images (and we also deal with the environment). For over two months we spent every weekend to solve the problem: detect deforestation from satellite images and differentiate their causes.
Another sign of rapid progress in DL technology: a year ago we had spent a lot of time and effort to invent the way to trick Google Cloud Platform into running our experiments with less money. Today, Google offers a free Jupyter notebook environment with GPUs, and there are a host of services that will be happy to train your models for you.
We did not win the prize, we entered the top 15% of the ranking (which is not bragged about anything), we made the mistake of every beginner in the book, but the experience proved invaluable: we have acquired the trust to continue our efforts and we have chosen the next competition, this time in the field of NLP.
Russian is a morphologically rich language that is considered under-resources in terms of NLP research, some details in this document.
The challenge of developing a question-answer system was hosted by a large Russian bank, and the contenders worked with a unique data set created in the spirit of a famous SQUAD (the Stanford reading comprehension data set of 150 000 questions created by volunteers based on the set of Wikipedia articles), but for the Russian language.
The task was to train a system to answer questions based on a text. The model, accepted for presentation in a Docker container (RAM is limited to 8 GB), should have been able, given a natural language question, to highlight the relevant part from a paragraph of text. As often happens with the most demanding competitions, we had to present not an already trained model, but a solution that had to completely train and give answers to the test questions (and the test data set was only partially public, to ensure fair play ) in less than 2 hours of the machine time.
Our solution came in second place in the public rankings, but we were so focused on solving a task that we forgot to read the rules correctly: they said that the alliance was forbidden and only individual voices were accepted. We had to be clean and offered a bronze "consolation" award (a sort of Cannes festival, when a film is shown "out of competition").
We were lucky to avoid disqualification, but we learned the lesson and now I urge everyone to fight the first impulse to face the problem frontally and take the time to read the rules carefully.
Since my interest in in-depth learning is mainly production-oriented (with solutions that can be applied to the real world needs of my startup), I also noticed that looking at the rankings gives you a good base of reference on how " production close "you. The best solutions are generally academic and are not yet ready to be distributed commercially, while silver, bronze and a few below are often the ones that are most promising for application.
7. Stay in the circuit
The blessing and curse of deep learning is the pace at which the field is evolving. Also this article (as introductory, personal and non-academic as it is) was probably surpassed in some respects before it was published.
The best way to stay up to date is to become part of a big online forum full of ML fans. If you're lucky enough to understand Russian, you're sure to join the Open Data Science community: a public Slack server with over 12,000 users and over 140 public channels. There is always a way to find smaller and more local groups through Reddit & # 39; s r / MachineLearning or Meetup.com. If you know any international Slack group that matches the ODS scale (not to be confused with ODCS, which is also an AI resource), be sure to let us know!
Setting up a Twitter feed and e-mail subscriptions is also essential to stay in the loop. You can also invest your time in various offline boot camps that occur all over the world.
I thought the summer camps in Spain were for surfing until I visited the Deep Learning International Summer School in Bilbao. It was a rash decision, but I do not regret it: it was perfect for my level (after a year in the industry). In the absence of practitioners, the school was more of a conference, though very intense: nine to six, five days in a row. The entire program was divided into sections that worked in parallel with each speaker who presented a three and a half hour cycle of classes.
Once you feel safer, try entering one of the main conferences on AI, ML and DL: this year I was lucky to visit the ICLR. Other important international conferences are CVPR (in particular on Computer Vision) and NIPS. Yes, in the field of AI your life consists almost entirely of acronyms.
8. Use your programming chops
Let's assume the obvious: Python has completely conquered the AI and the data science community. Probably today there is no reason to start with a different language unless you are really good or plan to manage some really low-level optimizations.
For me, as a developer of Ruby, switching to Python was an easy and enjoyable experience overall. It only takes a couple of weeks of practice (and learning the tricks and comprehensibility for indexing the arrays) to feel quite comfortable. However, it took me some time to complete a free medium-level Python programming course and it certainly did not hurt me.
For a software engineer, a linguistic "barrier" is not a problem. For fans coming from non-programming backgrounds, entering DL is more difficult. So you already have an advantage.
However, do not expect any stellar OOP and intuitive APIs. Most of the examples of public code would not even pass a serious revision of the code on my team. It's not about software engineering, after all, it's about mathematics: multiplying the matrix must first multiply the matrices, clean DSL (and Ruby makes you used to good DSL) is always an afterthought.
The same functionality can have different APIs also within the same library. It might seem strange to do a series of those that it's made of
np.ones ((2,3)) (takes a tuple) while creating a series of random numbers of the same shape is performed with two separate integer arguments:
Also, do not worry about the documentation or style. Once you come across some non-trivial details when translating your academic paper into code, you'll need to read the library source code and it will not be easy. Test coverage is also lacking.
However! This is the opportunity to capitalize on your best programming practices: feel free to create good reusable libraries with Jupyter notebooks publicly available.
9. Finally, redo your calculations
Naturally, I left the best for last. In the end, you will have to close all the mathematical blanks you have. Especially if, after covering your bases, you are willing to remain at the forefront and follow the academic publications.
Fortunately, machine learning has its "bible" in the form of an 800-page ultra-dense book "Deep Learning (Adaptive Computation and Machine Learning)" by Ian Goodfellow, Yoshua Bengio and Aaron Courville, known as Book of deep learning. Also fortunately, it is available online, free and in full.
Part I (linear algebra, probability and information theory, numerical calculation, machine learning principles) is the minimum introductory minimum that is surprisingly enough to feel much less intimidated in following the current research. Yes, there are 130 pages of an unpleasant reading, but you will not regret reading it.
Thanks for reading!
I hope this article has been able to convey my passion for in-depth learning and has made the field seemingly accessible for people like me who come from applied programming. I sincerely believe that with the recent advances in artificial intelligence and deep learning the world is approaching another bulb momentand yes, I mean Edison's light bulb.
A "curious software developer" like you and me will be the main driving force of the revolution that has already begun. Perhaps, not exactly at the forefront of science (otherwise you probably will not read this article), but with the ability to implement the best ideas from the academic world, one application at a time, it's how we change the world.
So go ahead, browse through some resources from the text above and from the list below, increase your trust and start!
Deep Learning Book: everything you need to train with a bit of formal math. Attention, the text is quite fast, but you can easily find more concrete explanations of the concepts described online.
deeplearning.ai: an offer from Coursera by Andrew Ng. It can be taken for free. In my opinion, the only prerequisite is knowing how to program.
Practical deep learning for Jeremy Howard Coders. Completely free. A seven-week course for programmers who want to engage in in-depth learning but do not know where to start.
Blog to read for fans of AI and Deep Learning: start creating your list of readings!
Some email subscriptions to stay informed.
Distill.pub: a platform that presents the research on automatic learning in the best possible and interactive way. Perfect for "visual" students.
- A great summary of the matrix calculation for deep learning. Hosted at explain.ai: a noteworthy (but still rather small) collection of clear explanations on topics related to ML.