You want to switch to a data science career, but don’t know how to get started.
If you have ever considered entering a data science career but haven’t done so, chances are you are sold with the idea that you need a PhD, or formal education like a bachelor or master degree in data science.
Or you wonder whether self learning is a viable option to learn the required skills and get a data science job.
Today, let’s talk about whether you should self learn data science, and its pros and cons.
Pros of self learn data science
Self-learning is a quick and flexible way to gain knowledge in today’s world.
Data science videos, podcasts, tutorials, lectures, are just a Google search away.
You can buy a data science book, sign up for an online course, or read some articles from medium.com to begin with.
These resources are designed to give you a basic understanding of data science. You can further explore other topics on your own afterwards.
The upfront investment for self-learning is usually low. The average cost of a data science book is around $80 – $120 USD.
The price of an online course starts from $9 USD (they often do sales) to $250 USD. Coursera MOOC is also available for free if you choose to audit and don’t need a certificate.
Self-learning is good if you are unsure if data science is for you (try out our career series or skill series for a start?).
You can try out different subject areas (a.k.a test the water) before deciding whether you want to continue. This way you don’t risk invest your time and money, in case data science is not for you.
Cons of self learn data science
Any beginner who has tried to self-learn a technical subject will tell you that it is challenging to do so.
Learning data science by yourself needs a lot more effort than someone who learns with a course. You may not get the same results as you would if you were following a course.
Most online courses lack interactivity with an instructor. They also do not provide any mentorship. It can be hard to get the support you need, if you get stuck while learning (which happens quite often), or you need some feedback on a project.
Learning at your own pace also means you need to have excellent time management skills and motivations.
There are always other priorities like family, social gatherings, and work. Many studies show that the completion rate of self-learn programs are below 20%.
It requires a high level of discipline to achieve the learning goals you set, since there are no hard deadlines or exams. You are responsible for your own learning progress.
Another challenge of self-learn data science is that it is hard to know what you need to learn.
Data science is a broad and an ever-evolving field and you may not know what skills to pick up and their order.
Trying to draft your own learning plans through blog posts and YouTube videos can be frustrating.
Every blogger and YouTuber has a different pathway, and it won’t be a perfect fit for you. Especially if you are serious about a career in data science.
What are the steps to self learn data science?
Before you begin, make sure that you at least read some data science book such as the, or read some blog posts.
This is to make sure you have a comprehensive overview of data science, the required skills, tools, and terminologies. It will reduce frictions before you start watching tutorial videos or sign up for an online program.
Then, learn a programming language like Python. This will help you to perform data processing tasks that normal software can’t ( e.g. MS Excel). We use Python to write codes to automate our tasks at scale and build machine learning models.
At this point, many people will recommend picking up mathematics or statistics. Some will also suggest algorithms, machine learning, or even big data analysis.
Nothing wrong with learning these first, but they can come later. After you have identified the problem you want to solve.
What I would suggest is to think of a business problem and start looking for data related to your problem.
The business problem can be anything, as long as it is something that interests you.
It will be ideal if you are currently working and has access to your company’s marketing, finance, or HR data.
If not, go to Kaggle.com and look for a dataset related to your problem.
It can be topics like greenhouse gas effects or global warming, if you are into environment study. Or it can be something like forecasting the price of petrol in the next 7 days.
After you have identified the business problems and gather some initial data, then only move on to learning the analysis techniques and algorithms.
For example, learn linear regression if you want to forecast the revenue based on ads budget.
Learn K-means clustering if you want to group similar customers base on their shopping behavior.
Learn classification if you want to know which employee is resigning next month.
The reason behind is that there is one thing I learn after teaching many classes and training students from different background.
The best way to learn is to work on projects. It helps to develop skills. It also helps you to retain the knowledge a lot longer because you actually learn how to apply it.
Remember:
Don’t learn the techniques before the problem. It is like carrying a hammer around looking for a nail.
Is it possible to master data science in 3 months?
Yes, you certainly can. But like I said, it depends on whether you are disciplined enough, and whether you have guidance and support when you encounter roadblocks.
And, like everything else in life, you need a plan.
Here is an example of a 12-week data science learning:
- Week 1: Learn how to write Python codes using a notebook (Jupyter or Google Colab). Learn how to read data from CSV files, and retrieve data from a SQL database.
- Week 2: Pick a business problem and look for a dataset from Kaggle, learn how to perform EDA (exploratory data analysis).
- Week 3: Apply supervised learning and predictive modelling to the dataset. Learn how to evaluate the performance using different metrics.
- Week 4: Create a presentation on your business case study. You can record a 5-minute video and upload it to social media, blog about it, or present it to a group of friends who don’t know about your topic.
- Week 5: Learn how to analyze text using Natural Language Processing (NLP). Start from pre-processing, tokenization, and feature extraction. Then, look into applications like sentiment analysis, and topic modelling.
- Week 6: Learn supervised learning. Practice how to use supervised learning to perform classification on binary and categorical data, and prediction using regressions.
- Week 7: Learn unsupervised learning. Start by using K-Means clustering to group similar items (e.g. customers, documents). You may also look into recommendation systems using collaborative filtering.
- Week 8: Learn the basics of data types and the statistical functions for EDA. Starts from the mean, median, standard deviation, correlation heatmap, and boxplot.
- Week 9: Learn discrete mathematics and probability. Starts from the combination, permutation, p-value, conditional probability, Baye’s theorem and distributions (binomial, Poisson, normal).
- Week 10: Learn the concepts of evaluation (false positive, false negative) and how to evaluate machine learning models using precision, recall, ROC curve, confusion matrix.
- Week 11: Learn visualization. Understand the charts that suit different data types. Make sure you understand the potential pitfalls so that you won’t mislead your audience.
- Week 12: Build your portfolio. It doesn’t need to be fancy at first. Include a clear title, engaging description, and a link to your notebook or presentation slides. Share your portfolio on social media and ask for feedback.
You should only spend about 30 minutes per day reading a book chapter, blog post, or watching one to two tutorial videos at most. Spend the remaining time working on hands-on practicals.
If you get stuck on a particular topic, write it down on a separate sheet, and move on. Most of the time it is because you lack some other related knowledge, which you can unlock down the road.
You should not spend more than 2 hours per day in order to prevent fatigue and burnout. Also, do spend time resting and doing other activities. Learning data science is an enjoyable journey, it is all about a balanced life.
Will I have problems applying for a job?
In general, there shouldn’t be. You should have the confidence you need to apply for a data science job if you have picked up the topics listed above.
Of course, your success rate depends on a couple of factors: how you show your skill level, the technical test, and the interview.
Make sure you research the company that you want to join before even starting the application process. You should know the company’s goals, what they are looking for in their employees, their work culture, and their application process.
Don’t worry even if this is the first time you are applying for a data science job. You might worry that you will not be able to get the job because you do not have enough experience.
This is not the case. There are many ways to prove your skills and experience. And the portfolio is one of them.
A good portfolio can give you a competitive edge when you are applying for data science jobs. Make sure your portfolio contains a good mix of different projects. And there are projects that are related to the company’s core business or industry.
If the hirer or recruiter will then schedule a technical test with you. If they are happy with your resume and portfolio.
During the technical test, you will be given a use case, a problem statement, a sample dataset, with a set of questions.
The purpose of the test is to test your ability to identify problems and propose solutions. Most of the technical test allow you to take home. You will need to complete it within 2-3 days, before they decide to schedule an interview with you.
During the interview, you will be asked a series of questions that help your hirer to further assess you.
They may ask about your knowledge, working style (team or solo player), and the culture fit. Do expect some further technical related questions.
But they are often more for testing your thought process, rather than testing your memorizing capability.
Check out the data science interview recipe.
Why consider joining a data science boot camp?
If you want to switch to a data career in a more effective manner, consider joining a data science boot camp.
An instructor-led boot camp gives you the skills and knowledge you need to enter the field of data science.
It’s focused and action-packed. Before you start the boot camp, make sure the instructor is also a practitioner.
This way, he will be able to help you to relate the data science topics with his personal experience. Learn through step-by-step guidance is more effective than following the course material.
A boot camp needs to also have a solid support system. There should be enough facilitators instead of only one trainer throughout the program.
Facilitators will make sure you receive the personalized attention that you need. This way you don’t have to worry about being stuck on problems for days before moving on to the next topic.
Lastly, find out the application process before signing up. This is to make sure you are not learning with others who are not committed. You want to learn together with other like-minded people.
Check out the data science uncut boot camp, and see how you can learn data science and transform your career in 30 days.
Learn from trainers with industry experience. Receive guidance, mentorships from experienced mentors, and fast track your progress to get a data science job and start transforming today!
Conclusion
Self-learning is a great option when you want to learn something on your own, but want to keep the costs low at the same time.
Yet, it takes time to learn on your own, and you need to be disciplined. It is also challenging to decide the topics, and how much you should spend on each topic. Unless you have done thorough research on them.
It is always good to start from a book, blog posts, tutorial videos to have a broad understanding of data science. A training program and bootcamp is a more effective choice for you, if you can invest the time and afford it.
It is also possible to work full time while you learn data science. You can apply most of the skills at work and see results immediately.
0 Comments