How to build a data science portfolio; an interview with the founder of LEAD.
The first thing you think of when you apply to a Data Science position is:
Is my portfolio good enough?
I found many commonly asked questions plastered all over Quora.
In hopes to address these questions, I interviewed the founder of LEAD, Dr Lau.
As a professional in the field, a founder, and a genuine person, Dr Lau shared his insights with me, to share with you.
You may be a student, a fresh grad, someone looking to update their portfolio, or you could be applying for a new data science position.
The first thing is first:
What is a data science portfolio?
A data science portfolio is a great way to showcase your skills in lieu of work experience.
It also demonstrates your passion for data science and assuming that passion is genuine, you will also have a lot of fun completing your own projects and learning new data science skills through them.
I started off with asking Dr Lau the first question on everyone’s mind, including mine:
“Why should you make a portfolio anyway?”
He replied with a smile and explained.
“Well first off, if you are looking for a job, it increases your chances of getting hired. It helps you illustrate the things you may not be able to in your resume.
Even if you aren’t looking for a job, it is important.
It helps to think of it as a sort of journal for you to gauge your progress and document your journey”
He expressed his regret for not starting his portfolio earlier in his career.
“As they say, if you write it down, it sticks in your head!”
Dr Lau believes that by noting down and actively tracking your growth, you are able to use your knowledge more effectively.
I agree with him. He broke my question down perfectly.
But what about your projects?
Which projects to include in a data science portfolio? And which not to?
His big no-nos were the Titanic Data Sets, MNIST, and IRIS.
You should also steer away from common projects from YouTube tutorials, Udemy courses or even Data Camp, for example.
These are good but do not effectively showcase how impressive you can be.
If you really want to put them in, Dr Lau strongly advises that they shouldn’t be on the first page, at least.
“What should appear in the top section or first page then?” was my follow up question.
“It should be a reflection of yourself. It should be as though by looking at their portfolio, we have a rough image of them (and their skills).”
Fair enough.
For example, Dr Lau shares,
“For my own portfolio for example I’d be doing more on the technical side because and my background is text processing and natural language processing.
So, I put those as my top projects so that they are easily visible to my site visitors or to my audience.
If you are more on the visualizations or more towards business, then you should include some of the visualizations as well.
You can either go very specialized or you can go with something with more variety.”
His little bit of tip though,
“Word of advice, if you’re applying for a data science job, I would prefer you to go broad in terms of showcasing different varieties of projects. It should be in a way that the person who reads it can understand and appreciate your depth of experience.”
The next thing I needed to know was,
“Which skills to show in a data science portfolio?”
Because there is so much an individual would want to put into it.
It gets hard to prioritise.
Here’s what Dr Lau had to say about it.
“So if you remember, there are key skills in data science like programming, mathematics, statistics, and domain knowledge.
The first thing that came to mind, following my train of thought from the previous question, is to have something that showcases your business knowledge
Make sure to elaborate clearly.
What is a business question that you are trying to solve?
So, the technique behind is always the same.
A lot of people, mistaken between the technical and the business questions.
So, I’ll give you an example.
I can use the same algorithm to predict whether somebody is going to die from a Titanic disaster or if somebody is going to quit a job or to predict whether a telco subscriber is going to switch to another telco.
In all these three cases, we can use the same technique.
But if I just put the technique, let’s say I put “Decision Tree”, for example, for a non-technical person, they wouldn’t know what it is that I put or even if I put in logistic regression would be like a big buzzword.
So why don’t you just put a business question upfront: how to predict which employee is going to quit next month?
This is easy to understand.
Easy to comprehend and people appreciate.
It triggers their curiosity to read further.
The bottom line would be that business questions/domain knowledge are a top priority followed by your techniques.
Always remember to elaborate on the intentions and use basic writing skills.
You need to have a good opening and you need to have a good closing rather than just saying, ‘this my business question, this is my code, and this is the results.’”
The question I personally was looking forward to asking was,
“Should you have single page resumes?”
Dr Lau gave me a candid answer, as always.
“I think that it depends.”
I always recommend to both students or mentees that when they go for interviews, they should customize their resumes.
For example, if you are applying to a start-up you can safely assume that because it’s a small company that the CEO may be reading it. They tend to have a bit more time to actually read your resume.
If you’re applying to a bigger company, chances are HR is looking through it to check their boxes and to let your application reach a manager.
The worst thing to do, in my opinion, is to have a resume that’s more than 3 pages long.
So, one page is fine, but make sure it’s readable.
Make sure to leave a lot of white space.
But in Asian/Malaysian culture, I would say you better put two pages to show that you are not lazy, but that’s the only reason.”
His final pointer was on:
How to build your own personal brand in data science?
Dr Lau says that the best way to do it is by amping up your LinkedIn.
“Post about your projects.
Describe them and remember to add in some screenshots.
Finally link it to your GitHub!
Remember that hiring managers/employers probably won’t have the time to google you, but they definitely will check out your LinkedIn.
So, make sure your work is easy to find.
Don’t make the mistake of simply linking something into your resume- no one is going to click on it.
As for growing within data science, join Quora and Kaggle!
Quora is a good place to get information and get in-depth regarding the discussions you are interested in.
Kaggle on the other hand is a great way to grow within the community and finding like-minded individuals.
It’s a place for you to get the dataset, participate in the forum discussions, and then take part in competitions.”
So, there you go!
These are straightforward and easily actionable tips to ramp up your portfolio quickly.
For more ways to prepare for your data science interview check out The Data Science Interview Recipe, designed to help you get that job.
We hope this interview was insightful.
As always, shoot us an email at chelsea@lead.io for any questions or feedback.
We love hearing from you.
Wishing you growth, always.
0 Comments