In Data Science, learning by doing is the key.

So you have just finished your course and now trying to find a job. You have applied to multiple jobs.

You know the drill. You need experience to get a job. You need a job to get experience. So most people do data science projects to fill this requirement on a CV.

Every time you reach the interview process, the recruiter or the hiring manager looks at your past projects/experience and asks you to explain your solutions to some problems.

There are projects that will land your application in the bin. So let's talk about it.


Why you need project experience

The internet is filled with advice on how to get a job in data science. Most of these advices are about how you should learn more and more and more about data science, which is very true, by the way. But there is another aspect that is often forgotten: project experience. In my experience, most of the companies are looking for people who have some kind of project experience. This is because they want to know that you can do something with the data that you learned about in school. So you should definitely not underestimate the importance of project experience.

If you are looking to get hired at a top company, you need to be able to point to a portfolio of projects that you have done. The projects should showcase your ability to do the work that the company needs done. As an example, you might have a data science project that focuses on building a machine learning model from a dataset. So, when you go interview with the company, you can point to that project as a way of showing that you can do the work that they need done. The project is also a good way to show how you learn and how you work. It is a way to show that you can learn new skills, and it is a way to show your ability to work in a team.

Don't showcase overly Simple Projects

We all start out somewhere. Many try out the Titanic dataset, build a small neural network to predict MNIST, dabble in the wine prediction dataset, or predict house prices in Boston.

These are very good projects to get started. I started my machine learning journey by building a neural network from scratch to predict MNIST. They are natural learning points and in fact. However, these problems are also considered solved for the most part and don't showcase your skills adequately.

If you demonstrate a novel approach to a problem on one of these data sets this may be an exception to this advice. Santiago Valdarrama for example used MNIST to showcase contrastive learning on computer vision problems. There the focus is obviously on the novel application rather than solving MNIST.

Basically, if it's a Kaggle playground project it doesn't belong on your CV.

Clearly Coursework

This was Clearly Coursework

In the same vein of simple projects, you will not be able to pass off capstone projects from Andrew Ng's machine learning course as project work anymore. You used to, but it won't do anymore, because everyone has taken that course by this point.

You can mention this capstone in the course work, but normally passing this off as project experience can negatively impact the evaluation of your CV during screening.

Remember, in the first pass, recruiters will often only negatively filter and find reasons to discard your CV rather than giving you a chance.

Projects with questionable ethics

There are many projects that sound interesting at first. But if your interviewers are ethically conscious, you may be in hot water with certain projects.

The Iris dataset is already on a "too simple" list but also was collected by a eugenicist. Not a great look.

Facemask detection may seem like a great project in times of covid. But what is it used for? Can you use facemask detection in shops? No. Karens already don't care when people tell them off. This technology will only be used by law enforcement to monitor peaceful protests in the end. Very questionable.

In the same vein, crime prediction based on historic data will not shine a great light on your critical evaluation of information. While it may seem fun to play Sherlock Holmes, using machine learning to dispatch law enforcement has been shown to overly target low-income neighbourhoods and historically disadvantaged groups.

NeurIPS and many other conferences are increasing their focus on the ethical impact of papers and so do employers as the ethical implications become more apparent.

Projects without a clear goal

Sometimes we dig into data and explore a dataset. This exploration can be an important part of a project. However, a business usually needs to have a specific question answered.

Your CV should probably be written with the STAR method in mind. Having a clear question and a clear outcome in a project description is essential in conveying the value of a project.

This can often mean that we need to trim down a project as we explored beyond the bounds of the question we finally settled on, however, this will also often be the case in a job environment and is something to get used to. Not every avenue of exploration bears fruits.

Projects not demonstrating your skills

This problem can be twofold and often stems from not having a clear goal. But it's an additional problem you find in some project descriptions on CVs and portfolios.

How did you actually solve the problem and how do these skills relate to the job description and eventually the job you are interviewing for?

When you build a chatbot, you can alternatively go into detail with techniques, like text processing, question-answer filtering, or causal modelling. These better relate to actual applications and directly tell the person screening your resumé what you did.

Communicate Well

Choose good projects and describe them well

In the end, it's best to build a portfolio that relates to the specific domain of expertise that you want to break into as a data scientist.

Want to work in healthcare? Work on healthcare-related projects (with ethics in mind).

In case you already built a rather generic portfolio, you can always go into more detail in their description and how it relates to the job you apply for.

There are many transferrable techniques you may have used it in a different capacity to the job description. Describing and focusing on these in a project description can set you apart in the hiring process.

Frequently Asked Questions

  1. What are the common mistakes Data Scientists make while writing their resumes?

They tend to include everything related to their profile.

The resume should make the employer take notice, not turn them away. The resume has to focus on specific skills and achievements. A resume with too much information confuses recruiters on what you have to offer and often lose interest.

As a data scientist, you are responsible for many tasks. It is important to visualize the job descriptions to figure out what your resume should include.

  1. What are the must-haves for Data Scientist resumes?

The resume should have the following sections - Education, Experience, Skills, Certifications, Projects.

Education - always list the degrees you have received, the year you got them, and the name of the college or university.

Skills - The skills section shows your skills. The first thing everyone looks at is your list of skills, so make sure your strengths specific to the job posting are listed first.

Experience - The section should start with the most recent job you had, and list all previous jobs in reverse chronological order, starting with the most recent. List your current job first.

Certifications - Only list your most recent and most important certifications. Include a verification link and the year you got it.

Projects - The section should show your accomplishments, or your contribution to a project.

  1. What are some skills you need for getting a job in Data science?

The most important skills required for the Data scientist job are

  • Computer programming knowledge.
  • Knowledge of statistical techniques & mathematics.
  • Knowledge of machine learning techniques.
  • Knowledge of Database languages like SQL.
  • Knowledge of Data visualization techniques and communication.