Close
Written by
Matthew Weaver

Matthew Weaver

Is it possible to recruit great data scientists?

Creative thinking and a little mathematics provide some interesting insights.

Data & artificial intelligence seem to be high on every CIOs wish list. Data scientists and associated roles are literally worth their weight in gold. A quick search on Google Trends shows there has never been a greater interest than now. Finding talented people can be difficult. They are snapped up before most of us even realise they’re available.

The key is to act quickly, find people with potential and improve their capabilities over time. As Mark Twain once said, “Progressive improvement beats delayed perfection”. It’s a simple concept but in a world where demand significantly exceeds supply – prime mover advantage is at a premium. This was certainly the case for Bargain RAIders Ltd*.

Figure 1: Google trend for data scientists, June 2019.

 

Building capability at Bargain RAIders

Bargain RAIders (BR) Ltd is a new start-up company focused on finding the best consumer products at the right prices. The idea is to build a recommendation engine that predicts consumers’ buying habits. It’s a well trodden path, but there’s plenty of room for improvement.

At least that’s what Mark Denman, the CEO at Bargain RAIders, thought. He had convinced his investors that this was a golden opportunity. “The last piece of the puzzle is hiring a chief data scientist”, he said, as he ticked the final box on his hastily scribbled whiteboard list. Mark nodded confidently to his audience, and noticed his reflection in the office window, was doing exactly the same.

It was hard to know if Emily’s grimace was due to her coffee or Mark’s request. “Mark – you’ve had my recruitment report for two weeks”, she said, still eying her coffee suspiciously. “Right now, finding a good data scientist is harder than finding a golf ball in a blizzard”. Emily was right, but she wasn’t giving up easily. Her tenacity and optimism were two of the many reasons Mark had married her 11 years ago. They had first met at Oxford University where she was studying for a master’s degree in economics.

The next morning, Emily grabbed Mark and declared, “Hireathon!”. Data scientists, she’d found, have easily damaged egos. If not hired on the spot, they will simply wait for the next headhunter to offer them careers with almost limitless potential. If the right person comes along, hire them immediately. Otherwise they are gone forever.

As she described this to Mark, she explained how someone else had obviously arrived at the same conclusion. “A national recruitment firm is running a speed dating event – for data scientists”, she exclaimed. “I would like to register us and I really want you to be there”. Emily easily handled his initial objections, “How does it work?”, he asked submissively.

What on earth is a hireathon?

“A hireathon is a little like speed dating”, Emily described. “As a recruiter, we have a limited time with… let me check… 24 applicants. There is one key constraint”, she added – “demand is far greater than supply. If we find the right person, we must hire them immediately or we can assume they are gone forever”.

Emily summarised the conditions of the event for Mark:

  • There are 24 talented data scientists that have agreed to see us over the next four days.
  • We have at most, one hour to interview each applicant.
  • We can only rank each applicant with respect to the other candidates we have already interviewed.
  • We will assign each applicant a number that defines how we rate them from 1 to 24. Lower numbers mean higher relative rankings – ideally we want to hire candidate number 1.
  • We can change people’s rankings at any time during the interviews.
  • We will know nothing about formal qualifications. We can only relate each of the interviewees to each other.
  • The interviews will take place in a random order, dictated by the organiser.
  • After each interview, we must hire the applicant immediately or accept they are gone forever.

“There are some other condition that may interest you”, she described. “But I’ll explain them over your favourite dessert, I’ll just go and check how it’s coming along”. Dessert was apple pie and cream, simple and appreciated. Emily served it up along with a few diagrams she had put together a little earlier. Mark shared his time between the pie and Emily’s drawings – although the pie was certainly getting far more attention.

“Enjoy your pudding first darling”, Emily said, clearly noticing his priority.

“I’ll talk to you a little later”.

Optimal stopping to the rescue

Mark trusted his wife – if not for her, he would not have started his current venture. He also tortured himself about the promises he’d made to his investors. They were
not intentionally inflated – but perhaps a little optimistic. He asked Emily to describe her diagrams to him.

“It’s based on a theory called optimal stopping”, she said excitedly. “It’s a subject I studied at university after refusing your offer to bunk off for the last 2 weeks of my first term”. She didn’t add how tough a decision it had actually been. Mark had adorned the apartment they shared for 2 years with photos of his road trip along the west coast of the U.S. While she stayed at home, creating lecture notes for both of them. As interesting as optimal stopping was, it could not quite match the splendour of a Pacific sunset across the Santa Monica horizon.

“We start with a pool of 24 applicants, let me take you through one possible sequence of events”, she explained. She slowly nudged her first diagram (Figure 2) under Mark’s nose. For demonstration purposes, she had assigned a relative ranking to each applicant. “Of course, we can’t rank any applicant before seeing them”, she said. “But this will help to explain the process”. Emily went on to describe the features of a ‘look before you leap’ strategy. She continued, “We look without hiring anyone for a certain period of time. After this point, we hire the first candidate that is better than anyone we saw during the look phase”.

Emily pointed out the obvious implication of a look before you leap strategy. “Of course, we’re in trouble if any of the applicants find out what we’re doing. It’s unlikely they’ll turn up for interview if they find they’re part of the look phase.” On reflection, Mark felt this balanced their own constraint of having to offer a role immediately after an interview had finished.

Figure 2: The intial pool of applicants

 

Emily’s second diagram (Figure 3) showed their position after interviewing the first 9 applicants. Predicting Mark’s question, Emily explained, “Nine applicants equates to just over 37% of the total number of applicants. That’s an important and fascinating number”.

For this type of challenge, the look phase always equates to 37% of the total number of applicants. This gives us the greatest chance of finding the best applicant. Using this strategy, we will find the best applicant 37% of the time. It’s a mathematical quirk that the size of the look phase and the chance of success turn out to be the same number.

Figure 3: The situation directly following the look phase

 

“Our best possible strategy only gives us a 37% chance of success?”, Mark asked. He felt a little deflated by the odds. This kind of news could transform an excellent dessert into indigestion – and that’s the last thing he needed. Emily comforted him and explained it was way better than a random choice of a little over 4% (that is, the odds of finding the best candidate by randomly choosing someone from a group of 24 people).

Following the process through, Emily presented Mark with her third and final diagram (Figure 4). “So now, we enter the leap phase”, she explained. “For our scenario, candidate 15 has an overall ranking of 2nd. If this was happening now, we would simply know that this is the best candidate so far and hire him or her immediately”.

Figure 4: Hiring during the leap phase

 

Mark interrupted, “But the best possible candidate is 17th – we would not even get to see him at all?”, he half asked and half stated. “That’s true, love”, Emily replied. “Furthermore, if we interviewed the highest ranked candidate during the look phase – we would end up interviewing everyone. In which case we’d offer the job to the last candidate ranked 10th”. “Either that or walk away without a chief data scientist”, Mark added, not fully convinced of their approach.

All’s well that ends well

Mark and Emily debated their position for most of the evening. It seemed that, given the constraints they were facing, the ‘look before you leap’ strategy really was their best option. Ironically, this became clearer after a second glass of wine – it had been a long and rewarding day.

Emily registered Bargain RAIders for the Hireathon and they did recruit a new chief data scientist. It turned out to be the 12th person they interviewed on the day. In reality, they could never know where their new recruit ranked amongst all 24 potential employees. After all, they didn’t even meet the last 12 applicants. They only know that she was the best they had seen after the look phase had ended.

Bargain RAIders are enjoying their new adventures and Gemma, their chief data scientist turned out to be an excellent choice. She now has a team of Data & AI scientists and luckily, was not prone to the same constraints that Mark and Emily initially faced when recruiting her.

It seems you really can hire great data scientists

Whilst our story above is purely fictitious, the optimal stopping strategy is real. This particular example is often known as the secretary problem. It’s believed that it first appeared in the February 1960 edition of Scientific American. Figure 5 illustrates the best approach given the constraints we defined.

Figure 5: Summarising the look before you leap strategy

 

The strategy is proven mathematically and, quite remarkably, the numbers do not change for larger groups of applicants. Whether we have 24 candidates, or 1000 or even a million, the look before you leap strategy always has a 37% chance of finding the best possible applicant.

There are many derivatives of this strategy based on different constraints. For example, you can offer the role to a previous candidate but there is a 50% chance they will reject it. In this case, a look before you leap strategy is still best although the relative sizes of the look and leap phases will be different.

If more information is known, then the best strategy may change significantly. For example, you may simply choose the first person above a certain percentile if you have information allowing that percentile to be determined. For example, finding someone with an industry recognised qualification. Back in the real world, we might think it very unlucky to find ourselves faced with this particular scenario. However, data comes in many forms and there are rarely times when unexpected challenges do not appear.

Great data scientists are creative thinkers. They have experience and capabilities that promote thought leadership. They strive to deliver expected outcomes and business value. Rather than focusing on academic models that are difficult or impractical to deploy. More than anything else, talented data scientists can ask valuable questions as well as providing the outcomes you need.

At Objectivity, we like to think a little differently. Our customers’ goals are sometimes similar, yet always unique. Finding the best outcomes means we stray from the beaten path now and again. We always match our approach to your challenge rather than the other way around. Because the most challenging problems require the most creative solutions.

*Bargain RAIders is a fictitious company. All names, characters, and incidents portrayed in this document are fictitious. No identification with actual persons (living or deceased), places, buildings, and products is intended or should be inferred.

Share this post on

Tags

2 thoughts in Comments

  1. PBS

    Great blog Matt!

    My LOL 🤣 moment was:

    “Mark nodded confidently to his audience, and noticed his reflection in the office window, was doing exactly the same”!

    And the article got me thinking…

    At the speed dating events I’ve been to (!), I wasn’t the only person trying to find a match. There were just about as many seekers as candidates.

    In that scenario, the pool of available candidates would be shrinking with each round and the pressure would really be on.

    Some modified version might be more optimal in those conditions , possibly with a reduced “look” phase???

    And we might assume that it’s not necessary to find the best but any candidate in the top 25% should be good enough??

    Devising an optimising strategy to succeed under challenging conditions is always a great thing to practice and thanks for shining the light on this one for us 😃👍

    Reply
  2. Matt

    Thanks Peter.

    I’m just toying with a few lines of code and a little mathematics to look at your scenario and a few others. The dating scene is about to be disrupted!

    🙂

    Reply

Leave a Reply

Required fields are marked *

Read next

Low-code development in the eyes of a Business Analyst

At the end of February, we had an initial visit to a store which sells luxury products. Three weeks after the meeting, we’ve had the first planning session and three months later, working together with two developers, our team was able to complete the development of a CRM solution.  This included the UAT phase as well. So how did […]

Read more