Top 5 considerations when creating an in-house Data Science Team!
Many of our clients ask us what to consider when setting up a data science team and what mistakes they should avoid. Although there are many considerations, we have addressed the ones we believe are key and will determine the unit's success.
This article assumes the organisation has decided to have an in-house department. This means that the need for an internal department has been established and quantified:
that the difference between descriptive, predictive, and prescriptive analytics is fully understood; and
that the portfolio size and modelling/analytical requirements warrant the on-going costs of the department.
Let’s delve deeper into the dos and don’ts when setting up a Data Science Team.
1. Effective management is critical
When structured and managed correctly, a data science team can benefit the organisation enormously. Unfortunately, we have come across too many departments that have no actual manager (all team members are senior but have no direct reporting line to one particular individual) and/or no link to the business needs of the organisation. This results in the Data Science team working on what is cool rather than what’s needed.
The team manager must be sufficiently technical AND have a good grasp of the business requirements and how work should be prioritised. Specific use cases should be addressed and tackled in a particular order, managing the balance of objective-driven output and R&D tasks to maintain job satisfaction in the team.
Key to all of this is ensuring the team is structured and works towards providing a measurable return on investment (ROI) for the business. They need to be aware of what actually provides benefits, and they must be accountable for their time, effort, and delivery. Data scientists who understand and aim to deliver benefits can make a significant impact to an organisation's bottom line.
When deciding the objectives, remember these are not credit analysts or reporting resources. Monitoring, ongoing business reports, etc., warrant a different skillset and a much easier team to recruit and manage. Focus your highly skilled resources on crucial projects.
2. Data Access and Standards
Data scientists are expensive resources, even at junior levels. They are also challenging to recruit and keep.
This means that frustrating tasks such as fighting to simply get access to the data or executing the same tasks repeatedly cost the organisation time, money and employee engagement.
Ensure data scientists can access data and understand the systems and processes on which their design solutions will be implemented.
Also, ensuring that processes are streamlined from the very beginning means the team can focus on the challenging tasks and deliver high-quality results.
3. Proper Due Diligence
Don’t believe when you hear ‘That’s the old-fashioned way’. Due diligence may be old-fashioned, but it’s still a must. Too many times we are told, ‘That’s the way it used to be done’, ‘that’s not needed anymore’.
Quality checks, proper data validation, modelling standards, understanding how the techniques work with the data you have, etc... are all still needed, even when using the latest technology.
The results, when tasks are not properly understood or checked, are overfitting models and sub-optimal solutions which actually don’t work. New technologies are about doing things faster, not ignoring fundamental steps in the process.
We understand that it’s sometimes difficult to manage what is not understood, but, as with any department, accountability is essential (and universal!). You should not be afraid to confront the team and ask them to explain what they are doing and, most importantly, why they are doing it that way.
You really don’t want to end up 1-2 years down the line with only costs, a big team, one or two ‘bad’ models delivered and a bad reputation for the Data Science team. Believe it or not, this is more common than you think.
4. Adequate team size
It obviously does depend on the portfolio size and the various use cases. Still, a good balance of experience and size will determine the ability to share information, QC each other and establish contingency.
It isn't easy to find good, experienced Data Scientists, but they are very important to ensure the more junior members are mentored correctly and the team has sustainability over the long term.
Too many times, analytical teams end up with juniors only because the most experienced person leaves, and essential knowledge is lost in the process. As managers of analytical teams, we have so often faced the feeling a child has when the sea has destroyed their sandcastle, and they have to start all over again.
5. Platform costs
Data Science teams will want and need a good developing and executing platform. If you are considering using one of the many clouds, understand this can become an even more expensive cost than the team itself.
Teams should have a budget for the technology side. They should also be accountable for that cost and ensure that instances are not left running for no reason.
Consider starting with simple objectives that can be completed using existing computing resources, then move on to automation and better use of existing cloud technologies. If the use cases are known and the data is structured, there is no need for high processing costs. Ask the team to justify the costs of moving to more sophisticated setups before you do so.
Remember, ‘techies’ will get very excited about new ways to do things or new software available. This additional cost may not be warranted if you are developing a traditional scorecard, for example.
Understanding and addressing the above areas will ensure your team is set up correctly and is worked from the ground up, using sound principles and limiting costs until the team can deliver adequate results for the organisation.
Not every single team and organisation will face the same challenges, but structure and an ROI focus will ensure you maximise the output of your Data Science team.
Until next time…