Pandora's Sid Patil on How to Design Your Data Science Team

 Pandora's Sid Patil on How to Design Your Data Science Team

There is no one more interesting to learn about Artificial Intelligence, Machine Learning and Data Science from than my good friend, Sid Patil, Head of Data Science, Listeners at Pandora. Why? Because not only does he use words like “nutritious” and “delicious” to describe data, but he was one of those guys that took Las Vegas by storm and portrayed in the movie “21”

So, I wouldn’t recommend challenging him to a game of Blackjack, but I highly recommend tuning in to these short clips of wisdom to help you create your perfect recipe for data science success!

 

What to know about Machine Learning

“So everyone is using machine learning today, right? And so is Pandora. But the problem with machine learning is that these algorithms tend to be very greedy. So what these algorithms tend to do is they will keep exposing experiences to you that they are really, really certain that you're going to like, right? And so you keep seeing things that you already believe in very strongly and you already love. So it deepens into what you're already doing and how you're already thinking and that's what's making society more divisive and that's what you see today.

It's not that different for music. So three to four years ago we realized that diversity in musical experiences was really important. It was not just important for our listener, but it was also important to emerging artists who wanted to reach their new fans. So we can't just let machine learning go wild. So we have to introduce systems and processes in there which will actually lead to more discovery for our listeners and it will also help the artists and it's not all exploited heavy internally. We call that delicious versus nutritious, because machine learning will tend to always deliver delicious things to you. Actually, I stole this phrase from a friend so I can't take credit for it, but machine learning always tends to actually offer you delicious stuff, but the nutritious stuff is also important. Right? And people are realizing that today and Pandora is heavily investing in that too.”

 
The problem with machine learning is that these algorithms tend to be very greedy. ML will tend to always deliver delicious things to you, but the nutritious stuff is also important.
 

 

What is the role of the Data Scientist?

“So what I tell aspiring data scientists is that data science is not about taking a highly curated, clean dataset and applying math to it. There's a lot more that goes into it and the way I see it, there are four pillars of data science.

  • The first is that the data scientist needs to understand the context of the problem. They need to understand the business and the context because they are there to solve the business problem. They're not there to solve a math problem. They're going to use math to solve it, but it's really important that they understand the context of the problem, so they have to wear the product hat.

  • The second pillar is that a lot of their time is going to go into cleaning the data, sorting the data. I mean these data sets tend to be massive today and they're very unstructured, so a lot of their time goes into cleaning, sorting, organizing the data so that it's in a place where you can actually apply the math techniques on them.

  • Once the datasets are ready, the third pillar is, out of using machine learning or whatever math techniques you want, is to actually build your models and optimize

  • But your work doesn't end there. There's a fourth aspect to it, which is socializing what you did, why you did it, what were the results, what were the methodologies used.

So a data scientist has to aspire to excel in all of these four compartments and actually concentrate on all these different aspects.”

 
A data scientist has to aspire to excel in all four compartments: business context, engineering, science, and socialization
 

How to Build a Data Science Team? 

“Look, I think is very well covered that companies need to make data driven decisions today. I think that part is well accepted, but here's how people should go about building their teams.

  • Number one, they really, really need to concentrate on building diverse teams. Diversity is really important, and I'm not saying this for the sake of saying it, but here's why it's important. Whatever business you are building your audience or your shoppers are going to be very diverse, right? So it only makes sense that the people who are building the product are diverse too. I can give you an example. I mean Pandora, we have more women who use Pandora than men, so you can see where I'm going with with my data science team, we're not there yet, but our data science team needs to reflect our audience to some extent. So that's really important. 

  • The second thing is people should concentrate on hiring data scientists with different skill levels and it's important to establish a strong mentorship program so that data scientists are actually learning from each other. 

  • Third is, as Reid Hoffman alluded to, we expect a lot from data scientists. In general we expect a lot from our employees, but we expect a lot from our data scientists. And there has to be a mutual investment. So we have to invest in our data scientists and the two things that are really important to our data scientists, are education and ownership. Data scientists need to feel that they're always learning and you need to empower them so that they have the feeling of owning what they're working on. Owning certain product features, for example. So at Pandora we actually invest very heavily by sponsoring our scientists to two or more conferences every year and they get the chance to actually learn from other scientists, even academics, but they also go and actually do workshops and do presentations. They get to take ownership of what they're working on. 

  • And the last part I'll say is that this is a mistake I see often in the valley, which is businesses have data science teams, but data science exists as a shiny toy on the side. I think it's important that people build these teams not to inform the business but to drive the business. And it's going back to the four pillars. That's why it's important that you hire scientists who are not just statisticians, but who are also builders, so that they can actually drive the business forward.”

 
A mistake I see often in the valley is data science exists as a shiny toy on the side. I think it’s important that people build these teams not to inform the business but to drive the business.
 
 

How to Foster Growth within Your Data Science Team? 

“There is an education and a learning process involved with almost all the data scientists that we hire. That's why actually establishing a strong mentorship program is important. So we tend to hire from everywhere. We invest pretty heavily in our internship program and so a lot of our interns actually convert into full time data scientists. We hire scientists straight out of Grad school. But again, that's fine as long as you're ready to invest in them. And the way to invest in them is to have scientists at different levels so that they can learn from each other. Honestly, even the principal data scientist and even the staff level data scientists can learn from scientists who are coming straight out of school because they bring this fresh perspective. They have dabbled with the latest techniques and they have spent years and years working on it. So education goes both ways and learning goes both ways.”

 
Education goes both ways and learning goes both ways
 


To hear more from Sid, check out our full interview with him and how he is trying to capture the holy grail of digital music to deliver your ultimate playlist.