A young man in a tan jacket holds a bicycle outdoors while smiling at the camera.
David Azcona. Image: Zalando

In data science, uncertainty is one of the most challenging aspects

7 Feb 2022

Zalando’s David Azcona discusses his work in data science and machine learning, and why it’s important to stay curious in this field.

 Click here to view the full Data Science Week series.

David Azcona completed a PhD at Dublin City University and spent a year as a Fulbright fellow at Arizona State University before taking up his current role at Zalando.

He now works as a senior applied scientist in the market insights team at the fashion e-commerce company. His role focuses on improving machine learning systems and working with computer vision and natural language processing techniques.

‘Many roles in the data science space have moved from pure research scientists to machine learning engineers’

If there is such a thing, can you describe a typical day in the job?

My day almost always starts with our regular stand-up, which is a very short catch-up meeting where each of the team members discuss what we are working on and any blockers we may have. This was typically done in person and literally standing up to make it as quick as possible. These days we do it online but still try to make it very short.

Afterwards, my day can vary a lot. I generally work on improving our machine learning systems, reviewing my colleagues’ code, researching new infrastructure approaches for our data pipelines, collaborating with engineers and product specialists, reviewing state-of-the-art papers in the literature and interviewing new candidates for our open roles.

What types of data science projects do you work on?

When I started at Zalando, I worked on personalisation by recommending brands to our customers that are relevant to their personal style. However, when a new brand is onboarded onto Zalando, we suffer from the cold-start problem in recommendation systems as these are very data-hungry environments.

That is, we cannot draw any inferences for users on items about which it has not yet gathered sufficient information. Our solution has been to learn compact representations for brands, building on Zalando Research’s existing work and leveraging those embeddings to find customers that might be interested in these new brands.

Since then, I switched teams and I am now working on market insights, where we look for product matches between Zalando’s assortment and their competitors. For that, we use state-of-the-art machine learning methods on large amounts of multimodal data including images, text or structured data, and human-in-the-loop systems.

We get to use computer vision and natural language processing techniques to provide insights for Zalando’s traffic and pricing platform strategy.

What skills do you use on a daily basis?

As an applied scientist, we work on problems with a high degree of ambiguity and there are a number of soft skills such as critical thinking and problem solving that we use daily to plan our milestones, design the next phase of experiments and develop our machine learning pipelines.

In Zalando, I found my colleagues to be very open minded. [They] listen to their peers’ ideas and propose solutions to our challenges. In addition, we present our results and approaches to upper management and other stakeholders.

These roles require a good understanding of machine learning fundamentals building on some maths and probability background. We mostly work with programming languages such as Python, machine learning libraries such as PyTorch or TensorFlow on top of a cloud provider’s infrastructure. This is something we learn and get better on the job so these are not a prerequisite.

What are the hardest parts of working in data science?

In my personal opinion, one of the most challenging aspects of working in data science is uncertainty. Planning the milestones of one of these projects and estimating the amount of work it may take is generally harder than an engineering project that implements a particular feature.

However, this makes it much more rewarding too as we get to implement a novel machine learning algorithm on a new domain that customers love!

The tooling to train and deploy models at scale has also a lot of room for improvement. We are still a long way from being able to deploy and serve online real-time predictions in a smooth manner.

Do you have any productivity tips that help you through the day?

I usually make a list of things I want to complete by the end of the day, which keeps me on track. One of my former managers, Anthony Brew, had a multitude of great tips to keep us productive and motivated such as write often, record your achievements for future reference or just get to finish what you start.

I often take a break in the middle of the day and go for a short walk. This helps me to clear my head and get back to work with more energy! In the evenings, I practise some sport, usually a workout outside with some friends that helps me get outside again and socialise after working from home during the day.

How has this role changed as the data science sector has grown and evolved?

Many roles in the data science space have moved from pure research scientists to machine learning engineers.

These are in charge not only of reviewing state-of-the-art approaches and training predictive models, but also deploying machine learning systems in production using data pipelines and cloud infrastructures, and then monitoring their predictions and measuring model drift among other tasks.

What do you enjoy most about working in data science?

I love working with some of the most talented engineers in very challenging problems that range from computer vision for fashion such as detecting clothing garments, to natural language understanding for customer reviews.

These solutions impact millions of customers in a meaningful way making their shopping journey easier and more enjoyable.

What advice would you give to someone who wants to work in data science?

I would suggest to stay curious, start off by exploring what can be done using machine learning, how companies and research institutes are leveraging it to make an impact, and then dive deep into the fundamentals using the many online resources available such as MIT or Stanford online courses.

I am a very hands-on learner and I love to build small projects to solidify my understanding and put my knowledge into practice. Hackathons are a great way to get started while solving a business problem in a short period of time.

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.

Loading now, one moment please! Loading