Using Google Cloud and machine learning to improve fraud detection
As the world's largest online only supermarket, our systems handle millions of events every minute as our customers navigate our website and apps, add items to their trolleys, choose delivery slots, and check out their orders.
These interactions result in petabytes of data collecting in our data lake stored in Google Cloud. One challenge facing any retailer operating online is isolating and recognizing the rare incidents classified as fraud in a smart and efficient way.
For those unfamiliar with online fraud, it typically covers any instance where an order is delivered but not paid for. Fraud can happen as a result of a genuine mistake (a customer entering the wrong personal details or using an expired card accidentally) but, occasionally, it can also be the result of malicious intent. If left unchecked, fraud can propagate to other systems and companies and affect our customer service.
Therefore, we needed a clever way of predicting and recognizing these incidents among millions of other normal events. The answer to this complex challenge was to use the cloud and machine learning (ML). Our data science team had already successfully deployed many ML projects into production so it made sense to design our own solution using the experience and competencies we had gained from elsewhere in the business.
In addition to augmenting our contact center, machine learning pervades our end-to-end e-commerce, fulfilment and logistics platform. For example, ML is already powering the way we recommend products on our webshop or how we generate search results designed in order to to avoid suggesting meat to vegetarians or products containing gluten to celiacs.
The motivation behind using ML for fraud detection was twofold: speed and adaptability. Machines are fundamentally more capable of quickly detecting patterns compared to humans. Also, as fraudsters change their tactics, machines can learn the new patterns much quicker.
Traditionally, fraud detection agents are employed to make judgement calls on whether they think a certain interaction is likely to be fraud or not. Decisions are based largely on intuition and can leave companies in a position of playing a cat and mouse game with fraudsters. For example, if fraud agents notice a correlation between baskets containing an unusually large order of alcohol and confirmed instances of fraud, they might then continue to look out for this trend in future. However, once fraudsters pick up on this, a new trend may start for say household goods, and so the game of catch up continues.
A machine learning model on the other hand can learn and adapt far quicker, evolving based on the current environment and even predicting future trends; the model can also look at many more factors than a human or fixed rule based engine can. The work of fraud agents is then made more manageable, as they no longer have to frantically analyze thousands of data points to establish fraud. Instead, they simply perform a final check to confirm whether they should cancel the order or not based on the prediction made by the model; it’s a perfect case of humans and machines working together in harmony.
However, just because we could improve our fraud detection process with a ML model didn’t make it easy to implement. Confirmed fraud cases are incredibly rare; given a typical fraud rate of one in every thousand orders (0.1%), a machine learning model that is only 99.9% accurate could still miss several instances of fraud.
Therefore, our fraud detection ML model had to be incredibly accurate.
So, how did we do it? From the data we had collected from past orders, including cases of fraud, we created a list of features which included the number of past deliveries, the cost of baskets, and other information. The more features we included in the training data, the more reliable the model could be, so we made sure that we were providing our model with as much information as possible (and we will continue to add more as time goes on).
After collating our data, we then had to decide upon an algorithm capable of learning from the information. Eventually we implemented a deep neural network on TensorFlow, as it was precise and easy to deploy into production. Using TensorFlow was a natural choice as we had already made the move over to Google Cloud for data analytics so using TensorFlow alongside our data stored on the Google Cloud Platform worked well. It also made our model scalable and transferable, which has in turn empowered our developers.
In order to brainstorm ideas and improve our proof-of-concept model, we hosted hackdays where our multidisciplinary team of data scientists and software engineers explored new features, tested new models, analysed and visualised new data and explored monitoring. The main goal of getting everyone in a room was to manipulate our data in order to gain insights into the problem and provide more information for our model. These hackdays allowed us to focus solely on the task at hand and took our model to the next level.
We now have a model that predicts results in real-time and provides the likelihood of fraud as a probability, using the following process:
The customer order information is stored and analysed using BigQuery.
The information is then processed using Dataflow, where the data is normalised, (a process whereby numerical features are re-scaled around the origin, and categorical features are transformed into a sequence of integers). This reformatting is necessary for many ML algorithms, including Deep Neural Networks built using TensorFlow.
Dataflow is also used to transfer the data from BigQuery to Datastore and Cloud storage.
Datastore provides fast data access, allowing for pre-computed features to be accessed when running real time predictions.
Cloud Storage efficiently stores data using a file system.
Cloud ML consumes the data from Cloud Storage and produces models as APIs.
The Ocado fraud detection model, powered by TensorFlow, then reads the data from Datastore and then, using the Cloud ML APIs, makes real-time predictions.
The model has been a great success, improving Ocado’s precision of detecting fraud by a factor of 15x. However, we are keen to continue improving. We are now tackling our next challenges: investigating algorithms that could allow us to explain our predictions in more detail, assessing whether we can transfer learnings from one retailer to another, and considering what tools could help us to streamline our process.
Looking back, this project wouldn’t have been possible without the close collaboration between many technology and retail teams in Ocado: the ML model has been consuming data from our order management, payments, CRM, and e-commerce teams. We have been using the tools developed by our data engineering, data platform, and machine learning services teams. We also relied on the data governance team to help us set up the Google Cloud projects and interact with our colleagues in the fraud team.
In such a fast changing industry we are always trying to stay ahead of the game, exploring the latest technologies and thinking of creative ways to implement our cutting edge ideas.
Have you found any innovative uses for machine learning models using TensorFlow?