Vision picking - AI enabled object detection

Enterprise - GTP Auto

When it comes to supplying chain operations, the vast majority of warehouses in the developed world still use a pick-by-paper approach. A paper-based approach requires cost-intensive training, is relatively slow and prone to errors. Vision Picking in warehousing operations can help address these challenges, increasing productivity.

Model design

This project is an exploration of one of the most widely validated Object detection models published in the field of Machine Learning. Using the YOLO (You Only Look Once) object detection framework, an Android mobile application for the detection of beverage packages, in terms of typology and packaging size. YOLO is a Machine Learning framework that uses Convolutional Neural Networks (CNN) and Deep Learning for real-time object detections in images. It is trained on the COCO dataset, a large-scale object detection, segmentation, and captioning dataset that is sorted into 80 different categories of objects. In this case, the model was fine-tuned to privately produced dataset, capturing a high range of sample variability.

How does it work?

A pick operator wearing smart glasses loads a batch of orders to be picked and after the product recognition, receives information from the next location where a pick is to be performed. As the operator navigates toward the location, the smart glasses scan the location barcode and validate that the operator is in the right place. The built-in vision system recognizes the product and verifies the correct item is picked.

Check out the demo video! 😎


This POC was co-created with André Pilastri, and Hugo Andrade.

Jessica Delmoral
Jessica Delmoral
Data Scientist

I am a Data Scientist who loves both the Academic and Industry worlds of applied Analytics. I am currently working at the intersection between AI and medical image diagnostic algorithms.