Making sense of images
Data that describes the world and all the people and things in it is circulating in increasingly visual form.
Smartphone cameras, CCTV, maps, drones and spatial sensors are generating more imagery than human beings can ever hope to monitor or organise. Automated searching of photos, video and even live scenes is therefore vital to locate the 'needle’ of relevant content from this visual haystack.
Facebook, Apple and Amazon are among the first to put this advanced capability in the hands of consumers. Facebook’s photo sharing app Moments uses facial recognition to bring together the relevant contact details with the faces in an image. Whereas Moments is currently unavailable in the European Union due to privacy concerns, Apple’s facial recognition for photo sharing and phone unlock is processed securely on the device – keeping it oﬀ the cloud, at least for now.
The ability to identify visual cues has tremendous implications for the retail sector, crime prevention, medical procedures and more. Recognising the important developments in this area, we commissioned global research consultants Frost & Sullivan to produce a report on the opportunities in visual search.
The Shazam of shopping
The retail sector is ﬁrst to take commercial advantage of AI applied to unstructured visual data. Amazon encourages consumers to scan real-world objects and locate the associated products online using its Fireﬂy phone app.
Start-ups like Toronto-based Slyce describe their solution as ‘the Shazam of Shopping’. The company’s apps recognise products from smartphone cameras, web images and physical product tags and direct the consumer to the closest match in the retailer’s catalogue, or to a range of similar products for browsing. Neiman Marcus, an American department store, is already using Slyce technology in its mobile ecommerce app.
By moving consumers quickly from their initial interest in a product to ﬁnal purchase online, visual search removes the clunky text search stage and should reduce ‘abandoned basket’ syndrome. However, object recognition is technically extremely challenging. To limit complexity, start-ups are specialising in speciﬁc domains. Fashion retail contains a well-deﬁned class of objects, but even distinguishing one item from another – a jumper from a shirt – can be diﬃcult. Then the application must make a further judgement about texture and colour.
Machine learning is the key to accuracy
Breakthroughs in machine learning are the key to improving accuracy. Autonomous, self-optimising algorithms accumulate errors and successes from their attempts at recognising patterns. Analysis of this data requires massive scale computation and industrial quantities of training data before reliable results start to emerge – plus detailed human oversight in the initial stages.
Sentient Technologies of San Francisco boldly claims it will exceed Google’s computing power by recruiting spare resources from data centres around the world. This ambitious start-up supplies the ‘visual intelligence technology’ behind Shoes.com and derives its methods from a combination of evolutionary analysis and deep learning (essentially multiple, interacting layers of machine learning). Sentient goes a step further by linking visual search with analysis of the individual’s intent to purchase and personal preferences.
All this might seem an excess of technical and analytical resources just to select a pair of pixie boots with the right buckle. But as accuracy improves, and consumers become familiar with using visual search, retailers will beneﬁt from increased sales and more eﬀective inventory management. Competition then centres on getting access to suﬃcient high-quality training data to continuously improve the results.
Visual search will transform business processes
Cost is becoming less of a barrier as leaders in the ﬁeld make their artiﬁcial intelligence tools more aﬀordable. Google, for example, recently made its machine learning libraries available as open source, and Microsoft oﬀers machine learning as a service on its Azure platform.
Access to training data may be more of a challenge. Opportunities will arise for data aggregators in specialist domains – supplying validated, high quality data via an as-a-service business model.
Solution providers that can build massive scale and accurate visual search will be in a strong position to add value that competitors cannot easily replicate. Customers in many sectors will quickly realise the beneﬁts in automated operations, new services, and deeper engagement with consumers.
The beauty of visual search in Scotland
There are a number of Scottish companies making important developments in artificial intelligence and machine learning, two essential components for visual search. These developments often begin in the highly specialised departments of Scottish universities such as Edinburgh University and Dundee University.
For example, Scotland has its own pioneer in mobile visual search - Mobile Acuity, a spin-out from Edinburgh University’s Artiﬁcial Intelligence department. Artificial intelligence skills are an essential resource as visual search depends on the underlying, leading-edge techniques of machine learning and computer vision.
Glasgow is using facial recognition for crime prevention as part of its £24 million future city initiative, and Edinburgh-based Toshiba Medical Visualization Systems specialises in automated scanning of medical imaging and has launched a postgraduate research practice with Dundee University.
Could Scotland play a role in your data science project? Find out how we can help.
Get in touch