Vanilla Computer Vision.

Computer Vision (CV) is everywhere, in the military, in medical, in academics, in logistics… CV touches virtually all segments of the economy and society. (Military) Drones in Ukraine have a vital role to detect and neutralise potential threats, identify objects of interest - they rely on CV. (Logistics) Automated drone delivery leverages computer perception to manoeuvre in remote areas or affluent suburbs - that relies on CV. (Social) Startups helps you find clothing items in retailers catalog by matching pieces detected in your pictures - they rely on CV. (Medical) Doctors increasingly depend on computer assisted diagnostics and VR to identify suspected cancerous cell and operate remote brain areas - they rely on CV. Beyond CV’s business impact, individuals should be able to leverage computer perception to serve their social projects, concepts, and interests, beyond Apple or Google shy attempts. CV has a clear impact on growth, revenues, profits and society.

All industries are trying to build sophisticated CV products heavily depending on what we call Vanilla CV (VCV) namely classification, detection and segmentation. While essential, the implementation of VCV in industry has been plagued by high attrition rates due in most part to labelling outsourcing, over reliance on big data, poor data quality, [absence of clear internal data strategy](https://sloanreview.mit.edu/projects/reshaping-business-with-artificial-intelligence/) within companies and misalignment between ML engineers and subject matter experts. We believe that, at the roots of it all, is the lack of standardised efficient ML tools to for humans to teach machines. At **l`école we are setting CV back on the right path by implementing Machine Teaching, infusing human expertise, distribution frequency and business consequences into the selection of training data so that the most relevant outputs / predictions are produced.

Let’s backtrack to clarify why think that standardisation is key. Imagine a world where each construction company has to manufacture their own nails in addition to building structures. Results from one construction company to another would vary as the hired talents and the knowledge of nail manufacturing would differ, resulting in buildings with unreliable safety standards, delayed completion, and increased costs. The building industry has standardized and streamlined nail production to focus on the practice of construction with reliable and consistently well built nails. Companies dealing with large dataset face that very issue with CV, a tool as fundamental as nails. They allocate too much resources towards meagre results building CV tools internally or getting access to third parties tools or outsourcing it altogether. By the time they get to every companies’ purpose - build revenue generating products or services - getting the necessary CV has at best diminished their ROI. At worst the company has wasted some of its precious resources and failed. Said plainly, companies keep overpaying for underperforming AI.

L`école is building the first standardised and consolidated ML platform implementing the concept of Machine Teaching, that is ML done right, cheaper, and better by the subject matter expert to achieve business impacting results. That means that we are developing business oriented and reliable ML products built for companies depending on ML. Our vision does not stop to businesses. Strong supporters of open source and Ai for good, we are building the first fun, accessible, non-threatening AI platform for everyone’s need through a gamified interface that is libre for non-profit project. Our straightforward value prop materialises into a consolidated Machine Teaching platform for visually enhanced human-in-the-loop data evaluation, continuous ground truthing, clear calls for actions and actionable performance reporting. Our product is built upon two interfaces - data and user. We’ll get back to that but first, we would like to take a short detour and discuss why 9 out of 10 ML projects fail. If you are in a hurry or if you are in the know, please jump to the UX section below.

Prediction. Machine Learning.
Representation Learning.

Since their beginnings, organisms have tried to capture a finer picture of their environment statistical regularities - labels - with increasing sophistication. Humans, first with the emergence of their associative cortex then with the development of mathematics and computation, have been able to abstract their environment with ever higher and finer predictive power. Until recently and with the notable exception of parts of physics and maths, our predictive power was limited to ad hoc recipes combining linear relations between variables selected by intuition, experience and, sometimes, reflecting biased assumptions. Following the development of neural science and neuroscience in the 60s, researchers formalised the principles that would break the limitations of legacy mathematics, capturing non linear relationships between observations and giving raise to the emergence of Convolutional Neural Network (CNN). Combined with computation’s exponential developments, CNN allowed the development of prediction algorithms associating non linear variables. Humanity had finally solved its longstanding hard limit when making inferences based on large dataset analysis and ML was within hand’s reach. Directly stemming out of CNN, Deep Learning is a true step toward general machine intelligence enabling a unique type of statistical learning : representation learning AKA learning what is stationary in a distribution of observations. Applied to image classification, representation learning does not classify images per se, it learns features to compress images from millions of pixels to hundred(s) of value vectors that are then analysed with classical maths approaches to predict classes. Unsupervised and supervised ML using Deep Learning learns the features’ non linear relations and interdependencies that humans cannot grasp, no more, no less.

Evaluation. Decision. Calibration.

The most widely used and operational form of supervised ****ML, and the core of our business focus today, is representation learning based computer perception commonly known as Computer ****Vision Artificial Intelligence (CV). While its mathematic and computational aspects are challenging, CV’s working principle is simple : show your machine objects of the world you wish to predict (e.g. cats) and it will extract features and their underlying relationships to produce labels and detect objects accurately. Beyond a seemingly simple concept, companies using CV face data challenges and make systematic errors that have produced wasted capital at best, and human tragedies at worst. Said errors are common, leading to catastrophes even within the best tech companies around and being responsible for the >87% [attrition rates](https://breakdowndata.com/top-10-reasons-why-87-of-machine-learning-projects-fail/) observed in CV development**.** As a team of ML pioneers, we classify CV common mistakes within 3 categories - Evaluation, Production and Curation.

Evaluation is key. Businesses’ hard problem with CV is not to produce predictions but to rigorously evaluate their risks and accuracy in light of the harm it could do to their business and customers. Evaluation of your CV must be analysed within data constrains (e.g. geographical, cultural, tech compatible…), and the calibration threshold selected in accordance with your business specificities. Where is your threshold when detecting cancerous cells in your patient’s bloodstream? Is your threshold is low enough to boost your consumers’ reach while still avoiding customer frustration? Having a clear view of the implications of your prediction’s accuracy and of the different outcomes when building your model has become CV’s sinews of war. We believe that putting the subject matter expert, be it the product developer or the oncologist, in charge of building and testing his CV model is the only way to clear that risk. An experienced oncologist evaluating his models’ predictions will never be replaced by a ML engineer.

The right production in the right hands.Today’s CV model building is akin to discrete manufacturing. Produce one model every few months, update it every few month and loose touch with your customers / patients / fleet in the meantime. We believe CV should be akin to continuous process manufacturing where the model’s accuracy and predictions are updated in sink with your business. We see your data stream as a resource to be tapped continuously not discretely. Model discontinuity is topped by misalignment and siloed visions between you engineering and business teams. While the first ones will be focused mainly on accuracy and technical metrics, the business team will be more interested in financial benefits or business insights.

Curate your data through the eyes of the expert. Today, nobody looks into the data and large, uncured, sometimes irrelevant datasets are handled by engineers who most likely do not know the nature of the data they handle nor the keys to their company’s business. Ensuing this lack of overseeing are poor target data distribution and poor prediction power when confronting your model with the rest of the world. ****If you only showed your machine images of cats and images of dogs and you submit it a car picture, it will predict that car to be a cat or a dog. If you train a model to distinguish vehicles in the US and you submit it a TukTuk, it will be wrong. Because we cannot show all the existing and upcoming objects of the world, you need to adapt to your target distribution with proper data sampling. The alternative, increasing the sample size and outsource labelling, will cost you capital while never entirely derisking your model predictions. Only through the eyes of the expert we can ensure to use business oriented, business impactful, representative data. An experienced oncologist looking at something as idiosyncratic as cell biopsies will never be replaced by a data scientist.

ML issues are akin to sending your child on Wikipedia’s library expecting that he would come back with Voltaire’s critical mind. Rather, you would want your child to be taught the necessary albeit limited knowledge **by the best teacher** to be adaptable to their likely future environment. The very same principles apply to machines. Machines could predict. Machines can learn. It is time for machines to be taught the best of human expertise. Welcome to l`école, the first Machine Teaching plateform.

éo

Learning.

Machine Learning. Representation.

Predict.

Evaluation. Decision. Calibration.

VCV

Vanilla Computer Vision.

éo