Machine Learning has proven to be a very versatile framework for solving a wide variety of novel challenges. From self-driving to healthcare, it has proven to be capable of tackling problems that were previously thought to require a certain level of human intuition and understanding to solve. An area of computer science that has experienced a lot of benefits from the versatility of ML is computer vision. It deals with problems relating to the extraction of high-level, meaningful information from digital images. Example problems that fall under the domain of CV include object detection & recognition, optical character recognition, artistic image generation, and many more. Machine Learning has introduced numerous innovations recently, which has resulted in many of these domains having solutions that regularly outperform a human-driven solution, thought by many to be impossible just a few decades back.
Prior to the popularity of machine learning, computer vision was mostly done via what is now called ‘classical techniques’. These techniques often required custom made solutions for individual problems. For example, the classical technique that allows you to detect the presence of a face in an image would be very different from the technique that allows one to detect a bird. Machine learning, on the other hand, has a uniform framework for solving a huge class of similar problems, which makes it a much more scalable approach in terms of being able to adapt an existing solution to a new problem. In addition to being adaptable, it also often gives out more reliable and accurate results than the classical solutions. This has caused the popularity of classical approaches to dwindle as more professionals opt for an ML-based solutions.
However, we at Sertis Vision Lab believe that classical approaches to computer vision are still very much relevant today for professionals engaged in computer vision. Past experience from various projects has shown us the advantages of computer vision engineers being well-versed in both classical and modern techniques. In this blog, we share our experiences with this and argue why you are missing out if you have been ignoring non-ML based computer vision.
Classical computer vision — why it’s still pretty cool
Use classical techniques to speed up data annotation
A common instance of leveraging classical computer vision to great effectiveness for data annotation. Annotation, in general, is an expensive and labor intensive process. However, knowledge of classical computer vision can sometimes help alleviate the pain of data annotation by automatically doing a preliminary annotation procedure. Of course, these annotations will not be as good as the human labels, because if they are, then the problem is already solved; however, they can be used as a tool to help speed up manual annotation greatly.
At Sertis, we encountered this problem when we were annotating the edges of Thai ID card for building our state-of-the-art Thai ID OCR pipeline. We had to annotate the corners of the dataset of ID cards for training a card detection module. Manually doing this would have been a lot of work, which we managed to reduce by around 80% by automatically pre-labelling the corners of the card by using a line detection algorithm. Then the annotation process then was simply to go through the dataset once and correct the errors wherever they occurred.
Before spending a lot of resources on annotation, check if the process can be sped up by using classical techniques.
Useful when lacking data
Lacking data is an obvious instance where classical computer vision can be very effective. In situations where collection of data is difficult, classical techniques might be the best way of solving the problem. ML techniques tend not to be effective at all given a small amount of data, unlike classical techniques. ML does scale up very well when fed more data, so if there is a way to gather more data in a cheap and effective manner, it might still be the way to go. However, in cases where it’s not possible, quickly using classical techniques to prototype your product might be the smarter move than waiting for more data to come in.
When lacking data, don’t give up. Check to see if the problem can be solved using any classical computer vision techniques.
More often than not, explainability is a built-in feature of many classical computer vision techniques. ML models are infamously difficult to decipher and it’s unclear why certain decisions are made within a lot of ML frameworks. Although there are some explainable ML models, and there are ways to make even a black-box ML model partially explainable, the field of interpretable ML is still emerging, and a lot more research needs to be conducted before these techniques are production ready in a high-stake environment.
Classical techniques, on the other hand, makes interpretability much more straightforward. This is the reason why they are still widely used in many areas such as finance, banking, medicine and so on. Hence, if you are working on these types of problems, before diving deep into the world of Local Interpretable Model-Agnostic Explanations and Centered Kernel Alignment, check if the problem is better served with a simple corner finding algorithm.
When working on requiring transparency and interpretability, classical techniques are a great addition to your toolbox on top of interpretable ML.
Fast prototyping allows for downstream modules to be developed parallelly
Suppose you are building a ML pipeline that includes multiple components chained together. Sometimes to meaningfully build the latter modules in the pipeline, the former modules are required. If the ML solution for the former modules requires a lot of time (data collection, for example) to be completed, the project will be forced to operate in ineffective timelines. Classical computer vision offers a way out here as well. Those time-consuming initial modules can be quickly developed with classical computer vision, simply to unblock the development of the later modules. Later on, when the better performing ML solution is developed, it can replace the classical module. This might offer a faster completion of the project.
Once again, we encountered this problem when working on our state-of-the-art Thai ID OCR pipeline. Because our text-detection module assumed that the card images were cropped and aligned, it required card-detection to run first. However, we did not want to wait until card-detection was implemented to start working on the text-detection. So we used the classical card-detection system to be a temporary module, which was later replaced by a more accurate ML-based system.
Do not wait around for a lengthy ML module to be completed for starting the work on a downstream module. See if a quick classical algorithm can proxy as a temporary module.
Use as baselines for ML
If the problem that you are trying to solve is not a well-established problem, it might be difficult to understand what sort of accuracy you can reasonably expect from a ML model. For example, if the ML model is performing sub-optimally, it’s difficult to understand if the issue is a bug in the code, sub-optimal hyperparameters, bad network architecture, lack of data, or if the problem is fundamentally difficult and this is the best performance you can hope for. Classical techniques are able to shed some light on the last question. Once you have some classical baselines, the ML development efforts can be focused into beating them. If the ML is performing worse than these, you can conclude that the issue is not that the problem is difficult and we can focus on other solutions.
Use classical techniques as a baseline for ML models. It makes analyzing the performance of ML models easier.
These are some of the reasons why classical techniques are still worth keeping at the forefront of your computer vision toolbox. They can help with a lack of data or annotation, are great for interpreting the solution, and can even speed up your workflow significantly if used cleverly and with foresight. So next time when you are stuck on a problem and are unsure what direction to take, give classical techniques a thought, you might find some unexpected insight into how to proceed.
Written by: Sertis Vision Lab