What is it like and how to specialize computer vision and work along with the promising human-like technology of the future? Let’s ask the expert!
In this episode of Ask the Experts, we are introducing you to Kit-Ukrit Watchareruethai, our senior AI researcher, who spends his days with AI and machine learning to develop the technology that mimics and performs tasks that would typically require human intelligence. He has focused and built his expertise on researching and developing state-of-the-art computer vision methods.
Reading through this interview, you will get to know the essence of how to develop computer vision specialization, — what does he currently do, what is his journey like, what skills are needed to develop proficiency in computer vision, and tips and tricks on how to learn and practice. Don’t miss a chance to learn from the expert!
What do you do currently?
Currently, I’m working as a Senior AI Researcher in Sertis AI team. I have been involved in several research projects, for example, face recognition and video analytics systems. My responsibilities mainly focus on researching and developing state-of-the-art computer vision methods. Our team has recently published research papers in a peer-reviewed journal and conference. I also work with the business development team sometimes to help gather the requirements from our clients and design our AI solutions to solve their business problems. Developing a proof-of-concept (PoC) for our AI solutions is also a part of my responsibilities.
How did you get here?
It started when I was an undergraduate student. Although my major was electrical engineering, I have been interested in computer programming since my first year of study. Before I graduated, I had to complete a graduation project and it was my first computer vision project. It was about a fingerprint recognition system which required programming skill and computer vision knowledge. Doing this project made me realize that computer vision has the great potential to be a key technology for the world. There are countless applications of computer vision, which we currently have seen a lot of in our daily life.
Since then, I continued doing research projects in the field of computer vision in my graduate-level studies and many more projects when I was working as a university lecturer. For example, medical image analysis, automatic weed detection, plant species classification, nutrient deficiency identification, and so on. These projects had their own challenges and required different computer vision techniques to solve, while some of them also required domain-specific knowledge.
What is computer vision and how does it differ from other artificial intelligence fields?
Computer vision is a technology that involves all stages of image and video processing pipelines, starting from acquisition, processing, analyzing and understanding. One of its ultimate goals is to make computers be able to understand scenes in the real world through image and video signals.
Computer vision is actually a branch of AI and what makes it differ from several AI fields is the data that it processes. While other AI fields process other kinds of data such as speech, text, graph, or tabular data, computer vision focuses on image and video data, or in general, higher dimensional data possessing both the layout and color information. The former tells us where objects are in an image while the latter shows us what color they are.
Due to the nature of data, methods that computer vision uses to process image and video data are also different. Basically, an image is represented as an array of numbers, a lot of numbers. Sometimes, there are millions of numbers in one image. And each number has some relationship with its nearby numbers. So, a method to process image data needs to process groups of numbers together in order to get both kinds of information.
What do I need to become a computer vision expert?
In my opinion, there are three things that are essential for computer vision. The first required skill is computer programming since it allows us to implement and test computer vision methods. Python, C/C++, and MATLAB are popular choices of programming languages often used in computer vision. The second is know-how to apply mathematics, image processing, and computer vision techniques to solve real-world problems. Frameworks such as OpenCV, PIL, scikit-image are good tools for computer vision. The third skill is an ability to develop machine learning and deep learning pipelines since these tools have become mainstream approaches in many computer vision technologies. Notable frameworks for machine learning and deep learning include scikit-learn, PyTorch, TensorFlow, MxNet, and so on.
How did you learn and practice computer vision?
I took image processing and computer vision courses when I was a university student. Later on, self study. Reading textbooks is a good way to learn techniques and foundations of computer vision. Some online resources are very helpful for learning computer vision; for example, LearnOpenCV (https://learnopencv.com/) and PyImageSearch (https://pyimagesearch.com/). These two sites explain how computer vision problems can be solved and provide some step-by-step example code.
To learn advanced, state-of-the-art computer vision techniques, reading research papers is vital. Simpler explanations of state-of-the-art methods are sometimes available as blog posts; for example, Medium. However, implementing those state-of-the-art methods might be very difficult since they are usually highly complicated. Fortunately, several papers also share their code repository online. Papers With Code (https://paperswithcode.com/) is a good place to find code repositories of published papers, if available. Benchmarking datasets for many computer vision problems are also listed in the site.
More importantly, I gained more experience from doing and supervising many computer vision projects.
What are your tricks and tips?
Mathematics is the first thing I would like to recommend. Strong background knowledge in mathematics will help you a lot when reading books and research papers. Since computer vision processes images and videos, which are technically multi-dimensional arrays of numbers, mathematical knowledge such as linear algebra offers an efficient and effective way to handle and analyze this kind of data. Calculus and statistics are also important tools in computer vision.
Moreover, although current mainstream approaches for many computer vision systems rely on machine learning and deep learning, good understanding of classical image processing and computer vision methods would still be useful.
Lastly, when learning new techniques/methods, try to understand the intuition and rationale behind it (that is, why it works), not just trying to understand the process (how it works). Understanding both why and how methods work would allow you to choose the right tool when solving a new computer vision problem.
If this sparks your interest in the field, feel free to visit the Sertis Vision Lab tab on our website for more awesome research and blog posts mainly about computer vision. Enjoy! :https://www.sertiscorp.com/sertis-vision-lab
Written by: Kit Watchareruethai
Explore yourself and learn by doing, in the leading technology company that is surrounded by talented people from around the globe. Check out our open positions at this link below https://www.careers.sertiscorp.com/jobs