What do Facebook’s face recognition feature, Amazon’s Go store, Google Lens’s object detection, and Tesla’s Autopilot have in common? If you said that they all use some sort of artificial intelligence (AI) algorithm, then you were pretty close. More specifically, all of those things are based on computer vision (CV). This AI-based process is an impressive advancement that, though not precisely new, is becoming increasingly common in sophisticated products.
That’s because CV seeks to achieve something that sounds easy but that’s far from it – mimicking human sight. It’s hard to understand how something that comes naturally to us, humans, can be so complicated to replicate through machines. However, that’s precisely what happens with CV, a technology that as of now is starting to get closer to the mechanism we use to see and identify things.
Its current level of development made it possible for CV to become relatively common across businesses and products of all kinds. The best is yet to come, though. Once developers overcome the challenges that CV faces today, its potential will only increase. But before we get ahead of ourselves, let’s check some of the basics about how it works.
Computer vision, defined
As said above, computer vision is a procedure used by a computer to identify and process visual content such as photos and videos. To do so, it uses AI algorithms capable of analyzing what the computer “sees” to later generate an output the computer itself can understand. Thus, CV is comprised of three main steps:
- See: the ability for sensors and devices to record the world around them is the most basic capability and has already been achieved some time ago. The high definition cameras we enjoy today are proof of that. Getting devices to see was the most basic and probably the easiest stage, as they only needed to record the light coming from the things in front of them.
- Describe: the second stage is already a complex one. After recording the objects, the computer needs to identify the things it has captured. This is done through algorithms that break the images down to basic geometric features by detecting edges and looking for perspectives. Though this is a hard feat to achieve, modern computers are able to do it thanks to recent advances in GPU-powered parallel computing.
How CV divides reality to identify objects Source: Welker Media
- Understand: a computer that only covers the steps above can capture the image of a car and name it for what it is. However, without the ability to understand, that computer wouldn’t be able to tell that said car is a vehicle that can be used for transportation or that it can be grouped with other vehicles such as bikes and airplanes. Understanding is paramount for CV to actually become the digital version of our sight. But since we comprehend what we see on the basis of our memories, the information coming from other senses, our attention and cognition, and our interactions, CV developers still have to find a way to replicate that context for computers.
All of those 3 steps of the CV process require a lot of complex operations that are only possible thanks to machine learning. That’s why CV has just started to become more widespread, as the computing power and resources necessary for this technology to be a reality only became available in recent years.
However, as advanced as CV is today, there’s still a lot of room for growth. That’s mainly because we don’t truly understand the mechanisms we use to see the reality around us and how our brains interpret the complex objects we perceive. Without that knowledge, it’s almost impossible to replicate our sight through machines. In other words, we still need to work on the Understand part of CV. Still, researchers are working tirelessly to further explore our own vision and to comprehend its inner workings.
Modern uses of CV
The fact that CV isn’t fully developed might sound discouraging for you but don’t worry. CV, as it is today, is a very powerful technology that can serve a lot of purposes. Some of its current uses are perfect examples of what you can achieve when using it for a product, service, or platform. Here are some of the most notable ones:
- Facial recognition: the algorithms used in advanced surveillance cameras and security systems as well as the ones embedded in Facebook that let you tag your friends in photos are all powered by CV. The revolutionary Amazon Go store that allows customers to shop in a cashier-less environment is also possible by CV. That’s because this technology is used to link particular customers to their shopping lists, which are updated with the items they pick up from the shelves. Those are captured by cameras all around the store that “see” what clients are putting in their carts and automatically send the information to the virtual list. Once the customer walks out of the store, the service automatically generates a receipt and charges it to the person’s Amazon account.
- Driverless cars: these vehicles depend on artificial intelligence to move around roads and streets safely and in an orderly manner. That’s why it is no surprise to learn that Tesla and Ford are using CV techniques to improve how their self-driving cars get to their destinations. In this context, CV processes the visual data that vehicles collect from the road to take better decisions. That includes seeing the road signs to act accordingly and identifying pedestrians and other vehicles for secure interactions while on the go.
- Healthcare businesses: CV is also being used throughout the medical industry, a field that traditionally depended on images like X-rays and scans for many of their treatments. An application called DermLens is an innovative example of how medicine is using CV. This app uses the camera to monitor a patient’s psoriasis and gather data about the patient’s condition and evolution. Based on the images it captures through CV, the app is capable of determining the severity of the condition and can lead to more precise treatments.
- Agricultural uses: agriculture is one of the most technological industries among the most traditional ones, so it’s only natural for it to use CV to improve its operations. Thus, artificial intelligence is being used to help farmers adopt more efficient growth methods, increase crop yields and detect potential issues. An example of the latter can be found in Slantrange, a drone that captures images of the fields to detect common stressors like pests, nutrient deficiencies, and dehydration. Those readings are then transmitted to software for farmers to analyze them and take better decisions to limit the stressors’ impact.
- Banking institutions: banking is another industry that is using a lot of AI-based solutions to improve its services, especially because this sophisticated technology provides answers to sensible issues regarding security and privacy.
- Industrial uses: finally, the manufacturing industry is also using CV for a wide variety of purposes. For instance, there are apps that use this technology to monitor infrastructure, especially those ones which are hard to access (such as remote wells or facilities working with dangerous materials). Though these applications can be used in practically any manufacturing industry, they come especially handy for industries that work with oil platforms, chemical plants, refineries, and energy power plants.
Conclusion
As you can see, the numerous uses of CV already in place prove that this technology is already ripe for a lot of purposes. From facial recognition and driverless cars to preventing fraud in banking institutions, computer vision is proving to be a vital ally for all kinds of industries.
And while it’s true that there’s still some way to go before CV can reach its maturity, we can safely say that its revolution is already here. Judging by its results, it feels like an amazing revolution that’s only beginning and that promises enormous potential for a lot of businesses in the near future.
If you enjoyed this, be sure to check out our other AI articles.