Computer Vision applications: the Power and limitation of Deep Learning 07/04 Update SLTechnology News&Howtos

Computer Vision applications: the Power and limitation of Deep Learning

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

15:14, 20 January 2020

From the early days of the development of artificial intelligence, computer scientists have always dreamed of creating machines that can see and understand the world like us, and these efforts have promoted the emergence of computer vision.

This article was originally published on the TechTalks blog and was translated and shared by InfoQ Chinese site with the authorization of the original author.

Photo courtesy of Depositphotos

This article is part of a series of articles entitled "deciphering artificial Intelligence". This series of articles attempts to disambiguate terms and legends about artificial intelligence.

From the early days of the development of artificial intelligence, computer scientists have always dreamed of creating machines that can see and understand the world like us. These efforts have led to the emergence of computer vision, a huge subfield of artificial intelligence and computer science that processes visual data content.

In recent years, computer vision has made a leap due to the development of deep learning and artificial neural network. Deep learning is a branch of artificial intelligence, which is especially suitable for dealing with unstructured data such as images and videos.

These developments pave the way for promoting the application of computer vision in existing fields and introducing it into new fields. In many cases, computer vision algorithms have become a very important part of our daily applications.

Some explanations about the present situation of computer Vision

Don't get too excited about the advances in computer vision. It's important for us to understand the limitations of current artificial intelligence technology. Although there have been significant improvements, we still have a long way to go before there are computer vision algorithms that can understand photos and videos like humans do.

At present, deep neural network (the core of computer vision system) is very good at pixel-level pattern matching. They are particularly effective in image classification and object location. However, they often fail when it comes to understanding the context of visual data and describing the relationships between different objects.

The latest work in this field shows that computer vision algorithms have limitations and require new evaluation methods. However, the current application of computer vision shows that a lot of work can be done only through pattern matching. In this article, we will explore some of these applications, but we will also discuss their limitations.

Commercial Application of computer Vision

You use computer vision applications every day, but in some cases you may not notice it. Here are some practical and popular applications of computer vision that make life interesting and convenient.

Image search

Computer vision has made great progress in image classification and target detection. If there is enough tagged data, the trained neural network will be able to detect and highlight many different objects with impressive accuracy.

Few companies have the same amount of user data as Google. The company has been using its almost unlimited (and growing) user data to develop some of the most efficient artificial intelligence models. When you upload photos in Google photos, it uses its computer vision algorithm to label photos with content information about scenes, objects, and people. You can then search for images based on this information.

For example, if you search for "dog", Google will automatically return all images in the library that contain dog.

Google uses machine learning and computer vision to search for the contents of images, even if you don't mark them.

However, Google's image recognition is not perfect. On one occasion, the computer vision algorithm mistakenly marked the photos of two dark-skinned people as "gorillas", which embarrassed the company.

Google also uses computer vision to extract text from images in libraries, drives and Gmail attachments. For example, when you search your inbox for a term, Gmail also looks at the text in the image. Not long ago, I searched Gmail for my home address, and I received an email with a picture attachment containing an Amazon package with my address.

Image editing and enhancement

Many companies now use machine learning to automatically enhance photos. Google's Pixel series of phones use neural networks on the device for automatic enhancements such as white balance and adding blurred backgrounds.

Another significant improvement brought about by the development of computer vision is intelligent scaling. Traditional scaling functions usually blur images because they fill enlarged areas by interpolating between pixels. Unlike magnifying pixels, zoom based on computer vision focuses on edges, patterns and other features. A clearer image can be obtained by this method.

Many startups and long-established graphics companies have turned to deep learning to enhance images and videos. Adobe's enhanced detail technology (featuring Lightroom CC) uses machine learning to create clearer zoomed images.

Adobe uses deep learning to enhance the details of the scaled image.

The image editing tool Pixelmator Pro provides the ML super-resolution feature, which uses convolution neural networks to provide clear scaling and enhancement.

Facial recognition application

Until recently, facial recognition was a clumsy and costly technology, limited to police research laboratories. But in recent years, due to the progress of computer vision algorithms, facial recognition has entered a variety of computing devices.

IPhone X introduced FaceID, an authentication system that uses the neural network on the device to unlock the phone when it sees the owner's face. During setup, FaceID uses the owner's facial image to train its artificial intelligence model, and it works well even if there are differences in lighting conditions, facial hair, hairstyle, hat and glasses.

In China, many stores now use facial recognition technology to provide customers with a smoother payment experience (but at the expense of their privacy). Customers don't need to use credit cards or mobile payment apps, just face a camera equipped with a computer vision system.

However, despite these advances, the current facial recognition is not perfect. Artificial intelligence and security researchers have found many ways to cause facial recognition systems to go wrong. In one case, researchers at Carnegie Mellon University found that they could trick facial recognition systems into thinking they were celebrities by wearing special glasses.

Researchers at Carnegie Mellon University have found that by wearing special glasses, they can trick facial recognition algorithms into thinking they are celebrities (photo source: ww.cs.cmu.edu). Data efficient home security

With the disorderly development of the Internet of things (IoT), home security cameras connected to the Internet are becoming more and more popular. You can now easily install security cameras and monitor your home online at any time.

Each camera sends a large amount of data to the cloud. However, most of the images recorded by the security cameras are innocuous, resulting in a huge waste of network, storage and power resources. Computer vision algorithms can make home security cameras make more efficient use of these resources.

Smart cameras remain idle until they detect an object or movement in a real-time image, and then they can start sending data to the cloud or alerts to the camera's owner. Note, however, that computer vision is still not very effective in understanding context. So don't expect it to distinguish between acts of kindness (such as a ball rolling across the room) and things that need your attention (such as a thief breaking into your house).

Interact with the real world

Augmented reality is a technology that overlays real-world videos and images with virtual objects. In the past few years, it has become a growing market. The development of augmented reality technology is largely due to the progress of computer vision algorithms. AR applications use machine learning to detect and track target locations and objects, and place virtual objects accordingly. You can see the combination of AR and computer vision in many applications, such as Snapchat filters and Warby Parker Virtual Try-On.

Computer vision also allows you to extract information from the real world through the lens of your phone's camera. A striking example is Google Lens, which uses computer vision algorithms to perform tasks such as reading business cards, checking the style of furniture and clothes, translating road signs, and connecting mobile phones to wi-fi networks based on router tags.

Advanced applications of computer Vision

Due to the progress of deep learning, computer vision is solving problems that are difficult or even impossible for computers to solve in the past. In some cases, well-trained computer vision algorithms can be comparable to human beings with years of experience and corresponding training.

Medical image processing

Before the advent of deep learning, it takes a lot of work for software engineers and subject matter experts to create computer vision algorithms that can process medical images. They must collaborate to develop code to extract relevant features from radiation images and then examine them for diagnosis. (artificial intelligence researcher Jeremy Howard has an interesting discussion about this. )

The deep learning algorithm provides an end-to-end solution, making the process very simple. The engineers established the appropriate neural network structure, then trained it with X-rays, magnetic resonance imaging (MRI) images or CT scans, and marked the results. The neural network then finds out the characteristics associated with each result, so that future images can be diagnosed with impressive accuracy.

Computer vision has found suitable application scenarios in many medical fields, including cancer detection and prediction, radiology, diabetic retinopathy.

Some artificial intelligence researchers even say that deep learning will soon replace radiologists. But those with rich experience in this field do not agree. Diagnosing and treating diseases goes far beyond looking at slides and images. Let's not forget that deep learning is about extracting patterns from pixels-it doesn't replicate all the duties of human doctors.

Play a game

Teaching computer to play games has always been a hot field of artificial intelligence research. Most game programs use reinforcement learning, an artificial intelligence technology that improves their behavior through trial and error.

Computer vision algorithms play an important role in helping these programs analyze the content of game graphics. It is important to note, however, that in many cases, these graphics have been "simplified" to make them easier for neural networks to understand. In addition, the current artificial intelligence algorithms need a lot of data to learn games. For example, OpenAI's Dota game AI uses 45,000 years of game data training to reach the championship level.

Unmanned retail store

In 2016, Amazon launched Go, where you can walk into the store, pick up anything you want, and leave without being arrested for shoplifting. Go uses a variety of artificial intelligence systems to eliminate the need for cashiers.

As customers walk around the store, cameras equipped with advanced computer vision algorithms monitor their behavior and track the items they pick or put back on the shelf. When they leave the store, their shopping cart is automatically credited to their Amazon account.

Three years later, Amazon opened 18 new Go stores, and the work is still under way. But there are signs that computer vision (with the help of other technologies) will one day make queuing checkouts a thing of the past.

Self-driving car

Driverless cars have always been the longest dream and one of the biggest challenges in the field of artificial intelligence. Today, we still have a long way to go from self-driving cars that can drive on any road under all kinds of lighting and weather conditions. However, due to the development of deep neural network, we have made a lot of progress.

One of the biggest challenges in creating self-driving cars is to enable them to understand their surroundings. Although different companies are solving this problem in different ways, one thing remains the same, and that is computer vision.

Cameras installed around the car monitor the environment of the car. The deep neural network analyzes the video clips and extracts the information of the surrounding objects and people. This information is combined with data from other devices such as lidar to form a map of the area that helps cars navigate and avoid collisions.

Creepy computer vision applications

Like all other technologies, artificial intelligence is not pleasant in every way. Advanced computer vision algorithms can enhance malicious applications. Here are some interesting computer vision applications.

Monitor and control

It is not just mobile phone and computer makers that are interested in facial recognition technology. In fact, the biggest customers of facial recognition technology are government agencies, who are interested in using the technology to automatically identify criminals in surveillance videos.

But the question is, where do you draw the line between national security and citizen privacy? If there are too many of the former and too few of the latter, it will lead to a state of monitoring, giving the government too much control. The widespread use of security cameras based on facial recognition technology enables the government to closely track the actions of millions of citizens, whether they are criminal suspects or not.

In the United States and Europe, the situation is more complicated. Technology companies face resistance from employees and digital activists when it comes to providing facial recognition technology to law enforcement. Some American states and cities have banned the public use of facial recognition technology.

Autonomous weapon

Computer vision can also attach eyes to weapons. Military drones can use artificial intelligence algorithms to identify objects and select targets. In the past few years, the use of artificial intelligence by the military has caused a lot of controversy. Faced with criticism from employees, Google had to cancel its contract with the Defense Department to renew its computer vision technology development contract.

There are no autonomous weapons yet. Most military institutions have human intervention in the use of artificial intelligence and computer vision systems.

But people worry that with the progress of computer vision and the further intervention of the military, sooner or later we will have the weapons to choose our own targets and pull the trigger, without the need for humans to make decisions.

Stuart Russell, a famous computer scientist and artificial intelligence researcher, set up an organization dedicated to stopping the development of autonomous weapons.

Check out the original English text: Computer vision applications: The power and limits of deep learning

Https://www.infoq.cn/article/wLSpoj2eOQF7ujcHZqzf

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.