A close-up view of a robotic picking hand holding a tomato next to a tomato plant with ripe fruit. Yellow boxes with the words Ripe Tomato and a percentage value for ripeness are shown around each, demonstrating the use of computer vision

What Is Computer Vision?

Computer vision uses artificial intelligence (AI) to see and interpret visual data to improve processes, enable proactive and faster situational response times, and deliver increased business and customer value.

Computer Vision Takeaways

  • Computer vision is a type of AI that enables computers and systems to act on insights derived from images and videos.

  • Computer vision combines edge computing, cloud computing, software, and AI deep learning models to recognize, classify, analyze, and take action on collected visual data.

  • Organizations are applying computer vision to a range of use cases to unlock improved automation, efficiency, and value.

author-image

By

What Is Computer Vision?

Computer vision is a type of AI that trains computers to emulate how humans see, make sense of what they see, and act on that processed and analyzed information. Organizations that use computer vision can achieve a variety of business outcomes, including streamlined processes, improved performance, enhanced customer experiences, and greater competitive differentiation in their market.

Computer vision is often used for tasks that are time consuming, error prone, or nearly impossible for humans to accomplish, such as defect and anomaly identification and classification, machine condition monitoring, automated medical imaging analysis, and disease detection. These types of tasks require organizations to monitor operations, processes, or other parts of the business from multiple touchpoints, creating an enormous amount of collected visual data from which insights must be extracted—often in near-real time—and acted upon.

How Computer Vision Is Used

Computer vision systems use machine learning and deep learning models to train the system to recognize aspects of an image or video and make predictions about them. Types of computer vision models include:

 

  • Image classification for inspecting an image and assigning it a class label based on the content. For example, an image classification model can be used to predict which images contain a dog, cat, or angry customer.
  • Image segmentation for identifying objects and extracting them from their background, such as isolating a tumor from surrounding brain tissue in X-ray results.
  • Object detection for scanning images or videos and finding target objects. Object detection models commonly highlight multiple objects simultaneously and can be used for tasks such as identifying items on shelves for improved inventory management or anomalies in items on a production line.
  • Object tracking for tracking movements of detected objects as they navigate in an environment. For example, object tracking can be used in autonomous driving to track pedestrians on sidewalks or as they cross the road.
  • Feature extraction for isolating useful characteristics captured in an image or video and sharing them with a second AI algorithm, such as search and retrieve image matching. For example, feature extraction can be used to automate traffic monitoring and incident detection.
  • Optical character recognition for extracting and converting text from an image into a format that a machine can read. This is often used in banking and healthcare to process important documents and patient records.

Computer Vision Applications across Industries

Computer vision enables a range of new use cases, helping companies across industries solve real-world problems like reducing operational costs, unlocking business automation, and creating new services or revenue streams. Here are the top industries using computer vision and the exciting ways they apply this technology.

Industrial Automation and Manufacturing

Manufacturers use computer vision to enable automation, which helps make production processes more efficient, reduce human error, improve worker safety, and produce higher outputs at lower costs. Some common applications of computer vision in manufacturing include:

 

  • Automated product inspection: Visual product inspections are critical to quality control. By automating optical inspections using production line cameras, AI models for defect classification and anomaly detection, and edge computing, manufacturers can improve quality assurance accuracy and speed.
  • Safety monitoring: Computer vision can be used to monitor factory floors to help ensure employee safety. For example, analysis of real-time video can help identify and alert staff to accidents or spills or detect access to restricted, hazardous areas.

Healthcare

From preventive care to cancer treatment planning, computer vision is used by healthcare organizations in a variety of ways to help improve patient outcomes, enhance accuracy, accelerate disease detection, and more. Examples of how computer vision is being applied in healthcare include:

 

  • Medical imaging: Equipping CT scanners, X-ray systems, endoscopy cameras, and other medical imagining technology with computer vision systems can help enable rapid processing of massive amounts of data, streamlined workflows, and accurate and efficient image evaluation. Deep learning technology is being applied to assist with whole slide imaging in digital pathology.
  • Remote patient monitoring: Cameras and sensors equipped with computer vision applications can be used to collect and analyze data about patient movement, such as gait or body positioning, to identify deviations from established norms and alert care team members to possible urgent needs.

Retail

From understanding where to place products to the optimal time to restock inventory to in-store customer behavior tracking, computer vision can help retailers discover powerful insights about their operations for more-informed business decision-making. Some applications of computer vision in retail include:

 

  • Loss prevention: Computer vision models can analyze data from existing store cameras or self-check kiosks to identify suspicious behavior and send real-time alerts to managers so they can intervene and help stop fraud being committed.
  • Touchless self-service checkout terminals: Retailers looking to increase efficiency and enhance the customer experience can leverage 3D smart scan technology and computer vision models to capture, detect, and recognize nonbarcoded food items, enabling quick and convenient checkout with minimal staff intervention.

Smart Cities

Smart city technologies can help gather video feeds from street cameras so city leaders can make more-informed operational decisions to help improve citizen safety, mobility, and quality of life. Here are a few ways computer vision can be applied in smart cities:

 

  • Traffic management: City governments can implement computer vision systems to monitor and analyze street intersections and traffic patterns and detect and track vehicles and pedestrians to optimize traffic flow and help enhance safety at intersections.
  • Infrastructure maintenance: Computer vision models can be trained to recognize road and bridge problems, such as potholes or cracked pavement, throughout a city or entire county and inform crews of locations needing maintenance.

How Computer Vision Works

Computer vision combines components like edge computing, cloud computing, software, and AI deep learning models to enable computers to “see” data collected from cameras and videos; quickly recognize specific objects, people, and patterns; make predictions about them; and take action if necessary.

The Role of Convolutional Neural Networks

Computer vision systems use deep learning models from a family of algorithms known as convolutional neural networks (CNNs) to guide image processing and analysis. These deep learning models analyze the RGB values embedded in digital image pixels to detect identifiable patterns. CNNs can be developed to evaluate pixels based on a wide range of features—including color distribution, shape, texture, and depth—and accurately recognize and classify objects.

Training Computer Vision Models

Before a computer vision system can be put to work, data scientists and developers must train the system’s deep learning model for its specific use case. This requires inputting large amounts of application-specific data that the model can use to recognize what it has been developed to identify. For example, for a computer vision application designed to recognize a dog, the model must first learn what a dog looks like. It does this by being trained on thousands, maybe even millions, of images of dogs of different breeds, sizes, colors, and characteristics.

Most commonly, training takes place in data centers or cloud environments. For especially complex training initiatives, GPUs and AI accelerators can be applied to expedite the process and better handle the increased number of parameters involved. Once the model has completed the training phase, it has the knowledge needed to interpret and infer information from digital images. The model may also be further fine-tuned or retrained over time.

It’s also important to note that those seeking to build computer vision solutions can use off-the-shelf, foundational models as starting points for fine-tuning to accelerate development times and avoid starting from scratch.

Deploying Computer Vision Models

Once trained, computer vision models can be deployed to computer systems to perform inference and interpret conditions in the field—continuously assessing image and video data to extract insights and information. While computer vision solutions can run inferencing workloads in the cloud or data center, many organizations today are exploring edge AI applications, where computer vision models run closer to where data is generated on lightweight, optimized edge hardware or embedded devices.
Moving AI inferencing capabilities closer to the edge can offer several key benefits:

 

  • Increased speed and lower latency: Moving data processing and analysis to where it is generated helps speed system response, enabling faster transactions and better experiences that are vital in many computer vision applications.
  • Improved network traffic management: Minimizing the amount of data sent over a network to the cloud can reduce the bandwidth and costs of transmitting and storing large volumes of data.
  • Greater reliability: The amount of data that networks can transmit simultaneously is limited. For locations with subpar internet connectivity, storing and processing data at the edge improves reliability.
  • Enhanced security: With proper implementation, an edge computing solution may increase data security by limiting data transmission over the internet.
  • Privacy compliance requirements: Some governments, customers, or industries may require that data being used for computer vision applications remain in the jurisdiction where it was created. Edge computing can help businesses stay compliant with such rules and regulations.

Get Started with Computer Vision

Computer vision, like other forms of AI, is impacting all aspects of business. It is helping companies across a wide variety of industries reduce operational costs, unlock business automation, and identify potential new services or revenue streams. Businesses that can harness the power of computer vision to unlock new use cases, capabilities, and innovations will emerge as industry leaders.
As your business moves through your computer vision AI journey, Intel is here and ready to help your AI initiatives succeed.

Find scalable, ready-to-deploy AI solutions optimized with Intel® hardware and software

FAQs

Frequently Asked Questions

Computer vision is a type of AI that enables computers to “see” data collected from images and videos. Computer vision systems are used in a wide range of environments and industries, such as robotics, smart cities, manufacturing, healthcare, and retail brick-and-mortar stores.

Computer vision combines cameras, edge computing, cloud-based computing, software, and AI models to help systems “see” and identify objects. It uses deep learning to form neural networks that guide system image processing and analysis, helping to teach a computer to recognize aspects of an image or video and make predictions about them.

Convolutional neural network (CNN) techniques enable deep learning inference for image classification and object detection. Once fully trained, computer vision models can perform object recognition; detect and recognize people, things, or visual details; and even track movement.

Computer vision is used in a range of industries—including manufacturing, retail, healthcare, and smart cities—to help businesses react and respond to situations in near-real time. This enables process improvements and automation, early detection of potential issues, rapid response to critical situations, and improved customer experiences.