Enhance AI Upscaling with Intel® AI Boost Neural Processing Unit (NPU)

Stay in the Know on All Things CODE

Ramya Ravi, AI Software Marketing Engineer, Intel | LinkedIn
Gururaj Deshpande, Graduate Intern Technical, Intel | LinkedIn
Chandan Damannagari, Director, AI Software, Intel | LinkedIn

AI Upscaling, also referred to as, Image Upscaling or Super-Resolution is a technique that takes an input image and then upscales the resolution of the input image using Deep Neural Networks. One of the most common applications of AI Upscaling is to restore photos and videos from the 1990s or 2000s recorded in lower resolutions. AI Upscaling can convert images that are 360p resolution or lower to 1080p and even 4k resolution with minimal quality loss.

In this article, we will demonstrate how to run a popular AI Upscaling model on Intel's AI Boost Neural Processing Unit (NPU) provided in the Intel® AI PC (powered by Intel® Core™ Ultra processor). Intel® Core™ Ultra processor accelerates AI on the PC by combining CPU, GPU, and NPU via a 3D performance hybrid architecture, together with high-bandwidth memory and cache. We will use the Intel® Distribution of OpenVINO™ toolkit and the Neural Network Compression Framework (NNCF) for weight compression, allowing our AI Upscaling model to run performantly on the Intel AI PC.

Get Started

AI Upscaling Model

The AI Upscaling model that will be used is the BSRGAN (Blind Super-Resolution Generative Adversarial Network). BSRGAN retains the same architecture as ESRGAN (Enhanced Super-Resolution Generative Adversarial Network. Both BSRGAN and ESRGAN use a Generative Adversarial Network structure (GAN) to allow the generator to create more representative upscaled images that align with the ground truth. BSRGAN, unlike ESRGAN, is trained on synthetic images that can represent diverse degradation of images, or in other words, reasons why lower resolution images are often lower quality (noise, blurring, etc.). This allows BSRGAN to generalize more to real-world use cases where different degradations happen.

OpenVINO toolkit and NNCF

OpenVINO is an open-source deep learning toolkit that allows users to easily deploy and accelerate deep learning models with hardware optimizations. OpenVINO can convert PyTorch, TensorFlow Models, ONNX models, and more into OpenVINO’s Intermediate Representation (IR) format. OpenVINO then uses this IR as well as the available hardware accelerators to efficiently run deep learning inference.

The Neural Network Compression Framework (NNCF) is a deep learning framework that allows users to easily apply post-training and training-aware deep learning compression algorithms (e.g. quantization techniques) to their deep learning models. It supports OpenVINO models, PyTorch models, and more model formats.

AI PC

AI PCs represent the new generation of personal computers to provide power efficient AI acceleration and handle AI tasks locally which includes a central processing unit (CPU), a graphic processing unit (GPU), and a neural processing unit (NPU). AI PCs powered by Intel® Core™ Ultra processors can balance power and performance for fast and efficient AI experiences. NPUs are specialized hardware designed for AI capabilities and allow the AI PC to perform a variety of AI tasks efficiently.

Code Sample

This code sample is available in the AI PC Notebooks GitHub Repository. Here, BSRGAN model is used to AI-upscale an image and video optimized for Intel's AI Boost NPU using OpenVINO and NNCF.

The following steps are implemented in the code sample:
 

  1. Convert PyTorch model to OpenVINO model: After loading in the BSRGAN model, we can then convert the model to the OpenVINO format. Currently, OpenVINO only supports static shapes for the NPU, which is why we need to provide an explicit input shape for the BSRGAN model. With future updates, OpenVINO will support dynamic shapes for the NPU.
    ov_model = ov.convert_model( cpu_model, input=[1, 3, width, height], example_input=torch.randn(1, 3, width, height), )
  2. Compress model weights: The NNCF framework is used to run data-free 4-bit symmetric quantization on the BSRGAN model. This data-free quantization takes the 32-bit floating point weights and maps them into a 4-bit range while still allowing for a fixed zero-point.
    compressed_model = compress_weights(ov_model, mode=CompressWeightsMode.INT4_SYM)
  3. Compile OpenVINO model: Run the OpenVINO compilation on NPU.
    core = ov.Core() compiled_model = core.compile_model(compressed_model, device_name="NPU")
  4. Visualize Upscaled Image: The code sample will create an interactive visualization using Plotly displaying the original image and the AI-upscaled image from the CPU and NPU.
  5. Visualize performance between CPU and NPU: The code sample will create an interactive visualization using Plotly displaying the performance gains the NPU achieves over the CPU.
  6. Run AI Upscaling model on video: Using OpenVINO’s asynchronous inference queue, we can run the AI Upscaling model performantly on a video. This runs at around 5 frames processed per second.
    def callback(infer_request, userdata): res = infer_request.get_output_tensor(0).data[0] frame = postprocess(res) pbar, postprocessed_frames = userdata pbar.update(1) postprocessed_frames.append(frame) infer_queue = ov.AsyncInferQueue(compiled_model) infer_queue.set_callback(callback) pbar = tqdm(total=len(original_frames), desc="Inferencing frames") postprocessed_frames = [] for frame in original_frames: new_frame = preprocess(frame) infer_queue.start_async( inputs={input_layer.any_name: new_frame}, userdata=(pbar, postprocessed_frames) ) infer_queue.wait_all()

Try out the above code sample for yourself. The code sample output illustrates the difference between original and upscaled images and videos and also shows the performance improvement while running the model on the NPU compared to CPU.

What’s Next

Many AI Upscaling technologies have often only been able to run performantly on discrete GPUs. However, with the new AI PC, AI Upscaling is now able to run performantly on everyday laptops with the help of the Intel AI Boost NPU. Additionally, elevate your speed, efficiency, privacy, and security for various AI tasks such as creating creative content, drafting emails, improve reporting, manage schedules and auto-summarization of meetings with detailed notes on Intel AI PCs.

We encourage you to also check out and incorporate Intel’s other AI/ML Framework optimizations and tools into your AI workflow and learn about the unified, open, standards-based oneAPI programming model that forms the foundation of Intel’s AI Software Portfolio to help you prepare, build, deploy, and scale your AI solutions.

Additional Resources: