Arrow Electronic Components Online

Qualcomm Linux Sample Apps – AI Object Detection and Parallel AI Fusion

Artificial Intelligence18 Sep 2025
A person wearing a red sweater and yellow lanyard is seen working on a laptop in a server room.
View all articles

In this article, learn about live-stream object detection and parallel sensor fusion capabilities in the Qualcomm Intelligent Multimedia SDK.

Due to the constantly changing variables that affect an autonomous drone's ability to navigate, powerful AI processing is used for decision-making. Because they’re so complex, implementing these AI controls can be difficult. But, Qualcomm’s Linux-based sample applications give product developers a faster path to market. In this article, learn about live-stream object detection and parallel sensor fusion capabilities included in the Qualcomm Intelligent Multimedia SDK.
 
Last time we walked you through two building-block sample apps from the Qualcomm Intelligent Multimedia SDK. The SDK is based on Qualcomm® Linux® software, our distribution that lets you write once and use for many of our IoT system-on-chips (SoCs). In this post we’ll explore two more of the 22 sample applications in the SDK to demonstrate how you can write apps for our IoT chipsets

Seeing through AI: Live stream object detection

This command-line application, gst-ai-object-detection, takes a live video stream from a camera and hands it off to open-source YOLO (You Only Look Once) AI models for object detection. It performs object detection with pre-processing and AI inference on dedicated hardware blocks, executing YOLOV5, YOLOV8 or YOLO-Nas using the Snapdragon Neural Processing SDK. It then displays a preview with overlaid output, such as labels and bounding boxes, based on the model.
 
The application pipeline looks like this:

A technical diagram illustrating the workflow of the Qualcomm Neural Processing SDK.

  • qtiqmmfsrc – Using this gstreamer plugin, the application captures the camera live stream, then uses tee to split the stream.
  • qtimlvconverter – This preprocessing plugin performs tasks like color conversion, down-/upscaling and normalization on the stream data. It converts the video stream to a tensor stream for inference later.
  • qtimlsnpe – This machine learning inferencing plugin applies YOLO-Nas (default), YOLOV8 or YOLOV5 to detect objects in the stream. It executes the Snapdragon Neural Processing runtime in hardware on a CPU-, GPU- and DSP-based neural processing unit (NPU).
  • The SNPE runtime performs inference on the tensor stream and produces a tensor stream with the inference results.
  • qtimlvdetection – For post-processing, this plugin applies the threshold to the chosen number of desired results. It loads the YOLO-Nas module for post-processing, produces video frames containing only bounding boxes (for overlay) and hands the frames off for video composition.
  • qtivcomposer – This plugin overlays frames, with the bounding boxes and labels, onto frames from the live camera stream, then hands off gst buffers with the combined layers.
  • Waylandsink submits the received video stream to Weston, which renders the video stream on a local display.

Here’s an example of using gst-ai-object-detection to detect a person in a camera stream:

A person wearing a red sweater and yellow lanyard is seen working on a laptop in a server room.

When would you use this application?

gst-ai-object-detection has dozens of classes that you can build into your own applications for detecting objects (people, vehicles, animals, etc.) and locating them in a camera frame. Examples include detecting helmets, fire/smoke and intruders.

Parallel AI Fusion: Four AI Inferences on Live Camera

This command-line application, gst-ai-parallel-inference, extends the one-channel app above to four channels of parallel processing of AI models on dedicated hardware blocks. Besides object detection, it adds classification, pose detection and segmentation, then displays scaled-down, composed previews of the live camera stream with overlaid output from all four models.
 
The application pipeline is a variation of the one above, with separate flow for each AI inference as shown below:

A technical diagram showcasing a video processing pipeline for AI applications.

  • qtiqmmfsrc – Using this gstreamer plugin, the application captures the camera live stream, then uses tee’s to generate four parallel streams.
  • qtimlvconverter – This preprocessing plugin performs tasks like color conversion, down-/upscaling and normalization on the stream data. It converts the video stream to a tensor stream for inference later.
  • qtimlsnpe – This machine learning inferencing plugin applies YOLO-Nas for object detection and DeepLabv3 for image segmentation. The plugin executes the Snapdragon Neural Processing runtime in hardware on a CPU-,
  • qtimltflite – This plugin applies PoseNet for pose detection and Inception V3 for object classification. The plugin executes the TFLite runtime in hardware on a CPU-, GPU- and DSP-based neural processing unit (NPU).
  • Post-processing uses a different plugin for each model.
  • qtimlvdetection – For object detection, this plugin applies the threshold to the chosen number of desired results. It loads the YOLO-Nas post-processing module, produces video frames containing only bounding boxes (for overlay) and hands the frames off for video composition.
  • qtimlvclassification – For classification, this plugin applies the threshold to the chosen number of desired results. It loads the Inception V3 post-processing module, produces video frames with classification labels (for overlay) and hands the frames off for video composition.
  • qtimlvpose – For pose estimation, this plugin applies the threshold to the chosen number of desired results. It is capable of loading modules for different pose estimation models. In this use case, it loads the PoseNet module, produces video frames with poses drawn (for overlay) and hands the frames off for video composition.
  • qtimlvsegmentation – For segmentation, this plugin converts the inference tensors it receives into video formats that our multimedia plugins can understand later.
  • qtivcomposer – This plugin overlays frames from the AI models onto frames from the live camera stream, then hands off gst buffers with the combined layers.
  • Waylandsink submits the received video stream to Weston, which renders the video stream on a local display.

Here’s an example of the rendered video stream from gst-ai-parallel-inference:

A male cyclist rides a mountain bike on a rugged dirt trail surrounded by greenery under a clear blue sky.

When would you use this application?

As a superset of gst-ai-object-detection, gst-ai-parallel-inference allows you to detect people, vehicles, animals and other objects – even smoke and fire – in a camera frame.
 
With pose detection, you can determine, for example, whether a person is lying, sitting or standing, with the potential to determine whether the person has fallen. A gym trainer or yoga instructor can use pose detection to understand whether a student is assuming a pose correctly or not. An ergonomics application can watch for and remind about correct posture in a chair or at a desk.
 
Scenarios for classification apps include product categorization, and for segmentation they include manufacturing, healthcare and logistics.

Next Steps

Those are two more of the compelling applications we’ve built to showcase Qualcomm Linux. You can get them and the entire Qualcomm Intelligent Multimedia SDK from open source, with 20 more applications for AI and multimedia. Then you can start incorporating them to your own applications.
 
We’ve designed Qualcomm Linux so you can write once and run on multiple IoT chipsets with the same source code. The Qualcomm Intelligent Multimedia SDK is the first time we’ve opened all of our multimedia subsystems – including camera, artificial intelligence and audio – to developers via APIs.
As we give more to open source, you can customize, try out and contribute to this work. It’s a big step in our developer-first mindset, in which we make it easier for you to develop the kinds of customizations you want in your IoT applications running on Linux.

Article Tags

Drones
Internet of Things (IoT)
QUALCOMM
Artificial Intelligence (AI)

Related Content