這篇博文我們測試Intel 第12代CPU Alder Lake 搭配Intel OpenVINO 執行Image segmentation的範例並測試在不同硬體下執行的效能。
測試環境
CPU: Intel® Core™ i7-1265U 處理器
GPU: Intel® Iris® Xe 顯示晶片
Storage: 256GB
Memory: 16GB
OS: Ubuntu 20.04 LTS (64 bit)
OpenVINO: Intel® Distribution of OpenVINO* toolkit 2022.3 LTS
Intel 網站範例程式說明連結
Image Segmentation Python* Demo
安裝好後的範例程式路徑
path: open_model_zoo/demos/segmentation_demo/python
Testing Model: semantic-segmentation-adas-0001
omz_downloader --name semantic-segmentation-adas-0001
Testing Video: head-pose-face-detection-male.mp4
Inference at CPU
Command
python segmentation_demo.py -d CPU -i ~/Downloads/head-pose-face-detection-male.mp4 -at segmentation -m ./intel/semantic-segmentation-adas-0001/FP32/semantic-segmentation-adas-0001.xml
Result
Latency: 402.0ms
FPS: 2.5
Inference at GPU
Command
python segmentation_demo.py -d GPU -i ~/Downloads/head-pose-face-detection-male.mp4 -at segmentation -m ./intel/semantic-segmentation-adas-0001/FP32/semantic-segmentation-adas-0001.xml
Result
Latency: 78.4ms
FPS: 12.2
Inference at AUTO (會根據CPU 與 GPU 的工作負載提供最佳推論效能)
Command
python segmentation_demo.py -d AUTO -i ~/Downloads/head-pose-face-detection-male.mp4 -at segmentation -m ./intel/semantic-segmentation-adas-0001/FP32/semantic-segmentation-adas-0001.xml
Result
Latency: 79.8ms
FPS: 12.1
Append I
如何確認GPU安裝是否正確
Download hello_query_device.py
curl -LO https://github.com/openvinotoolkit/openvino/raw/master/samples/python/hello_query_device/hello_query_device.py
Run
python hello_query_device.py
Output 可以看出可支援推論的硬體
Append II
How to install GPU driver If you installed OpenVINO Runtime via PyPI
you can download the install_NEO_OCL_driver.sh script from the OpenVINO GitHub repository:
curl -L https://raw.githubusercontent.com/openvinotoolkit/openvino/releases/2022/3/scripts/install_dependencies/install_NEO_OCL_driver.sh --output install_NEO_OCL_driver.sh
Run the script:
chmod +x install_NEO_OCL_driver.sh
sudo -E ./install_NEO_OCL_driver.sh
本博文測試Intel i7-1265U 在執行Image Segmentation 的AI 推論時於CPU 或 GPU 的執行效能,可以看到以FP32 精度的模型可達到12 FPS 左右
已可以滿足大部分的應用。
下次有機會再來比較不同精度下的執行效能。