Overview
We will directly use the pre-converted models from the official Model Zoo to run on the board and quickly experience the functionality.
Downloading Model Resources
- Official Download Link: https://console.box.lenovo.com/l/l0tXb8
- Extraction Password:
rkllm
After accessing the link, we need to download the following resources:
- Download the entire directory:
rkllm_model_zoo/quickstart/demo_Linux_aarch64/ - Download these two files from:
rkllm_model_zoo/1.2.3/RK3576/Qwen3-VL-2B/:qwen3-vl-2b-instruct_w4a16_g128_rk3576.rkllmqwen3-vl-2b_vision_rk3576.rknn
Pushing Files to the Board
We recommend using ADB for file transfer. The LCSC-TaishanPi-3M has ADB enabled by default, and we need to configure ADB on the PC side. Refer to: Debian12 System ADB Usage
Create a directory to store the files we will push:
adb shell mkdir -p /home/lckfb/rkllm_demoUse the following command to push the environment files from the PC to the LCSC-TaishanPi-3M development board under /home/lckfb/rkllm_demo/:
# Push the entire demo_Linux_aarch64 directory to the board at /home/lckfb/rkllm_demo/
adb push quickstart/demo_Linux_aarch64/ /home/lckfb/rkllm_demo/2
Push model files to the LCSC-TaishanPi-3M development board under /home/lckfb/rkllm_demo/:
Need to push two files:
.rknnand.rkllm
adb push 1.2.3/RK3576/Qwen3-VL-2B/qwen3-vl-2b-instruct_w4a16_g128_rk3576.rkllm /home/lckfb/rkllm_demo/
adb push 1.2.3/RK3576/Qwen3-VL-2B/qwen3-vl-2b_vision_rk3576.rknn /home/lckfb/rkllm_demo/2
The final file structure on the board should look like this:
Running Models on the Board
We enter the LCSC-TaishanPi-3M development board terminal and navigate to the /home/lckfb/rkllm_demo/demo_Linux_aarch64/ directory:
# Navigate to the directory
cd /home/lckfb/rkllm_demo/demo_Linux_aarch64/2
Set the dynamic library path (located in the ./lib subdirectory under the current directory):
# Set the dynamic library path (very important, otherwise errors will occur)
export LD_LIBRARY_PATH=./lib:$LD_LIBRARY_PATH2
If you want to view performance, add a variable
export RKLLM_LOG_LEVEL=1
Grant executable permission to the demo:
sudo chmod +x demoRun the Demo:
Usage:
./demo [Image] [Vision Model] [Language Model] [Generation Length] [Context Length] [NPU Core Count] [Special Prompt Tokens...]Note: Because the model path is in the parent directory, we use
../
./demo demo.jpg \
../qwen3-vl-2b_vision_rk3576.rknn \
../qwen3-vl-2b-instruct_w4a16_g128_rk3576.rkllm \
2048 4096 3 "<|vision_start|>" "<|vision_end|>" "<|image_pad|>"2
3
4
The three tokens "<|vision_start|>", "<|vision_end|>", and "<|image_pad|>" are actually special placeholder tokens in multimodal LLM input:
"<|vision_start|>": Visual Start Token- Indicates the starting position of image information in the LLM input sequence, telling the model "an image's content will be inserted next."
"<|vision_end|>": Visual End Token- Indicates where the image information ends, telling the model "image information input ends here."
"<|image_pad|>": Image Padding Token- When processing multiple images in batch inference, image patch/token lengths may vary. To align inputs, Pad tokens are often used to pad to a consistent length. This token is used for padding.
Essentially, these are special string tokens that tell the LLM "where the image content starts, where it ends, and what to pad with when not filled", used for multimodal inference input.
After successful execution, you can engage in Q&A.
The terminal will output the model's description or answer regarding the
demo.jpgimage.
Let's look at this image and see if it matches the model's response: