Skip to content

Feat/sam3cpp video backend#1351

Open
omar-A-hassan wants to merge 2 commits intoCVHub520:mainfrom
omar-A-hassan:feat/sam3cpp-video-backend
Open

Feat/sam3cpp video backend#1351
omar-A-hassan wants to merge 2 commits intoCVHub520:mainfrom
omar-A-hassan:feat/sam3cpp-video-backend

Conversation

@omar-A-hassan
Copy link
Copy Markdown

@omar-A-hassan omar-A-hassan commented Apr 19, 2026

Summary

Adds a second video tracking backend, sam3cpp_video, powered by PABannier/sam3.cpp — a C++14/ggml engine with no PyTorch dependency. This brings SAM 3 and
EdgeTAM to X-AnyLabeling for the first time and gives Apple Silicon / CPU-only users a video-tracking option without installing torch or sam-2.

The existing segment_anything_2_video backend is untouched.
This is to show some love to the mac people out there.

What's added

  • New model class anylabeling/services/auto_labeling/segment_anything_3_video.py — graceful ImportError if the sam3cpp Python module isn't installed, matching the existing SAM 2 video class's
    pattern.

  • 5 model entries in the AI model picker (auto-download from HuggingFace on first use):

    Model Size Notes
    sam3cpp_video-edgetam-q4_0 15 MB mobile / lowest latency
    sam3cpp_video-edgetam-f16 27 MB best EdgeTAM quality
    sam3cpp_video-sam3_visual-q4_0 289 MB SAM 3 visual, quantized
    sam3cpp_video-sam3_visual-f16 946 MB SAM 3 visual, full precision
    sam3cpp_video-sam3-q4_0 707 MB SAM 3 full (text + visual), quantized
  • Setup guide at examples/interactive_video_object_segmentation/sam3cpp/README.md — mirrors the existing SAM 2 example folder structure (uv venv, build sam3.cpp from source, install xlabel).

  • Registry test tests/test_models/test_auto_labeling_registry.py — asserts no duplicate entries in any _*_MODELS list.

How a user installs it

The interaction model is identical to the existing SAM 2 video tutorial — see the new README for the full walkthrough. In short:

curl -LsSf https://astral.sh/uv/install.sh | sh                                                                                                                                                              
uv venv --python 3.10 .venv && source .venv/bin/activate                                                                                                                                                     
git clone --recursive https://github.com/PABannier/sam3.cpp                                                                                                                                                  
cd sam3.cpp && mkdir build && cd build && cmake .. && make -j                                                                                                                                                
# build sam3cpp Python bindings (see README — upstream PR pending)                                                                                                                                           
# clone & install X-AnyLabeling normally                                                                                                                                                                   

If sam3cpp is not importable, the model picker reports a clean error and the rest of the app keeps working.

Pending upstream work

The sam3cpp Python wrapper (pybind11) is being upstreamed to PABannier/sam3.cpp in a parallel PR. Once that lands, the bindings build collapses to a single CMake flag (-DSAM3_BUILD_PYBIND=ON) and the
README here will be updated. Link to that PR will be added to this description once it's open.

Verification

  • All 5 HuggingFace .ggml URLs verified to resolve (HEAD 200).
  • tests/test_models/test_auto_labeling_registry.py passes — no duplicate registry entries.
  • bash scripts/format_code.sh produces no diff on the new file (black -l 79).
  • Manual end-to-end on Apple Silicon: load sam3cpp_video-edgetam-q4_0 → click point on video frame → mask appears → "Run All" tracks across frames.
  • The original segment_anything_2_video still loads (no regression).

Thanks for your contribution!
Please confirm that you agree to the Contributor License Agreement (CLA) by checking the box below:

  • I have read and agree to the CLA.

@CVHub520 CVHub520 self-assigned this Apr 19, 2026
@CVHub520 CVHub520 added enhancement New feature or request planned labels Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request planned

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants