Resolve Windows linker C runtime mismatch by implementing a custom
tokenizer that doesn't depend on esaxx-rs (which uses static runtime).
Changes:
- Remove tokenizers crate dependency (caused MT/MD conflict)
- Add custom SimpleTokenizer in src/ai/tokenizer.rs
- Loads vocab.txt files directly
- Implements WordPiece-style subword tokenization
- Pure Rust, no C++ dependencies
- Handles [CLS], [SEP], [PAD], [UNK] special tokens
- Update OnnxClassifier to use SimpleTokenizer
- Update ModelConfig to use vocab.txt instead of tokenizer.json
- Rename distilbert_tokenizer() to distilbert_vocab()
Build status:
✅ Compiles successfully
✅ Links without C runtime conflicts
✅ Executable works correctly
✅ All previous functionality preserved
This resolves the LNK2038 error completely while maintaining full
ONNX inference capability with NPU acceleration.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Implement complete ONNX inference pipeline with NPU acceleration:
- Add OnnxClassifier for text classification via ONNX Runtime
- Integrate HuggingFace tokenizers for text preprocessing
- Support tokenization with padding/truncation
- Implement classification with probabilities (softmax)
- Add distilbert_tokenizer() model config for download
Features:
- Tokenize text input to input_ids and attention_mask
- Run NPU-accelerated inference via DirectML
- Extract logits and convert to probabilities
- RefCell pattern for session management
Note: Current blocker is Windows linker C runtime mismatch between
esaxx-rs (static MT) and ONNX Runtime (dynamic MD). Code compiles
but linking fails. Resolution in progress.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>