Algorithms AnyNet C++ Containers Effective STL FlashAttention2 Functor GQA HuggingFace Iterator LLM LLaMa2 LayerNorm MHA MNN MQA ONNX Python Qwen2.5-VL RMSNorm STL String TTS Tokenizers Transformer Transformers VLM Vector ViT Vision Transformer Whisper bfloat16 c++ clip cosplay deepseek llama.cpp opencl self-attention std::threads transformer video summarization 多模态大模型 数值精度 综述