Intel and Hugging Face are building powerful optimization tools to accelerate training and inference with Hugging Face libraries.
Get started with deploying Intel's models on Intel® architecture with these hands-on tutorials from blogs written by engineers from Hugging Face and Intel:
Blog | Description |
---|---|
Building Cost-Efficient Enterprise RAG applications with Intel Gaudi 2 and Intel Xeon | Develop and deploy RAG applications as part of OPEA, the Open Platform for Enterprise AI |
Running Large Multimodal Models on an AI PC's NPU | Run the llava-gemma-2b model on an AI PC's NPU |
A Chatbot on your Laptop: Phi-2 on Intel Meteor Lake | Deploy Phi-2 on your local laptop with Intel OpenVINO in the Optimum Intel library |
To get started with Hugging Face Transformers software on Intel, visit the resources listed below.
Optimum Intel - To deploy on Intel® Xeon, Intel® Max Series GPU, and Intel® Core Ultra, check out optimum-intel, the interface between Intel architectures and the 🤗 Transformers and Diffusers libraries. You can use these backends:
Backend | Installation |
---|---|
OpenVINO™ | pip install --upgrade --upgrade-strategy eager "optimum[openvino]" |
Intel® Extension for PyTorch* | pip install --upgrade --upgrade-strategy eager "optimum[ipex]" |
Intel® Neural Compressor | pip install --upgrade --upgrade-strategy eager "optimum[neural-compressor]" |
Optimum Habana - To deploy on Intel® Gaudi® AI accelerators, check out optimum-habana, the interface between Gaudi and the 🤗 Transformers and Diffusers libraries. To install the latest stable release:
pip install --upgrade-strategy eager optimum[habana]
Check out the Intel® Tiber™ Developer Cloud to run your latest GenAI or LLM workload on Intel architecture.
Want to share your model fine-tuned on Intel architecture? And for more detailed deployment tips and sample code, please visit the "Deployment Tips" tab from the Powered-by-Intel LLM Leaderboard.
Join us on the Intel DevHub Discord to ask questions and interact with our AI developer community.