Site Search

A series on Tenstorrent web content essential for realizing local LLM

If you would like to view the article you will need to register for an account. Please contact us for more information.

Software

TT-NN™ + VSCode + Gemini Code Assist

1 -Gemini Code Assist: A VS Code extension that supports the first step of Tenstorrent development

Tenstorrent's software stack is available as open source on GitHub, but developers new to it may find it a bit difficult to understand and implement the code.

In this article and video, we introduce Gemini Code Assist to Visual Studio Code and share our experience using it to support software development at Tenstorrent.

Please take a look at the video.

Hardware

Wormhole™ n300 x 1

1 -I tried out Tenstorrent's AI accelerator card, Wormhole™ N300

(Llama 3.2 11B / Meta model edition)

We tried out Tenstornet's Wormhole™ N300 accelerator card, which specializes in image and generative AI.
This product offers superior performance per watt and cost performance compared to existing GPU cards, so if you're interested, please take a look.
It supports various network models, but this time we used Llama 3.2 11B.


Hardware

TT-LoudBox ( Wormhole™ n300s x 4)

2 -Testing Tenstorrent's AI accelerator card Wormhole™ N300

(Llama 3.3 70B / Meta model edition)

We tried out Tenstornet's Wormhole™ N300 accelerator card, which specializes in image and generative AI.
This product offers superior performance per watt and cost performance compared to existing GPU cards, so if you're interested, please take a look.
It supports various network models, but this time we used Llama 3.3 70B.

3 -Running vLLM on Tenstorrent's AI accelerator card Wormhole™ N300

(Llama 3.1 70B / HF model edition)

We ran vLLM on Tenstornet's Wormhole™ N300 accelerator card, which specializes in image and generative AI.
This product offers superior performance per watt and cost performance compared to existing GPU cards, so if you're interested, please take a look.
It supports various network models, but this time we used Llama3.1 70B.

4 -Running vLLM on Tenstorrent's AI accelerator card, Wormhole™ N300

(Qwen 2.5 72B / HF model)

We ran vLLM on Tenstornet's Wormhole™ N300 accelerator card, which specializes in image and generative AI.
This product offers superior performance per watt and cost performance compared to existing GPU cards, so if you're interested, please take a look.
It is compatible with various network models, but this time we used the Qwen 2.5 72B.

5 -Running vLLM on Tenstorrent's AI accelerator card Wormhole™ N300

(Llama 3.3 Swallow 70B / HF model edition)

We ran Llama 3.3 Swallow 70B with vLLM on Tenstornet's Wormhole™ N300 accelerator card, which specializes in image and generative AI.
This product offers superior performance per watt and cost performance compared to existing GPU cards, so if you're interested, please take a look.
It supports various network models, but this time we used Llama 3.3 Swallow 70B Instruct v0.4.

6 -I tried running vLLM on Tenstorrent's desktop workstation (TT-LoudBox)

(Llama 3.3 70B / HF model edition)

We ran vLLM on Tenstornet's TT-LoudBox, a desktop workstation specialized for image and generative AI.
This product offers superior performance per watt and cost performance compared to existing GPU cards, so if you're interested, please take a look.
It supports various network models, but this time we used Llama3.3 70B.

Inquiry

If you have any questions about this article, please contact us using the form below.

Tenstorrent Manufacturer Information Top

Tenstorrent Manufacturer Information If you would like to return to the top page, please click below.