Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: huggingface/transformers.js

3.7.6

20 Oct 19:44
4c908ec
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

What's new?

New Contributors

Full Changelog:3.7.5...3.7.6

Contributors

  • @nico-martin
nico-martin
Assets2
Loading
Sayemahamed reacted with hooray emojiLostBeard and Jerboas86 reacted with rocket emoji
3 people reacted

3.7.5

02 Oct 13:58
c670bb9
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

What's new?

  • Add support for GraniteMoeHybrid in#1426

Full Changelog:3.7.4...3.7.5

Loading
LostBeard, roanhjs, rasmuseriksson90, and Jerboas86 reacted with rocket emoji
4 people reacted

3.7.4

29 Sep 17:40
d6b3998
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

What's new?

  • Correctly assign logits warpers in_get_logits_processor in#1422

Full Changelog:3.7.3...3.7.4

Loading
Jerboas86 reacted with thumbs up emojiLostBeard and VladOS95-cyber reacted with rocket emoji
3 people reacted

3.7.2

15 Aug 17:58
28852a2
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

What's new?

  • Add support for DINOv3 in#1390

    Seehere for the full list of supported models.

    Example: Compute image embeddings

    import{pipeline}from'@huggingface/transformers';constimage_feature_extractor=awaitpipeline('image-feature-extraction','onnx-community/dinov3-vits16-pretrain-lvd1689m-ONNX',);consturl='https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/cats.png';constfeatures=awaitimage_feature_extractor(url);console.log(features);

    Try it out using ouronline demo:

    dinov3.mp4

Full Changelog:3.7.1...3.7.2

Loading
LostBeard and restoker reacted with heart emojiLostBeard reacted with rocket emoji
2 people reacted

3.7.1

01 Aug 21:14
8d6c400
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

What's new?

New Contributors

Full Changelog:3.7.0...3.7.1

Contributors

  • @Honry
Honry
Loading
ibelem reacted with thumbs up emojiLostBeard and Lion-Lion0 reacted with heart emojiLostBeard, ShawnDGitHub, and Jerboas86 reacted with rocket emoji
5 people reacted

3.7.0

23 Jul 03:12
0feb5b7
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

🚀 Transformers.js v3.7 — Voxtral, LFM2, ModernBERT Decoder

🤖 New models

This update adds support for 3 new architectures:

Voxtral

Voxtral Mini is an enhancement ofMinistral 3B, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. ONNX weights for Voxtral-Mini-3B-2507 can be foundhere. Learn more about Voxtral in the releaseblog post.

Try it out with ouronline demo:

Voxtral.WebGPU.demo.mp4

Example: Audio transcription

import{VoxtralForConditionalGeneration,VoxtralProcessor,TextStreamer,read_audio}from"@huggingface/transformers";// Load the processor and modelconstmodel_id="onnx-community/Voxtral-Mini-3B-2507-ONNX";constprocessor=awaitVoxtralProcessor.from_pretrained(model_id);constmodel=awaitVoxtralForConditionalGeneration.from_pretrained(model_id,{dtype:{embed_tokens:"fp16",// "fp32", "fp16", "q8", "q4"audio_encoder:"q4",// "fp32", "fp16", "q8", "q4", "q4f16"decoder_model_merged:"q4",// "q4", "q4f16"},device:"webgpu",},);// Prepare the conversationconstconversation=[{"role":"user","content":[{"type":"audio"},{"type":"text","text":"lang:en [TRANSCRIBE]"},],}];consttext=processor.apply_chat_template(conversation,{tokenize:false});constaudio=awaitread_audio("http://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/mlk.wav",16000);constinputs=awaitprocessor(text,audio);// Generate the responseconstgenerated_ids=awaitmodel.generate({    ...inputs,max_new_tokens:256,streamer:newTextStreamer(processor.tokenizer,{skip_special_tokens:true,skip_prompt:true}),});// Decode the generated tokensconstnew_tokens=generated_ids.slice(null,[inputs.input_ids.dims.at(-1),null]);constgenerated_texts=processor.batch_decode(new_tokens,{skip_special_tokens:true},);console.log(generated_texts[0]);// I have a dream that one day this nation will rise up and live out the true meaning of its creed.

Added in#1373 and#1375.

LFM2

LFM2 is a new generation of hybrid models developed byLiquid AI, specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency.

The models, which we have converted to ONNX, come in three different sizes:350M,700M, and1.2B parameters.

Example: Text-generation with LFM2-350M:

import{pipeline,TextStreamer}from"@huggingface/transformers";// Create a text generation pipelineconstgenerator=awaitpipeline("text-generation","onnx-community/LFM2-350M-ONNX",{dtype:"q4"},);// Define the list of messagesconstmessages=[{role:"system",content:"You are a helpful assistant."},{role:"user",content:"What is the capital of France?"},];// Generate a responseconstoutput=awaitgenerator(messages,{max_new_tokens:512,do_sample:false,streamer:newTextStreamer(generator.tokenizer,{skip_prompt:true,skip_special_tokens:true}),});console.log(output[0].generated_text.at(-1).content);// The capital of France is Paris. It is a vibrant city known for its historical landmarks, art, fashion, and gastronomy.

Added in#1367 and#1369.

ModernBERT Decoder

These models form part of the Ettin suite: the first collection of paired encoder-only and decoder-only models trained with identical data, architecture, and training recipes. Ettin enables fair comparisons between encoder and decoder architectures across multiple scales, providing state-of-the-art performance for open-data models in their respective size categories.

The list of supported models can be foundhere.

import{pipeline,TextStreamer}from"@huggingface/transformers";// Create a text generation pipelineconstgenerator=awaitpipeline("text-generation","onnx-community/ettin-decoder-150m-ONNX",{dtype:"fp32"},);// Generate a responseconsttext="Q: What is the capital of France?\nA:";constoutput=awaitgenerator(text,{max_new_tokens:128,streamer:newTextStreamer(generator.tokenizer,{skip_prompt:true,skip_special_tokens:true}),});console.log(output[0].generated_text);

Added in#1371.

🛠️ Other improvements

  • Add special tokens in text-generation pipeline if tokenizer requires in#1370

Full Changelog:3.6.3...3.7.0

Loading
LostBeard, quanghuynt14, Jerboas86, and Laadnanimoustapha reacted with heart emojiLostBeard and kikiimdev reacted with rocket emoji
5 people reacted

3.6.3

11 Jul 20:11
467f59c
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

What's new?

  • Bump@huggingface/jinja to version 0.5.1 for new chat template functionality in#1364

Full Changelog:3.6.2...3.6.3

Loading
LostBeard and Arumuza reacted with heart emojiLostBeard and MouradiSalah reacted with rocket emoji
3 people reacted

3.6.2

08 Jul 17:46
6f026f3
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

What's new?

  • Add support for SmolLM3 in#1359

    SmolLM3 is a 3B parameter language model designed to push the boundaries of small models. It supports 6 languages, advanced reasoning and long context. SmolLM3 is a fully open model that offers strong performance at the 3B–4B scale.

    Example:

    import{pipeline,TextStreamer}from"@huggingface/transformers";// Create a text generation pipelineconstgenerator=awaitpipeline("text-generation","HuggingFaceTB/SmolLM3-3B-ONNX",{dtype:"q4f16"},);// Define the list of messagesconstmessages=[{role:"system",content:"You are SmolLM, a language model created by Hugging Face. If asked by the user, here is some information about you: SmolLM has 3 billion parameters and can converse in 6 languages: English, Spanish, German, French, Italian, and Portuguese. SmolLM is a fully open model and was trained on a diverse mix of public datasets./think"},{role:"user",content:"Solve the equation x^2 - 3x + 2 = 0"},];// Generate a responseconstoutput=awaitgenerator(messages,{max_new_tokens:1024,do_sample:false,streamer:newTextStreamer(generator.tokenizer,{skip_prompt:true,skip_special_tokens:true}),});console.log(output[0].generated_text.at(-1).content);
  • Add support for ERNIE-4.5 in#1354
    Example:

    import{pipeline,TextStreamer}from"@huggingface/transformers";// Create a text generation pipelineconstgenerator=awaitpipeline("text-generation","onnx-community/ERNIE-4.5-0.3B-ONNX",{dtype:"fp32"},// Options: "fp32", "fp16", "q8", "q4", "q4f16");// Define the list of messagesconstmessages=[{role:"system",content:"You are a helpful assistant."},{role:"user",content:"What is the capital of France?"},];// Generate a responseconstoutput=awaitgenerator(messages,{max_new_tokens:512,do_sample:false,streamer:newTextStreamer(generator.tokenizer,{skip_prompt:true,skip_special_tokens:true}),});console.log(output[0].generated_text.at(-1).content);// The capital of France is Paris.

Full Changelog:3.6.1...3.6.2

Loading
Jerboas86, kazssym, and hans00 reacted with thumbs up emojiLostBeard and mchinaloy reacted with heart emojiLostBeard reacted with rocket emojiehan701 reacted with eyes emoji
6 people reacted

3.6.1

02 Jul 05:20
fc2847c
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

What's new?

  • Add support for NeoBERT in#1350

    import{pipeline}from"@huggingface/transformers";// Create feature extraction pipelineconstextractor=awaitpipeline("feature-extraction","onnx-community/NeoBERT-ONNX");// Compute embeddingsconsttext="NeoBERT is the most efficient model of its kind!";constembedding=awaitextractor(text,{pooling:"cls"});console.log(embedding.dims);// [1, 768]
  • Improve webworker detection to support ServiceWorker and SharedWorker by@aungKhantPaing in#1346

  • Pin numpy version for scripts by@fidoriel in#1351

  • Fix optionalfrom_pretrained types in#1352

New Contributors

Full Changelog:3.6.0...3.6.1

Contributors

  • @aungKhantPaing
  • @fidoriel
aungKhantPaing and fidoriel
Loading
LostBeard, roanhjs, and Sayemahamed reacted with heart emojimariuz, escape011, LostBeard, rasmuseriksson90, and Sayemahamed reacted with rocket emoji
6 people reacted

3.6.0

26 Jun 15:48
7b45042
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

🚀 Transformers.js v3.6 — Gemma 3n, Qwen3-Embedding, Llava-Qwen2

🤖 New models

Gemma 3n

Gemma 3n, which was announced as apreview during Google I/O, is a model that is designed from the ground up torun locally on your hardware. On top of that, it's nativelymultimodal, supporting image, text, audio, and video inputs 🤯

Gemma 3n models have multiple architecture innovations:

  • They are available in two sizes based oneffective parameters. While the raw parameter count of this model is 6B, the architecture design allows the model to be run with a memory footprint comparable to a traditional 2B model by offloading low-utilization matrices from the accelerator.
  • They use a MatFormer architecture that allows nesting sub-models within theE4B model. We provide one sub-model (this model repository), or you can access a spectrum of custom-sized models using theMix-and-Match method.

Learn more about these techniques in thetechnical blog post and theGemma documentation.

As part of the release, we are releasing ONNX weights for thegemma-3n-E2B-it variant (link), making it compatible with Transformers.js:

Warning

Due to the model's large size, we currently only support Node.js, Deno, and Bun execution.
In-browser WebGPU support is actively being worked on, so stay tuned for an update!

Example: Caption an image

import{AutoProcessor,AutoModelForImageTextToText,load_image,TextStreamer,}from"@huggingface/transformers";// Load processor and modelconstmodel_id="onnx-community/gemma-3n-E2B-it-ONNX";constprocessor=awaitAutoProcessor.from_pretrained(model_id);constmodel=awaitAutoModelForImageTextToText.from_pretrained(model_id,{dtype:{embed_tokens:"q8",audio_encoder:"q8",vision_encoder:"fp16",decoder_model_merged:"q4",},device:"cpu",// NOTE: WebGPU support coming soon!});// Prepare promptconstmessages=[{role:"user",content:[{type:"image"},{type:"text",text:"Describe this image in detail."},],},];constprompt=processor.apply_chat_template(messages,{add_generation_prompt:true,});// Prepare inputsconsturl="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg";constimage=awaitload_image(url);constaudio=null;constinputs=awaitprocessor(prompt,image,audio,{add_special_tokens:false,});// Generate outputconstoutputs=awaitmodel.generate({  ...inputs,max_new_tokens:512,do_sample:false,streamer:newTextStreamer(processor.tokenizer,{skip_prompt:true,skip_special_tokens:false,// callback_function: (text) => { /* Do something with the streamed output */ },}),});// Decode outputconstdecoded=processor.batch_decode(outputs.slice(null,[inputs.input_ids.dims.at(-1),null]),{skip_special_tokens:true},);console.log(decoded[0]);
See example output
The image is a close-up, slightly macro shot of a cluster of vibrant pink cosmos flowers in full bloom. The flowers are the focal point, with their delicate, slightly ruffled petals radiating outwards. They have a soft, almost pastel pink hue, and their edges are subtly veined. A small, dark-colored bee is actively visiting one of the pink flowers, its body positioned near the center of the bloom. The bee appears to be collecting pollen or nectar. The flowers are attached to slender, brownish-green stems, and some of the surrounding foliage is visible in a blurred background, suggesting a natural outdoor setting. There are also hints of other flowers in the background, including some red ones, adding a touch of contrast to the pink. The lighting in the image seems to be natural daylight, casting soft shadows and highlighting the textures of the petals and the bee. The overall impression is one of delicate beauty and the gentle activity of nature.

Example: Transcribe audio

import{AutoProcessor,AutoModelForImageTextToText,TextStreamer,}from"@huggingface/transformers";importwavefilefrom"wavefile";// Load processor and modelconstmodel_id="onnx-community/gemma-3n-E2B-it-ONNX";constprocessor=awaitAutoProcessor.from_pretrained(model_id);constmodel=awaitAutoModelForImageTextToText.from_pretrained(model_id,{dtype:{embed_tokens:"q8",audio_encoder:"q4",vision_encoder:"fp16",decoder_model_merged:"q4",},device:"cpu",// NOTE: WebGPU support coming soon!});// Prepare promptconstmessages=[{role:"user",content:[{type:"audio"},{type:"text",text:"Transcribe this audio verbatim."},],},];constprompt=processor.apply_chat_template(messages,{add_generation_prompt:true,});// Prepare inputsconsturl="https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav";constbuffer=Buffer.from(awaitfetch(url).then((x)=>x.arrayBuffer()));constwav=newwavefile.WaveFile(buffer);wav.toBitDepth("32f");// Pipeline expects input as a Float32Arraywav.toSampleRate(processor.feature_extractor.config.sampling_rate);letaudioData=wav.getSamples();if(Array.isArray(audioData)){if(audioData.length>1){for(leti=0;i<audioData[0].length;++i){audioData[0][i]=(Math.sqrt(2)*(audioData[0][i]+audioData[1][i]))/2;}}audioData=audioData[0];}constimage=null;constaudio=audioData;constinputs=awaitprocessor(prompt,image,audio,{add_special_tokens:false,});// Generate outputconstoutputs=awaitmodel.generate({  ...inputs,max_new_tokens:512,do_sample:false,streamer:newTextStreamer(processor.tokenizer,{skip_prompt:true,skip_special_tokens:false,// callback_function: (text) => { /* Do something with the streamed output */ },}),});// Decode outputconstdecoded=processor.batch_decode(outputs.slice(null,[inputs.input_ids.dims.at(-1),null]),{skip_special_tokens:true},);console.log(decoded[0]);
See example output
And so, my fellow Americans, ask not what your country can do for you. Ask what you can do for your country.

Qwen3-Embedding

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B). This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model.

You can run it with Transformers.js as follows:

import{pipeline,matmul}from"@huggingface/transformers";// Create a feature extraction pipelineconstextractor=awaitpipeline("feature-extraction","onnx-community/Qwen3-Embedding-0.6B-ONNX",{dtype:"fp32",// Options: "fp32", "fp16", "q8"// device: "webgpu",},);functionget_detailed_instruct(task_description,query){return`Instruct:${task_description}\nQuery:${query}`;}// Each query must come with a one-sentence instruction that describes the taskconsttask="Given a web search query, retrieve relevant passages that answer the query";constqueries=[get_detailed_instruct(task,"What is the capital of China?"),get_detailed_instruct(task,"Explain gravity"),];// No need to add instruction for retrieval documentsconstdocuments=["The capital of China is Beijing.","Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",];constinput_texts=[...queries, ...documents];// Extract embeddings for queries and documentsconstoutput=awaitextractor(input_texts,{pooling:"last_token",normalize:true,});constscores=awaitmatmul(output.slice([0,queries.length]),// Query embeddingsoutput.slice([queries.length,null]).transpose(1,0),// Document embeddings);console.log(scores.tolist());// [//   [ 0.7645590305328369, 0.14142560958862305 ],//   [ 0.13549776375293732, 0.599955141544342 ]// ]

Llava-Qwen2

Finally, we also added support for Llava models with a Qwen2 text backbone:

import{AutoProcessor,AutoModelForImageTextToText,load_image,TextStreamer,}from"@huggingface/transformers";// Load processor and modelconstmodel_id="onnx-community/FastVLM-0.5B-ONNX";constprocessor=awaitAutoProcessor.from_pretrained(model_id);constmodel=awaitAutoModelForImageTextToText.from_pretrained(model_id,{dtype:{embed_tokens:"fp16",vision_encoder:"q4",decoder_model_merged:"q4",},});// Prepare promptconstmessages=[{role:"user",content:"<image>Describe this image in detail.",},];constprompt=processor.apply_cha...
Read more
Loading
Jerboas86, AngelFrieren, alexatcano, and wangrongding reacted with thumbs up emojiLostBeard, aecea, Kuberwastaken, alexatcano, yowainwright, roanhjs, and wangrongding reacted with heart emojiLostBeard, aecea, yowainwright, jbarnat, and wangrongding reacted with rocket emoji
10 people reacted
Previous1345678
Previous

[8]ページ先頭

©2009-2025 Movatter.jp