- Notifications
You must be signed in to change notification settings - Fork14.2k
feat: Support bloom models#3553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Merged
Merged
Uh oh!
There was an error while loading.Please reload this page.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
ContributorAuthor
xingchensong commentedOct 9, 2023
PPL TestTest script for torch fp32importargparseimporttorchfromtransformersimportAutoConfig,BloomForCausalLM,BloomTokenizerFastdefcalculate_ppl(device,model,tokenizer,sentence:str,max_length:int=100,stride:int=50)->float:sentence_ids=tokenizer.encode(sentence)# do not add bos_token_idprint(sentence_ids)seq_len=len(sentence_ids)nlls= []forbegin_locinrange(0,seq_len,stride):end_loc=min(begin_loc+max_length+stride//2,seq_len)if (end_loc-begin_loc)!= (max_length+stride//2):breakinput_ids=sentence_ids[begin_loc:end_loc]input_ids=torch.tensor([input_ids])input_ids=input_ids.to(device)target_ids=input_ids.clone()target_ids[:, :-stride]=-100withtorch.no_grad():outputs=model(input_ids,labels=target_ids)neg_log_likelihood=outputs.lossnlls.append(neg_log_likelihood)ifend_loc==seq_len:breakggml_nlls=torch.cumsum(torch.stack(nlls)*stride,dim=0)count=torch.arange(stride,len(nlls)*stride+stride,stride)chunk_ppls=torch.exp(ggml_nlls/count).cpu().tolist()fori,pplinenumerate(chunk_ppls):print("[{}] {}".format(i+1,ppl))if__name__=="__main__":parser=argparse.ArgumentParser()parser.add_argument("--model_name_or_path",type=str,default="")args=parser.parse_args()device=torch.device("cpu")tokenizer=BloomTokenizerFast.from_pretrained(args.model_name_or_path)model_config=AutoConfig.from_pretrained(args.model_name_or_path,trust_remote_code=True )model=BloomForCausalLM.from_pretrained(args.model_name_or_path,torch_dtype=torch.float16,config=model_config,device_map="auto", )model.to(device).float()# type: ignoremodel.eval()# type: ignoresens= ["About six million children are reported to child protection agencies in America each year. About 400,000 of those children are placed in protective custody because of severe neglect or abuse. About 500,000 children are placed into foster care and adoptive placements. Abused and neglected children are all around us. These children are invisible in our community, yet each one of us is directly responsible for their plight. They live under our laws; they go to our schools; they are convicted by our courts; many of them spend lifetimes in our prisons. They have no say in the laws and policies that rule their lives. Just like they had no say in the neglect and abuse that was their childhood. Neglected and abused children make up a great majority of the crime, drugs, and violence we experience in our communities. Over fifty percent of the children in the juvenile justice system have diagnosable mental illness, about thirty percent of children in child protection services are proscribed psychotropic medications, & almost eighty percent of youth aging out of foster care lead dysfunctional lives. Ninety percent of the juveniles in the Juvenile Justice System have come out of the Child Protection System (Minnesota’s Chief Justice, Kathleen Blatz). Over 90 percent of the adults in the Criminal Justice System come out of the Juvenile Justice System. Justice Blatz (and others) call it a prison “feeder” system. The United States is the only nation in the world to build prisons based on failed third grade reading scores or the number of children in Child Protection. Children are not aware of the rightness or wrongness of their own abuse. They do not know that abuse is abnormal, or even that it is wrong. To a five-year-old, no matter how painful and frightening her life is, her life is normal. A sad and lasting fact of child abuse is that children blame themselves for the abuse they receive. How can sex, drugs, and violence be unlearned by a ten year old child whose entire life has been just that? It takes years of therapy to change a child’s perception of an abusive past. It takes a great deal longer for an abused child to develop a healthy view of the world and a positive self-image. There is no book a child can go to, or code they are born with, that explains the abnormality of what is happening to them. Children can’t call their senators, or complain to the authorities (they can’t even tell their parents). Behaviors learned by abused children to stay alive in toxic homes are terribly counter-productive once the child is out of the abusive circumstances and trying to live a normal life. The behaviors developed for staying alive and avoiding pain dominate and thus can become significant detriments to getting along in society. As a matter of fact, for many troubled youth, their explosive responses and pain avoidance behaviors define them as uneducated social misfits with criminal histories."]for_,seninenumerate(sens):calculate_ppl(device,model,tokenizer,sen,max_length=100,stride=50 ) Test script for ggml fp16/q4_1./build/bin/perplexity -m models/bloom-1b7.fp16.gguf \ -p "About six million children are reported to child protection agencies in America each year. About 400,000 of those children are placed in protective custody because of severe neglect or abuse. About 500,000 children are placed into foster care and adoptive placements. Abused and neglected children are all around us. These children are invisible in our community, yet each one of us is directly responsible for their plight. They live under our laws; they go to our schools; they are convicted by our courts; many of them spend lifetimes in our prisons. They have no say in the laws and policies that rule their lives. Just like they had no say in the neglect and abuse that was their childhood. Neglected and abused children make up a great majority of the crime, drugs, and violence we experience in our communities. Over fifty percent of the children in the juvenile justice system have diagnosable mental illness, about thirty percent of children in child protection services are proscribed psychotropic medications, & almost eighty percent of youth aging out of foster care lead dysfunctional lives. Ninety percent of the juveniles in the Juvenile Justice System have come out of the Child Protection System (Minnesota’s Chief Justice, Kathleen Blatz). Over 90 percent of the adults in the Criminal Justice System come out of the Juvenile Justice System. Justice Blatz (and others) call it a prison “feeder” system. The United States is the only nation in the world to build prisons based on failed third grade reading scores or the number of children in Child Protection. Children are not aware of the rightness or wrongness of their own abuse. They do not know that abuse is abnormal, or even that it is wrong. To a five-year-old, no matter how painful and frightening her life is, her life is normal. A sad and lasting fact of child abuse is that children blame themselves for the abuse they receive. How can sex, drugs, and violence be unlearned by a ten year old child whose entire life has been just that? It takes years of therapy to change a child’s perception of an abusive past. It takes a great deal longer for an abused child to develop a healthy view of the world and a positive self-image. There is no book a child can go to, or code they are born with, that explains the abnormality of what is happening to them. Children can’t call their senators, or complain to the authorities (they can’t even tell their parents). Behaviors learned by abused children to stay alive in toxic homes are terribly counter-productive once the child is out of the abusive circumstances and trying to live a normal life. The behaviors developed for staying alive and avoiding pain dominate and thus can become significant detriments to getting along in society. As a matter of fact, for many troubled youth, their explosive responses and pain avoidance behaviors define them as uneducated social misfits with criminal histories." \ --ppl-stride 50 -c 100 -b 512 -s 2023 Results
|
ContributorAuthor
xingchensong commentedOct 9, 2023
PPL results look good to me, I think this PR is ready for a final review :)),@ggerganov |
Member
ggerganov commentedOct 9, 2023
Nice job. This still lacks tensor offloading for GPU support, but we can fix this later. |
Member
ggerganov commentedOct 10, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Tested on M2 Ultra using Metal - seems to work as expected: ./main -m ./models/bloom-1b/ggml-model-f16.gguf -p"I believe the meaning of life is" --ignore-eos -n64 -t4 -ngl1 -s1llama_new_context_with_model:computebuffertotalsize =500.13MBllama_new_context_with_model:maxtensorsize =980.00MBggml_metal_add_buffer:allocated'data 'buffer,size =4279.47MB, (4280.09 /147456.00)ggml_metal_add_buffer:allocated'kv 'buffer,size =98.00MB, (4378.09 /147456.00)ggml_metal_add_buffer:allocated'alloc 'buffer,size =494.02MB, (4872.11 /147456.00)system_info:n_threads =4 /24 |AVX =0 |AVX2 =0 |AVX512 =0 |AVX512_VBMI =0 |AVX512_VNNI =0 |FMA =0 |NEON =1 |ARM_FMA =1 |F16C =0 |FP16_VA =1 |WASM_SIMD =0 |BLAS =1 |SSE3 =0 |SSSE3 =0 |VSX =0 |sampling:repeat_last_n =64,repeat_penalty =1.100000,presence_penalty =0.000000,frequency_penalty =0.000000,top_k =40,tfs_z =1.000000,top_p =0.950000,typical_p =1.000000,temp =0.800000,mirostat =0,mirostat_lr =0.100000,mirostat_ent =5.000000generate:n_ctx =512,n_batch =512,n_predict =64,n_keep =0Ibelievethemeaningoflifeisdeterminednotbyanindividual's physical, spiritual or mental well-being but rather their place in a more meaningful context.Thetermholisticwellbeingwasfirstcoinedtodescribetheconceptthatpeopleshouldbehealthyandhappyasindividualswithoutbeingforcedintohealthcareprograms (Barnett &Jones,2006) .Toachievethisllama_print_timings:loadtime =216.92msllama_print_timings:sampletime =330.09ms /64runs (5.16mspertoken,193.89tokenspersecond)llama_print_timings:promptevaltime =20.13ms /7tokens (2.88mspertoken,347.81tokenspersecond)llama_print_timings:evaltime =609.28ms /63runs (9.67mspertoken,103.40tokenspersecond)llama_print_timings:totaltime =1002.27ms//////////////////./main -m ./models/bloom-1b/ggml-model-q4_0.gguf -p"I believe the meaning of life is" --ignore-eos -n64 -t4 -ngl1 -s1llama_new_context_with_model:computebuffertotalsize =500.13MBllama_new_context_with_model:maxtensorsize =401.95MBggml_metal_add_buffer:allocated'data 'buffer,size =1341.05MB, (1341.67 /147456.00)ggml_metal_add_buffer:allocated'kv 'buffer,size =98.00MB, (1439.67 /147456.00)ggml_metal_add_buffer:allocated'alloc 'buffer,size =494.02MB, (1933.69 /147456.00)system_info:n_threads =4 /24 |AVX =0 |AVX2 =0 |AVX512 =0 |AVX512_VBMI =0 |AVX512_VNNI =0 |FMA =0 |NEON =1 |ARM_FMA =1 |F16C =0 |FP16_VA =1 |WASM_SIMD =0 |BLAS =1 |SSE3 =0 |SSSE3 =0 |VSX =0 |sampling:repeat_last_n =64,repeat_penalty =1.100000,presence_penalty =0.000000,frequency_penalty =0.000000,top_k =40,tfs_z =1.000000,top_p =0.950000,typical_p =1.000000,temp =0.800000,mirostat =0,mirostat_lr =0.100000,mirostat_ent =5.000000generate:n_ctx =512,n_batch =512,n_predict =64,n_keep =0Ibelievethemeaningoflifeisfindingtherightpartner.Andthatyoujustdon't know how to find it when you'reyoungormature," she said."Youreallyneedamentorwhowillgiveyouguidanceanddirectioninyourrelationships -whetherit's with friends, family, partners, children."My advice would be to trust yourself enough so not to letllama_print_timings: load time = 167.38 msllama_print_timings: sample time = 319.66 ms / 64 runs ( 4.99 ms per token, 200.21 tokens per second)llama_print_timings: prompt eval time = 21.48 ms / 7 tokens ( 3.07 ms per token, 325.84 tokens per second)llama_print_timings: eval time = 402.63 ms / 63 runs ( 6.39 ms per token, 156.47 tokens per second)llama_print_timings: total time = 786.13 ms |
ggerganov approved these changesOct 10, 2023
Merged
joelkuiper added a commit to vortext/llama.cpp that referenced this pull requestOct 12, 2023
…example* 'master' of github.com:ggerganov/llama.cpp: (34 commits) examples: support LLaVA v1.5 (multimodal model) (ggml-org#3436) docs : fix typo GOMP_CPU_AFFINITY (ggml-org#3597) cmake : fix add_compile_options on macOS typo : it is `--n-gpu-layers` not `--gpu-layers` (ggml-org#3592) ci : check if there is enough VRAM (ggml-org#3596) server : add completion mode (no chat) (ggml-org#3582) prompts : add mnemonics.txt server : fix kv cache management (ggml-org#3588) main : fix session loading bug (ggml-org#3400) server : add parameter -tb N, --threads-batch N (ggml-org#3584) common : fix mirostat state when using multiple sequences (ggml-org#3543) batched : add bench tool (ggml-org#3545) examples : add batched.swift + improve CI for swift (ggml-org#3562) Add MPT model to supported models in README.md (ggml-org#3574) Minor improvements in GPT2 tokenizer (ggml-org#3567) readme : add bloom (ggml-org#3570) llm : add bloom models (ggml-org#3553) swift : improvements and fixes (ggml-org#3564) llm : add MPT support (ggml-org#3417) infill. : fix tokenization (ggml-org#3508) ...
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading.Please reload this page.
This is a follow-up PR, plz seeggml-org/ggml#543
Test Script
./build/bin/main -m models/bloom-1b7.fp16.gguf \ -p"Building a website can be done in 10 simple steps:\nStep 1:" \ -n 100 -e --temp 1.0 --top-k 1 --top-p 1.0 \ --repeat-last-n 0 -s 2023Tested Models
https://huggingface.co/bigscience/bloom-1b7


(fp16)
(q4_1)
https://huggingface.co/bigscience/bloom-3b


(fp16)
(q4_1)
https://huggingface.co/bigscience/bloom-7b1


(fp16)
(q4_1)
https://huggingface.co/Langboat/bloom-1b4-zh


(fp16)
(q4_1)
TODO