Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat: Support bloom models#3553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
ggerganov merged 3 commits intoggml-org:masterfromxingchensong:xcsong-bloom
Oct 10, 2023

Conversation

@xingchensong
Copy link
Contributor

@xingchensongxingchensong commentedOct 9, 2023
edited
Loading

This is a follow-up PR, plz seeggml-org/ggml#543

Test Script

./build/bin/main -m models/bloom-1b7.fp16.gguf \  -p"Building a website can be done in 10 simple steps:\nStep 1:" \  -n 100 -e --temp 1.0 --top-k 1 --top-p 1.0 \  --repeat-last-n 0 -s 2023

Tested Models

TODO

  • PPL test

lin72h, cdliang11, JinZr, and qmpzzpmq reacted with thumbs up emojilin72h and PapersAnon reacted with hooray emoji
@xingchensong
Copy link
ContributorAuthor

PPL Test

Test script for torch fp32

importargparseimporttorchfromtransformersimportAutoConfig,BloomForCausalLM,BloomTokenizerFastdefcalculate_ppl(device,model,tokenizer,sentence:str,max_length:int=100,stride:int=50)->float:sentence_ids=tokenizer.encode(sentence)# do not add bos_token_idprint(sentence_ids)seq_len=len(sentence_ids)nlls= []forbegin_locinrange(0,seq_len,stride):end_loc=min(begin_loc+max_length+stride//2,seq_len)if (end_loc-begin_loc)!= (max_length+stride//2):breakinput_ids=sentence_ids[begin_loc:end_loc]input_ids=torch.tensor([input_ids])input_ids=input_ids.to(device)target_ids=input_ids.clone()target_ids[:, :-stride]=-100withtorch.no_grad():outputs=model(input_ids,labels=target_ids)neg_log_likelihood=outputs.lossnlls.append(neg_log_likelihood)ifend_loc==seq_len:breakggml_nlls=torch.cumsum(torch.stack(nlls)*stride,dim=0)count=torch.arange(stride,len(nlls)*stride+stride,stride)chunk_ppls=torch.exp(ggml_nlls/count).cpu().tolist()fori,pplinenumerate(chunk_ppls):print("[{}] {}".format(i+1,ppl))if__name__=="__main__":parser=argparse.ArgumentParser()parser.add_argument("--model_name_or_path",type=str,default="")args=parser.parse_args()device=torch.device("cpu")tokenizer=BloomTokenizerFast.from_pretrained(args.model_name_or_path)model_config=AutoConfig.from_pretrained(args.model_name_or_path,trust_remote_code=True    )model=BloomForCausalLM.from_pretrained(args.model_name_or_path,torch_dtype=torch.float16,config=model_config,device_map="auto",    )model.to(device).float()# type: ignoremodel.eval()# type: ignoresens= ["About six million children are reported to child protection agencies in America each year. About 400,000 of those children are placed in protective custody because of severe neglect or abuse. About 500,000 children are placed into foster care and adoptive placements. Abused and neglected children are all around us. These children are invisible in our community, yet each one of us is directly responsible for their plight. They live under our laws; they go to our schools; they are convicted by our courts; many of them spend lifetimes in our prisons. They have no say in the laws and policies that rule their lives. Just like they had no say in the neglect and abuse that was their childhood. Neglected and abused children make up a great majority of the crime, drugs, and violence we experience in our communities. Over fifty percent of the children in the juvenile justice system have diagnosable mental illness, about thirty percent of children in child protection services are proscribed psychotropic medications, & almost eighty percent of youth aging out of foster care lead dysfunctional lives. Ninety percent of the juveniles in the Juvenile Justice System have come out of the Child Protection System (Minnesota’s Chief Justice, Kathleen Blatz). Over 90 percent of the adults in the Criminal Justice System come out of the Juvenile Justice System. Justice Blatz (and others) call it a prison “feeder” system. The United States is the only nation in the world to build prisons based on failed third grade reading scores or the number of children in Child Protection. Children are not aware of the rightness or wrongness of their own abuse. They do not know that abuse is abnormal, or even that it is wrong. To a five-year-old, no matter how painful and frightening her life is, her life is normal. A sad and lasting fact of child abuse is that children blame themselves for the abuse they receive. How can sex, drugs, and violence be unlearned by a ten year old child whose entire life has been just that? It takes years of therapy to change a child’s perception of an abusive past. It takes a great deal longer for an abused child to develop a healthy view of the world and a positive self-image. There is no book a child can go to, or code they are born with, that explains the abnormality of what is happening to them. Children can’t call their senators, or complain to the authorities (they can’t even tell their parents). Behaviors learned by abused children to stay alive in toxic homes are terribly counter-productive once the child is out of the abusive circumstances and trying to live a normal life. The behaviors developed for staying alive and avoiding pain dominate and thus can become significant detriments to getting along in society. As a matter of fact, for many troubled youth, their explosive responses and pain avoidance behaviors define them as uneducated social misfits with criminal histories."]for_,seninenumerate(sens):calculate_ppl(device,model,tokenizer,sen,max_length=100,stride=50        )

Test script for ggml fp16/q4_1

./build/bin/perplexity -m models/bloom-1b7.fp16.gguf \  -p "About six million children are reported to child protection agencies in America each year. About 400,000 of those children are placed in protective custody because of severe neglect or abuse. About 500,000 children are placed into foster care and adoptive placements. Abused and neglected children are all around us. These children are invisible in our community, yet each one of us is directly responsible for their plight. They live under our laws; they go to our schools; they are convicted by our courts; many of them spend lifetimes in our prisons. They have no say in the laws and policies that rule their lives. Just like they had no say in the neglect and abuse that was their childhood. Neglected and abused children make up a great majority of the crime, drugs, and violence we experience in our communities. Over fifty percent of the children in the juvenile justice system have diagnosable mental illness, about thirty percent of children in child protection services are proscribed psychotropic medications, & almost eighty percent of youth aging out of foster care lead dysfunctional lives. Ninety percent of the juveniles in the Juvenile Justice System have come out of the Child Protection System (Minnesota’s Chief Justice, Kathleen Blatz). Over 90 percent of the adults in the Criminal Justice System come out of the Juvenile Justice System. Justice Blatz (and others) call it a prison “feeder” system. The United States is the only nation in the world to build prisons based on failed third grade reading scores or the number of children in Child Protection. Children are not aware of the rightness or wrongness of their own abuse. They do not know that abuse is abnormal, or even that it is wrong. To a five-year-old, no matter how painful and frightening her life is, her life is normal. A sad and lasting fact of child abuse is that children blame themselves for the abuse they receive. How can sex, drugs, and violence be unlearned by a ten year old child whose entire life has been just that? It takes years of therapy to change a child’s perception of an abusive past. It takes a great deal longer for an abused child to develop a healthy view of the world and a positive self-image. There is no book a child can go to, or code they are born with, that explains the abnormality of what is happening to them. Children can’t call their senators, or complain to the authorities (they can’t even tell their parents). Behaviors learned by abused children to stay alive in toxic homes are terribly counter-productive once the child is out of the abusive circumstances and trying to live a normal life. The behaviors developed for staying alive and avoiding pain dominate and thus can become significant detriments to getting along in society. As a matter of fact, for many troubled youth, their explosive responses and pain avoidance behaviors define them as uneducated social misfits with criminal histories." \  --ppl-stride 50 -c 100 -b 512 -s 2023

Results

Modelchunk-0chunk-1chunk-2chunk-3chunk-4chunk-5chunk-6chunk-7chunk-8chunk-9
torch fp3211.960311.665714.045414.481515.977816.682716.081015.778516.667417.1874
ggml fp16 (4.2GB)11.961511.667314.046614.482815.978616.997916.112115.938617.386617.9251
ggml q4_1 (1.5GB)12.499612.394014.640015.389216.974318.256517.182017.091018.484018.8846
Modelchunk-0chunk-1chunk-2chunk-3chunk-4chunk-5chunk-6chunk-7chunk-8chunk-9
torch fp3224.832326.777430.078939.088641.802040.551737.774338.024438.251239.4991
ggml fp16 (2.7GB)24.826826.780530.080939.091441.803440.551237.773538.024138.252239.5002
ggml q4_1 (855MB)26.086228.591431.653440.425343.303842.387739.775240.376240.654341.8351

@xingchensong
Copy link
ContributorAuthor

PPL results look good to me, I think this PR is ready for a final review :)),@ggerganov

lin72h reacted with thumbs up emoji

@ggerganov
Copy link
Member

Nice job. This still lacks tensor offloading for GPU support, but we can fix this later.
I'll review and merge this PR after the#3417 is merged

xingchensong, lin72h, and zolastro reacted with thumbs up emojiGreen-Sky, lin72h, and zolastro reacted with rocket emoji

@ggerganovggerganov added modelModel specific need feedbackTesting and feedback with results are needed labelsOct 9, 2023
@ggerganov
Copy link
Member

ggerganov commentedOct 10, 2023
edited
Loading

Tested on M2 Ultra using Metal - seems to work as expected:

./main -m ./models/bloom-1b/ggml-model-f16.gguf -p"I believe the meaning of life is" --ignore-eos -n64 -t4 -ngl1 -s1llama_new_context_with_model:computebuffertotalsize =500.13MBllama_new_context_with_model:maxtensorsize =980.00MBggml_metal_add_buffer:allocated'data            'buffer,size =4279.47MB, (4280.09 /147456.00)ggml_metal_add_buffer:allocated'kv              'buffer,size =98.00MB, (4378.09 /147456.00)ggml_metal_add_buffer:allocated'alloc           'buffer,size =494.02MB, (4872.11 /147456.00)system_info:n_threads =4 /24 |AVX =0 |AVX2 =0 |AVX512 =0 |AVX512_VBMI =0 |AVX512_VNNI =0 |FMA =0 |NEON =1 |ARM_FMA =1 |F16C =0 |FP16_VA =1 |WASM_SIMD =0 |BLAS =1 |SSE3 =0 |SSSE3 =0 |VSX =0 |sampling:repeat_last_n =64,repeat_penalty =1.100000,presence_penalty =0.000000,frequency_penalty =0.000000,top_k =40,tfs_z =1.000000,top_p =0.950000,typical_p =1.000000,temp =0.800000,mirostat =0,mirostat_lr =0.100000,mirostat_ent =5.000000generate:n_ctx =512,n_batch =512,n_predict =64,n_keep =0Ibelievethemeaningoflifeisdeterminednotbyanindividual's physical, spiritual or mental well-being but rather their place in a more meaningful context.Thetermholisticwellbeingwasfirstcoinedtodescribetheconceptthatpeopleshouldbehealthyandhappyasindividualswithoutbeingforcedintohealthcareprograms (Barnett &Jones,2006) .Toachievethisllama_print_timings:loadtime =216.92msllama_print_timings:sampletime =330.09ms /64runs   (5.16mspertoken,193.89tokenspersecond)llama_print_timings:promptevaltime =20.13ms /7tokens (2.88mspertoken,347.81tokenspersecond)llama_print_timings:evaltime =609.28ms /63runs   (9.67mspertoken,103.40tokenspersecond)llama_print_timings:totaltime =1002.27ms//////////////////./main -m ./models/bloom-1b/ggml-model-q4_0.gguf -p"I believe the meaning of life is" --ignore-eos -n64 -t4 -ngl1 -s1llama_new_context_with_model:computebuffertotalsize =500.13MBllama_new_context_with_model:maxtensorsize =401.95MBggml_metal_add_buffer:allocated'data            'buffer,size =1341.05MB, (1341.67 /147456.00)ggml_metal_add_buffer:allocated'kv              'buffer,size =98.00MB, (1439.67 /147456.00)ggml_metal_add_buffer:allocated'alloc           'buffer,size =494.02MB, (1933.69 /147456.00)system_info:n_threads =4 /24 |AVX =0 |AVX2 =0 |AVX512 =0 |AVX512_VBMI =0 |AVX512_VNNI =0 |FMA =0 |NEON =1 |ARM_FMA =1 |F16C =0 |FP16_VA =1 |WASM_SIMD =0 |BLAS =1 |SSE3 =0 |SSSE3 =0 |VSX =0 |sampling:repeat_last_n =64,repeat_penalty =1.100000,presence_penalty =0.000000,frequency_penalty =0.000000,top_k =40,tfs_z =1.000000,top_p =0.950000,typical_p =1.000000,temp =0.800000,mirostat =0,mirostat_lr =0.100000,mirostat_ent =5.000000generate:n_ctx =512,n_batch =512,n_predict =64,n_keep =0Ibelievethemeaningoflifeisfindingtherightpartner.Andthatyoujustdon't know how to find it when you'reyoungormature," she said."Youreallyneedamentorwhowillgiveyouguidanceanddirectioninyourrelationships -whetherit's with friends, family, partners, children."My advice would be to trust yourself enough so not to letllama_print_timings:        load time =     167.38 msllama_print_timings:      sample time =     319.66 ms /    64 runs   (    4.99 ms per token,   200.21 tokens per second)llama_print_timings: prompt eval time =      21.48 ms /     7 tokens (    3.07 ms per token,   325.84 tokens per second)llama_print_timings:        eval time =     402.63 ms /    63 runs   (    6.39 ms per token,   156.47 tokens per second)llama_print_timings:       total time =     786.13 ms
xingchensong and lin72h reacted with thumbs up emoji

@ggerganovggerganov merged commit02d2875 intoggml-org:masterOct 10, 2023
@xingchensongxingchensong deleted the xcsong-bloom branchOctober 10, 2023 14:50
joelkuiper added a commit to vortext/llama.cpp that referenced this pull requestOct 12, 2023
…example* 'master' of github.com:ggerganov/llama.cpp: (34 commits)  examples: support LLaVA v1.5 (multimodal model) (ggml-org#3436)  docs : fix typo GOMP_CPU_AFFINITY (ggml-org#3597)  cmake : fix add_compile_options on macOS  typo : it is `--n-gpu-layers` not `--gpu-layers` (ggml-org#3592)  ci : check if there is enough VRAM (ggml-org#3596)  server : add completion mode (no chat) (ggml-org#3582)  prompts : add mnemonics.txt  server : fix kv cache management (ggml-org#3588)  main : fix session loading bug (ggml-org#3400)  server : add parameter -tb N, --threads-batch N (ggml-org#3584)  common : fix mirostat state when using multiple sequences (ggml-org#3543)  batched : add bench tool (ggml-org#3545)  examples : add batched.swift + improve CI for swift (ggml-org#3562)  Add MPT model to supported models in README.md (ggml-org#3574)  Minor improvements in GPT2 tokenizer (ggml-org#3567)  readme : add bloom (ggml-org#3570)  llm : add bloom models (ggml-org#3553)  swift : improvements and fixes (ggml-org#3564)  llm : add MPT support (ggml-org#3417)  infill. : fix tokenization (ggml-org#3508)  ...
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@ggerganovggerganovggerganov approved these changes

Assignees

No one assigned

Labels

modelModel specificneed feedbackTesting and feedback with results are needed

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@xingchensong@ggerganov

[8]ページ先頭

©2009-2025 Movatter.jp