- Notifications
You must be signed in to change notification settings - Fork11
saschaschramm/github-copilot
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This is an analysis of theGithub Copilot extension for Visual Studio Code.
Under macOS theVS Code Extensions are located in the following directory:
~/.vscode/extensionsAnalysis of version 1.92.177
For an analysis of Copilot Chat seeREADME_COPILOT_CHAT.md.
The Github Copilot extension generates three types of prompts.
We start with the simplest case with only one filefile1.py.
- filename:
file1.py - file content:
# Print hello, world
If the user presses enter after# Print hello, world, the extension generates the following prompt:
# Path: file1.py# Print hello, world
The path to the file is part of the prompt.
Now let's consider a slightly more complex two-file case where filefile2.py is edited.
filename:
file1.pyfile content:
# Print hello, worldfilename:
file2.pyfile content:
# Print he
In this case, the extension generates the following prompt:
# Path: file2.py# Compare this snippet from file1.py:# # Print hello, world# Print he
Files with similar content are also included in the prompt.
Copilot supportsFill in the Middle. That means the extension sends the code before and after the cursor position to the model.
- filename:
file3.py - file content:
# Test prefix\n# Test suffix
If the user presses enter after# Test prefix, the extension generates the prefix
# Path: file3.py# Test prefix
and the suffix
# Test suffixTo generate a completion, the extension sends aPOST request to the endpointhttps://copilot-proxy.githubusercontent.com/v1/engines/copilot-codex/completions.
After sending the request, the endpoint returns the followingresponse.
The Github Copilot extension sendstelemetry data to the endpointhttps://dc.services.visualstudio.com:
The extension contains two vocabulary files
| Filename | Vocabulary Size | Comment |
|---|---|---|
vocab_cushman001.bpe | 50,276 | This vocabulary is based on the GPT-2 vocabulary |
vocab_cushman002.bpe | 100,000 | This vocabulary is new and not based on the GPT-2 vocabulary anymore |
The length of the prompt has to be >= 10 characters before the prompt is sent to the model.
if((_>0 ?n.length :d)<t.MIN_PROMPT_CHARS)returnt._contextTooShort;
The following information is collected about the file being edited:
constm={uri:d.toString(),// The absolute path of the filesource:t,// Content of the fileoffset:n,// The offset of the cursorrelativePath:u,// The relative path of the filelanguageId:p// The programming language of the file}
The extension remembers the files that have been accessed before. The functiongetNeighborFiles calls the functiontruncateDocs. Input of the functiontruncateDocs are thefiles sorted by access time.
When the combined size of all files exceeds 200,000, any additional files will be disregarded. The functiontruncateDocs returns atruncated list of files.
We have evaluated the copilot modelcushman-ml with theHumanEval dataset. Out of 164 programming problems, the model can solve56.10%.
| Model name | Pass@1 | Date | Comment |
|---|---|---|---|
| code-cushman-001 | 32.93% | 2022-10-23 | https://openai.com/api/ |
| code-davinci-002 | 46.95% | 2022-10-23 | https://openai.com/api/ |
| cushman-ml | 56.10% | 2022-10-23 | Copilot |
Completions of the evaluation run:2022-10-23-samples-cushman-ml.jsonl
About
Analysis of the Github Copilot extension
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.