- Notifications
You must be signed in to change notification settings - Fork825
refactor: consistent cloning & pattern-handling#388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
7ab4af2
to831a36d
Compare This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
Uh oh!
There was an error while loading.Please reload this page.
cf1aa6f
tod1224a6
Compare This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
Uh oh!
There was an error while loading.Please reload this page.
979c88a
to3031e7c
CompareUh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM
This comment was marked as off-topic.
This comment was marked as off-topic.
if not await check_repo_exists(url, token=token): | ||
msg = "Repository not found. Make sure it is public or that you have provided a valid token." | ||
raise ValueError(msg) | ||
commit = await resolve_commit(config, token=token) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Why not populate the CloneConfig directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
My logic was that we should be able to reuse that after when manipulating the object
Uh oh!
There was an error while loading.Please reload this page.
This comment was marked as duplicate.
This comment was marked as duplicate.
TODO:
|
Add `_handle_remove_readonly` on-error callback for `shutil.rmtree`that removes the read-only attribute and retries, preventing WinError 5during temp-repo cleanup in tests.
fix blob/tag added test
f98250f
to8438492
Compare
Uh oh!
There was an error while loading.Please reload this page.
✨ Refactor: consistent cloning & pattern-handling
Why
→Caching and reproducibility suffered.
query_parser
and duplicated ignore/include logic.→Hard to unit-test and reuse.
packed-object files.
→Shallow-clone cleanup broke with WinError 5.
What’s new
utils.git_utils.resolve_commit()
guarantees we always fetch the exact SHA(HEAD / branch / tag)before checkout.
Deterministic → enables caching (feat: Implement caching on a per-commit basis #343).
utils.pattern_utils.process_patterns()
centralises include/exclude parsing(moved out of
query_parser
). Adds thorough tests._handle_remove_readonly
on-error callback makes read-only Git objectswritable and retries
shutil.rmtree()
, preventing WinError 5 in CI.parse_query()
removed in favour ofparse_remote_repo()
(URLs/slugs)parse_local_dir_path()
(local paths)clone_repo
,ingest_query
,parse_query
are no longer re-exported fromgitingest.__init__
.(optional) → fetch commit → checkout → submodule update (optional).
_checkout_partial_clone
renamed & moved →git_utils.checkout_partial_clone
.query_processor
uses new pattern utilities and passes a typed enum.IngestRequest.validate_input_text()
now removes any.git
suffix._is_safe_symlink
_is_valid_pattern
,InvalidPatternError
, andtest_parse_patterns_invalid_characters
)File changes
gitingest
__init__.py
clone_repo
,ingest_query
,parse_query
)clone.py
clone_repo
to callresolve_commit
, sparse checkout, and uniform fetch/checkout steps; move helper & renameentrypoint.py
_handle_remove_readonly
callback for Windows temp-dir cleanup;ingest_async
now usesparse_remote_repo
/parse_local_dir_path
output_formatter.py
Tag
prefix in_create_summary_prefix
query_parser.py
parse_query
; move pattern helpers topattern_utils.py
utils/exceptions.py
InvalidPatternError
utils/git_utils.py
resolve_commit
helperutils/os_utils.py
ensure_directory
→ensure_directory_exists_or_create
utils/pattern_utils.py
process_patterns
utils/query_parser_utils.py
_is_valid_pattern
path_utils.py
(removed)_is_safe_symlink
server
models.py
.git
suffix invalidate_input_text
query_processor.py
parse_query
withparse_remote_repo
, integratePatternType
, useprocess_patterns
routers_utils.py
pattern_type
intoPatternType
enumtests
conftest.py
query_parser/test_git_host_agnostic.py
parse_remote_repo
query_parser/test_query_parser.py
parse_remote_repo
; addparse_local_dir_path
coveragetest_clone.py
test_pattern_utils.py
_parse_patterns
andprocess_patterns
; removetest_parse_patterns_invalid_characters
(pattern validation no longer enforced)test_summary.py
gitingest.ingest()
emits correct summaries