- Notifications
You must be signed in to change notification settings - Fork194
rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.
License
phiresky/ripgrep-all
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
rga is a line-oriented search tool that allows you to look for a regex in a multitude of file types. rga wraps the awesomeripgrep and enables it to search in pdf, docx, sqlite, jpg, movie subtitles (mkv, mp4), etc.
For more detail, see this introductory blogpost:https://phiresky.github.io/blog/2019/rga--ripgrep-for-zip-targz-docx-odt-epub-jpg/
rga will recursively descend into archives and match text in every file type it knows.
Here is anexample directory with different file types:
demo/├── greeting.mkv├── hello.odt├── hello.sqlite3└── somearchive.zip├── dir│ ├── greeting.docx│ └── inner.tar.gz│ └── greeting.pdf└── greeting.epub
Seethe wiki for instructions of integrating rga with fzf.
Linux x64, macOS and Windows binaries are availablein GitHub Releases.
pacman -S ripgrep-all
emerge sys-apps/ripgrep-all
nix-env -iA nixpkgs.ripgrep-all
download therga binary and get the dependencies like this:
apt install ripgrep pandoc poppler-utils ffmpeg
If ripgrep is not included in your package sources, get it fromhere.
rga will search for all binaries it calls in $PATH and the directory itself is in.
Note that installing viachocolatey orscoop is the only supported download method. If you download the binary from releases manually, you will not get the dependencies (for example pdftotext from poppler).
If you get an error likeVCRUNTIME140.DLL could not be found
, you need to installvc_redist.x64.exe.
choco install ripgrep-all
scoop install rga
rga
can be installed withHomebrew:
brew install rga
To install the dependencies that are each not strictly necessary but very useful:
brew install pandoc poppler ffmpeg
rga
can also be installed on macOS viaMacPorts:
sudo port install ripgrep-all
rga should compile with stable Rust (v1.75.0+, check withrustc --version
). To build it, run the following (or the equivalent in your OS):
~$ apt install build-essential pandoc poppler-utils ffmpeg ripgrep cargo~$ cargo install --locked ripgrep_all~$ rga --version # this should work now
rga works withadapters that adapt various file formats. It comes with a few adapters integrated:
rga --rga-list-adapters
You can also addcustom adapters. Seethe wiki for more information.
Adapters:
pandocUses pandoc to convert binary/unreadable text documents to plain markdown-like textRuns: pandoc --from= --to=plain --wrap=none --markdown-headings=atx
Extensions: .epub, .odt, .docx, .fb2, .ipynb, .html, .htmpopplerUses pdftotext (from poppler-utils) to extract plain text from PDF filesRuns: pdftotext - -
Extensions: .pdf
Mime Types: application/pdfpostprocpagebreaksAdds the page number to each line for an input file that specifies page breaks as ascii page break character.Mainly to be used internally by the poppler adapter.
Extensions: .asciipagebreaksffmpegUses ffmpeg to extract video metadata/chapters, subtitles, lyrics, and other metadata
Extensions: .mkv, .mp4, .avi, .mp3, .ogg, .flac, .webmzipReads a zip file as a stream and recurses down into its contents
Extensions: .zip, .jar
Mime Types: application/zipdecompressReads compressed file as a stream and runs a different extractor on the contents.
Extensions: .als, .bz2, .gz, .tbz, .tbz2, .tgz, .xz, .zst
Mime Types: application/gzip, application/x-bzip, application/x-xz, application/zstdtarReads a tar file as a stream and recurses down into its contents
Extensions: .tarsqliteUses sqlite bindings to convert sqlite databases into a simple plain text format
Extensions: .db, .db3, .sqlite, .sqlite3
Mime Types: application/x-sqlite3
The following adapters are disabled by default, and can be enabled using '--rga-adapters=+foo,bar':
- mailReads mailbox/mail files and runs extractors on the contents and attachments.
Extensions: .mbox, .mbx, .eml
Mime Types: application/mbox, message/rfc822
rga [RGA OPTIONS] [RG OPTIONS] PATTERN [PATH ...]
--rga-accurate
Use more accurate but slower matching by mime type
By default, rga will match files using file extensions. Some programs,such as sqlite3, don't care about the file extension at all, so userssometimes use any or no extension at all. With this flag, rga will tryto detect the mime type of input files using the magic bytes (similarto the `file` utility), and use that to choose the adapter.Detection is only done on the first 8KiB of the file, since we can'talways seek on the input (in archives).
--rga-no-cache
Disable caching of results
By default, rga caches the extracted text, if it is small enough, to adatabase in ${XDG_CACHE_DIR-~/.cache}/ripgrep-all on Linux,~/Library/Caches/ripgrep-all on macOS, orC:\Users\username\AppData\Local\ripgrep-all on Windows. This way,repeated searches on the same set of files will be much faster. If youpass this flag, all caching will be disabled.
-h,--help
Prints help information
--rga-list-adapters
List all known adapters
--rga-print-config-schema
Print the JSON Schema of the configuration file
--rg-help
Show help for ripgrep itself
--rg-version
Show version of ripgrep itself
-V,--version
Prints version information
--rga-adapters=<adapters>...
Change which adapters to use and in which priority order (descending)
"foo,bar" means use only adapters foo and bar. "-bar,baz" meansuse all default adapters except for bar and baz. "+bar,baz" meansuse all default adapters and also bar and baz.
--rga-cache-compression-level=<compression-level>
ZSTD compression level to apply to adapter outputs before storing incache db
Ranges from 1 - 22 [default: 12]
--rga-config-file=<config-file-path>
--rga-max-archive-recursion=<max-archive-recursion>
Maximum nestedness of archives to recurse into [default: 5]
--rga-cache-max-blob-len=<max-blob-len>
Max compressed size to cache
Longest byte length (after compression) to store in cache. Longeradapter outputs will not be cached and recomputed every time.
Allowed suffixes on command line: k M G [default: 2000000]
--rga-cache-path=<path>
Path to store cache db [default: /home/phire/.cache/ripgrep-all]
-h shows a concise overview,--help shows more detail andadvanced options.
All other options not shown here are passed directly to rg, especially[PATTERN] and [PATH ...]
The config file location leverage the mechanisms defined by
- theXDG base directory andtheXDG user directory specifications on Linux (ex:
~/.config/ripgrep-all/config.jsonc
) - theKnown Folder API on Windows (ex:
C:\Users\Alice\AppData\Roaming\ripgrep-all/config.jsonc
) - theStandard Directoriesguidelines on macOS (ex:
~/Library/Application Support/ripgrep-all/config.jsonc
)
To enable debug logging:
export RUST_LOG=debugexport RUST_BACKTRACE=1
Also remember to disable caching with--rga-no-cache
or clear the cache(~/Library/Caches/rga
on macOS,~/.cache/rga
on other Unixes,orC:\Users\username\AppData\Local\rga
on Windows)to debug the adapters.
You can use the providedflake.nix
to setup all build- andrun-time dependencies:
- EnableFlakes in your Nix configuration.
- Add
direnv
to your profile:nix profile install nixpkgs#direnv
cd
into the directory where you have cloned this directory.- Allow use of
.envrc
:direnv allow
- After the dependencies have been installed, your shell will now have all ofthe necessary development dependencies.
About
rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.