- Notifications
You must be signed in to change notification settings - Fork21
SQL Syntax without any database
License
Puchaczov/Musoq
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Stop writing throwaway scripts. Use SQL instead.
Every developer has been there: you need to process some files, check git history, or transform data. Do you write a bash one-liner that breaks on edge cases? A Python script you'll delete in 5 minutes?
With Musoq, just write a query.
# Install CLI# Follow instructions in [CLI repository](https://github.com/Puchaczov/Musoq.CLI)# Generate some GUIDsMusoq run query"select NewId() from #system.range(1, 5)"# Find your largest filesMusoq run query"select Name, Length from #os.files('/home/user', true) where Length > 1000000 order by Length desc take 10"# Check git commits this monthMusoq run query"select Count(1) from #git.repository('.') r cross apply r.Commits c where c.CommittedWhen > '2024-11-01'"
Instead of this bash nightmare:
find. -name"*.js" -exec wc -l {}\;| awk'{sum+=$1} END {print sum}'
Write this:
selectSum(Length(f.GetFileContent()))as TotalLinesfrom#os.files('.', true) fwheref.Extension='.js'
Instead of throwaway Python:
importos,subprocess,re# 15 lines of file handling, regex, and loops
Write this:
selectf.Name,f.Directoryfrom#os.files('/project', true) fwheref.GetFileContent() rlike'TODO.*urgent'
- Quick utilities: Generate data, check file sizes, count lines
- File system queries: Find files, compare directories, analyze disk usage
- Git insights: Who changed what, commit patterns, file history
- Code analysis: Search patterns, extract metrics, find dependencies
- Data transformation: Convert between formats, clean up data
- System administration: Process queries, log analysis, monitoring
All with the declarative power of SQL instead of imperative loops and conditionals.
-- Generate test dataselect'User'+ ToString(Value)as Username, NewId()as UserIdfrom#system.range(1, 100)-- Find duplicate files by nameselect Name,Count(1)as Duplicatesfrom#os.files('/downloads', true)group by NamehavingCount(Name)>1-- Disk space by file typeselect Extension,Sum(Length)/1024/1024as SizeMB,Count(1)as FileCountfrom#os.files('/project', true)group by Extensionorder bySum(Length)/1024/1024desc
-- Git contributors last monthselectc.Author,Count(1)as Commitsfrom#git.repository('.') rcross applyr.Commits cwherec.CommittedWhen> SubtractDateTimeOffsets(GetDate(), FromString('30.00:00:00'))group byc.Authororder byc.Commitsdesc-- Find TODOs in codebaseselectfiles.Name,f.LineNumber,f.Linefrom#os.files('./src', true) filescross apply#flat.file(files.FullName) fwherefiles.Extensionin ('.cs','.js','.py')andf.Linelike'%TODO%'-- Complex C# classes (requires solution analysis)selectc.Name,c.MethodsCount,c.LinesOfCodefrom#csharp.solution('./MyProject.sln') scross applys.Projects pcross applyp.Documents dcross applyd.Classes cwherec.MethodsCount>20order byc.LinesOfCodedesc
-- Transform CSV dataselect Category,Sum(ToDecimal(Amount))as Totalfrom#separatedvalues.csv('./sales.csv', true, 0)group by Category-- Extract structured data from text using AIselectt.ProductName,t.Price,t.Descriptionfrom#stdin.text('OpenAI', 'gpt-4o') twhere ToDecimal(t.Price)>100
- No context switching: Stay in SQL instead of jumping between bash/Python/tools
- Declarative: Say what you want, not how to get it
- Composable: Complex queries from simple building blocks
- Familiar: You probably already know SQL
- Fast: No need to write, debug, and maintain scripts
- Powerful: Joins, aggregations, and complex logic built-in
Start with basic utilities:
select NewId()from#system.dual()
Scale up to complex analysis:
-- Analyze codebase evolution over timewith MonthlyStatsas (select ToString(c.CommittedWhen,'yyyy-MM')as Month,Count(d.Path)as FilesChanged,Sum(p.LinesAdded)as LinesAdded,Sum(p.LinesDeleted)as LinesDeletedfrom#git.repository('./large-project') r cross applyr.Commits c cross applyr.DifferenceBetween(c,r.CommitFrom(c.Sha+'^')) d cross applyr.PatchBetween(c,r.CommitFrom(c.Sha+'^')) pgroup by ToString(c.CommittedWhen,'yyyy-MM'))select Month, FilesChanged, LinesAdded- LinesDeletedas NetLinesfrom MonthlyStatsorder by Monthdesctake12
File Management:
-- Find files not accessed in 6 monthsselect FullName, LastAccessTimefrom#os.files('/old-project', true)where LastAccessTime< SubtractDateTimeOffsets(GetDate(), FromString('180.00:00:00'))-- Compare two directoriesselect FullName, Statusfrom#os.dirscompare('/backup', '/current')where Status!='The Same'
Development Workflow:
-- Which files change together most often?selectf1.Path,f2.Path,Count(1)as CoChangesfrom#git.repository('.') r cross apply r.Commits ccross applyr.DifferenceBetween(c,r.CommitFrom(c.Sha+'^')) f1cross applyr.DifferenceBetween(c,r.CommitFrom(c.Sha+'^')) f2wheref1.Path<f2.Pathgroup byf1.Path,f2.PathhavingCount(1)>5order by CoChangesdesc
Data Processing:
-- Extract and count imports from protobuf fileswith Importsas (select Replace(Replace(Line,'import "',''),'";','')as ImportPathfrom#flat.file('./proto/service.proto') fwhereLinelike'import "%' )select ImportPath,Count(1)as Usagefrom Importsgroup by ImportPath
Query everything with the same SQL syntax:
File System & OS
- Files, directories, processes, metadata
- Archives (ZIP contents without extraction)
- Text files, CSVs, JSON
Development Tools
- Git repositories (commits, diffs, branches)
- C# codebases (classes, methods, complexity)
- Docker containers, Kubernetes resources
AI & Analysis
- OpenAI/GPT integration for text/image analysis
- Local LLMs via Ollama
- Extract structure from unstructured data
Specialized
- Time-series data and schedules
- CAN bus data for automotive
- Airtable, databases, APIs
- Install CLI - Quick setup guide
- Try basic queries - Generate data, list files, check git
- Explore data sources - See what you can query
- Replace your next script - Use SQL instead
- Documentation - Guide and examples
- Data Sources - All available plugins
- CLI Tool - Command-line interface
SQL power including:
- Common Table Expressions (CTEs)
- JOINs across different data sources
- Set operations (UNION, EXCEPT, INTERSECT)
- Regular expressions and pattern matching
- Aggregations
- Custom data type handling through plugins
✅ Perfect for:
- One-off data tasks that would need a script
- Combining data from multiple sources
- Quick analysis and reporting
- File system operations beyond basic commands
- Git repository insights
- Code pattern searches
❌ Not ideal for:
- Large-scale data processing (>memory size)
- Real-time/streaming data
- Production ETL pipelines
- Applications requiring millisecond performance
Musoq is designed around one principle:eliminate developer friction.
Stop deciding whether a task is "worth writing a script for." Stop context-switching between tools. Stop debugging bash pipes.
Just write a query.
"Why write loops when you can write queries?"
MIT License - see theLICENSE file for details.
About
SQL Syntax without any database
Topics
Resources
License
Code of conduct
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors8
Uh oh!
There was an error while loading.Please reload this page.