Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A fusion of SQL and awk: Applying SQL to text-based data files

License

NotificationsYou must be signed in to change notification settings

jgarzik/sqawk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crates.ioDocs.rsMIT licensed

Sqawk is an SQL-based command-line tool for processing delimiter-separated files (CSV, TSV, etc.), inspired by the classicawk command. It loads data into in-memory tables, executes SQL queries against these tables, and writes the results back to the console or files.

Features

  • Powerful SQL Query Engine

    • Support for SELECT, INSERT, UPDATE, and DELETE operations
    • WHERE clause filtering with comparison operators
    • DISTINCT keyword for removing duplicate rows
    • ORDER BY for sorting results (ASC/DESC)
    • Column aliases with the AS keyword
    • Aggregate functions (COUNT, SUM, AVG, MIN, MAX)
    • GROUP BY for data aggregation
  • Multi-Table Operations

    • Cross joins between tables
    • INNER JOIN with ON conditions for precise join criteria
    • Support for joining multiple tables
    • Table-qualified column names
  • Smart Data Handling

    • Automatic type inference (Integer, Float, Boolean, String)
    • Type coercion for comparisons
    • Null value support
  • File Format Support

    • Process CSV, TSV, and custom-delimited files
    • Custom field separator support with -F option (like awk)
    • Fast in-memory execution
    • Process multiple files in a single command
    • Table name customization
    • Chain multiple SQL statements
  • Safe Operation

    • Doesn't modify files without explicit request (--write flag)
    • Only writes back tables that were modified
    • Verbose mode for operation transparency

Installation

cargo install sqawk

Usage

Basic SELECT query

sqawk -s"SELECT * FROM data" data.csv

This loadsdata.csv into an in-memory table called "data" and performs a SELECT query.

Filtering data with WHERE clause

sqawk -s"SELECT * FROM employees WHERE salary > 50000" employees.csv

Updating rows

sqawk -s"UPDATE data SET status = 'active' WHERE id = 5" data.csv --write

This updates the status field to 'active' for rows with id = 5 and saves the changes back to data.csv.

Deleting rows

sqawk -s"DELETE FROM data WHERE id = 5" data.csv --write

This removes rows with id = 5 and saves the changes back to data.csv.

Multiple operations

sqawk -s"UPDATE data SET status = 'inactive' WHERE last_login < '2023-01-01'" -s"DELETE FROM data WHERE status = 'inactive' AND last_login < '2022-01-01'" -s"SELECT * FROM data" data.csv --write

This executes multiple SQL statements in sequence: first marking recent inactive accounts, then removing very old inactive accounts, and finally showing the results.

Multiple files

sqawk -s"SELECT * FROM users" -s"SELECT * FROM orders" users.csv orders.csv

Finding unique values with DISTINCT

# Get unique values from a single columnsqawk -s"SELECT DISTINCT category FROM products" products.csv# Get unique combinations of columnssqawk -s"SELECT DISTINCT department, role FROM employees" employees.csv# Use DISTINCT with ORDER BY for sorted unique valuessqawk -s"SELECT DISTINCT region FROM customers ORDER BY region" customers.csv

Join tables with INNER JOIN

# Join users and orders using INNER JOIN with ON conditionsqawk -s"SELECT users.name, orders.product_id, orders.date FROM users INNER JOIN orders ON users.id = orders.user_id" users.csv orders.csv# Join with additional WHERE filteringsqawk -s"SELECT users.name, orders.product_id, orders.date FROM users INNER JOIN orders ON users.id = orders.user_id WHERE orders.product_id > 100" users.csv orders.csv# Using DISTINCT with JOINs to find unique customer-product pairssqawk -s"SELECT DISTINCT users.name, products.name FROM users INNER JOIN orders ON users.id = orders.user_id INNER JOIN products ON orders.product_id = products.product_id" users.csv orders.csv products.csv

Custom field separators

# Process a tab-delimited file (TSV)sqawk -F'\t' -s"SELECT * FROM employees WHERE salary > 70000" employees.tsv# Process a colon-delimited filesqawk -F':' -s"SELECT id, name, email FROM contacts" contacts.txt

Verbose mode

sqawk -s"SELECT * FROM data WHERE value > 100" data.csv -v

Write mode

sqawk -s"DELETE FROM data WHERE status = 'expired'" data.csv --write

By default, sqawk doesn't modify input files. Use the--write flag to save changes back to the original files.

Documentation

For more detailed information, see:

License

Licensed under the MIT License (LICENSE orhttp://opensource.org/licenses/MIT).

Contribution

Unless you explicitly state otherwise, any contribution intentionally submittedfor inclusion in the work by you shall be licensed as MIT, without any additionalterms or conditions.

About

A fusion of SQL and awk: Applying SQL to text-based data files

Topics

Resources

License

Stars

Watchers

Forks


[8]ページ先頭

©2009-2025 Movatter.jp