CSV Data Analysis in the Terminal: 7 Tasks You Don't Need pandas For
If you work with CSV files, you probably reach for pandas, Excel, or a Jupyter notebook. Those tools work great — until you're on a bare server, inside a Docker container, or SSH'd into a machine with a minimal Python install. Installing pandas is a 50MB+ download with 20+ transitive dependencies. Sometimes you just want to run a single command and get an answer.
This post covers 7 practical CSV tasks you can do entirely from the terminal, using only Python's standard library — no pandas, no Jupyter, no Excel, no npm packages. Every command shown here works on any machine with Python 3.8+ installed.
If you have Python but not pandas, you're one pip install evolver-tools away from all of these. Each tool is a standalone command that reads from stdin or a file and writes structured output to your terminal.
1. Get a Statistical Summary of Any CSV File
The first thing you do with a new dataset is understand it: column types, value ranges, missing data, distribution. In pandas this is df.describe(). In the terminal:
$ evtool csv-stats sales.csv --all ╔═══════════════════════════════════════════════════════════╗ ║ CSV STATISTICAL REPORT ║ ╠═══════════════════════════════════════════════════════════╣ ║ File: sales.csv ║ ║ Rows: 1,847 ║ ║ Columns: 12 ║ ╠═══════════════════════════════════════════════════════════╣ ║ Column Type Non-Null Unique Missing % ║ ║ ───────────────────────────────────────────────────────── ║ ║ date date 1,847 347 0 0% ║ ║ product string 1,847 24 0 0% ║ ║ region string 1,847 4 0 0% ║ ║ units int 1,840 12 7 0.4% ║ ║ unit_price float 1,847 89 0 0% ║ ║ total float 1,840 312 7 0.4% ║ ║ discount float 1,632 45 215 11.6%║ ╚═══════════════════════════════════════════════════════════╝
In one command you get: total row count, column types (auto-detected), null counts, unique value counts, and missing percentage. This alone replaces the first 15 minutes of any data analysis workflow — loading the data, checking types, looking for missing values.
You can drill deeper into any column:
$ evtool csv-stats sales.csv --column total --histogram Column: total (float) — 1,840 non-null values Min: 12.50 Max: 3,847.20 Mean: 342.18 Median: 198.75 Std: 411.63 Histogram (10 bins): 0 — 400 ████████████████████████████████ 1,428 400 — 800 ██████ 254 800 — 1200 ███ 110 1200 — 1600 ██ 32 1600 — 2000 █ 12 2000 — 2400 ▏ 2 2400 — 2800 ▏ 1 2800 — 3200 ▏ 1 3200 — 3600 ▏ 0 3600 — 4000 ▏ 0
This is df.describe() + df.hist() in one command. No imports, no notebooks, no Jupyter dependency.
2. Filter Rows by Any Condition
Filtering data is the most common CSV operation. Maybe you need all sales over $500, or all customers from a specific region, or records from Q4 2025. In pandas you write a boolean mask. In the terminal:
$ evtool csv-filter 'total > 500' sales.csv | head -5 date,product,region,units,unit_price,total,discount 2025-10-12,Widget Pro,East,120,14.99,1,798.80,0.00 2025-10-14,Gadget X,West,85,24.99,2,124.95,10.00 2025-10-15,Premium Suite,North,50,49.99,2,499.80,5.00 2025-10-18,Widget Pro,South,200,14.99,0,998.00,0.00
The filter syntax is SQL-like and supports >, <, =, !=, >=, <=, string contains, and compound expressions with AND/OR:
# Multiple conditions $ evtool csv-filter 'region = "West" AND total > 200' sales.csv # String matching $ evtool csv-filter 'product ~ "Pro"' sales.csv # Date range $ evtool csv-filter 'date >= "2025-10-01" AND date < "2025-11-01"' sales.csv
The output is always valid CSV, so you can pipe it into the next tool — no temporary files needed.
3. Sort by Any Column (or Multiple Columns)
Quick sort without Excel's click-and-drag:
$ evtool csv-sort -k total sales.csv | head -5 date,product,region,units,unit_price,total,discount 2025-08-03,Sticker Pack,South,1,4.99,2.50,0 2025-09-12,Bumper Sticker,East,2,3.99,7.98,0 2025-07-22,Magnet,North,3,4.99,14.97,0 2025-11-04,Keychain,West,5,3.99,19.95,0
# Descending order (highest first) $ evtool csv-sort -k total --desc sales.csv | head -5 # Multiple columns: sort by region, then by total descending $ evtool csv-sort -k region,total --desc sales.csv
Performance note: csv-sort handles files larger than memory by streaming through chunks, so it works on datasets that would make Excel choke (100K+ rows).
4. Chart Data Right in the Terminal
Need a quick visual without spinning up matplotlib? Terminal charts give you instant visual feedback:
$ evtool csv-chart sales.csv -k region -v total --agg mean East ████████████████████████████████████ $387.40 West ███████████████████████████████ $340.20 North ██████████████████████████ $298.60 South ██████████████████████████████████████████████ $429.10
You can chart by category, time series, or distribution:
# Time series: average total by month $ evtool csv-chart sales.csv -k date -v total --agg mean --time month # Stacked bar: product sales by region $ evtool csv-chart sales.csv -k product -v total --agg sum --group region # Histogram of any numeric column $ evtool csv-chart sales.csv -k total --hist
These charts render using Unicode block characters — they work in any terminal, inside Docker logs, over SSH, in CI/CD output. No display server required.
5. Join and Merge CSV Files
One of pandas' superpowers is pd.merge(). The terminal equivalent handles the most common join patterns:
$ evtool csv-join sales.csv products.csv --on product_id date,product_id,units,total,product_name,category,price 2025-10-12,WP-001,120,1,798.80,Widget Pro,Efficiency,14.99 2025-10-14,GX-002,85,2,124.95,Gadget X,Electronics,24.99 2025-10-15,PS-003,50,2,499.80,Premium Suite,Business,49.99
# Left join (keep all rows from left file) $ evtool csv-join sales.csv customers.csv --on email --how left # Inner join (only matching rows) $ evtool csv-join employees.csv departments.csv --on dept_id --how inner # Join on multiple columns $ evtool csv-join inventory.csv warehouse.csv --on sku,location
No more loading two CSV files into pandas, writing a merge statement, then saving the result. One command, done.
6. Convert CSV to JSON (and Back)
CSV ↔ JSON is one of the most common data pipeline tasks. You need it when integrating with APIs, feeding data into databases, or generating config files from spreadsheets.
$ evtool csv-to-json sales.csv --pretty | head -20 [ { "date": "2025-10-12", "product": "Widget Pro", "region": "East", "units": "120", "unit_price": "14.99", "total": "1798.80", "discount": "0.00" }, { "date": "2025-10-14", "product": "Gadget X", "region": "West", "units": "85", "unit_price": "24.99", "total": "2124.95", "discount": "10.00" }
# Flatten nested JSON keys into column headers $ evtool json-to-csv data.json --flatten > output.csv # Pipe directly: filter CSV, convert to JSON $ evtool csv-filter 'region = "East"' sales.csv | evtool csv-to-json > east.json
This is particularly useful in CI/CD pipelines — extract data from a CSV report, convert to JSON, and POST it to an API endpoint, all in one shell pipeline.
7. Validate and Clean Your Data
Before you analyze, you need to know if your data is trustworthy. The terminal gives you instant integrity checks:
# Find duplicate rows $ evtool csv-dedup sales.csv --show-duplicates row_46 2025-10-12,Widget Pro,East,120,14.99,1798.80,0.00 row_287 2025-10-12,Widget Pro,East,120,14.99,1798.80,0.00 # Check for null values across columns $ evtool csv-stats sales.csv --null-summary Null values found: units: 7 missing (0.4%) total: 7 missing (0.4%) discount: 215 missing (11.6%) # Remove duplicates $ evtool csv-dedup sales.csv -k date,product,region --output clean.csv
If you're processing CSV files from external sources (client exports, government data, scraped datasets), this validation step catches issues before they propagate through your pipeline.
Putting It All Together: A Real-World Pipeline
Here's a complete data analysis pipeline that would otherwise require a Jupyter notebook:
# Download a CSV from an API $ evtool http-get "https://api.example.com/reports/sales.csv" > sales.csv # Check the structure $ evtool csv-stats sales.csv --all # Filter to high-value transactions in a specific region $ evtool csv-filter 'region = "West" AND total > 500' sales.csv > west_high.csv # Sort by total descending $ evtool csv-sort -k total --desc west_high.csv | head -20 # Generate a chart $ evtool csv-chart west_high.csv -k date -v total --time month # Convert to JSON for the team's API $ evtool csv-to-json west_high.csv > west_high.json
One pipeline, zero dependencies, runs anywhere Python runs.
When NOT to Use Terminal CSV Tools
Let's be honest about limitations. Terminal tools are great for exploration and lightweight pipelines, but not for:
- Machine learning — You still need scikit-learn, TensorFlow, or PyTorch for model training
- Multi-gigabyte datasets — For millions of rows, use DuckDB, Polars, or Spark
- Interactive visualization — If you need zoomable, multi-panel charts, stick with matplotlib/plotly
- Statistical modeling — Regression, clustering, and hypothesis testing need proper libraries
But for 80% of everyday data tasks — filtering, sorting, joining, summarizing, converting, validating — terminal tools are faster, lighter, and more composable than firing up a full analysis environment.
Quick Comparison: Terminal vs. pandas
| Task | pandas | Terminal (evtool) |
|---|---|---|
| Install size | ~50MB + 20 deps | ~2MB, zero deps |
| Stats summary | df.describe() |
csv-stats data.csv --all |
| Filter rows | df[df.col > 100] |
csv-filter 'col > 100' data.csv |
| Sort | df.sort_values('col') |
csv-sort -k col data.csv |
| Chart | df.plot() |
csv-chart data.csv -k cat -v val |
| Join | pd.merge(a, b, on='key') |
csv-join a.csv b.csv --on key |
| CSV → JSON | df.to_json() |
csv-to-json data.csv |
| Null check | df.isnull().sum() |
csv-stats data.csv --nulls |
| Docker-friendly | ❌ Heavy | ✅ 2MB |
| Air-gapped | ❌ Needs network | ✅ Works offline |
Get Started in 10 Seconds
Install evolver-tools with pip — it works on any machine with Python 3.8+:
$ pip install evolver-tools
Then try the CSV tools on any file:
$ evtool csv-stats your-file.csv --all
No install needed? Try it in Google Colab — it runs in your browser with zero setup.
Or see the full list of 261 tools:
$ evtool list
One pip install. 261 tools. Zero dependencies.
From CSV analysis to network debugging to system monitoring — everything your terminal should have had, in one package.
📦 Browse All Tools ⭐ Star on GitHub
pip install evolver-tools · 261 tools · 18 categories · pure Python stdlib