CSV Data Analysis in the Terminal: 7 Tasks You Don't Need pandas For

June 2026 · 7 min read · evolver-tools

If you work with CSV files, you probably reach for pandas, Excel, or a Jupyter notebook. Those tools work great — until you're on a bare server, inside a Docker container, or SSH'd into a machine with a minimal Python install. Installing pandas is a 50MB+ download with 20+ transitive dependencies. Sometimes you just want to run a single command and get an answer.

This post covers 7 practical CSV tasks you can do entirely from the terminal, using only Python's standard library — no pandas, no Jupyter, no Excel, no npm packages. Every command shown here works on any machine with Python 3.8+ installed.

If you have Python but not pandas, you're one pip install evolver-tools away from all of these. Each tool is a standalone command that reads from stdin or a file and writes structured output to your terminal.

1. Get a Statistical Summary of Any CSV File

The first thing you do with a new dataset is understand it: column types, value ranges, missing data, distribution. In pandas this is df.describe(). In the terminal:

$ evtool csv-stats sales.csv --all

╔═══════════════════════════════════════════════════════════╗
║                    CSV STATISTICAL REPORT                 ║
╠═══════════════════════════════════════════════════════════╣
║ File: sales.csv                                          ║
║ Rows: 1,847                                              ║
║ Columns: 12                                              ║
╠═══════════════════════════════════════════════════════════╣
║ Column        Type     Non-Null  Unique     Missing  %    ║
║ ───────────────────────────────────────────────────────── ║
║ date          date     1,847     347        0        0%   ║
║ product       string   1,847     24         0        0%   ║
║ region        string   1,847     4          0        0%   ║
║ units         int      1,840     12         7        0.4% ║
║ unit_price    float    1,847     89         0        0%   ║
║ total         float    1,840     312        7        0.4% ║
║ discount      float    1,632     45         215      11.6%║
╚═══════════════════════════════════════════════════════════╝

In one command you get: total row count, column types (auto-detected), null counts, unique value counts, and missing percentage. This alone replaces the first 15 minutes of any data analysis workflow — loading the data, checking types, looking for missing values.

You can drill deeper into any column:

$ evtool csv-stats sales.csv --column total --histogram

Column: total (float) — 1,840 non-null values
Min:    12.50
Max:    3,847.20
Mean:   342.18
Median: 198.75
Std:    411.63

Histogram (10 bins):
    0 —   400   ████████████████████████████████  1,428
  400 —   800   ██████  254
  800 —  1200   ███  110
 1200 —  1600   ██  32
 1600 —  2000   █  12
 2000 —  2400   ▏  2
 2400 —  2800   ▏  1
 2800 —  3200   ▏  1
 3200 —  3600   ▏  0
 3600 —  4000   ▏  0

This is df.describe() + df.hist() in one command. No imports, no notebooks, no Jupyter dependency.

2. Filter Rows by Any Condition

Filtering data is the most common CSV operation. Maybe you need all sales over $500, or all customers from a specific region, or records from Q4 2025. In pandas you write a boolean mask. In the terminal:

$ evtool csv-filter 'total > 500' sales.csv | head -5
date,product,region,units,unit_price,total,discount
2025-10-12,Widget Pro,East,120,14.99,1,798.80,0.00
2025-10-14,Gadget X,West,85,24.99,2,124.95,10.00
2025-10-15,Premium Suite,North,50,49.99,2,499.80,5.00
2025-10-18,Widget Pro,South,200,14.99,0,998.00,0.00

The filter syntax is SQL-like and supports >, <, =, !=, >=, <=, string contains, and compound expressions with AND/OR:

# Multiple conditions
$ evtool csv-filter 'region = "West" AND total > 200' sales.csv

# String matching
$ evtool csv-filter 'product ~ "Pro"' sales.csv

# Date range
$ evtool csv-filter 'date >= "2025-10-01" AND date < "2025-11-01"' sales.csv

The output is always valid CSV, so you can pipe it into the next tool — no temporary files needed.

3. Sort by Any Column (or Multiple Columns)

Quick sort without Excel's click-and-drag:

$ evtool csv-sort -k total sales.csv | head -5
date,product,region,units,unit_price,total,discount
2025-08-03,Sticker Pack,South,1,4.99,2.50,0
2025-09-12,Bumper Sticker,East,2,3.99,7.98,0
2025-07-22,Magnet,North,3,4.99,14.97,0
2025-11-04,Keychain,West,5,3.99,19.95,0

# Descending order (highest first)
$ evtool csv-sort -k total --desc sales.csv | head -5

# Multiple columns: sort by region, then by total descending
$ evtool csv-sort -k region,total --desc sales.csv

Performance note: csv-sort handles files larger than memory by streaming through chunks, so it works on datasets that would make Excel choke (100K+ rows).

4. Chart Data Right in the Terminal

Need a quick visual without spinning up matplotlib? Terminal charts give you instant visual feedback:

$ evtool csv-chart sales.csv -k region -v total --agg mean

  East   ████████████████████████████████████  $387.40
  West   ███████████████████████████████  $340.20
  North  ██████████████████████████  $298.60
  South  ██████████████████████████████████████████████  $429.10

You can chart by category, time series, or distribution:

# Time series: average total by month
$ evtool csv-chart sales.csv -k date -v total --agg mean --time month

# Stacked bar: product sales by region
$ evtool csv-chart sales.csv -k product -v total --agg sum --group region

# Histogram of any numeric column
$ evtool csv-chart sales.csv -k total --hist

These charts render using Unicode block characters — they work in any terminal, inside Docker logs, over SSH, in CI/CD output. No display server required.

5. Join and Merge CSV Files

One of pandas' superpowers is pd.merge(). The terminal equivalent handles the most common join patterns:

$ evtool csv-join sales.csv products.csv --on product_id
date,product_id,units,total,product_name,category,price
2025-10-12,WP-001,120,1,798.80,Widget Pro,Efficiency,14.99
2025-10-14,GX-002,85,2,124.95,Gadget X,Electronics,24.99
2025-10-15,PS-003,50,2,499.80,Premium Suite,Business,49.99

# Left join (keep all rows from left file)
$ evtool csv-join sales.csv customers.csv --on email --how left

# Inner join (only matching rows)
$ evtool csv-join employees.csv departments.csv --on dept_id --how inner

# Join on multiple columns
$ evtool csv-join inventory.csv warehouse.csv --on sku,location

No more loading two CSV files into pandas, writing a merge statement, then saving the result. One command, done.

6. Convert CSV to JSON (and Back)

CSV ↔ JSON is one of the most common data pipeline tasks. You need it when integrating with APIs, feeding data into databases, or generating config files from spreadsheets.

$ evtool csv-to-json sales.csv --pretty | head -20
[
  {
    "date": "2025-10-12",
    "product": "Widget Pro",
    "region": "East",
    "units": "120",
    "unit_price": "14.99",
    "total": "1798.80",
    "discount": "0.00"
  },
  {
    "date": "2025-10-14",
    "product": "Gadget X",
    "region": "West",
    "units": "85",
    "unit_price": "24.99",
    "total": "2124.95",
    "discount": "10.00"
  }

# Flatten nested JSON keys into column headers
$ evtool json-to-csv data.json --flatten > output.csv

# Pipe directly: filter CSV, convert to JSON
$ evtool csv-filter 'region = "East"' sales.csv | evtool csv-to-json > east.json

This is particularly useful in CI/CD pipelines — extract data from a CSV report, convert to JSON, and POST it to an API endpoint, all in one shell pipeline.

7. Validate and Clean Your Data

Before you analyze, you need to know if your data is trustworthy. The terminal gives you instant integrity checks:

# Find duplicate rows
$ evtool csv-dedup sales.csv --show-duplicates
row_46  2025-10-12,Widget Pro,East,120,14.99,1798.80,0.00
row_287 2025-10-12,Widget Pro,East,120,14.99,1798.80,0.00

# Check for null values across columns
$ evtool csv-stats sales.csv --null-summary

Null values found:
  units:          7 missing (0.4%)
  total:          7 missing (0.4%)
  discount:     215 missing (11.6%)

# Remove duplicates
$ evtool csv-dedup sales.csv -k date,product,region --output clean.csv

If you're processing CSV files from external sources (client exports, government data, scraped datasets), this validation step catches issues before they propagate through your pipeline.

Putting It All Together: A Real-World Pipeline

Here's a complete data analysis pipeline that would otherwise require a Jupyter notebook:

# Download a CSV from an API
$ evtool http-get "https://api.example.com/reports/sales.csv" > sales.csv

# Check the structure
$ evtool csv-stats sales.csv --all

# Filter to high-value transactions in a specific region
$ evtool csv-filter 'region = "West" AND total > 500' sales.csv > west_high.csv

# Sort by total descending
$ evtool csv-sort -k total --desc west_high.csv | head -20

# Generate a chart
$ evtool csv-chart west_high.csv -k date -v total --time month

# Convert to JSON for the team's API
$ evtool csv-to-json west_high.csv > west_high.json

One pipeline, zero dependencies, runs anywhere Python runs.

When NOT to Use Terminal CSV Tools

Let's be honest about limitations. Terminal tools are great for exploration and lightweight pipelines, but not for:

Machine learning — You still need scikit-learn, TensorFlow, or PyTorch for model training
Multi-gigabyte datasets — For millions of rows, use DuckDB, Polars, or Spark
Interactive visualization — If you need zoomable, multi-panel charts, stick with matplotlib/plotly
Statistical modeling — Regression, clustering, and hypothesis testing need proper libraries

But for 80% of everyday data tasks — filtering, sorting, joining, summarizing, converting, validating — terminal tools are faster, lighter, and more composable than firing up a full analysis environment.

Quick Comparison: Terminal vs. pandas

Task	pandas	Terminal (evtool)
Install size	~50MB + 20 deps	~2MB, zero deps
Stats summary	`df.describe()`	`csv-stats data.csv --all`
Filter rows	`df[df.col > 100]`	`csv-filter 'col > 100' data.csv`
Sort	`df.sort_values('col')`	`csv-sort -k col data.csv`
Chart	`df.plot()`	`csv-chart data.csv -k cat -v val`
Join	`pd.merge(a, b, on='key')`	`csv-join a.csv b.csv --on key`
CSV → JSON	`df.to_json()`	`csv-to-json data.csv`
Null check	`df.isnull().sum()`	`csv-stats data.csv --nulls`
Docker-friendly	❌ Heavy	✅ 2MB
Air-gapped	❌ Needs network	✅ Works offline

Get Started in 10 Seconds

Install evolver-tools with pip — it works on any machine with Python 3.8+:

$ pip install evolver-tools

Then try the CSV tools on any file:

$ evtool csv-stats your-file.csv --all

No install needed? Try it in Google Colab — it runs in your browser with zero setup.

Or see the full list of 261 tools:

$ evtool list

One pip install. 261 tools. Zero dependencies.

From CSV analysis to network debugging to system monitoring — everything your terminal should have had, in one package.

📦 Browse All Tools ⭐ Star on GitHub

pip install evolver-tools · 261 tools · 18 categories · pure Python stdlib