ETL vs ELT: See the Difference
InteractiveDemo (Synthetic)
Click ETL or ELT to watch the pipeline order change. Same dataset, different flow.
Business Question
When data arrives from multiple systems with inconsistent formats, where should transformation happen to maximize reliability and speed (before storage vs in-warehouse)?
How this is used
This mini-simulator is used to explain pipeline tradeoffs to stakeholders (analytics, data engineering, and product) and to justify where validation + standardization should live.
Note: Dataset is synthetic for demonstration. No real customer or company data is shown.
Definitions
Extract = ingest raw records from source systems.
Transform = standardize fields (names, casing), parse types (revenue), validate constraints.
Load = write to warehouse tables for downstream querying.
Extract
Ingest from sources (apps, logs, SaaS)
id
item
revenue
channel
101
usb-c hub
usd 29.9
WEB
102
USB C HUB
$29.90
web
103
wireless mouse
19$
Retail
104
Wireless Mouse
$19.00
RETAIL
105
desk mat - xl
12.5$
partner
106
Desk Mat XL
USD 12.50
Partner
107
keyboard mech
79$
WEB
Transform
Standardize names, formats, and types
id
item
revenue
channel
101
USB-C Hub
$29.90
Web
102
USB-C Hub
$29.90
Web
103
Wireless Mouse
$19.00
Retail
104
Wireless Mouse
$19.00
Retail
105
Desk Mat (XL)
$12.50
Partner
106
Desk Mat (XL)
$12.50
Partner
107
Mechanical Keyboard
$79.00
Web
Load
Load curated tables
id
item
revenue
channel
101
USB-C Hub
$29.90
Web
102
USB-C Hub
$29.90
Web
103
Wireless Mouse
$19.00
Retail
104
Wireless Mouse
$19.00
Retail
105
Desk Mat (XL)
$12.50
Partner
106
Desk Mat (XL)
$12.50
Partner
107
Mechanical Keyboard
$79.00
Web
Data Quality Snapshot (Raw vs Clean)
Raw
Rows
7
OK
Invalid revenue
0
OK
Revenue sum
$201.80
OK
Channel variants
6 → 3
OK
Item variants
7 → 4
OK
Logical duplicates
3
CHECK
Clean
Rows
7
OK
Invalid revenue
0
OK
Revenue sum
$201.80
OK
Channel variants
3 → 3
OK
Item variants
4 → 4
OK
Logical duplicates
3
CHECK
Professional note: in production, these checks typically live in dbt tests / Great Expectations or pipeline validation steps, and failures block deployment or alert owners.
Key takeaway: ETL transforms before storage (stronger upstream validation, consistent curated tables). Use when schemas are stable and you want strict guarantees for downstream consumers.
SQL Efficiency Playground
Engine: Warehouse
Business question: Which customer segments drove revenue over the last 90 days?
Recruiter takeaway
I optimize analytics queries by pruning partitions, pushing predicates down, joining on stable keys, and leveraging rollups—then validate the impact using execution plans and cost signals.
Estimated runtime
1270 ms
Estimated scan
120.0M rows
Estimated CPU
65%
Estimated spill
3.5 GB
Efficiency
0/6 toggles
Toggle techniques ON/OFF to update SQL and the execution summary.
0%
Optimization toggles
Summary
Toggles change the likely execution strategy.
Scan
Broader time scan
Filter
Filter later
Join
String join
Layout
Default layout
Aggregation
Raw aggregation
Optimizer
Stats stale
Estimates are illustrative cost proxies, not benchmarks.
After
Demo query pattern (illustrative)
SQL
-- Optimized (illustrative pattern; adjust to your warehouse/db)
-- Reduce scan, filter early, join on keys, and optionally query a summary model.

WITH scoped_orders AS (
  SELECT
    DATE_TRUNC('day', o.event_time) AS event_date,
    o.customer_email,
    o.amount
  FROM orders o
  WHERE 1=1
    AND o.event_time >= NOW() - INTERVAL '90 days'
)
SELECT
  s.event_date AS day,
  c.segment,
  SUM(s.amount) AS revenue
FROM scoped_orders s
JOIN customers c
  ON s.customer_email = c.email
GROUP BY 1, 2
ORDER BY 1 DESC;
Before
Demo query pattern (illustrative)
SQL
-- Naive (anti-patterns)
SELECT
  DATE_TRUNC('day', o.event_time) AS day,
  c.segment,
  SUM(o.amount) AS revenue
FROM orders o
JOIN customers c
  ON LOWER(o.customer_email) = LOWER(c.email)
WHERE DATE_TRUNC('day', o.event_time) >= DATE_TRUNC('day', NOW()) - INTERVAL '90 days'
GROUP BY 1, 2
ORDER BY 1 DESC;
Build a KPI — Guided Wizard
KPI Builder
Define a KPI like a real analytics team: metric contract, slicing guardrails, targets, and export-ready spec.
Demo preview
Status: At risk
Step 1: KPI
Define, validate, and export a KPI spec teams can trust.
Outcome + unit (rate, time, $, users).
Exec
Product
Growth
Ops
Controls the “so what” narrative.
Stakeholder framing
Contract
Activation Rate for Product in Product Analytics. Goal: Understand weekly activation performance and identify drivers of lift.
Trend — Activation Rate (Weekly)
Hover to inspect points · Area shows trajectory
29.8%28.5%27.2%25.9%24.6%1510
KPI Overview
Demo values (simulated).
Current
At risk
25.6%
+0.7%vs previous
Target 35% · Watch 28%
Metric
(COUNT_DISTINCT(activated_users)) / (COUNT_DISTINCT(new_users))
Direction: Higher is better
Owner: Data / Analytics
Refresh: Weekly
Data integrity
Demo (Simulated)
Dims: Channel, Region.
Filters: Country = US • Status = active.
Ops notes: Metric defined at user-level. Exclude internal traffic. Keep attribution logic stable.
Alerts
Rule-based
If Activation Rate crosses 22% (risk threshold), trigger owner workflow.
Owner: Data / Analytics
© 2026 Shayan Bhatti. All rights reserved.
Data Analyst | Business Intelligence | Machine Learning