Tech
-

Structuring a Data Pipeline: Physical Environment Separation and Layer Separation
A well-architected data pipeline is not just about writing good transformation logic…
-

Z-ORDER vs Liquid Clustering: Why You Should Switch
Z-ORDER was a huge step forward for Delta Lake query performance —…
-

Reading Cloud Files in Spark: Directory Listing vs SQS
When building data pipelines on cloud platforms like AWS (S3), GCP (GCS),…
-

Why broadcast() doesn’t work inside Delta Lake’s merge() — and what to do instead
A common gotcha when optimizing Delta merge performance: the broadcast hint you…
