All Articles (12) ✨

I often come across articles and resources that deeply resonate with my work or challenge my perspectives. This section is a collection of the most interesting articles I’ve read.

The Builder: Outliving the Modern Data Stack

By Adrian Brudaru

But code is the only interface that scales with AI. If you want to leverage the acceleration of LLMs, your stack must be defined in code.

Let's normalize removing complexity

By Juan Luis Cano RodrĂ­guez

Un fenĂłmeno como siempre. Cortito y al pie.

The Law of Leaky Abstractions

By Joel Spolsky

Code generation tools which pretend to abstract out something, like all abstractions, leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So the abstractions save us time working, but they don’t save us time learning. And all this means that paradoxically, even as we have higher and higher level programming tools with better and better abstractions, becoming a proficient programmer is getting harder and harder.

Is DuckLake a Step Backward?

By Alireza Sadeghi

- Solid historical recap. - Thoughtful technical reasoning that weighs the pros and cons. - No strong bias.

How we raised a trilingual child by accident

By Mehdi Ouazza

Training a kid's ear before it's too late without being a native speaker

Stop Pretending Your Data Catalog Is Working

By Francesco Mucio

If you want to solve you self-service analytics problem, you need a data catalog. Now you have a data catalog problem and you forgot about the self-service analytics

Why Semantic Layers Matter — and How to Build One with DuckDB

By Simon Späti

Many ask themselves, "Why would I use a semantic layer? What is it anyway?" In this hands-on guide, we’ll build the simplest possible semantic layer using just a YAML file and a Python script—not as the goal itself, but as a way to understand the value of semantic layers. We’ll then query 20 million NYC taxi records with consistent business metrics executed using DuckDB and Ibis. By the end, you’ll know exactly when a semantic layer solves real problems and when it’s overkill.

The Postmodern Data Stack & Action-Oriented Architecture

By Joe Reis

The old data stack was built for hindsight. This one is built for foresight, action, and even autonomy. Welcome to the Postmodern Data Stack.

AI Denialists are the Data Industry’s Flat Earthers

By Joe Reis

The article advocates for embracing AI in the data industry to drastically improve delivery speed, efficiency, and project success rates. It challenges outdated practices and resistance to change, highlighting how AI can accelerate tasks like data modeling, coding, and stakeholder alignment.

The Data Engineer's Guide to Efficient Log Parsing with DuckDB/MotherDuck

By Simon Späti

As data engineers, we spend countless hours combing through logs - tracking pipeline states, monitoring Spark cluster performance, reviewing SQL queries, investigating errors, and validating data quality. These logs are the lifeblood of our data platforms, but parsing and analyzing them efficiently remains a persistent challenge. This comprehensive guide explores why data stacks are fundamentally built on logs and why skilled log analysis is critical for the data engineer's success.

A Second Look at the Cathedral and Bazaar

By Nikolai Bezroukov

This paper provides an overview of the weaknesses of Eric Raymond's paper The Cathedral and the Bazaar as well as the more coherent demonstration of the fact that the bazaar metaphor is internally contradictive.

The Analytics Development Lifecycle (ADLC)

By Tristan Handy

The founder of dbt is trying to establish the foundation for a general framework for a mature analytics workflow.