Forem

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Building a Football Analytics Pipeline: Patterns, Tradeoffs, and What Production Would Look Like
Cover image for Building a Football Analytics Pipeline: Patterns, Tradeoffs, and What Production Would Look Like

Building a Football Analytics Pipeline: Patterns, Tradeoffs, and What Production Would Look Like

2
Comments
9 min read
How Hard Is It to Add an Index to an Open Format? Lessons from the Apache Iceberg Community

How Hard Is It to Add an Index to an Open Format? Lessons from the Apache Iceberg Community

Comments
16 min read
The Data Engineering Lifecycle
Cover image for The Data Engineering Lifecycle

The Data Engineering Lifecycle

Comments
3 min read
You Don't Need to Write Data Tests

You Don't Need to Write Data Tests

Comments
8 min read
Data Anomaly Detection: The Complete Guide for Data Engineers

Data Anomaly Detection: The Complete Guide for Data Engineers

Comments
7 min read
El Ciclo de Vida de la Ingeniería de Datos
Cover image for El Ciclo de Vida de la Ingeniería de Datos

El Ciclo de Vida de la Ingeniería de Datos

Comments
4 min read
Why Shannon Entropy Catches What Schema Validation Misses
Cover image for Why Shannon Entropy Catches What Schema Validation Misses

Why Shannon Entropy Catches What Schema Validation Misses

Comments
5 min read
dbt snapshots: moving from merges to native history

dbt snapshots: moving from merges to native history

1
Comments
5 min read
PySpark to Pandas/scikit-learn: A Practical Migration Guide for Data Engineers Learning ML

PySpark to Pandas/scikit-learn: A Practical Migration Guide for Data Engineers Learning ML

Comments
7 min read
Apache Parquet File Anatomy: Row Groups, Column Chunks, Pages, and Metadata Explained 🧱📦
Cover image for Apache Parquet File Anatomy: Row Groups, Column Chunks, Pages, and Metadata Explained 🧱📦

Apache Parquet File Anatomy: Row Groups, Column Chunks, Pages, and Metadata Explained 🧱📦

Comments
8 min read
ETL vs ELT: Which One Should You Use and Why?
Cover image for ETL vs ELT: Which One Should You Use and Why?

ETL vs ELT: Which One Should You Use and Why?

Comments
7 min read
🚀 DB Explorer 3.0.1 — The AI‑First SQL Editor You’ll Want to Try

🚀 DB Explorer 3.0.1 — The AI‑First SQL Editor You’ll Want to Try

Comments
1 min read
My first data pipeline

My first data pipeline

Comments
1 min read
ETL vs ELT: Which One Should You Use and Why?

ETL vs ELT: Which One Should You Use and Why?

1
Comments
6 min read
Entity Resolution at Scale: Matching Products Across Amazon, Reddit, and RTINGS

Entity Resolution at Scale: Matching Products Across Amazon, Reddit, and RTINGS

Comments
4 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.