Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
Forem
Close
#
dataengineering
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Building a Football Analytics Pipeline: Patterns, Tradeoffs, and What Production Would Look Like
ayoabass777
ayoabass777
ayoabass777
Follow
Apr 12
Building a Football Analytics Pipeline: Patterns, Tradeoffs, and What Production Would Look Like
#
dataengineering
#
python
#
aws
#
dbt
2
reactions
Comments
Add Comment
9 min read
How Hard Is It to Add an Index to an Open Format? Lessons from the Apache Iceberg Community
Mingyu Chen
Mingyu Chen
Mingyu Chen
Follow
Apr 12
How Hard Is It to Add an Index to an Open Format? Lessons from the Apache Iceberg Community
#
architecture
#
database
#
dataengineering
#
opensource
Comments
Add Comment
16 min read
The Data Engineering Lifecycle
Mirina-Gonzales
Mirina-Gonzales
Mirina-Gonzales
Follow
Apr 11
The Data Engineering Lifecycle
#
architecture
#
data
#
dataengineering
Comments
Add Comment
3 min read
You Don't Need to Write Data Tests
Blaine Elliott
Blaine Elliott
Blaine Elliott
Follow
Apr 11
You Don't Need to Write Data Tests
#
dataengineering
#
dataquality
Comments
Add Comment
8 min read
Data Anomaly Detection: The Complete Guide for Data Engineers
Blaine Elliott
Blaine Elliott
Blaine Elliott
Follow
Apr 11
Data Anomaly Detection: The Complete Guide for Data Engineers
#
dataengineering
#
dataquality
Comments
Add Comment
7 min read
El Ciclo de Vida de la Ingeniería de Datos
Mirina-Gonzales
Mirina-Gonzales
Mirina-Gonzales
Follow
Apr 11
El Ciclo de Vida de la Ingeniería de Datos
#
dataengineering
#
data
#
architecture
Comments
Add Comment
4 min read
Why Shannon Entropy Catches What Schema Validation Misses
Anthony Johnson II
Anthony Johnson II
Anthony Johnson II
Follow
Apr 11
Why Shannon Entropy Catches What Schema Validation Misses
#
dataquality
#
databricks
#
dataengineering
#
opensource
Comments
Add Comment
5 min read
dbt snapshots: moving from merges to native history
Philip Hern
Philip Hern
Philip Hern
Follow
Apr 10
dbt snapshots: moving from merges to native history
#
dbt
#
dataengineering
#
snowflake
#
snapshots
1
reaction
Comments
Add Comment
5 min read
PySpark to Pandas/scikit-learn: A Practical Migration Guide for Data Engineers Learning ML
Nyson Markus
Nyson Markus
Nyson Markus
Follow
Apr 10
PySpark to Pandas/scikit-learn: A Practical Migration Guide for Data Engineers Learning ML
#
dataengineering
#
datascience
#
machinelearning
#
python
Comments
Add Comment
7 min read
Apache Parquet File Anatomy: Row Groups, Column Chunks, Pages, and Metadata Explained 🧱📦
Kumaravelu Saraboji Mahalingam
Kumaravelu Saraboji Mahalingam
Kumaravelu Saraboji Mahalingam
Follow
Apr 10
Apache Parquet File Anatomy: Row Groups, Column Chunks, Pages, and Metadata Explained 🧱📦
#
dataengineering
#
apacheparquet
#
iceberg
#
analytics
Comments
Add Comment
8 min read
ETL vs ELT: Which One Should You Use and Why?
Lawrence Murithi
Lawrence Murithi
Lawrence Murithi
Follow
Apr 11
ETL vs ELT: Which One Should You Use and Why?
#
architecture
#
data
#
database
#
dataengineering
Comments
Add Comment
7 min read
🚀 DB Explorer 3.0.1 — The AI‑First SQL Editor You’ll Want to Try
Ashish Srivastava
Ashish Srivastava
Ashish Srivastava
Follow
Apr 10
🚀 DB Explorer 3.0.1 — The AI‑First SQL Editor You’ll Want to Try
#
sql
#
database
#
postgres
#
dataengineering
Comments
Add Comment
1 min read
My first data pipeline
Ajay M
Ajay M
Ajay M
Follow
Apr 10
My first data pipeline
#
showdev
#
beginners
#
dataengineering
#
sideprojects
Comments
Add Comment
1 min read
ETL vs ELT: Which One Should You Use and Why?
John Wakaba
John Wakaba
John Wakaba
Follow
Apr 10
ETL vs ELT: Which One Should You Use and Why?
#
architecture
#
beginners
#
data
#
dataengineering
1
reaction
Comments
Add Comment
6 min read
Entity Resolution at Scale: Matching Products Across Amazon, Reddit, and RTINGS
Daniel Rozin
Daniel Rozin
Daniel Rozin
Follow
Apr 10
Entity Resolution at Scale: Matching Products Across Amazon, Reddit, and RTINGS
#
ai
#
webdev
#
dataengineering
#
tutorial
Comments
Add Comment
4 min read
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a blogging-forward open source social network where we learn from one another
Log in
Create account