Data Berlin #27

Longer days, fresh opportunities and new things to read, build and explore in the Berlin data scene.

Mar 02, 2026

If your browser has 27 open tabs, three half-read articles, a job post you keep “saving for later”, and at least one dataset you swear you’ll explore this weekend, this newsletter is for you.

We collected the best data jobs, upcoming meetups, hands-on events, and fresh reads from the Berlin data scene so you can spend less time searching and more time actually building, learning, and meeting people.

Let’s get into it 🚀

📖Interesting Readings

Upgrading from DuckDB 1.0 to 1.5 is worth it: benchmarks show clear speedups across common analytics workloads, with earlier regressions fixed in 1.5, so teams on older versions are leaving performance on the table. – How Much Faster is DuckDB 1.5 vs 1.0? A lot

Joe Reis’s follow-up to his “2028 Data Reckoning” piece: models are doing production-quality work now, procedure-following roles are at risk, and the moat is judgment and business impact, not tool fluency. – The Reckoning Is Already Here

Spark 4.1’s real-time mode brings millisecond-level streaming to Structured Streaming with a small code change, so you can stay on Spark instead of moving to Flink. – 7 Minutes to Understand the New Spark Streaming Feature that Changes Everything

Anusha Kovi built one pipeline four different ways (dbt, Airflow, Spark, etc.) and shares what she’d avoid next time. – I Built the Same Data Pipeline 4 Ways. Here's What I'd Never Do Again

Text-to-SQL works; chat-BI still fails on ambiguous metrics, out-of-scope questions, and common sense—fix it with gold-answer evals and clear metric definitions. – SQL Is Solved. Here's Where Chat-BI Still Breaks

☕ Upcoming Meetups & Events

🧠 Hackathon – Data Berlin

Ready to build instead of just bookmarking tutorials? This community hackathon by Data Berlin is a full-on day of hands-on data work: form a team, pick an idea, and turn raw datasets into something real, a pipeline, a dashboard, an experiment, or a beautifully over-engineered MVP.

Expect a mix of hacking, learning, and spontaneous knowledge-sharing, whether you come with a project in mind or just curiosity. Perfect if you want to try new tools, meet the Berlin data crowd, and leave with fresh skills (and probably new friends).

Hosted by Snowflake and sponsored by Collate and Lightdash, with the support and a hack challenge from our friends at JustWatch.

→ Register here

March 10 — Hands-On Data Engineering: From Zero to Billion-Row Analytics — Berlin DataTalks Club ⚙️

March 10 — March Members Talk Evening — PyLadies Berlin 🧾

March 17 — Data Berlin — From Bits to Blossoms🐻

March 18 — PyBerlin 59 — March event🗣️

March 23 — Data Berlin presents: Berlin Analytics & AI Hackathon👾