Data Berlin #37
May tried to cancel work; we put Data Berlin on the calendar anyway.
May in Berlin has a personality: public holidays stack up, half the company is mysteriously “OOO,” and you can hear “let’s sync next week” already on Tuesday. Maybe itàs the right time to get things done on your own (you and your AI, of course).
Huge thanks to Adsquare for hosting us on the 28th of May. You will hear two talks from their side of the house and a third from Snap on what it takes to measure campaigns when the story is not just clicks in a dashboard, but stores, streets, and stubborn offline reality. Think audience data at scale, impact evaluation, and experiments that respect the physical world. Plus the usual food, drinks, and nice people.
If you like your evenings like your coffee (without LinkedIn crap), this crowd is for you. Sneak us onto the calendar between the random days off, we will make it worth the bike ride.
The Data Berlin official job board now lists 500+ open roles in data and AI across Berlin. If you are hiring in this space and you are not on databerlin.net, are you sure you are actually reaching the people you want?
And if you are looking for a new job why are you still here?
📖Interesting Readings
Snowflake-side “skills” are basically small, shareable instruction bundles with warehouse-aware defaults, closer to a packaging ecosystem than a magic agent, and if you live in dbt, semantic YAML, and governance, watch the QA gap: bad packs can fail quietly while bad SQL usually fails loud. – Cortex Code skills, 90 days in: plugins, not prompts
Cheap drafting exploded internal doc volume, but reading and judgment did not get cheaper, so teams drown in fluff and rework, and for data work that is mostly norms: shorter artifacts, real editing before publish, and fewer “AI-shaped” specs nobody will defend. – How AI Changed Our Work Without Asking
The “tokens are basically free” era is ending; agent loops spend money the way forgotten big warehouses do, and Berlin-style platform teams should care because the fix is routing, governed metadata, trace visibility, and keeping core logic out of giant prompts. – The End of the AI Subsidy
Fast agent output plus heavy human oversight is a recipe for mental fatigue, especially when you are still on the hook for outcomes, and the practical takeaway for builders is to slow the parts that matter (architecture, big diffs) instead of treating merge speed as understanding. – The Cognitive Overload of AI Development
Sick of the same hype carousel? One post is a human-researched hit of ten charts on war optics, crude in crises, arms and chip money loops, layoffs, and US health rot; the other drags the stack into a Sydney gallery where artists make hidden labeling wages, insurer automation, consent fights, and data-centre thirst hard to ignore. – Data Dump: Iran War, Oil Prices, the AI Bubble, and Health · Artificial Intelligence on Exhibit at an Art Gallery Near You
If you are curiouse about what happens behind the scenes of databerlin.net, a self-led dialogue traces long-tail ATS ingestion with dlt into MotherDuck, free-tier Actions-to-Astro publishing, and pragmatic filters with room to grow. – Building a Data Job Board for Berlin: An Interview at the Mirror
🚀 Shipping Now
AWS Agent Toolkit rolls out 40+ curated agent skills, a managed IAM-guarded MCP server, and three installable plugins so coding agents ship AWS work with fewer token-wasting guesses.
Amazon SageMaker adds an agentic model-customization flow (skills + IDE plugins) that walks fine-tuning, eval, and deployment with reusable code hooks for Bedrock or SageMaker endpoints.
GPT‑5.5 Instant replaces ChatGPT’s default free-tier model with tighter answers, stronger factuality on sensitive domains, and clearer memory-source controls for personalization.
Realtime voice models ship GPT‑Realtime‑2 (reasoning-grade speech), GPT‑Realtime‑Translate (70+ input languages), and GPT‑Realtime‑Whisper (streaming STT) on the Realtime API.
Higher Claude limits double Claude Code five-hour caps on paid tiers, drop peak-hour throttling for Pro/Max Code users, and raise Opus API rate limits alongside a big SpaceX compute deal.
Lakeflow Pipelines Editor is GA with an agent-style editing loop (Genie Code) for building and debugging production pipelines next to graphs and metrics.
Databricks Runtime 18.2 is generally available alongside
%uv pipin serverless notebooks for faster Python installs on current environments.dbt platform May updates include Fusion release-track rollout, native private packages GA, Fusion + Snowflake connection GA, and a Developer agent preview for NL-assisted models, tests, docs, and semantic YAML.
dbt Core 1.11.9 patches manifest handling for
static_analysis: offand fixesstate:modifieddetection when UDF YAML properties change.
☕ Upcoming Meetups & Events
May 26 — Women Techmakers Berlin · AI & tech career with SPICED 🐍
May 27 — AI Agent Builders Berlin · powered by Dataiku 🛰️
May 28 — Tech Europe · Applied AI Conf Berlin 🤖
May 28 — Data Berlin · May We Talk About Data? 🐻
💼 This Week’s Job Picks
📊 Data Analyst
Product Data Analyst – FREE NOW → Apply
Data Analyst Global Finance Technology & Data – BMG → Apply
Full Stack Engineer - Analytics (f/m/d) – Contentful → Apply
And more here.
📈 Analytics Engineer
Senior Analytics Engineer – SumUp → Apply
Senior Analytics Engineer - Growth (m/f/d) – 1KOMMA5° → Apply
Senior Analytics Engineer (m/f/d) – Smoobu → Apply
And more here.
⚙️ Data Engineer
Senior Data Engineer – Risk & Compliance (CNP Protect) – SumUp → Apply
Junior Backend Engineer – Data Marketplace – Taktile → Apply
Senior Data Engineer - Internal Platform – Parloa → Apply
And more here.
🤖 AI / ML / LLM + Data Scientist
Senior Platform Engineer (Agentic Experience) (f/m/d) – Moss → Apply
Principal Applied Scientist – Parloa → Apply
Staff Data Scientist, Selection Analytics – Wolt → Apply
And more AI/ML roles here and Data Scientist roles here.
If anything in here clicked, give the post a like, leave a comment if you disagree with us (or agree loudly), and share it with someone who still thinks “offline measurement” is a fairy tale.
We know May is a game of calendar Tetris, if you are around, just join us and Adsquare on the 28th of May. We would rather see you in person than wonder if you only skimmed this between holidays.
— Data Berlin team








