Lots of Machine Learning Libraries to assess image quality, produce explanations for your models, or forecast & classify time series.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

(1) 🚀 Image Quality Assessment Implementation

The German price comparison website idealo.de provides an implementation of some interesting applied Google research from 2018 called “NIMA: Neural Image Assessment”. The paper describes two neural networks the team open-sourced. The first network aims to establish the aesthetic looks of an image, while the second takes a guess at the technical looks.

So basically, these two networks help you determine how pretty…


Why functional data engineering is the right approach to batch ETL, Machine Learning can use a functional approach as well and how to build evolutionary data architectures.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

🎄 (1) Functional Data Engineering

Two years ago, Maxime Beauchemin, the creator of both Apache Airflow and Superset published an article about why the functional paradigm is as important in data engineering as it is in software engineering. I very much agree and I feel this idea is still not completely absorbed by the community. …


What the future of BI looks like, how to generate proper unique keys in SQL, and a final look at how to build data platforms.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

🔥 (1) The Future of BI is Open Source

Maxime Beauchemin, the creator of both Apache Airflow and Superset, just published a great piece about why the future of business intelligence is open source. I totally agree with him and still find it mind-boggling that open source is just now catching up to this. …


Data teams today strive to build platforms, X-as-a-Service. But how? And how do you measure their success?

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

This week I got to think a lot about data platforms and X-as-a-Service provided by data teams.

(1) 🎁 (Data) Team Topologies

I recently enjoyed both the book “team topologies” as well as Pedro Mota's post about applying the same concepts to data teams. …


How Meltano works, how to increase the frequency of your data updates, and what the benefits of EL(T) are.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

This week I got to think a lot about ELT, ETL, LET, TEL, L….

(1) 🎁 From ETL to EL (T)

Few things in data are as clear as the fact that ETL is an outdated practice. That’s what it is, a practice. All “ETL” tools actually can also do “EL (T)”, but only some tools can only do the “T” and thus enforce the EL (T) pattern.


The Business Value of data products Is often miscalculated. Learn these two rules to calculate it correctly

WSJF and the problematic Value for data products. Image by the author.

For me, a product manager, the weighted shortest job, or what is called the cost of delay changed my perspective on understanding value. That’s what we want to do as product managers, maximize value. And the essential ingredient in that formula is the business value of a task or job.

For data products, for data-heavy products, machine learning solutions, business intelligence systems, in short everything that has data at its core, I often happen to misguess the business value by a lot.

It’s not because you have to go through a thorough calculation, it’s because a data products’ business value…


Enter into data mesh learning mode, a new data tool category, and why you need data SLAs.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

This week I stumbled through a series of things, the data mesh learning slack community, a primer on Reverse ETL, and Data Product SLAs.

1 Data Mesh Learning Slack Community

I “got” stumbled over the data mesh learning slack community, which is an excellent place to learn about data meshes and ask questions to people already involved deeply in building these things. Scott Hirleman et. al. did a great job of…


The 101 of Data Meshes, testing data with auto profiling, developing data pipeline in a test-driven way.

Created by the author with pitch.com. The image has nothing to do with the newsletter, it just came to my mind while observing the late covid-19 statistics in Germany, and it’s a data point.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

This week I stumbled over a couple of important topics I mentioned before, data meshes, test-driven development for data teams, and testing data with great-expectations.

1 TDD for Data People

I was about to write an article myself about a test-driven workflow for data people’s “daily bread”. But low and behold, Marcos Marx was faster. …


Three Tips on data product management; How data errors cascade down, why one-piece workflows are much better than context switching, and why you shouldn’t fix all data bugs.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

As you might’ve noticed by the title, I got to think about data product management this week and in particular three topics which I like to share my favorite resources on because I think product management is so important for data teams.

1 Data Errors Cascade, Watch Your Data or Your Data Project Dies

I recently picked up a great paper from the data engineering weekly newsletter, published by Google about data, quality, and the impact on…


Version Data Lakes, Declarative DAGs and shared SQL stuff with SQLPad.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

The three data points for today are next-gen data lakes with lakeFS, declarative DAGs with boundary-layer, and fast data engineer onboarding with SQLPad.

1 LakeFS, versioning and branching data

lakeFS is a tool that provides a layer on top of your AWS S3 or GCS data lake. It allows automatic versioning and branching of your data. The team provides lots of best practices, e.g. showing how to set up a data…

Sven Balnojan

Ph.D., Product Manager, DevOps & Data enthusiast, and author of “Three Data Point Thursday”: https://www.getrevue.co/profile/svenbalnojan.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store