Dashboards, Graphs, Reports, Spreadsheets, OLAP Cubes, or direct SQL Access?

Image for post
Image for post
The six BI artifacts. The usual company goes through a journey of spreadsheets -> Reports & Dashboards -> “somewhere else….”.

Artifact, from the two Latin words, arte “by skill” and factum “to make”. Something skillfully created on purpose. A great word to describe something created by a development process.

In the discipline of Business Intelligence, the collection of “technologies & processes” in a company to systematically analyze data, I find that six artifacts are at the cornerstone of most processes.

In most articles on data architectures, some of them are missing, or at least the appropriate tooling is, maybe intentionally maybe not (see for instance the great article at a16z , which mostly focuses on reports, dashboards & ad hoc access.). …


Understanding DataOps Testing, doing it with dbt, and how central initiatives are key to data.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand this near future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

Here are your weekly three data points: DataOps Testing, AirBnBs Quality Initiative, Testing with dbt.

1 DataOps, Value and Innovation Pipelines, DataKitchen

In DataOps companies aim for error numbers of 1 or less a year. To an average data guy that might sound crazy! …


Opinion

On Redshifts, Data Catalogs, Query Engines like Presto, and the troubles of machine learning engineers to get their data.

Image for post
Image for post
Image by the author. The author, confused between lots of different data mesh architectures.

Data Meshes are the hot & trending topic in data & analytics departments. Implemented at big companies like Zalando, and moved from the “Trial” to the “Assess” status of the ThoughtWorks Technology Radar, within just one year. Yet the results I’ve seen are not overly impressive.

Quite a few articles raising concerns have appeared throughout the past year, and at least I have gotten quite a bit of question & confusion about the topic after publishing my first article about data meshes.

Most concerns & confusions seem to have one idea in common. The idea that there is just one kind of data mesh — which closely resembles the ones described by the BMW Group and Zalando. …


Refactoring, Working effectively with Legacy Code, and Test-Driven Development for Data Guys.

…on software engineering.

Hi, I’m Sven. I think data will power every piece of our existence in the near future. I collect “Data Points” to help understand this near future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

Here are your weekly three data points: Refactoring, Working effectively with Legacy Code, and Test-Driven Development.

Why three software engineering books for data guys? Because I believe every data team should be treated as an agile development team.

1 Refactoring by Martin Fowler

Whenever I take a look at an SQL statement, they remind me of this…

Image for post
Image for post
(Photo by Taylor Brandon, Unsplash)

… these organically grown buildings you find in slums. They do their job. But no one would build them this way from scratch because you cannot find anything here unless you’re one of the residents! …


Hi, I’m Sven Balnojan. I think data will power every piece of our existence in the near future. I collect “Data Points” to help understand this near future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

Here are your weekly three data points: Simple Data Discovery, Modern Data Architectures & Data Meshes.

1 Whale, a Dead Simple Data Discovery Tool

“What does this column mean? Where can I find the order data?” Questions that bug every data engineer, every machine learner, and every analyst every day.

There are a lot of powerhouse tools to solve these problems like datahub, alation, etc. But none of these allow a company to start small. …


Office Hours

There are four ways to decentralize and structure data teams. Learn how to choose the right one.

Image for post
Image for post
(The four typical data team organization forms. Image by the author.)

Introducting Data Organizations

Data organizations within companies look like snowflakes. From close up, they are all unique, but if you step back, they all kind of look alike. They all deal with data and are usually organized around some data or analytics department.

That makes it hard to make organizational changes because it’s really hard to see the overarching picture. I like to propose a simple viewpoint that might make this easier.

I think these snowflakes come in four snowflake buckets. And really only one feature distinguishes them: Where in your data flow do you make the cut and go from a central unit working on the data to multiple decentralized ones, embedded into other units. …


The three organization forms Uber, Linkedin and AirBnb explored to integrate machine learning into the organization

Image for post
Image for post
Functional, central, or hybrid team org? Image by the author.

Uber, LinkedIn, and Airbnb all are massive machine learning success stories, not because they produce cool research or got lots of talent, but because they actually manage to turn data into money, and lots of it.

All three companies spent years exploring different ways of organizing machine learning work.

I’d like to share the key insights that I took away from their efforts, so you don’t have to spend years exploring different models.

Three Ways to Organize Machine Learning Work

There are really just three ways of organizing machine learning work, and all three have strengths & weaknesses that can either put your machine learning engineers out of work or make them the most valuable asset you have. …


Image for post
Image for post
Photo by Fernando Hernandez on Unsplash. Let’s update our productivity instead of updating our software packages again and again…

“Ever thought of updating yourself instead of updating your mobile” — Yash Gupta

Whenever I clone a new public repository and start the setup process, I think about the concept of the .go concept. I now start every new public repository I release with a “./go” script, which makes the setup process for everyone as simple as typing “./go”. However recently, I started to think that equally the ./get-up-to-date script should be part of every repository as well!

A ./go script is an amazing practice, but for a tech team to get into it, well it needs practice! It means using ./go scripts in lots of repositories. …


Understanding decentralization will help you understand, evaluate & adapt to current technology trends.

Image for post
Image for post
Photo credit: Nhia Moua

“Decentralization is based on the simple notion that it is easier to macrobull***t than microbull***t. Decentralization reduces large structural asymmetries.” ― Nassim Nicholas Taleb, Skin in the Game: Hidden Asymmetries in Daily Life

The human body is an amazing system. Optimized over a couple of million years, it seems to work pretty well. Scholar and statistician Nassim Taleb is a huge fan of nature, and as I recently reread some of his work, what stuck with me is the level of decentralization that nature built into our bodies.

We have two kidneys and if one fails, things will be fine. I tore one of the ligaments in my feet and luckily nature provided me with three so I can go on as if nothing happened. …


“Leveraging exponential technology to tackle big goals and using rapid iteration and fast feedback to accelerate progress toward those goals is about innovation at warp speed. But if entrepreneurs can’t upgrade their psychology to keep pace with this technology, then they have little chance of winning this race.”

Peter H. Diamandis, Bold: How to Go Big, Create Wealth and Impact the World

Image for post
Image for post
Photo by Franki Chamaki on Unsplash.

Introduction

It’s still Day 1 for data. Companies, governments, non-profits around the world are already extracting a whole lot of value from data. But really, compared to the things that are coming, today is really just Day 1. Data is growing exponentially, as is our ability to extract knowledge from it. If you imagine the amount of data available to us as an apple, then by 2030, this apple has turned into a soccer ball. …

About

Sven Balnojan

Ph.D., Product Manager, DevOps & Data enthusiast, and author of “Three Data Point Thursday”: https://www.getrevue.co/profile/svenbalnojan.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store