Stuck In Middle- Data Aspect

Harjeet Singh
4 min readApr 8, 2023

Was recently having a discussion on a project where this term came up in my mind and it resonates with multiple people I talk to. And it is more so visible especially in Data Engineering/ Platform aspect. So thought of writing a brief viewpoint on same.

What do Data Engineers DO? — Simple Question, very simple answers, either Data engineers are :

  • writing data pipelines, your classic ETL, or these days its nemesis ELT — which is basically taking raw data and transforming (or not) and moving it to another location (cloud, database, data store, datalake) for consumption by associated internal/external stakeholders.
  • Maintain Data Platform — Maintain a platform that allows for access to data in a controlled manner (Auth, governance, Sensitive/PII data, etc). This access could be querying, API, bulk API, or some other protocol or mechanism. To allow access there have to be provisions for multiple types of ingestion and corresponding storage (cloud, DBs, etc) that fit the use case lastly, this whole saga can be real-time, batch, or some other frequency timed.
  • Product- Support or create products over raw data, data platform, or data science platform that requires the computation of a large quantity of data where related expertise (such as Spark, distributed computing, file formats, etc ) is required.

If you see in all of the above cases the Data team or the engineer working on a specific set of problems is neither the producer nor consumer either at the raw stage or…

--

--

Harjeet Singh

Problem Solver, writes on Tech, finance and Product. Watch out for my new creation, "THE PM SERIES"