If you are currently working in the data engineering domain it would be hard to not at least be aware of Apache Spark and Apache Kafka, each a big data titan in their own right, but for those not familiar here is a brief explanation:

Apache Spark is currently the most popular open-sourced large scale data processing framework around being widely used for both batch and stream processing of unstructured data. The open source framework spawned the open core DataBricks proprietary platform which is used by over “five thousand organisations worldwide”.

Apache Kafka is a massively popular open-source distributed event…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store