Data Masters Internship Program

A head start on your future in Data Science

Begin your professional journey in Data Science with our innovative program that offers exciting opportunities to get a head start on your career in data.

Why complete our internship program

Data Masters offer a unique opportunity for interns to experience real-life projects and work on Business Intelligence solutions and Data Engineering challenges to fast-track their career into Data Consultancy.

Are you a curious problem solver with a passion for working with data? Do you have some experience with programming and statistics? Are you interested in how Business Intelligence and Data Engineering are implemented in the real world? Then this is the program for you.

At this internship, you will gain valuable insights into what it’s like to solve meaningful challenges with our diverse and forward-thinking team at Data Masters. The program will show you what kind of projects we work on at Data Masters and will attempt to simulate the challenges our consultants face every day – new terminology, ambiguity about the client goals, and challenging data analysis. All those aspects form an integral part of our day-to-day work.

Data Masters in transforming the businesses of today through our unique Self-Service methodology. Our goal is to help companies generate a competitive advantage by implementing Data Democratization. This program is designed to give you a feel of what it is like to work at Data Masters.

The program itself is divided into two segments: Business Intelligence and Data Engineering. Each of the tasks in the program aligns with a stage in different projects and follows a real-world example to bring to life.

We recognize that these tasks are challenging and that there are undoubtedly phrases and terminology you may no have heard before – don’t worry. We have tried to make this experience as true to life as possible and therefore our ask is that you attempt to seek our independent sources of information and do your own research, as required, to help guide you through the tasks.

Skills you will learn and practice:

Business understanding

Hypothesis farming

Communication

Programming

Exploratory Data Analysis

Data Visualization

Creativity

Mathematical Modelling

Model Evaluation

Client Communication

How it works:

Apply for a non-technical interview and see if you get chosen
Choose two out of three of the SQL, Python and Power BI for the exam to evaluate your current knowledge.
Get a Business Intelligence or Data Engineering project to work on with a dedicated mentor.

Introduction from Data Masters

Internship program tasks

Choose the project that fits you best

Objective:

Understand the architecture and tools required for streaming data and setting up the infrastructure

Actions:

Learn about the overall architecture: streaming services, orchestration, data lake, data warehouse, and isualization.

Study the tools and technologies being used, including GCP, Terraform, Docker, Kafka, Spark Streaming, Airflow, dbt, and BigQuery.

Set up the foundational infrastructure by configuring GCP and Terraform.

Objective:

Set up a real-time streaming pipeline.

Actions:

Create a Kafka instance to receive messages from the streaming service. Stream data using Kafka and process it in real time with Spark Streaming. Periodically store processed data to the data lake.

Objective:

Implement an hourly batch job that transforms streaming data for analytics.

Actions:

Configure an Apache Airflow instance to trigger the hourly batch processing job.
Use Airflow to execute data transformations using dbt.
Populate tables in BigQuery with the transformed data to support dashboard analytics.

Objective:

Create a dashboard to visualize and monitor key metrics.

Actions:

Design and implement a Google Data Studio dashboard to visualize the processed data.
Define and analyze key metrics.
Monitor data freshness and pipeline performance.

Objective:

Showcase the completed data pipeline and analytics dashboard.

Actions:

Record a video walkthrough explaining your project and presenting the dashboard.
Highlight how the pipeline ingests, processes, and transforms data for meaningful insights.
Share best practices, challenges encountered, and lessons learned.

Objective:

Understand the architecture and tools required for streaming data and setting up the infrastructure

Actions:

Learn about the overall architecture: streaming services, orchestration, data lake, data warehouse, and isualization.

Study the tools and technologies being used, including GCP, Terraform, Docker, Kafka, Spark Streaming, Airflow, dbt, and BigQuery.

Set up the foundational infrastructure by configuring GCP and Terraform.

Objective:

Set up a real-time streaming pipeline.

Actions:

Create a Kafka instance to receive messages from the streaming service. Stream data using Kafka and process it in real time with Spark Streaming. Periodically store processed data to the data lake.

Objective:

Implement an hourly batch job that transforms streaming data for analytics.

Actions:

Configure an Apache Airflow instance to trigger the hourly batch processing job.
Use Airflow to execute data transformations using dbt.
Populate tables in BigQuery with the transformed data to support dashboard analytics.

Objective:

Create a dashboard to visualize and monitor key metrics.

Actions:

Design and implement a Google Data Studio dashboard to visualize the processed data.
Define and analyze key metrics.
Monitor data freshness and pipeline performance.

Objective:

Showcase the completed data pipeline and analytics dashboard.

Actions:

Record a video walkthrough explaining your project and presenting the dashboard.
Highlight how the pipeline ingests, processes, and transforms data for meaningful insights.
Share best practices, challenges encountered, and lessons learned.

Apply here