DataOps Platform Training

StreamSets DataOps Platform Training

HANDS ON TRAINING

BUILD PROVEN SKILLS

2 DAY COURSE

StreamSets DataOps Platform Course Overview

High-performance deployments deserve high-caliber training. The fastest, most reliable way to build proven skills in StreamSets is via expert instructor-led hands-on classroom training in a structured learning environment.

This two-day hands-on training course provides comprehensive coverage of the StreamSets DataOps Platform.

  • Use the StreamSets Data Collector (SDC) engine to create complex pipelines that ingest data from a variety of sources, manipulate that data, and then export it to destinations including Apache Kafka, relational database management systems, and Apache Hadoop.
  • Configure and use the StreamSets Transformer engine to access the various environments, transfer and transform data, and run jobs., and monitor the performance of pipelines across all instances of StreamSets products running in the organization.
  • Manage users, design and share pipelines, use the Pipeline Repository, configure and run jobs, and monitor the performance of pipelines across the organization with StreamSets Control Hub.

Requirements

Students preferably should have a general knowledge of operating systems, networking, programming concepts, and databases.

Audience

The course is designed for Data Engineers who will be building, managing, and monitoring data flow pipelines.

Objectives

Introduction
Lab environment
Course Resources
Data Operations Platform Overview

Getting Started
Set Up a Deployment
Build a Pipeline
Run a Job
Monitor a Job
Schedule a Job

Build Pipelines with Data Collector
JDBC Pipeline
CDC Pipeline
Snowflake Pipeline
Databricks Pipeline
Kafka Pipeline

Build Pipelines with Transformer
Build Transformer Pipelines
Understand Apache Spark
Tuning Transformer Pipelines
Origins, Operators, Destinations

Managing Pipelines
Create a Topology
Set up Alerts and Subscriptions
Installing Packages, Libraries and Drivers
Collaboration and Version Control
Implementing Pipeline CI/CD

Extending the DataOps Platform
Working with the REST API
Advanced Processing with Script Evaluators

Logging and Troubleshooting
Logging and Troubleshooting with Data Collectors
Logging and Troubleshooting with Transformer