Solutions Architect & Instructor

Zoltan C. Toth

Scalable Data Architecture Expert with 20 years of experience, trusted by Databricks, dbt Labs, T-Mobile, and Fortune 500 companies worldwide. As a Principal Solutions Architect at Databricks, I designed and delivered some of the company's earliest and most strategic engagements. I help organizations build cloud-native data platforms, modernize their architectures, and enable their teams to operate at scale.

What I Can Help You With

Architecture, design, and training to help your team build and scale data platforms. Available as a freelance architect or with a team of contractors.

AI Systems & MLOps

Architecture and evaluation of AI systems, MLOps, and LLMOps platforms.

AI Integration & MCP

AI agent architecture, MCP development, and LLM-powered application design.

Training & Enablement

Teaching your team to architect, build, and scale data systems independently.

Training Catalog

Every training is fully hands-on, with a dedicated training environment provided. Can also be delivered on your own infrastructure. Available on-site or fully remote.

Spark Programming with Databricks

2–3 Days

A comprehensive introduction to Apache Spark on Databricks, covering everything from DataFrames and SQL to Structured Streaming and performance optimization. Aligned with the Databricks Spark Programming certification.

Module 1: Introduction to Databricks Apache Spark

  • Databricks Overview
  • Spark Runtime Architecture
  • Exploring Apache Spark Architecture in Databricks
  • Introduction to Spark DataFrames and SQL
  • Reading and Writing Data with DataFrames
  • Distributed System Programming Fundamentals

Module 2: Developing Applications with Apache Spark

  • Introduction to the SQL & DataFrame API
  • DataFrame API Fundamentals
  • Grouping and Aggregating Data
  • Relational Operations in Apache Spark
  • Working with Complex Data Types

Module 3: Stream Processing and Analysis

  • Introduction to Stream Processing
  • Spark Structured Streaming
  • Window Aggregation in Spark Structured Streaming

Module 4: Monitoring and Optimizing Spark on Databricks

  • Delta Lake Introduction and Deep Dive
  • Introduction to the Unity Catalog
  • Understanding and Optimizing Apache Spark Workloads
  • Performance Tuning
Based on the official Databricks Apache Spark Programming Certification requirements. Content is customized based on your team's experience level.

Machine Learning Operations with MLflow

1 Day

Learn to manage the complete ML lifecycle — from experiment tracking and model registry to deployment and production monitoring. Includes hands-on coverage of LLM evaluation with MLflow.

Module 1: Experimentation

  • Experiment Tracking with MLflow
  • Recording Parameters, Metrics, and Artifacts
  • Advanced Tracking: Autologging, Nested Runs, and Hyperparameter Tuning

Module 2: Model Management

  • MLflow Models and Model Flavors
  • Custom Models with pyfunc
  • MLflow Model Registry
  • Model Versioning and Stage Transitions
  • Webhooks, Automated Testing, and CI/CD Integration

Module 3: Deployment Paradigms

  • Batch Inference with Spark
  • Real-Time Serving with REST APIs
  • Databricks Model Serving and Managed Endpoints

Module 4: Production

  • Monitoring for Data, Feature, and Concept Drift
  • Statistical Drift Detection Methods
  • CI/CD for Machine Learning Pipelines

Module 5: LLM Operations

  • Tracing AI Agents
  • Evaluating LLMs with MLflow
  • Using LLM-as-a-Judge methods for evaluating LLM outputs
Based on the official Databricks ML in Production curriculum. Content is customized based on your team's experience level.

dbt (Data Build Tool)

2–3 Days

A comprehensive, hands-on deep dive into dbt covering the full development lifecycle — from models and testing to macros, documentation, and the latest dbt Fusion tooling.

Module 1: Introduction to dbt

  • What is dbt and why it matters
  • Setting up your dbt project
  • dbt Core vs. dbt Cloud vs. dbt Fusion
  • Project structure and data flow overview

Module 2: Models and Materializations

  • Building models with CTEs and the ref tag
  • Materialization strategies: views, tables, incremental, and ephemeral
  • Model dependencies and the DAG

Module 3: Seeds, Sources & Snapshots

  • Working with seeds and sources
  • Source freshness checks
  • Snapshots for slowly changing dimensions

Module 4: Testing and Data Quality

  • Generic, singular, and unit tests
  • Data contracts
  • Custom generic tests with parameters
  • Test severity configuration
  • Advanced data quality with dbt-expectations

Module 5: Jinja, Macros & Packages

  • Jinja templating fundamentals
  • Writing custom macros
  • Installing and using third-party packages

Module 6: Documentation, Hooks & Exposures

  • Writing and exploring documentation
  • The lineage graph (DAG)
  • Hooks and grants
  • Exposures and BI tool integration

Module 7: dbt Fusion & Tooling

  • dbt Fusion overview and feature matrix
  • VSCode extension and development workflow
  • Column-level lineage
  • Orchestration with Dagster
Based on the best-selling Udemy course (60,000+ students). Comprehensive curriculum covering dbt fundamentals through advanced production patterns. Adapted for instructor-led delivery with hands-on exercises.

Cloud Computing and Data Engineering on AWS

2–4 Days

A practical, hands-on course on cloud computing fundamentals and AWS services for data engineering. Covers networking, compute, storage, serverless architectures, and building data lake solutions with real-world exercises.

Module 1: Internet and Networking Fundamentals

  • TCP/IP protocols and how the internet works
  • IP addressing, DNS, and routing
  • HTTP/HTTPS and web communication
  • Encryption: symmetric, asymmetric, and public key infrastructure
  • Digital signatures and certificates

Module 2: AWS Core Services

  • AWS regions, availability zones, and global infrastructure
  • EC2: virtual machines in the cloud
  • Storage: S3, EBS, and ephemeral storage
  • Route 53 and domain management
  • High availability and disaster recovery patterns

Module 3: Serverless Computing

  • AWS Lambda and serverless architecture
  • Using AWS programmatically (Boto3)
  • Web scraping and data extraction
  • Building serverless data processing pipelines
  • Managed AI services (Image Recognition, Text Recognition, Transcription, and others)

Module 4: Data Lake and Analytics

  • Data lake concepts and architecture
  • Amazon Athena: querying S3 with SQL
  • Data formats: CSV, JSON, Parquet
  • Cost-effective analytics at scale

Building Agents and MCP with the OpenAI Agents SDK

1 Day

Build AI agents from scratch using the OpenAI Agents SDK, integrate them with external tools via the Model Context Protocol (MCP), and orchestrate multi-agent systems for production use cases.

Part 1: AI Agents Fundamentals

  • What are AI Agents and why they matter
  • The OpenAI Agents SDK & Agents Builder
  • Building your first agent
  • Tool use and function calling

Part 2: Model Context Protocol (MCP)

  • Understanding MCP architecture
  • Building MCP servers and clients
  • Integrating MCP with agents
  • Real-world MCP patterns

Part 3: Multi-Agent Systems

  • Agent orchestration patterns
  • Handoffs between agents
  • Guardrails and safety
  • Production deployment considerations
Based on the best-selling Udemy courses: AI Agents Crash Course with OpenAI & Complete MCP Bootcamp. Adapted for instructor-led delivery.

Trusted By

Selected consulting, architecture, and training clients.

dbt Labs logo Databricks logo Generali logo Aegon logo T-Mobile logo Alteo logo Craft logo Tipico logo

Detailed profile and references available upon request.

What Clients Say

Select testimonials from recent engagements.

Educational Consulting
★★★★★

“Zoltan provided educational services to dbt learners. During the project, he was very professional, making it very easy to work with him, as I was able to provide feedback to his plans which he incorporated very well, and he delivered high quality work. I am looking forward to working with Zoltan on our next project together!”

Sean McIntyre
Scaling Analytics with dbt
Educational Consulting
★★★★★

“I worked with Zoltan to deliver a tailored dbt course at my company — I found him easy to work with and collaborate with and very knowledgeable about industry standards. I would recommend Zoltan for any training engagements.”

Jake J. Dalli
Data Platform Lead
Cloud Application Development
★★★★★

“I really enjoyed collaborating with Zoltan to create a data stack from the ground up. He was committed to finding the best solution that aligned with the project's needs and the company's workflow. The execution not only met but exceeded our expectations, and the handover process was smooth and comprehensive.”

Andrea Sipos
Product Management Advisor & Researcher
IT Consulting
★★★★★

“Working with Zoltan on our AI project was a fantastic experience. He expertly handled the LLMOps architecture, guiding us from requirements analysis through PoC to full implementation. Zoltan stands out not only for his technical skills but for his focus on business impact, consistently suggesting ways to create more value. His unique blend of AI expertise and strategic thinking makes him an exceptional asset to any team.”

Linda Sallai
Head of Brand & Communications, BrokerChooser
IT Consulting
★★★★★

“Zoltan helped us professionalize our data engineering landscape. He did so not only by contributing excellent code, but by actively teaching our teams and enabling them to achieve the same quality themselves. In addition, he also had the bigger picture in mind, helping us also with topics not related to the initial engagement description.”

Florian Thürk
Senior Solutions Architect, Cloudera (formerly T-Mobile)
IT Consulting
★★★★★

“Zoltan helped us successfully pilot and deploy dbt at Alfa VIG Hungary that has been an integral part of our transform pipelines ever since. His expertise and experience in solving complex data engineering challenges was pivotal in the project just as much as his dedication to fully understand the client’s needs and fit the solution to the agreed requirements. I highly and honestly recommend Zoltan for any data and AI engineering projects independent of industry and company size.”

János Poór
Data Management and Analytics Expert, Alfa Vienna Insurance Group
Cloud Application Development
★★★★★

“I had the pleasure of collaborating with Zoltan on the creation of a cutting-edge Data Platform at Schneider Electric. His expertise on Databricks and on data and AI projects was invaluable in structuring the project during its early stages and providing meaningful insights on the technology. I highly recommend Zoltan for his exceptional skills and dedication.”

Giusy Agueci
AI Data Governance Leader, Schneider Electric
Cloud Application Development
★★★★★

“Zoltan helped us build our cloud infrastructure and automation for ML models. After discussing our plan, he suggested improvements, explained tradeoffs, and recommended optimal solutions. He built the system skeleton, focusing on complex parts, and documented everything thoroughly. His work gave us a flexible, high-performing system. He is fast, precise, and a true expert in ML, cloud, and system design.”

Gyozo Csóka
AI Research Engineer, Rényi Institute of Mathematics
IT Consulting
★★★★★

“Zoltan was so helpful when we asked for his support on our Data & AI project at Schneider-Electric. His expertise in data engineering and data science in a cloud context helped us accelerate and set up sustainable solutions. His strong soft skills in communication and his ability to transfer knowledge allowed our team to fully take over the evolutions and operations of what he built. Would love to have him as a permanent member of our organization.”

Dimitri Yanculovici
AI Architect, Schneider Electric

For a complete list of reviews, visit my LinkedIn Services Page.