LLMs and Reinforcement Learning Certificate for Paul A. Walsh

Certificate ID:

762903

Authentication Code:

cd6d8

Certified Person Name:

Paul A. Walsh

Certified Person Email:

paul.a.walsh@accenture.com

Trainer Name:

Patrick Fodor

Duration Days:

Duration Hours:

Course Name:

LLMs and Reinforcement Learning

Course Date:

2024-05-08 09:30 to 2024-05-09 16:30

Course Outline:

Baseline Training : Generative AI with Large Language Models (LLMs):

Transformer Architecture and LLMs

● What is a transformer and how does it work?

o Historical Background
o Applications in AI Today
o Architecture Overview

● Main components and features of a transformer
o Embedding and Positional Encoding
o Multi-Head Attention
o Feed Forward Neural Network
o Normalization and Residual Connections

● Transformer Models
o Self Attention ModelMulti-Head Attention
o Multi-Head Attention
o Encoder-Decoder Architecture
o Positional Embeddings
o Popular Models:
▪ BERT (Bidirectional Encoder Representations from Transformers)
▪ GPT (Generative Pretrained Encoder)

● Performance optimization and Pitfalls of transformers
o What is context length
o Mamba and Sate-Space models
o Flash attention: memory efficient, fast attention
o Sparse transformers
o Thinking like Transformers: understanding the inner workings of the transformer architecture.
o Vision Transformers
o Why we need Quantization
o Hands on tutorial: Quantize your model (GGUF)

● Improving transformers
o Retrieval augmented text generation.
o Mixture of models

Tree of thoughts

● Fine tuning
o Theory of Low-Rank Adaptation
o Hands on tutorial: Finetuning with QLora Scaling Laws and Optimization

● What are scaling laws and why are they important for LLMs?
o Over of Scaling Model
o Data and Model Size Scaling
o Computational scaling
o Parameter Efficiency Scaling

● How do scaling laws relate to the model size, data size, compute budget, and inference requirements?
● How can scaling laws help optimize the performance and efficiency of LLMs?
o Over of Scaling Model
o Data and Model Size Scaling
Training and Fine-Tuning LLMs

● Main steps and challenges of training LLMs from scratch
o Data Acquisition and maintaining
o Large Scale of needed Data, CPU and Memory.
o Optimization Challenges growing with number of parameters

● Benefits and drawbacks of fine-tuning LLMs for specific tasks
● Best practices and tools for training and fine-tuning LLM

● Landscape of open-source LLMs

Baseline Training : Fundamentals of Reinforcement Learning Introduction

● Learning through positive reinforcement
o Definition and Core Concepts
o Markov Decision Process - Dynamic Programming - Monte Carlo
o Temporal Difference Learning
o Deep Q Networks (DQN)
o Proximal Policy Optimization (PPO)
o Hands-On Implementation

Elements of Reinforcement Learning

Important Terms (Actions, States, Rewards, Policy, Value, Q-Value, etc.)

How reinforcement learning is used in LLMs & Reinforcement Learning with Human Feedback

Alternatives to RLHF: Direct Preference Optimization

Certificate Sent:

Certificate Sent