×
Study Notes — Certification Prep

Generative AI 101
Study Guide

Comprehensive overview of Generative AI definitions, workflows, prompt engineering, risks, and glossary.

Updated: April 2026
Version: 1.0
Category: Foundation
Reading Time: ~9 min
Author: Michaël Bettan
01

Definitions & Scope

What is GenAI?

Generative AI (GenAI) encompasses machine learning models designed to generate novel content (text, images, audio, video, code) by predicting patterns from vast training datasets.

LLMs (Large Language Models)
The foundation of text-based GenAI. They use the Transformer architecture and Attention mechanisms to weigh the relevance of words across long contexts.
Multimodal models
Built from the ground up to process, analyze, and generate across multiple modalities (text, images, audio) simultaneously. Example: Google’s Gemini, OpenAI's GPT-4o.
Synthetic media
The umbrella term for AI-generated text, images, video, and data (replacing the narrower term "deepfake").
02

Standard Workflow

1

Define Objective

What is the goal, audience, tone, and format of your output?

2

Model Selection

Choose the right tool for the task (e.g., text vs. multimodal vs. specialized SLM).

3

Prompt Engineering

Craft clear, specific, contextual instructions.

4

Generation & Iteration

Run the prompt, use feedback loops (thumbs up/down), and refine.

5

Troubleshooting

Address lazy models, freezes, and confusing outputs by tweaking parameters.

6

Evaluation & Fact-checking

Cross-reference outputs. GenAI is not infallible; human oversight is required.

7

Advanced Customization

Use RAG, AI chaining, or fine-tuning for specialized topics.

8

Finalizing & Editing

Harmonize tone, inject the human element, and verify originality.

9

Deployment & Ethics

Add digital watermarks, cite sources, and monitor for model drift.

03

Ecosystem & Key Players

The GenAI landscape goes far beyond a single chatbot. Selecting the right model depends on the task, budget, and privacy needs:

Google (Gemini)
Highly capable multimodal foundation models seamlessly integrated into ecosystems (Workspace, Search). DeepMind pioneers AI research (e.g., AlphaGo, Project Astra).
Anthropic (Claude)
Focuses on safety, large context windows, and "Constitutional AI" (aligning outputs to human rights/rules rather than just human feedback).
Meta (Llama)
The champion of open-weight models, making powerful LLMs accessible to developers worldwide for local deployment and fine-tuning.
DeepSeek
Chinese startup proving that smaller, highly efficient architectures (using MoE) can rival massive proprietary models in reasoning and coding.
OpenAI (ChatGPT/GPT-4)
Early pioneer of the commercial LLM chatbot and reasoning models (o1).
04

Prompt Engineering & Frameworks

Prompting Frameworks

Reusable, structured templates designed to build consistent, high-quality prompts instead of starting from scratch. Common frameworks include:

Prompt
A query or command entered into the UI. Prompts guide the output, but results are probabilistic. Natural human language is now a computer language, but you must still "think like a machine" to problem-solve.
Iterative Prompting
A flexible cycle of prompting, evaluating, and re-prompting until output is satisfactory.
Prompt Chaining
A fixed sequence of connected prompts where each step builds on the previous one to avoid confusing the model.
Meta-Prompts
Prompts about prompts (e.g., "Review my prompt and suggest improvements").
Few-Shot Learning
Providing examples within the prompt to illustrate the desired format or style.
System Prompt
Meta-instructions that define the model's persona, rules, and boundaries (e.g., "You are an expert editor. Never use passive voice.").
Chain-of-Thought (CoT)
Asking the model to "think step-by-step" before answering, which drastically improves its logic and math capabilities.
Subtractive method
Best for content creators. Start with a simple, general prompt and issue follow-up prompts to subtract or refine elements (whittling away).
Additive method
Best for artists. Break the creative process into steps. Start with a background or base prompt, and iteratively add details, layers, and textures.
05

How Models are Trained

Training Phases

06

Advanced Tactics & Troubleshooting

AI Aggregation
Combining outputs of multiple independent GenAI models into a unified final product (e.g., using Claude for text, Chatgpt for images, Gemini for video).
Output stitching
Running multiple AI models in parallel on the same prompt, then manually "stitching" the best segments of each output together.
AI chaining
Using the output of one GenAI tool as part or all of the input/prompt for another GenAI tool.
Multi-persona prompting
Instructing a single model to respond from multiple distinct perspectives (e.g., "Respond as a CEO, a designer, and a consumer").

Troubleshooting

07

Agents & RAG

Models are moving from passive text-generators to Autonomous Agents capable of pursuing goals flexibly.

AI Agents
Systems that perceive their environment, create plans, and execute actions (e.g., booking a calendar event, writing/running code in an IDE like Cursor).
MCP (Model Context Protocol)
An open standard allowing secure, two-way connections between AI tools and external data sources (e.g., securely connecting Claude or Gemini to your company's internal Slack or Google Drive).
A2A (Agent-to-Agent) / Multi-Agent Systems
Modern approach to complex orchestration. Instead of relying on one monolithic model, workflows are divided among a "swarm" of specialized micro-agents (e.g., a Research Agent, a Coding Agent, and a QA Agent). How it works: Agents communicate autonomously, share memory/state, delegate sub-tasks, and review each other's work to execute highly complex, multi-step enterprise workflows.

RAG & Context Augmentation

RAG (Retrieval-Augmented Generation): Bypasses the model's knowledge cutoff and drastically reduces hallucinations by grounding the AI in external, verified data.

User Query
Vector Search
Retrieve Facts
LLM Synthesizes
08

Risks, Ethics & Vulnerabilities

Risks & Vulnerabilities

Ethics, Quality & Human Oversight

09

Glossary (misc)

Tokens
Bits of words, spaces, or characters that models process. (~4 characters = 1 token).
Context window
The maximum memory limit of a model per interaction (Input prompt + generated response). If a model has a 128k context window, it will "forget" information pushed past that token limit.
Parameters
Numerical values used to assign weight and define connections in the neural network. Adjusting parameters changes how a model interprets data.
Temperature
A setting that controls output randomness. Low temperature (e.g., 0-0.3) = deterministic/focused. High temperature (e.g., 0.8-1.5) = highly creative/random.
Fine-Tuning
Upgrading a pre-trained model with domain-specific training data to master niche topics (e.g., medical or legal terminology).
SLMs (Small Language Models)
Scaled-down, efficient models trained on highly focused datasets. Run on less powerful hardware.
Custom GPTs / GEMs / AI Assistants
Specialized apps built on top of an LLM tailored for distinct use cases with custom instructions and specific knowledge bases.
Digital Twins
Virtual, dynamic computer models of real-world objects or systems used for simulation and testing.

Self-Assessment Questions

Q1. What is the core difference between pre-training and fine-tuning?

Pre-training happens on massive uncurated data to learn general language patterns, while fine-tuning uses smaller, domain-specific data to master niche tasks.

Q2. What is "Prompt Injection"?

An attack where hidden instructions in content hijack the AI to execute malicious commands.

Q3. What does "Temperature" control in an LLM?

The randomness of the output. Low temperature is focused and deterministic; high temperature is creative and random.