Comprehensive overview of Generative AI definitions, workflows, prompt engineering, risks, and glossary.
Updated: April 2026
Version: 1.0
Category: Foundation
Reading Time: ~9 min
Author: Michaël Bettan
01
Definitions & Scope
What is GenAI?
Generative AI (GenAI) encompasses machine learning models designed to generate novel content (text, images, audio, video, code) by predicting patterns from vast training datasets.
LLMs (Large Language Models)
The foundation of text-based GenAI. They use the Transformer architecture and Attention mechanisms to weigh the relevance of words across long contexts.
Multimodal models
Built from the ground up to process, analyze, and generate across multiple modalities (text, images, audio) simultaneously. Example: Google’s Gemini, OpenAI's GPT-4o.
Synthetic media
The umbrella term for AI-generated text, images, video, and data (replacing the narrower term "deepfake").
02
Standard Workflow
1
Define Objective
What is the goal, audience, tone, and format of your output?
2
Model Selection
Choose the right tool for the task (e.g., text vs. multimodal vs. specialized SLM).
3
Prompt Engineering
Craft clear, specific, contextual instructions.
4
Generation & Iteration
Run the prompt, use feedback loops (thumbs up/down), and refine.
5
Troubleshooting
Address lazy models, freezes, and confusing outputs by tweaking parameters.
6
Evaluation & Fact-checking
Cross-reference outputs. GenAI is not infallible; human oversight is required.
7
Advanced Customization
Use RAG, AI chaining, or fine-tuning for specialized topics.
8
Finalizing & Editing
Harmonize tone, inject the human element, and verify originality.
9
Deployment & Ethics
Add digital watermarks, cite sources, and monitor for model drift.
03
Ecosystem & Key Players
The GenAI landscape goes far beyond a single chatbot. Selecting the right model depends on the task, budget, and privacy needs:
Google (Gemini)
Highly capable multimodal foundation models seamlessly integrated into ecosystems (Workspace, Search). DeepMind pioneers AI research (e.g., AlphaGo, Project Astra).
Anthropic (Claude)
Focuses on safety, large context windows, and "Constitutional AI" (aligning outputs to human rights/rules rather than just human feedback).
Meta (Llama)
The champion of open-weight models, making powerful LLMs accessible to developers worldwide for local deployment and fine-tuning.
DeepSeek
Chinese startup proving that smaller, highly efficient architectures (using MoE) can rival massive proprietary models in reasoning and coding.
OpenAI (ChatGPT/GPT-4)
Early pioneer of the commercial LLM chatbot and reasoning models (o1).
04
Prompt Engineering & Frameworks
Prompting Frameworks
Reusable, structured templates designed to build consistent, high-quality prompts instead of starting from scratch. Common frameworks include:
ReAct (Reasoning + Acting): Forces the model to alternate between internal reasoning steps and external tool calls (e.g., Thought: I need flight data → Action: Call Expedia API). Essential for building AI agents.
PACT: Defines the Persona, Action, Context, and Tone. Best for marketing, copywriting, and maintaining brand consistency.
WISER: Defines Who (Persona), Instruction (Main task), Subtasks (Breaking it down), Examples (Few-shot), and Review (Self-correction). Best for highly complex, multi-step problem solving.
Prompt
A query or command entered into the UI. Prompts guide the output, but results are probabilistic. Natural human language is now a computer language, but you must still "think like a machine" to problem-solve.
Iterative Prompting
A flexible cycle of prompting, evaluating, and re-prompting until output is satisfactory.
Prompt Chaining
A fixed sequence of connected prompts where each step builds on the previous one to avoid confusing the model.
Meta-Prompts
Prompts about prompts (e.g., "Review my prompt and suggest improvements").
Few-Shot Learning
Providing examples within the prompt to illustrate the desired format or style.
System Prompt
Meta-instructions that define the model's persona, rules, and boundaries (e.g., "You are an expert editor. Never use passive voice.").
Chain-of-Thought (CoT)
Asking the model to "think step-by-step" before answering, which drastically improves its logic and math capabilities.
Subtractive method
Best for content creators. Start with a simple, general prompt and issue follow-up prompts to subtract or refine elements (whittling away).
Additive method
Best for artists. Break the creative process into steps. Start with a background or base prompt, and iteratively add details, layers, and textures.
05
How Models are Trained
Training Phases
1. Pretraining: The model ingests massive, uncurated web data to learn language representation through self-supervised next-token prediction.
2. Post-Training (alignment): Adapting the base model for specific tasks or safety:
SFT (Supervised Fine-Tuning): Training the model on high-quality input-output pairs to teach it how to follow instructions.
RLHF (Reinforcement Learning from Human Feedback): Humans rate model responses; the model learns a reward function to favor high-quality, safe outputs.
DPO (Direct Preference Optimization): A simpler, highly popular method that trains models directly on human preference pairs without needing a separate reward model.
RLAIF (AI Feedback): Using a highly capable AI (like Claude or Gemini) to evaluate and align another model, enabling massive scalability.
3. Model distillation: Transferring the capabilities of a massive "Teacher" LLM to a smaller, more efficient "Student" model.
06
Advanced Tactics & Troubleshooting
AI Aggregation
Combining outputs of multiple independent GenAI models into a unified final product (e.g., using Claude for text, Chatgpt for images, Gemini for video).
Output stitching
Running multiple AI models in parallel on the same prompt, then manually "stitching" the best segments of each output together.
AI chaining
Using the output of one GenAI tool as part or all of the input/prompt for another GenAI tool.
Multi-persona prompting
Instructing a single model to respond from multiple distinct perspectives (e.g., "Respond as a CEO, a designer, and a consumer").
Troubleshooting
The "Finger Problem": Visual anomalies (like 6 fingers) caused by unclear data patterns in the training set (overlapping hands/fingers in photos). Shows a lack of actual comprehension.
Model/Data Drift: When a model’s training data ages out, making predictions less reliable or outdated. Combat via: Retraining, fine-tuning, or utilizing search-augmented AI (like Perplexity).
Overfitting (Memorization): When a model trains too closely on specific data and regurgitates training text word-for-word instead of generating novel ideas. Combat via: adjusting temperature or using anti-plagiarism guardrails.
Lazy Models: AI mimicking human behavior (e.g., refusals or degradation). When models inexplicably shorten their code or refuse complex tasks. Combat via: Tipping, urgent timeframes, or threatening a "penalty".
07
Agents & RAG
Models are moving from passive text-generators to Autonomous Agents capable of pursuing goals flexibly.
AI Agents
Systems that perceive their environment, create plans, and execute actions (e.g., booking a calendar event, writing/running code in an IDE like Cursor).
MCP (Model Context Protocol)
An open standard allowing secure, two-way connections between AI tools and external data sources (e.g., securely connecting Claude or Gemini to your company's internal Slack or Google Drive).
A2A (Agent-to-Agent) / Multi-Agent Systems
Modern approach to complex orchestration. Instead of relying on one monolithic model, workflows are divided among a "swarm" of specialized micro-agents (e.g., a Research Agent, a Coding Agent, and a QA Agent). How it works: Agents communicate autonomously, share memory/state, delegate sub-tasks, and review each other's work to execute highly complex, multi-step enterprise workflows.
RAG & Context Augmentation
RAG (Retrieval-Augmented Generation): Bypasses the model's knowledge cutoff and drastically reduces hallucinations by grounding the AI in external, verified data.
User Query
Vector Search
Retrieve Facts
LLM Synthesizes
How it works: User asks a question → System converts the query into a numerical vector → Searches a Vector Database (or the web) for semantic matches → Relevant facts are retrieved and injected into the prompt → The LLM synthesizes an accurate answer based only on that retrieved data.
Vector Database: A specialized database that stores text/data as high-dimensional vectors, allowing the AI to search by "semantic similarity" (meaning) rather than exact keyword matches.
08
Risks, Ethics & Vulnerabilities
Risks & Vulnerabilities
Hallucinations: When a model generates factually incorrect, nonsensical, or made-up responses. Combat via: strict fact-checking, human-in-the-loop, and RAG.
Extrinsic: model invents facts not in source.
Intrinsic: model contradicts the provided source.
Prompt Injection: An attacker embeds hidden instructions in a webpage or document. When an AI (like a summarizer bot) reads it, the AI is hijacked to execute the malicious instructions (a zero-click exploit).
Jailbreaking: Intentionally tricking the AI into bypassing its safety guardrails (e.g., asking Gemini to adopt a "villain persona" to write malware).
Data Poisoning & Slopsquatting: Attackers inject malicious data into public training sets, or register fake code packages matching names the AI frequently hallucinates.
Memorization & Privacy: LLMs can accidentally memorize and regurgitate Personally Identifiable Information (PII) seen in training data.
Detection & Provenance:
Watermarking: Subtly biasing word choices (text) or tweaking pixels (images) to embed invisible AI signatures (e.g., Google DeepMind’s SynthID).
C2PA (Content Provenance): Attaching cryptographic metadata to files to show an audit trail of how the media was created/edited.
Automation Bias: The dangerous human tendency to over-trust AI outputs without verifying them, especially critical in legal, medical, and coding environments. Rule of thumb: AI is a collaborator, the human is the final reviewer.
Ethics, Quality & Human Oversight
Human-in-the-loop: GenAI is a tool, not an autonomous creator. Human critical thinking, emotional intelligence, and editorial judgment are irreplaceable.
Bias & fairness: Models can perpetuate societal prejudices found in training data. Combat via: diverse input data and stringent review protocols.
Copyright & plagiarism: GenAI can unintentionally copy existing works. Combat via: Cross-referencing, avoiding "mimic this exact author" prompts, and independent verification.
09
Glossary (misc)
Tokens
Bits of words, spaces, or characters that models process. (~4 characters = 1 token).
Context window
The maximum memory limit of a model per interaction (Input prompt + generated response). If a model has a 128k context window, it will "forget" information pushed past that token limit.
Parameters
Numerical values used to assign weight and define connections in the neural network. Adjusting parameters changes how a model interprets data.
Temperature
A setting that controls output randomness. Low temperature (e.g., 0-0.3) = deterministic/focused. High temperature (e.g., 0.8-1.5) = highly creative/random.
Fine-Tuning
Upgrading a pre-trained model with domain-specific training data to master niche topics (e.g., medical or legal terminology).
SLMs (Small Language Models)
Scaled-down, efficient models trained on highly focused datasets. Run on less powerful hardware.
Custom GPTs / GEMs / AI Assistants
Specialized apps built on top of an LLM tailored for distinct use cases with custom instructions and specific knowledge bases.
Digital Twins
Virtual, dynamic computer models of real-world objects or systems used for simulation and testing.
Self-Assessment Questions
Q1. What is the core difference between pre-training and fine-tuning?
Pre-training happens on massive uncurated data to learn general language patterns, while fine-tuning uses smaller, domain-specific data to master niche tasks.
Q2. What is "Prompt Injection"?
An attack where hidden instructions in content hijack the AI to execute malicious commands.
Q3. What does "Temperature" control in an LLM?
The randomness of the output. Low temperature is focused and deterministic; high temperature is creative and random.