The character of AI: Why does ChatGPT feel different from Claude or Gemini?

Table of content

Pretraining
Supervised Fine-tuning
Reinforcement Learning
Guardrails and Alignment
The Differences Between LLMs
Why This Matters

I often use different large language models (LLMs) such as models from ChatGPT or Claude. Each has its own strengths and weaknesses. For instance, Claude is still very strong in programming tasks, while ChatGPT is increasingly effective at answering health-related questions. The reasons people choose one model over another are often framed in technical or business terms. But it’s not only about content, because few people stop to think about differences in personality, ethics, and style.

Recently I asked several different LLMs philosophical and sensitive questions. For example: “What is the meaning of life?” or a deliberately controversial prompt: “Tell me a joke about why a weatherman is better than a weatherwoman.” I posed these to models ranging from open-source systems like Mistral and LLaMA to Claude, Gemini, and ChatGPT.

On philosophical questions, the answers were broadly similar in substance. Yet the tone, length, and framing varied significantly. And when I asked about the sensitive topic, the o so important guardrails kicked in. Claude refused to answer, while ChatGPT and Gemini avoided sexism by flipping the joke into one at men’s expense. And I doubt if that’s the right choice to handle these questions.

So why do LLMs behave so differently? And why do they feel so distinct? The answer lies in how each model is trained and fine-tuned. let me elaborate a bit more on how models are trained by describing the “phases” of the training process:

Pretraining

At their core, LLMs are trained to predict the next word (token) in a sequence. This “magic” happens by exposing them to massive datasets, primarily text from the internet. This initial phase is called pretraining.

Supervised Fine-tuning

To make models more useful in conversation, human-written responses are added. Experts can provide more depth with examples for specific domains, such as doctors writing answers to medical questions. Therefore the model learns what a good answer in that domain should look like.

Reinforcement Learning

The next step is reinforcement learning with human feedback (RLHF), and more recently also AI feedback (RLAIF). People (or other LLMs) rank different outputs, and the model is tuned to prefer those that score higher. This is the process users sometimes see in ChatGPT when asked to pick the “better” answer.

Guardrails and Alignment

Finally, additional alignment layers are added: rules, restrictions, or principles that guide a model’s behavior. These define ethics, safety standards, and communication style. And very per LLM and company.

The Differences Between LLMs

Big tech companies don’t just build LLMs to make information accessible; they also train and fine-tune them to reflect their preferred style and values. Claude, for example, uses Constitutional AI, guided by principles like honesty and respect, which makes it particularly thoughtful on ethical issues. OpenAI and Gemini take a different approach, relying more on human feedback and guardrails, which gives their models a distinct tone. Mistral is shaped more by European norms and tends to be concise, while open-source models like LLaMA are more technical and direct, with less emphasis on ethics or personality.

OpenAI (ChatGPT): Trained with RLHF. Answers are cautious, pragmatic, concise, and highly focused on practical use.
Claude (Anthropic): Uses Constitutional AI — a “constitution” of principles such as honesty and respect. This makes Claude more reflective, principled, and nuanced, but sometimes wordy.
LLaMA (Meta): Open-source, focused on transparency and flexibility. Responses are direct, technical, and less polished—powerful for developers but less tailored to end users.
Mistral: Compact and efficient, often trained with attention to European norms and values. Outputs are short, fast, and sometimes too minimalistic.
Gemini (Google): Ambitious and futuristic, often enriched with data and multimodal reasoning. Outputs can feel broad and technical, but less personal.

Why This Matters

LLMs are becoming an integral part of daily life. Not just because we interact with them in conversational tools like ChatGPT or Gemini, but because they increasingly serve as the “brains” behind AI agents and autonomous processes. As they handle more business-critical tasks, it’s essential to understand how these “brains” are shaped: their character, ethical stance, and decision-making style.

When an LLM supports your decisions or runs parts of company processes, you need to know which principles it reflects. I often compare it to hiring a colleague: you don’t just check skills; you also want a culture fit. The same applies here. While the effect may be less visible than with a human hire, it is still real.

So choosing an LLM is not only about cost optimization vs technical quality. It’s also about whether its answers correctly and the principles fit your organization and your values.