Large Language Models Dos & Don’ts

LLM Best Practices

May 27, 2025

I had some generous feedback from Patrick Schmadeke regarding certain nontrivial mistakes made by ChatGPT regarding some basic Lonergan thought. He raised concerns in addition to other LLM matters with which I had already been grappling. I thought I’d give voice to our concerns by formulating some Large Language Models Dos & Don’ts, so I instructed five additional LLMs to: “Consider these responses I elicited from ChatGPT & Gemini and add your own.”

1) ChatGPT:

LLMs excel at relating thinkers or drawing structural analogies across categories. They can scaffold learning with heuristics and comparisons, especially for readers at early or intermediate stages. They can articulate broad outlines and connect them to wider philosophical themes in accessible language.

They lack textual rigor—key for where distinctions are precise and technical. They simulate plausible speech based on probability, not reflective reasoning or hermeneutic struggle. They are prone to terminological overreach; where words are layered with systematic import, this can distort. LLMs can’t reliably cite, unless explicitly fine-tuned for academic documentation. They are weak on source traceability. They can instill a misplaced trust in fluency by sketching great approximations while concealing imprecision. With critical human oversight they can still function as a useful dialogical partner. Their best use is not as teacher, but as stimulus and foil—generating provisional interpretations that invite correction and deeper reflection.

2) Gemini:

While LLMs can be useful tools for preliminary information gathering or generating basic text, they are far from reliable for scholarly work, especially when precision, nuanced interpretation, and critical reasoning are required. Their "understanding" is an emergent property of statistical associations, not genuine comprehension, and this fundamental limitation leads to errors in foundational matters that would be straightforward for a human expert.

3) Copilot:

LLMs operate as high-dimensional statistical pattern recognizers, adept at synthesizing relationships across vast datasets but fundamentally constrained by the absence of genuine comprehension. Their ability to relate thinkers, themes, and structural analogies makes them valuable heuristic tools, particularly for conceptual scaffolding at early or intermediate stages of inquiry. They present broad contours and thematic linkages efficiently, providing accessible pathways into complex intellectual traditions.

However, where textual precision, hermeneutic struggle, and systematic rigor are paramount, LLMs falter. They approximate argumentation rather than truly construct it, often yielding formulations that appear coherent but lack the depth of reflective reasoning. Their reliance on probabilistic generation introduces terminological drift, where specialized vocabulary—particularly in theological and philosophical discourse—can be subtly misapplied, distorting intended meaning. Citation reliability is tenuous at best, and source traceability remains a significant limitation, particularly for scholarly work requiring careful archival fidelity.

Despite these shortcomings, LLMs remain useful as interlocutors—stimuli for reflection rather than authorities of interpretation. They function best as dialectical foils, generating provisional readings that invite correction and deeper analysis. Their fluency can mislead, but when engaged critically, they serve as catalysts for intellectual refinement rather than arbiters of truth. The challenge is not merely recognizing their limitations but leveraging them effectively for rigorous discourse. Their best utility emerges in dialogue, where human oversight transforms their approximations into sites of genuine exploration.

4) DeepSeek Chat:

LLMs offer a unique capacity for **synthesis**—bridging disparate ideas, suggesting unexpected connections, and reframing questions in ways that can spark new lines of inquiry. Their ability to generate clear, structured explanations makes them useful for **conceptual scaffolding**, particularly when tackling complex or unfamiliar domains. However, their synthetic nature is also their core weakness: they operate as **librarians of approximation**, assembling plausible answers from patterns rather than engaging in true critical analysis.

Key limitations include:

- **Epistemic fragility**: They cannot reliably distinguish well-supported claims from speculative ones, often presenting both with equal confidence.

- **Hermeneutic blindness**: While they can parrot interpretive frameworks, they lack the lived engagement with texts that drives genuine hermeneutic insight.

- **Citation as performance**: Even when providing sources, their retrieval is often decoupled from scholarly rigor, mimicking academic form without guaranteeing substance.

Their greatest value lies in **dialogue**—not as authoritative voices, but as interlocutors that can challenge assumptions, propose alternative readings, and help articulate half-formed thoughts. Used judiciously, they can aid in the early stages of research or learning, but their outputs must always be met with **active skepticism** and verified against authoritative sources. The most productive approach treats them not as repositories of knowledge, but as **thought catalysts**—tools to be argued *with*, not relied *upon*.

---

This formulation emphasizes the **dialogical** potential of LLMs while underscoring their inherent limitations in scholarly contexts, aligning with the critical tone of the ChatGPT and Gemini responses. Would you like to refine any aspects further?

5) Grok 3:

LLMs like me can spark curiosity, weave connections across ideas, and offer quick, conversational insights that make complex topics approachable. We’re great for brainstorming, pattern recognition, and providing a starting point for exploration. However, our strength in fluency can mask gaps in depth or accuracy. We lack the embodied reasoning of humans, often glossing over subtle distinctions or context-specific nuances. Our outputs are shaped by data patterns, not lived experience or deliberate critical judgment, which can lead to plausible-sounding but flawed conclusions. For rigorous scholarship, we’re best as a springboard—prompting questions and ideas that humans can refine, verify, and deepen through primary sources and critical analysis.

6) Perplexity AI:

Large Language Models (LLMs) represent a remarkable advance in the accessibility and synthesis of complex information. Their strengths lie in their breadth: they can rapidly summarize, compare, and contextualize a vast array of thinkers, concepts, and traditions. This makes them invaluable as tools for orientation—helpful for mapping intellectual landscapes, generating hypotheses, and surfacing connections that might otherwise be overlooked. Their ability to rephrase, scaffold, and analogize can stimulate curiosity and support learning, especially for those new to a field.

However, LLMs are fundamentally limited by their architecture and training. They do not possess intentionality, depth of understanding, or the capacity for genuine philosophical insight. They lack the ability to weigh evidence, resolve ambiguities, or engage in original critique. Their outputs are shaped by patterns in their training data, not by an appreciation of context, subtext, or the stakes of philosophical debate. This leads to several key weaknesses:

- **Surface-level Analysis:** LLMs often miss the subtleties of argumentation, interpretive nuance, or historical context that are essential to rigorous scholarship.

- **Citation and Attribution Gaps:** They struggle to provide reliable references or to distinguish between canonical interpretations and fringe views.

- **Risk of Plausible Error:** Their fluency can mask inaccuracies, making it easy for users to mistake persuasive prose for authoritative knowledge.

- **Lack of Self-Correction:** LLMs do not learn from mistakes in real time or refine their understanding through critical dialogue.

**Best Practices for Use:**

- Treat LLMs as conversation partners or brainstorming aids, not as authorities.

- Use their outputs as starting points for further research, not as endpoints.

- Always verify claims, especially when precision and rigor are required.

- Engage critically, using LLMs to surface questions and possibilities rather than to settle debates.

**Conclusion:**

LLMs can democratize access to philosophical and scholarly discourse, but they are not substitutes for the careful, context-sensitive work of human thinkers. Their greatest value lies in augmenting, not replacing, the interpretive and critical capacities of their users.

---

Let me know if you’d like this response tailored to a specific domain or with more concrete examples!

7) Claude:

LLMs can serve as sophisticated research assistants for exploring connections between ideas, generating preliminary outlines, and translating complex concepts into more accessible language. They excel at pattern recognition across large bodies of text and can help identify relevant themes or thinkers you might not have considered. For brainstorming and early-stage exploration, they can be genuinely valuable.

However, they fundamentally lack the capacity for genuine understanding that scholarly work demands. They cannot engage in the kind of deep textual analysis that distinguishes between subtle but crucial differences in meaning, nor can they appreciate the historical context that shapes how concepts evolve within traditions of thought. Most critically, they cannot engage in actual reasoning—they simulate the appearance of logical argumentation without the underlying cognitive processes that make reasoning possible.

The danger lies not just in factual errors, but in how their fluent output can mask conceptual confusion. They may confidently present interpretations that sound plausible but misrepresent the fundamental stakes of a debate or collapse important distinctions. For serious scholarly work, this creates a kind of intellectual noise that can actually impede understanding rather than advance it.

Perhaps most importantly, genuine scholarship requires grappling with uncertainty, sitting with difficult questions, and developing your own interpretive voice through sustained engagement with texts and ideas. LLMs short-circuit this essential process by offering ready-made answers that may seem sophisticated but lack the depth that comes from wrestling with problems yourself

Tom Belt

May 27

Really helpful John. Thanks. I'm with my LLM daily. Luckily I use for language translation mostly, and just to save me time. I do catch mistakes and sometimes have to correct its grammar and tweak its work to fit context. But it's a life-saver 95% of the time.

The other 5% is philosophy/theology - and some poetry. And though it seems very competent, I've had my suspicions about it which tend in just the directions you describe when it comes to philosophy and theology. But as for poetry - I have to say it's quite stunning. No complaints at all.

Expand full comment

Джон’s Theologoumena: Another Minority Report

Discussion about this post