Welcome back to the NicFab Podcast dedicated to legal prompting.
I am Nicola Fabiano and this is the second episode.
Last time we talked about privacy notices, how to check them,
simplify them and adapt them to different contexts.
At the end of that episode, I introduced a topic that shifts our perspective.
We are no longer talking about how to write a prompt,
but about how the model finds the information it works with.
We are talking about RAC, Retrieval Augmented Generation.
What is RAC?
RAC is an architecture. It works like this.
Instead of relying only on what the model learned during training,
the system retrieves documents from an external knowledge base
and inserts them into the context of your request.
The model then generates its response based on those documents.
In practice, you load a collection of materials, regulatory decisions,
contracts, legal opinions, legislation, and the system indexes them.
When you ask a question, the retrieval engine selects the most relevant fragments
and passes them to the model.
The model responds based on those fragments.
The idea is powerful.
You no longer depend on the model's memory, which can be inaccurate or outdated.
You work with your own documents, your own sources.
For a legal professional, this seems like the ideal solution.
It seems like it, but there are risks.
The risks are the following.
The first risk is retrieval quality.
The system does not search the way a lawyer would.
It uses semantic similarity.
It selects fragments that linguistically resemble the question, not legally.
If you ask about legitimate interest,
the system might return a fragment that mentions it
but in a completely different context.
The result is an answer that looks well-founded
but rests on an irrelevant fragment.
The second risk is fragmentation.
Documents are split into blocks, chunks, in technical terms, for indexing.
A supervisory authority decision is not a sequence of independent blocks.
It has an argumentative structure.
The premise shapes the conclusion.
If the system extracts only the conclusion without the premise,
the model works with an incomplete piece.
And it generates answers that lose the reasoning.
The third risk is outdated sources.
If your document base contains a repealed regulation or a superseded clause,
the system does not know.
It does not check validity.
It does not compare dates.
It retrieves the fragment most similar to the question,
even if it is no longer in force.
In the regulatory field, where amendments are frequent
and the differences between versions can be decisive,
this is particularly dangerous.
The fourth risk is opacity.
When the model responds using RUG,
it is not always clear which fragments it used.
Some systems show the sources, others do not.
But even when sources are shown,
you do not know how the model combined them,
which ones it prioritized, which ones it ignored.
This lack of transparency is a real problem.
A lawyer must be able to reconstruct the reasoning
with RUG that is often not possible.
The fifth risk concerns professional secrecy.
If you load confidential materials into a RUG platform,
where does that data go?
Who processes it? Where is it stored?
The infrastructure behind RUG is not just a technical issue.
It is a professional ethics issue.
And in many cases, it is a matter of GDPR and AI Act compliance.
The safeguards.
This does not mean RUG is useless.
Far from it, it means it must be used with awareness.
First safeguard.
Always verify the sources.
Do not trust the answer.
Read the fragments the system retrieved.
Check that they are relevant, complete and current.
Second safeguard.
Check the segmentation.
How are documents being split?
Do the chunks respect the logical structure of the text,
or do they cut it arbitrarily?
Poorer segmentation produces poorer retrieval.
Third safeguard.
Choose the infrastructure carefully.
Where does the data reside?
Who has access?
Are there contractual guarantees?
RUG, more than any other use of AI in law,
requires a serious infrastructure assessment.
Fourth safeguard.
Document everything.
If you use a RUG system to prepare an opinion or analyze a contract,
record the sources retrieved, the question asked and the answer received.
This is a matter of professional responsibility.
My closing remarks.
RUG promises to solve the root cause of the problem.
In part, it does.
But it introduces new risks.
Imprecise retrieval, fragmentation, outdated sources,
opacity and exposure of confidential data.
Legal prompting applied to RUG is not just about writing the right prompt.
It is about understanding what happens before the model generates its response
and taking professional responsibility for the result.
Next time, we will talk about advanced prompting techniques,
chain of thought and few-shot applied to legal reasoning.
Subscribe to the newsletter at nickfab.eu.
Thank you for listening.
Until the next episode.