I run a startup that does legal contract generation (contracts written by lawyers turned into templates) and have done some work GPT analysis of the contract for laypersons to interact and ask questions about the contract they are getting.
In terms of contract review, what I've found is that GPT is better at analysis of the document than generating the document, which is what this paper supports. However, I have used several startups options of AI document review and they all fall apart with any sort of prodding for specific answers. This paper looks like it just had to locate the section not necessarily have the back and forth conversation about the contract that a lawyer and client would have.
There is also no legal liability for GPT for giving the wrong answer. So It works well for someone smart who is doing their own research. Just like if you are smart you could use google before to do your own research.
My feelings on contract generation is that for the majority of cases, people are better served if there were simply better boilerplate contracts available. Laywers hoard their contracts and it was very difficult in our journey to find lawyers who would be willing to write contracts we would turn into templates because they are essentially putting themselves and their professional community out of income streams in the future. But people don't need a unique contract generated on the fly from GPT every time when a template of a well written and well reviewed contract does just fine. It cost hundreds of millions to train GPT4. If $10m was just spent building a repository of well reviewed contracts, it would be a more useful than spending the equivalent money training a GPT to generate them.
People ask pretty wide range of questions about what they want to do with their documents and GPT didn't do a great job with it, so for the near future, it looks like lawyers still have a job.
> Laywers hoard their contracts and it was very difficult in our journey to find lawyers who would be willing to write contracts we would turn into templates because they are essentially putting themselves and their professional community out of income streams in the future.
I notice same things in other professions, especially where it requires a huge upfront investment in education.
For example (at least where I live), there was a time about 20 years ago where architects also didn't want to produce designs that would be then sold to multiple people for cheap. The thinking was that this reduces market for architecture output. But of course it is easy to see that most people do not really need a unique design.
So the problem solved itself because the market does not really care and the moment somebody is able to compile a small library of usable designs and a usable business model, as an architect, you can either cooperate to salvage what you can or lose.
I believe the same comes for lawyers. Lawyers will live through some harsh time while their easiest and most lucrative work gets automated and the market for their services is going to shrink and whatever work is left for them will be of the more complex kind that the automation can't handle.
I think you greatly underestimate this group to retain their position as a monopoly. A huge chunk of politicians are lawyers, and most legal jurisdictions have hard requirements around what work you must have a lawyer to perform. These tools may make their practices more efficient internally, but it doesn't mean that value is being passed on to the consumer of the service in any way. They're a cartel and one with very close relationships with country leadership. I don't see this golden goose souring any time soon.
I think this overlooks a big part of how the legal market works. Our easiest work is only lucrative because we use it to train new lawyers, who bill at a lower rate. To the extent the easy stuff gets automated, 1) it’s going to be impossible to find work as a junior associate and 2) senior attorneys will do the same stuff they did last year. If there’s a decrease in prices for a while, great, but a generation from now it’s going to be a lot harder to find someone knowledgeable because the training pathway will have been destroyed.
> I notice same things in other professions, especially where it requires a huge upfront investment in education.
Doctors, for instance. You hear no end of stories about how incredibly high pressure medicine is with insane hours and stress, but will they increase university placements so they can actually hire enough trained staff to meet the workload? Absolutely fkn not, that would impact salaries.
So, you will get the template for free. And then a lawyer has to put it on their letterhead and they charge you the exact same as they do right now for that because that will be made a requirement.
I have no problem with your logic so long as it is applied uniformly to all of society. By that I mean society must abolish copyright, patents and all forms of intellectual property. Otherwise I'll be forced to defend lawyers in this case. Why can people hoard "IP" but lawyers can't hoard contracts?
I recently used Willful to create a will and was pretty disappointed with the result. The template was extremely rigid on matters that I thought should have been no-brainers to be able to express (if X has happened, do Y, otherwise Z) and didn't allow for any kind of property division other than percentages of the total. It was also very consumed with several matters that I don't really feel that strongly about, like the fate of my pets.
I was still able to rewrite the result into something that more suited me, but for a service with a $150 price tag I kind of hoped it would do more.
Our philosophy at GetDynasty is that the contract (in our case estate planning documents) itself is a commodity which is why we give it away for free. Charging $150 for a template doesn't make sense.
Our solution like you point out is more rigid than having a lawyer write it, but for the majority of people having something that is accessible and free is worth it and then having services layer on top makes the most sense. It is easier to have a well written contract that you can "turn on and off" features or sections of the contract than to try to have GPT write a custom contract for you.
>>didn't allow for any kind of property division other than percentages of the total.
Knowing someone who works in Trusts & Estates, that is terrible. I've often heard complaints about drafting by percentages of anything but straight financial assets which have an easily determined value, because that requires an appraisal(s). Yes, there are mechanisms to work it out in the end, but it is definitely better to be able to say $X to Alice, $Y to Bob and the remainder to Claire.
You have to think of not only what you want, but how the executors will need to handle it. We all love complex formulae, but we should use our ability to handle complexity to simplify things for the heirs - it's a real gift in a bad time.
Which is mostly what I feel also happens with LLMs producing code. Useful to start with, but not more than that. We've still got a job us programmers. For the moment.
Producing code is like producing syntactically correct algebra. It has very little value on its own.
I’ve been trying to pair system design with ChatGPT and it feels just like talking with a person who’s confident and regurgitates trivia, but doesn’t really understand. No sense of self-contradiction, doubt, curiosity.
I’m very, very impressed with the language abilities and the regurgitation can be handy, but is there a single novel discovery by LLMs? Even a (semantic) simplification of a complicated theory would be valuable.
You said: "However, I have used several startups options of AI document review and they all fall apart with any sort of prodding for specific answers. "
I think you will find that this is because they "outsource" the AI contract document review "final check" to real lawyers based in Utah ... so, it's actually a person, not really a wholy-AI based solution (which is what the company I am thinking of suggests in their marketing material)
> I think you will find that this is because they "outsource" the AI contract document review "final check" to real lawyers based in Utah ... so, it's actually a person, not really a wholy-AI based solution (which is what the company I am thinking of suggests in their marketing material)
Which company is that? I don't see any point in obfuscating the name on a forum like this.
> There is also no legal liability for GPT for giving the wrong answer
It was my understanding that there is also no legal liability for a lawyer for giving the wrong answer. In extreme cases there might be ethical issues that result in sanctions by the bar, but in most cases the only consequences would be reputational.
Are there cricumstances where you can hold an attorney legally liable for a badly written contract?
I believe all practicing attorneys carry malpractice insurance as well as E&O (errors and omissions) insurance. I think one of those would "cover" the attorney in your example, but obviously insurance doesn't prevent poor Google reviews, nor would it protect the attorney from anything done in bad-faith (ethical violations), or anything else that could land an attorney before the state bar association for a disciplinary hearing.
> It was my understanding that there is also no legal liability for a lawyer for giving the wrong answer.
There is plenty of legal, ethical and professional liability for a lawyer giving the wrong answer, we don't often see the outcome of these things because like everything in the courts they take a long time to get resolved and also some answers are not wrong just "less right" or "not really that wrong."
I mean, sure, if the attorney is operating below the usual standards of care -- it's exceptionally uncommon in the corporate world, but not unheard of. In the case of AI assistance, you run into situations where a company offering AI legal advice direct to end-users is either operating as an attorney without licensing, or, if an attorney is on the nameplate, they're violating basic legal professional responsibilities by not reviewing the output of the AI (if you do legal process outsourcing -- LPO -- there's a US-based attorney somewhere in the loop who's taking responsibility for the output).
About the only case where this works in practice is someone going pro se and using their own toolset to gin up a legal AI model. There's arguably a case for acting as an accelerator for attorneys, but the problem is that if you've got an AI doing, say, doc review, you still need lawyers to review not just the output for correctness, but also go through the source docs to make sure nothing was missed, so you're not saving much in the way of bodies or time.
We're building exactly this for contract analysis: upload a contract, review the common "levers to pull", make sure there's nothing unique/exceptional, and escalate to a real lawyer if you have complex questions you don't trust with an LLM.
In our research, we found out that most everyone has the same questions: (1) "what does my contract say?", (2) "is that standard?", and (3) "is there anything I can/should negotiate here?"
Most people don't want an intense, detailed negotiation over a lease, or a SaaS agreement, or an employment contract... they just want a normal contract that says normal things, and maybe it would be nice if 1 or 2 of the common levers were pulled in their direction.
Between the structure of the document and the overlap in language between iterations of the same document (i.e. literal copy/pasting for 99% of the document), contracts are almost an ideal use-case for LLMs! (The exception is directionality - LLMs are great at learning correlations like "company, employee, paid biweekly," but bad at discerning that it's super weird if the _employee_ is paying the _company_)
That makes sense, how are you guys approaching breaking down what should be present and what is expected in contracts? I've seen a lot of chatbot-based apps that just don't cut it for my use case.
> Just like if you are smart you could use google before to do your own research.
Unfortunately people stop at step #1, they use Google and that is their research. I don't think ChatGPT is going to be treated any different. It will be used as an oracle, whether that's wise or not doesn't matter. That's the price of marketing something as artificial intelligence: the general public believes it.
I'm in the energy sector and have been thinking of fine tuning a local llm on energy-specific legal documents, court cases, and other industry documents. Would this solve some of the problems you mention about producing specific answers? Have you tried something like that?
Law in general is interpretation. The most "lawyerese" answer you can expect is "It depends". Technically in the US everything is legal unless it is restricted and then there are interpretations about what those restrictions are.
If you ask a lawyer if you can do something novel, chances are they will give a risk assessment as opposed to a yes or no answer. Their answer typically depends on how well they think they can defend it in the court of law.
I have received answers from lawyers before that were essentially "Well, its a gray area. However if you get sued we have high confidence that we will prevail in court".
So outside of the more obvious cases, the actual function of law is less binary but more a function of a gradient of defensibility and the confidence of the individual lawyer.
How do you think organizations can best use the contractual interpretations provided by LLMs? To expand on that, good lawyers don't just provide contractual interpretations, they provide advice on actions to take, putting the legal interpretation into the context of their client's business objectives and risk profile. Do you see LLMs / tools based on LLMs evolving to "contextualize" and "operationalize" legal advice?
Do you have any views on whether context window limits the ability of LLMs to provide sound contractual interpretations of longer contracts that have interdependent sections that are far apart in the document?
Has your level of optimism for the capabilities of LLMs in the legal space changed at all over the past year?
You mentioned that lawyers hoard templates. Most organizations you would have as clients (law firms or businesses) have a ton of contracts that could be used to fine tune LLMs. There are also a ton of freely available contracts on the SEC's website. There are also companies like PLC, Matthew Boender, etc., that create form contracts and license access to them as a business. Presumably some sort of commercial arrangement could be worked out with them. I assume you are aware of all of these potential training sources, and am curious why they were unsatisfactory.
Not op but someone that currently runs an ai contract review tool.
To answer some of your questions:
- contract review works very well for high volume low risk contract types . Think slas, SaaS… these are contracts comercial legal teams need to review for compliance reasons but hate it.
- it’s less good for custom contracts
- what law firms would benefit from is just natural language search on their own contracts.
- it also works well for due diligence. Normally lawyers can’t review all contracts a company has. With a contract review tool they can extract all the key data/risks
- LLM doesn’t need to provide advice. LLM can just identify if x or y is in the contract. This improving the process of
review.
- context windows keep increasing but you don’t need to send the whole contract to the LLM . You can just identify the right paragraphs and send that.
- things changes a lot in the past year. It would cost us $2 to review a contract now it’s $0.2 . Responses are more accurate and faster
- I don’t do contract generation but have explored this. I think the biggest benefit isn’t generating the whole contract but to help the lawyer rewrite a clause for a specific need. The standard CLM already have contract templates that can be easily filled in. However after the template is filled the lawyer needs to add one or two clauses . Having a model trained on the companies documents would be enough.
As I've said before, one of my biggest concerns with LLMs is that they somehow manage to concentrate their errors in precisely the places we are least likely to notice: https://news.ycombinator.com/item?id=39178183
If this is dangerous with normal English, how much more so with legal text.
At least if a lawyer drafts the text, there is at least one human with some sort of intentionality and some idea of what they're trying to say when they draft the text. With LLMs there isn't.
(And as I say in the linked post, I don't think that is fundamental to AI. It is only fundamental to LLMs, which despite the frenzy, are not the sum totality of AI. I expect "LLMs can generate legal documents on their own!" to be one of those things the future looks back on our era and finds simply laughable.)
I'm working on Hotseat - a legal Q&A service where we put regulations in a hot seat and let people ask sophisticated questions. My experience aligns with your comment that vanilla GPT often performs poorly when answering questions about documents. However, if you combine focused effort on squeezing GPT's performance with product design, you can go pretty far.
I wonder if you have written about specific failure modes you've seen in answering qs from documents? I'd love to check whether Hotseat is handling them well.
Specific failure modes can be something as simple as extraction of beneficiary information from a Trust document. Sometimes it works, but a lot of times it doesn't even with startups with AI products specific to extracting information from documents. For example it will have an incomplete list of beneficiaries, or if there are contingent beneficiaries, it won't know what to do. Not even a hard question about the contingency. Just making a simple list with percentages of if no-one dies what is the distribution.
Further trying to get an AI to describe the contingency is a crap shoot.
While I expect these options to get better and better, I have fun trying them out and seeing what basic thing will break. :)
Your post is very interesting. Thanks for sharing.
If your focus is narrow enough the vanilla gpt can still provide good enough results. We narrow down the scope for the gpt and ask it to answer binary questions. With that we get good results.
Your approach is better for supporting broader questions. We support that as well and there the results aren’t as good.
"So It works well for someone smart who is doing their own research."
That's a concise explanation that also applies to GPTs and software engineering. GPT4 boosts my productivity as a software engineer because it helps me travel the path quicker. Most generated code snippets need a lot of work because I'm prodding it for specific use cases and it fails. It's perfect as an assistant though.
>There is also no legal liability for GPT for giving the wrong answer. So It works well for someone smart who is doing their own research. Just like if you are smart you could use google before to do your own research.
How is that good for the end user? Malpractice claims are often all that is left for a client after the attorney messes up their case. If you use a GPT, you wouldn't have that option.
I launched a contract review tool about year ago[1].
The legal liability is an issue in several countries but contract generation can also be. If you are providing whatever is defined as legal services and are not a law firm, you will have issues.
> If you are providing whatever is defined as legal services and are not a law firm, you will have issues.
that is a big reason why we haven't integrated AI tools into our product yet. Currently our business essentially works as a free product that is the equivalent of a "stationary store" of you are filling out a blank template and it is your responsibility what happens. This has a long history of precedence since for decades people could buy these templates off the shelf and fill them out themselves.
Giving a tool to our users to answer legal questions opens a can of works like you say. We decided that the stationary store templates are a commodity and should be free (even though our competitors charge hundreds for them) so we make money providing services on top of it.
"There is also no legal liability for GPT for giving the wrong answer."
I mean, i get your point but lets be real: I cannot count the number of times I sat in a meeting and looked back at a contract and wished some element had a different structure to it. In law there are a lot of "wrong answers" someone could foolishly provide, but its way more often something more variable as to how "wrong" the answer is, than it is a binary bad/good piece of advice.
I personally feel the ability to have more discussion about a clause is extremely helpful, v's getting the a hopefully "right answer" from a lawyer, and counting the clock / $ as you try wrap your head around the advice you're being given. If you have deep pockets, you invite your lawyer to a lot of meetings, they have context and away you go....but for a lot of people, you're just involving the lawyer briefly and trying to avoid billable hours. That's been me at the early stage of everything, and it's a very tricky balance.
If you're a startup trying to use GPT, i say do it, but also use a lawyer. Augmenting the lawyer with GPT to save billable hours so you can turn up to a meeting with your lawyer and extract the most value in the shortest time period seems like the best play to me.
You can read my other reply which agrees with you that law is a spectrum rather than a binary.
> I cannot count the number of times I sat in a meeting and looked back at a contract and wished some element had a different structure to it.
The only way to have something "bullet proof" is to have experience in ways in which something can go wrong. Its just like writing a program in which the "happy path" is rather obvious but then you have to come up with all the different attack vectors and use cases in which the program can fail.
The same is with lawyers. Lawyers at big firms have the experience of the firm to guide them on what to do and what they should include in a contract. A small town family lawyer might have no experience in what you ask them to do.
Which is why I advocate for more standardized agreements as opposed to one off generated agreements (with GPT or a lawyer). Think of the YCombinator SAFE, it made a huge impact on financing because it was standardized and there were really no terms to negotiate compared to the world before which the terms Notes were complex had to be scrutinized and negotiated.
> Augmenting the lawyer with GPT to save billable hours so you can turn up to a meeting with your lawyer and extract the most value in the shortest time period seems like the best play to me.
The issue is that a lot of lawyers have a conflict of interest and a "Not invented here" way of doing business. If you have a Trust for instance written by one lawyer and bring it to another lawyer, the majority of lawyers we talked to actually prefer to throw out the document and use their own. This method works well if you are a smart savvy person, but for the population at large, people have some crazy and weird ideas about how the law works and need to be talked out of what they want into something more sane.
Another common lawyer response besides "It depends" is "Well you can, but why would you want to?" So many people of a skewed view on what they want and part of a lawyers job is interpreting what they really want and guiding them on the path of that.
So the hybrid method really only works if you find a lawyer that accepts whatever crazy terms you came up with and are willing to work with what you generated.
That was more directed towards people who are trying to train AIs to be a competitor to Nolo. Lots of document repositories exist, but they won't work with you if you want to sell legal contracts yourself. I have seen a lot of startups raise money to try to build an AI solution to this, but the results haven't been great so far.
> It works well for someone smart who is doing their own research. Just like if you are smart you could use google before to do your own research.
That's a trap: If you don't have prior expertise then you can't distinguish plausible-sounding fact from fiction. If you think you are "smart", then afaik research shows you are easier to fool because you are more likely to think you know.
Google finds lots of mis/disinformation. GPTs are automated traps: they generate the most statistically likely text from ... the Internet! Not exactly encouraging. (Of course, it really depends on the training data.)
I'd like to know your company, and talk to you about GPT as it applies to legal - this is the most disruptive space available to GPTs, and its being poorly addressed.
>>>"Laywers hoard their contracts and it was very difficult in our journey to find lawyers who would be willing to write contracts we would turn into templates because they are essentially putting themselves and their professional community out of income streams in the future.
This is why Lawyers must die.
If "letter of law" is a thing - then "opinions" in legal parlance, should be BS.
We should have every SC decision eviscerated by GPTs
Any any single person saying "You need a lawyer to figure this out" is a grifter and a charlatan
--
Anyway - I'd like to know more about your startup. I'd like to know what you o for lkegal DMS, ala hummingbird for biotech, but there are so many real applications of GPT to legal troves, such as an auto index/summary/ parties/contacts/dates/ blah blah that GPTs make legal searching all that more powerful and ANY legal-person in any position telling you anything bad about computers keeping them in check is a fraud.
The legal industry is the most low hanging fruit for GPTs.
This is one of the domains I'm very very excited about for LLMs to help me with. In 5-10 years (even though this research paper makes me feel its already here), I would feel very confident chatting for a few hours with a "lawyer" LLM that has access to all my relevant taxes/medical/insurance/marriage documents and would be able to give me specialized advice and information without billing me $500 an hour.
A wave of (better) legally informed common-person is coming, and I couldn't be more excited!
I wouldn't blindly trust what the LLM says, but I take it that it would be mostly right, and that would give me at the very least explorable vocabulary that I can expand on my own, or keep grilling it about.
I've already used some LLMs to ask questions about licenses and legal consequences for software related matters, and it gave me a base, without having to involve a very expensive professional into it for what are mostly questions for hobby things I'm doing.
If there was a significant amount of money involved in the decision, though, I will of course use the services of a professional. These are the kinds of topics you can't be "mostly right".
I don't understand how everyone keeps making this mistake over and over.
They explicitly just said "in 5-10 years".
So many people continually use arguments that revolve around 'I used it once and it wasn't the best and/or me things up', and imply that this will always be the case.
There are many solutions already for knowledge editing, there are many solutions for improving performance, and there will very likely continue to be many improvements across the board for this.
It took ~5 years from when people in the NLP literature noticed BERT and knew the powerful applications that were coming, until the public at large was aware of the developments via ChatGPT.
It may take another 5 before the public sees the developments happening now in the literature hit something in a companies web UI.
I like the term "explorable vocabulary." I can see using LLMs to get an idea of what the relevant issues are before I approach a professional, without assuming that any particular claim in the model's responses is correct.
Our core business is legal document generation (rule based logic, no AI). Since we already have the users' legal documents available to us as a result of our core business, we are perfectly positioned to build supplementary AI chat features related to legal documents.
We recently deployed a product recommendation AI to prod (partially rule based, but personalized recommendation texts generated by GPT-4). We are currently building AI chat features to help users understand different legal documents and our services. We're intending to replace the first level of customer support with this AI chat (and before you get upset, know that the first level of customer support is currently a very bad rule-based AI).
Main website in Finnish: https://aatos.app (also some services for SE and DK, plus we recently opened UK with just a e-sign service)
So, let’s say that the chat will work as well as the real lawyer some day.
If the current pricing would be $500 an hour for a real lawyer, and at some point your costs are just keeping services up and running, how big cut will you take? Because it is enough if you are only a little cheaper than the real lawyer to win customers.
There is an upcoming monopoly problem, if the users get the best information from the service after they submit all their documents. And soon the normal lawyer might be competitive enough. I fear that the future is in the parent commenter’s open platfrom with open models and the businesses should extract money from some other use cases, while for a while, you get money momentarily based on the typical ”I am first, I have the user base” situation. It is interesting to see what will happen to lawyers.
Here's an example of what our product recommendations look like:
Given your ownership in a company and real estate, a lasting power of attorney is a prudent step. This allows you to appoint PARTNER_NAME or another trusted individual to manage your business and property affairs in the event of incapacitation. Additionally, it can also provide tax benefits by allowing tax-free gifts to your children, helping to avoid unnecessary inheritance taxes and secure the financial future of your large family.
> Since we already have the users' legal documents available to us as a result of our core business, we are perfectly positioned to build supplementary AI chat features related to legal documents.
Or a LLM that helps you spend less. Imagine a LLM that goes over all your spending, knows all the current laws, benefits, organizations, promotional campaigns, and suggests (or even executes) things like changing electricity provider, insurance provider, buying stuff in bulk from a different shop that you get for 4x the price at your local store, etc.
Yup. Because the doctor doesn't have time and doesn't give a fuck.
LLMs don't have to compete against the cutting edge of human professional knowledge. They only have to compete against the disinterested, arrogant, greedy, and overworked professionals that are actually available to people in practice. No wonder they're winning.
This is a really interesting use case for me. I've been envisioning a specially trained LLM that can give useful advice or insights that your average PCP might gloss over or not have the time to investigate.
Did you do anything special to achieve this? What were the results like?
I think a lot of startups are working on exactly what you are describing, and honestly, I wouldn't hold my breath. Everyone is still bound by token limits and the two best approaches for getting around them are RAG and Knowledge-Graphs, both of which could get you close to what you describe but not close enough to be useful (IMO).
This does not make sense to me. ChatGPT is completely nerfed to the point where it's either been conditioned or trained to provide absolutely zero concrete responses to anything. All it does is provide the most baseline, generic possible response followed by some throwaway recommendation to seek the advice of an actual expert.
The way to get around this is to have it "quote" or at least try to quote from input documents. Which is why RAG became so popular. Sure, it won't right you a contract, but it will read one back to you if you've provided one in your prompt.
In my experience, this does not get you close to what the top-level comment is describing. But it gets around the "nerfing" you describe
It's very easy to get ChatGPT to provide legal advice based on information fed in the prompt. OpenAI is not censoring legal advice anywhere near as hard as they are censoring politics or naughty talk.
I will reserve judgment of the possibilities of LLMs as applied to the legal field until they are tested on something other than Document/ contract review. Contract review is, in the large business law case, often done by outsourcing to hundreds of recent graduates and act more like proof reading with minimal application of actual lawyering skills to increase a corporation’s bottom line.
The more common, for individual purchasers of legal services, lawyering is going to be family law matters, criminal law matters, and small claims court matters. I can not see a time in the near future where an LLM can handle the fact specific and circumstantial analysis required to handle felony criminal litigation, and I see nothing that would imply LLMs can even approach the individualized, case specific and convoluted family dynamics required for custody cases or contested divorces.
I’m not unwilling to accept LLMs as a tool an attorney can use, but outside of more rote legal proof reading I don’t think the technology is at all ready for adoption in actual practice.
"and I see nothing that would imply LLMs can even approach the individualized, case specific and convoluted family dynamics required for custody cases or contested divorces."
Humans are pretty bad at this. Based on the results, it seems the judges' personal views and emotions are a large part of these cases. I'm not sure what they would look like without emotion, personal views, and the case law built off of those.
The worse judges are at being perfectly removed arbiters of justice, the more room for lawyers to exploit things like emotions and humans connections with those judges, and thus the worse LLMs will be at doing that part of the job. A charismatic lawyer backed by an LLM will be much better than an LLM.
At least until the LLMs surpass humans at being charismatic, but that would seem to be its own nightmare scenario.
> judges' personal views and emotions are a large part of these cases
That's a completely separate question. We're talking about automating lawyers, not judges. (to be a good lawyer in such a situation, you would need to model the judge's emotions and use them to your advantage. Probably AIs can do this eventually but it's not easy or likely to happen soon)
I lead data and AI at a tech-enabled commercial insurance brokerage. We have been leveraging GPT-4 to build a deep contract analysis tool specifically for insurance requirements. My teams at Google also built several LLM solutions to support Google's legal team, from patent classification to discovery support.
Language models are great at digesting legalese and analyzing what's going on. However, many legal applications involve around pretty important decisions that you don't want to get wrong ("am I contractually covered with my current insurance program?"). Because of that, we've built LLM products in the legal space with the following principles in mind:
- Human-in-the-loop tooling -- The product should be built around an expert using it whenever possible, so decision support as opposed to automation. You still see massive time savings with that in place
- Transparency / citations -- With a human-in-the-loop tool, you need mechanisms to build trust. Whether that's highlighting clauses in the document that the LLM referred to or explaining why a part of the analysis wasn't provided, citing your work is important
- Tuned for precision instead of recall -- False positives (and hallucinations) are especially bad in many of these legal use cases, so tuning the model or prompts for precision helps with mitigation.
Do you have any pointers on how to get GPT-4 to do citations? Is a prompt like “quote back the passage you are citing” so you can locate and highlight the original?
When you say “tuned for precision” is this your prompt engineering or are you actually fine-tuning GPT-4?
For RAG applications, starting simply with a reference to the most relevant chunk(s) is helpful in building transparency. A lot of our contract review task is a data extraction one (e.g. extract this type of insurance language and compare against your policy). As such, it's much easier to pinpoint the exact text from the source doc for citation. In our applications, currently, we are doing citations as a postprocessing task as opposed to as part of the prompt itself. Finding that feeding too many instructions to the LLM results in worse responses.
We're not fine-tuning today. Instead, "tuning for precision" is done through prompt chains. A simple example would be returning "I don't know" early on if the document isn't a contract or doesn't have clear insurance requirements in it. We've had success with various guardrail prompts.
The citation work we did at Google used model internals to highlight text (path integrated gradients). It's also easier to finetune for precision when you have control over the model itself.
While this paper is clearly not without merits, it intends to be more like an excuse to make a bombastic statements about a whole profession or "industry" (perhaps to raise their visibility and try to sell something later on?).
The worst part is that they have actually referenced a single preprint document as "previous art" - and that document itself is not related to contract review, but to legal reasoning in general. (A part of LegalBench is of course "interpretation", and that is built on existing contract review benchmarks, but they could've found more relevant papers).
Automating legal document review has been a very active field in NLP for twenty years or so (including in QA tasks) and became a lot more active since 2017. At least e.g. Kira (and Luminance etc., none of which is LLM-based) are already used quite widely in legal departments/firms around the world. So lawyers do have practical experience in their limitations...
But Kira & co. are not measuring the performance of the latest and greatest models and they do not use transparent benchmarks etc. So the benchmark results in this paper are indeed a welcome addition in terms of using LLMs.
But also considering its limited scope of reviewing 10 (!) documents based on a single review playbook, they should not have written about "implications for the industry". It is very much pretentious and shows more of the lack of knowledge of the authors of the very same industry than about the future of the legal services industry.
The third one (CUAD) is a single paper, not blogposts like the others. I think this paper is still the best in terms of being done by NLP experts and understanding the possibilities and not being just some and mirror. But there are so many papers published in this area nowadays that I might not even notice a new one.
The CUAD paper was still based on BERT, so pretraining was needed - that needs a bit more expertise than just prompting GPT-4-32k like in this paper or feeding prompts back to GPT-4 for another round of refining, or doing RAG.
For honest research purposes, "contract review" is not really a good area of approach: the subject field is not standardised, there is no good benchmark yet and your paper can easily get into bad company of snake oil sellers cashing on visceral hatred of average people for all professions.
I think the other way is going to happen, being a lawyer will now be a lot more expensive requiring some servers doing AI inference, developers, and 3 party services..
Lawyers control government, at least in the U.S. Expect laws banning or severely restricting the use of AI in the legal field soon. I expect arguments will range from the dangers of ineffective counsel to "but think of the children" - whatever helps them protect their monopoly.
Ah yes, the story of bad people not wanting their livelihoods taken from them by good tech giants. Seriously, is there no room for empathy in all of this ? If you went through law school and likely got yourself in debt in the process then you're not protecting any monopoly but your means to exist. There are people like that out there you know.
Given the recent legal cases where lawyers did use Chat GPT to do research and help write their brief did not go very well I'm not sold that on all the optimism that's here in the comments.
That was rookie level mistakes though. Not checking a *case* exists? Building a small pipeline of generation->validation isn't trivial, but it isn't impossible. The cases you describe seem to me like very lazy associates matched with a very poor understanding of what LLMs do.
Theoretically.
The language in laws should be structured similar to code. It has some logical structure. Thus should be more easily adopted to LLMs than other 'natural language'.
So despite the early news about lawyers submitting 'fake' cases, it is only a matter of time before the legal profession is up-ended. There are a ton of paralegals, etc.. doing 'grunt' work in firms, that an LLM can do. These are considered white color, and will be gone.
It will progress in a similar fashion to coding.
It will be like having a junior partner that you have to double check, or that can do some boiler plate for you.
You can't trust completely, but you don't trust your junior devs do you, but it gets you 80% there.
> It will progress in a similar fashion to coding.
I kind of agree with this, but this is why I am confused that I only ever see people (at least on HN) talk about AI up-ending the legal profession and putting droves of lawyers out of work--I never see the same talk about the coding industry being transformed in this way.
In terms of contract review, what I've found is that GPT is better at analysis of the document than generating the document, which is what this paper supports. However, I have used several startups options of AI document review and they all fall apart with any sort of prodding for specific answers. This paper looks like it just had to locate the section not necessarily have the back and forth conversation about the contract that a lawyer and client would have.
There is also no legal liability for GPT for giving the wrong answer. So It works well for someone smart who is doing their own research. Just like if you are smart you could use google before to do your own research.
My feelings on contract generation is that for the majority of cases, people are better served if there were simply better boilerplate contracts available. Laywers hoard their contracts and it was very difficult in our journey to find lawyers who would be willing to write contracts we would turn into templates because they are essentially putting themselves and their professional community out of income streams in the future. But people don't need a unique contract generated on the fly from GPT every time when a template of a well written and well reviewed contract does just fine. It cost hundreds of millions to train GPT4. If $10m was just spent building a repository of well reviewed contracts, it would be a more useful than spending the equivalent money training a GPT to generate them.
People ask pretty wide range of questions about what they want to do with their documents and GPT didn't do a great job with it, so for the near future, it looks like lawyers still have a job.
I notice same things in other professions, especially where it requires a huge upfront investment in education.
For example (at least where I live), there was a time about 20 years ago where architects also didn't want to produce designs that would be then sold to multiple people for cheap. The thinking was that this reduces market for architecture output. But of course it is easy to see that most people do not really need a unique design.
So the problem solved itself because the market does not really care and the moment somebody is able to compile a small library of usable designs and a usable business model, as an architect, you can either cooperate to salvage what you can or lose.
I believe the same comes for lawyers. Lawyers will live through some harsh time while their easiest and most lucrative work gets automated and the market for their services is going to shrink and whatever work is left for them will be of the more complex kind that the automation can't handle.
I think this overlooks a big part of how the legal market works. Our easiest work is only lucrative because we use it to train new lawyers, who bill at a lower rate. To the extent the easy stuff gets automated, 1) it’s going to be impossible to find work as a junior associate and 2) senior attorneys will do the same stuff they did last year. If there’s a decrease in prices for a while, great, but a generation from now it’s going to be a lot harder to find someone knowledgeable because the training pathway will have been destroyed.
Doctors, for instance. You hear no end of stories about how incredibly high pressure medicine is with insane hours and stress, but will they increase university placements so they can actually hire enough trained staff to meet the workload? Absolutely fkn not, that would impact salaries.
I was still able to rewrite the result into something that more suited me, but for a service with a $150 price tag I kind of hoped it would do more.
Our solution like you point out is more rigid than having a lawyer write it, but for the majority of people having something that is accessible and free is worth it and then having services layer on top makes the most sense. It is easier to have a well written contract that you can "turn on and off" features or sections of the contract than to try to have GPT write a custom contract for you.
Knowing someone who works in Trusts & Estates, that is terrible. I've often heard complaints about drafting by percentages of anything but straight financial assets which have an easily determined value, because that requires an appraisal(s). Yes, there are mechanisms to work it out in the end, but it is definitely better to be able to say $X to Alice, $Y to Bob and the remainder to Claire.
You have to think of not only what you want, but how the executors will need to handle it. We all love complex formulae, but we should use our ability to handle complexity to simplify things for the heirs - it's a real gift in a bad time.
Pet trusts [1]! My lawyer literally used their existence, which I find adorable, to motivate me to read my paperwork.
[1] https://www.aspca.org/pet-care/pet-planning/pet-trust-primer
I’ve been trying to pair system design with ChatGPT and it feels just like talking with a person who’s confident and regurgitates trivia, but doesn’t really understand. No sense of self-contradiction, doubt, curiosity.
I’m very, very impressed with the language abilities and the regurgitation can be handy, but is there a single novel discovery by LLMs? Even a (semantic) simplification of a complicated theory would be valuable.
I think you will find that this is because they "outsource" the AI contract document review "final check" to real lawyers based in Utah ... so, it's actually a person, not really a wholy-AI based solution (which is what the company I am thinking of suggests in their marketing material)
Which company is that? I don't see any point in obfuscating the name on a forum like this.
It was my understanding that there is also no legal liability for a lawyer for giving the wrong answer. In extreme cases there might be ethical issues that result in sanctions by the bar, but in most cases the only consequences would be reputational.
Are there cricumstances where you can hold an attorney legally liable for a badly written contract?
There is plenty of legal, ethical and professional liability for a lawyer giving the wrong answer, we don't often see the outcome of these things because like everything in the courts they take a long time to get resolved and also some answers are not wrong just "less right" or "not really that wrong."
Wrong Word in Contract Leads to $2M Malpractice Suit[1].
[1]https://lawyersinsurer.com/legal-malpractice/legal-malpracti...
About the only case where this works in practice is someone going pro se and using their own toolset to gin up a legal AI model. There's arguably a case for acting as an accelerator for attorneys, but the problem is that if you've got an AI doing, say, doc review, you still need lawyers to review not just the output for correctness, but also go through the source docs to make sure nothing was missed, so you're not saving much in the way of bodies or time.
In our research, we found out that most everyone has the same questions: (1) "what does my contract say?", (2) "is that standard?", and (3) "is there anything I can/should negotiate here?"
Most people don't want an intense, detailed negotiation over a lease, or a SaaS agreement, or an employment contract... they just want a normal contract that says normal things, and maybe it would be nice if 1 or 2 of the common levers were pulled in their direction.
Between the structure of the document and the overlap in language between iterations of the same document (i.e. literal copy/pasting for 99% of the document), contracts are almost an ideal use-case for LLMs! (The exception is directionality - LLMs are great at learning correlations like "company, employee, paid biweekly," but bad at discerning that it's super weird if the _employee_ is paying the _company_)
Unfortunately people stop at step #1, they use Google and that is their research. I don't think ChatGPT is going to be treated any different. It will be used as an oracle, whether that's wise or not doesn't matter. That's the price of marketing something as artificial intelligence: the general public believes it.
Law in general is interpretation. The most "lawyerese" answer you can expect is "It depends". Technically in the US everything is legal unless it is restricted and then there are interpretations about what those restrictions are.
If you ask a lawyer if you can do something novel, chances are they will give a risk assessment as opposed to a yes or no answer. Their answer typically depends on how well they think they can defend it in the court of law.
I have received answers from lawyers before that were essentially "Well, its a gray area. However if you get sued we have high confidence that we will prevail in court".
So outside of the more obvious cases, the actual function of law is less binary but more a function of a gradient of defensibility and the confidence of the individual lawyer.
Do you have any views on whether context window limits the ability of LLMs to provide sound contractual interpretations of longer contracts that have interdependent sections that are far apart in the document?
Has your level of optimism for the capabilities of LLMs in the legal space changed at all over the past year?
You mentioned that lawyers hoard templates. Most organizations you would have as clients (law firms or businesses) have a ton of contracts that could be used to fine tune LLMs. There are also a ton of freely available contracts on the SEC's website. There are also companies like PLC, Matthew Boender, etc., that create form contracts and license access to them as a business. Presumably some sort of commercial arrangement could be worked out with them. I assume you are aware of all of these potential training sources, and am curious why they were unsatisfactory.
Thanks for any response you can offer.
To answer some of your questions:
- contract review works very well for high volume low risk contract types . Think slas, SaaS… these are contracts comercial legal teams need to review for compliance reasons but hate it.
- it’s less good for custom contracts
- what law firms would benefit from is just natural language search on their own contracts.
- it also works well for due diligence. Normally lawyers can’t review all contracts a company has. With a contract review tool they can extract all the key data/risks
- LLM doesn’t need to provide advice. LLM can just identify if x or y is in the contract. This improving the process of review.
- context windows keep increasing but you don’t need to send the whole contract to the LLM . You can just identify the right paragraphs and send that.
- things changes a lot in the past year. It would cost us $2 to review a contract now it’s $0.2 . Responses are more accurate and faster
- I don’t do contract generation but have explored this. I think the biggest benefit isn’t generating the whole contract but to help the lawyer rewrite a clause for a specific need. The standard CLM already have contract templates that can be easily filled in. However after the template is filled the lawyer needs to add one or two clauses . Having a model trained on the companies documents would be enough.
Hope this helps
If this is dangerous with normal English, how much more so with legal text.
At least if a lawyer drafts the text, there is at least one human with some sort of intentionality and some idea of what they're trying to say when they draft the text. With LLMs there isn't.
(And as I say in the linked post, I don't think that is fundamental to AI. It is only fundamental to LLMs, which despite the frenzy, are not the sum totality of AI. I expect "LLMs can generate legal documents on their own!" to be one of those things the future looks back on our era and finds simply laughable.)
I'm working on Hotseat - a legal Q&A service where we put regulations in a hot seat and let people ask sophisticated questions. My experience aligns with your comment that vanilla GPT often performs poorly when answering questions about documents. However, if you combine focused effort on squeezing GPT's performance with product design, you can go pretty far.
I wonder if you have written about specific failure modes you've seen in answering qs from documents? I'd love to check whether Hotseat is handling them well.
If you'r curious, I've written about some of the design choices we've made on our way to creating a compelling product experience: https://gkk.dev/posts/the-anatomy-of-hotseats-ai/
Specific failure modes can be something as simple as extraction of beneficiary information from a Trust document. Sometimes it works, but a lot of times it doesn't even with startups with AI products specific to extracting information from documents. For example it will have an incomplete list of beneficiaries, or if there are contingent beneficiaries, it won't know what to do. Not even a hard question about the contingency. Just making a simple list with percentages of if no-one dies what is the distribution.
Further trying to get an AI to describe the contingency is a crap shoot.
While I expect these options to get better and better, I have fun trying them out and seeing what basic thing will break. :)
If your focus is narrow enough the vanilla gpt can still provide good enough results. We narrow down the scope for the gpt and ask it to answer binary questions. With that we get good results.
Your approach is better for supporting broader questions. We support that as well and there the results aren’t as good.
That's a concise explanation that also applies to GPTs and software engineering. GPT4 boosts my productivity as a software engineer because it helps me travel the path quicker. Most generated code snippets need a lot of work because I'm prodding it for specific use cases and it fails. It's perfect as an assistant though.
How is that good for the end user? Malpractice claims are often all that is left for a client after the attorney messes up their case. If you use a GPT, you wouldn't have that option.
The legal liability is an issue in several countries but contract generation can also be. If you are providing whatever is defined as legal services and are not a law firm, you will have issues.
[1]legalreview.ai
> If you are providing whatever is defined as legal services and are not a law firm, you will have issues.
that is a big reason why we haven't integrated AI tools into our product yet. Currently our business essentially works as a free product that is the equivalent of a "stationary store" of you are filling out a blank template and it is your responsibility what happens. This has a long history of precedence since for decades people could buy these templates off the shelf and fill them out themselves.
Giving a tool to our users to answer legal questions opens a can of works like you say. We decided that the stationary store templates are a commodity and should be free (even though our competitors charge hundreds for them) so we make money providing services on top of it.
I mean, i get your point but lets be real: I cannot count the number of times I sat in a meeting and looked back at a contract and wished some element had a different structure to it. In law there are a lot of "wrong answers" someone could foolishly provide, but its way more often something more variable as to how "wrong" the answer is, than it is a binary bad/good piece of advice.
I personally feel the ability to have more discussion about a clause is extremely helpful, v's getting the a hopefully "right answer" from a lawyer, and counting the clock / $ as you try wrap your head around the advice you're being given. If you have deep pockets, you invite your lawyer to a lot of meetings, they have context and away you go....but for a lot of people, you're just involving the lawyer briefly and trying to avoid billable hours. That's been me at the early stage of everything, and it's a very tricky balance.
If you're a startup trying to use GPT, i say do it, but also use a lawyer. Augmenting the lawyer with GPT to save billable hours so you can turn up to a meeting with your lawyer and extract the most value in the shortest time period seems like the best play to me.
> I cannot count the number of times I sat in a meeting and looked back at a contract and wished some element had a different structure to it.
The only way to have something "bullet proof" is to have experience in ways in which something can go wrong. Its just like writing a program in which the "happy path" is rather obvious but then you have to come up with all the different attack vectors and use cases in which the program can fail.
The same is with lawyers. Lawyers at big firms have the experience of the firm to guide them on what to do and what they should include in a contract. A small town family lawyer might have no experience in what you ask them to do.
Which is why I advocate for more standardized agreements as opposed to one off generated agreements (with GPT or a lawyer). Think of the YCombinator SAFE, it made a huge impact on financing because it was standardized and there were really no terms to negotiate compared to the world before which the terms Notes were complex had to be scrutinized and negotiated.
> Augmenting the lawyer with GPT to save billable hours so you can turn up to a meeting with your lawyer and extract the most value in the shortest time period seems like the best play to me.
The issue is that a lot of lawyers have a conflict of interest and a "Not invented here" way of doing business. If you have a Trust for instance written by one lawyer and bring it to another lawyer, the majority of lawyers we talked to actually prefer to throw out the document and use their own. This method works well if you are a smart savvy person, but for the population at large, people have some crazy and weird ideas about how the law works and need to be talked out of what they want into something more sane.
Another common lawyer response besides "It depends" is "Well you can, but why would you want to?" So many people of a skewed view on what they want and part of a lawyers job is interpreting what they really want and guiding them on the path of that.
So the hybrid method really only works if you find a lawyer that accepts whatever crazy terms you came up with and are willing to work with what you generated.
They’re going to be great for a lot of stuff. But when it comes to things like the law the other 20% is not optional.
What's your objection to Nolo Press? They seem to have already done that.
Deleted Comment
That's a trap: If you don't have prior expertise then you can't distinguish plausible-sounding fact from fiction. If you think you are "smart", then afaik research shows you are easier to fool because you are more likely to think you know.
Google finds lots of mis/disinformation. GPTs are automated traps: they generate the most statistically likely text from ... the Internet! Not exactly encouraging. (Of course, it really depends on the training data.)
>>>"Laywers hoard their contracts and it was very difficult in our journey to find lawyers who would be willing to write contracts we would turn into templates because they are essentially putting themselves and their professional community out of income streams in the future.
This is why Lawyers must die.
If "letter of law" is a thing - then "opinions" in legal parlance, should be BS.
We should have every SC decision eviscerated by GPTs
Any any single person saying "You need a lawyer to figure this out" is a grifter and a charlatan
--
Anyway - I'd like to know more about your startup. I'd like to know what you o for lkegal DMS, ala hummingbird for biotech, but there are so many real applications of GPT to legal troves, such as an auto index/summary/ parties/contacts/dates/ blah blah that GPTs make legal searching all that more powerful and ANY legal-person in any position telling you anything bad about computers keeping them in check is a fraud.
The legal industry is the most low hanging fruit for GPTs.
Dead Comment
Dead Comment
A wave of (better) legally informed common-person is coming, and I couldn't be more excited!
I've already used some LLMs to ask questions about licenses and legal consequences for software related matters, and it gave me a base, without having to involve a very expensive professional into it for what are mostly questions for hobby things I'm doing.
If there was a significant amount of money involved in the decision, though, I will of course use the services of a professional. These are the kinds of topics you can't be "mostly right".
So many people continually use arguments that revolve around 'I used it once and it wasn't the best and/or me things up', and imply that this will always be the case.
There are many solutions already for knowledge editing, there are many solutions for improving performance, and there will very likely continue to be many improvements across the board for this.
It took ~5 years from when people in the NLP literature noticed BERT and knew the powerful applications that were coming, until the public at large was aware of the developments via ChatGPT. It may take another 5 before the public sees the developments happening now in the literature hit something in a companies web UI.
Our core business is legal document generation (rule based logic, no AI). Since we already have the users' legal documents available to us as a result of our core business, we are perfectly positioned to build supplementary AI chat features related to legal documents.
We recently deployed a product recommendation AI to prod (partially rule based, but personalized recommendation texts generated by GPT-4). We are currently building AI chat features to help users understand different legal documents and our services. We're intending to replace the first level of customer support with this AI chat (and before you get upset, know that the first level of customer support is currently a very bad rule-based AI).
Main website in Finnish: https://aatos.app (also some services for SE and DK, plus we recently opened UK with just a e-sign service)
If the current pricing would be $500 an hour for a real lawyer, and at some point your costs are just keeping services up and running, how big cut will you take? Because it is enough if you are only a little cheaper than the real lawyer to win customers.
There is an upcoming monopoly problem, if the users get the best information from the service after they submit all their documents. And soon the normal lawyer might be competitive enough. I fear that the future is in the parent commenter’s open platfrom with open models and the businesses should extract money from some other use cases, while for a while, you get money momentarily based on the typical ”I am first, I have the user base” situation. It is interesting to see what will happen to lawyers.
Given your ownership in a company and real estate, a lasting power of attorney is a prudent step. This allows you to appoint PARTNER_NAME or another trusted individual to manage your business and property affairs in the event of incapacitation. Additionally, it can also provide tax benefits by allowing tax-free gifts to your children, helping to avoid unnecessary inheritance taxes and secure the financial future of your large family.
Uhh... What are the privacy implications here?!
I feel LLMs are great at suggestions that you follow up yourself (if only for sanity checking, but nothing you wouldn't do with a human too).
I uploaded all of my bloodwork tests and my 23andme data to Chat GPT and it was better at analyzing it than my doctor was.
LLMs don't have to compete against the cutting edge of human professional knowledge. They only have to compete against the disinterested, arrogant, greedy, and overworked professionals that are actually available to people in practice. No wonder they're winning.
Did you do anything special to achieve this? What were the results like?
In my experience, this does not get you close to what the top-level comment is describing. But it gets around the "nerfing" you describe
The more common, for individual purchasers of legal services, lawyering is going to be family law matters, criminal law matters, and small claims court matters. I can not see a time in the near future where an LLM can handle the fact specific and circumstantial analysis required to handle felony criminal litigation, and I see nothing that would imply LLMs can even approach the individualized, case specific and convoluted family dynamics required for custody cases or contested divorces.
I’m not unwilling to accept LLMs as a tool an attorney can use, but outside of more rote legal proof reading I don’t think the technology is at all ready for adoption in actual practice.
Humans are pretty bad at this. Based on the results, it seems the judges' personal views and emotions are a large part of these cases. I'm not sure what they would look like without emotion, personal views, and the case law built off of those.
At least until the LLMs surpass humans at being charismatic, but that would seem to be its own nightmare scenario.
That's a completely separate question. We're talking about automating lawyers, not judges. (to be a good lawyer in such a situation, you would need to model the judge's emotions and use them to your advantage. Probably AIs can do this eventually but it's not easy or likely to happen soon)
Language models are great at digesting legalese and analyzing what's going on. However, many legal applications involve around pretty important decisions that you don't want to get wrong ("am I contractually covered with my current insurance program?"). Because of that, we've built LLM products in the legal space with the following principles in mind:
- Human-in-the-loop tooling -- The product should be built around an expert using it whenever possible, so decision support as opposed to automation. You still see massive time savings with that in place
- Transparency / citations -- With a human-in-the-loop tool, you need mechanisms to build trust. Whether that's highlighting clauses in the document that the LLM referred to or explaining why a part of the analysis wasn't provided, citing your work is important
- Tuned for precision instead of recall -- False positives (and hallucinations) are especially bad in many of these legal use cases, so tuning the model or prompts for precision helps with mitigation.
When you say “tuned for precision” is this your prompt engineering or are you actually fine-tuning GPT-4?
Appreciate the insights.
We're not fine-tuning today. Instead, "tuning for precision" is done through prompt chains. A simple example would be returning "I don't know" early on if the document isn't a contract or doesn't have clear insurance requirements in it. We've had success with various guardrail prompts.
The citation work we did at Google used model internals to highlight text (path integrated gradients). It's also easier to finetune for precision when you have control over the model itself.
If you're interested in the capabilities and limitations, I suggest these informative, but still light reads as well: https://kirasystems.com/science/https://zuva.ai/blog/https://www.atticusprojectai.org/cuad
Deleted Comment
Ah yes, the story of bad people not wanting their livelihoods taken from them by good tech giants. Seriously, is there no room for empathy in all of this ? If you went through law school and likely got yourself in debt in the process then you're not protecting any monopoly but your means to exist. There are people like that out there you know.
I kind of assumed they were in the same space as government documents.
So despite the early news about lawyers submitting 'fake' cases, it is only a matter of time before the legal profession is up-ended. There are a ton of paralegals, etc.. doing 'grunt' work in firms, that an LLM can do. These are considered white color, and will be gone.
It will progress in a similar fashion to coding.
It will be like having a junior partner that you have to double check, or that can do some boiler plate for you.
You can't trust completely, but you don't trust your junior devs do you, but it gets you 80% there.
I kind of agree with this, but this is why I am confused that I only ever see people (at least on HN) talk about AI up-ending the legal profession and putting droves of lawyers out of work--I never see the same talk about the coding industry being transformed in this way.
Maybe HN is full of coders that still think themselves 'special' and can't be replaced.
Or maybe, the law profession has a lot more boilerplate than the coding profession?
So legal profession has more that can be replaced?
Coders will be replaced, but maybe not at same rate as paralegals.