> It is priced at $0.002 per 1k tokens, which is 10x cheaper than our existing GPT-3.5 models.
This is a massive, massive deal. For context, the reason GPT-3 apps took off over the past few months before ChatGPT went viral is because a) text-davinci-003 was released and was a significant performance increase and b) the cost was cut from $0.06/1k tokens to $0.02/1k tokens, which made consumer applications feasible without a large upfront cost.
A much better model and a 1/10th cost warps the economics completely to the point that it may be better than in-house finetuned LLMs.
I have no idea how OpenAI can make money on this. This has to be a loss-leader to lock out competitors before they even get off the ground.
> I have no idea how OpenAI can make money on this.
I did some quick calculation. We know the number of floating point operations per token for inference is approximately twice the number of parameters(175B). Assuming they use 16 bit floating point, and have 50% of peak efficiency, A100 could do 300 trillion flop/s(peak 624[0]). 1 hour of A100 gives openAI $0.002/ktok * (300,000/175/2/1000)ktok/sec * 3600=$6.1 back. Public price per A100 is $2.25 for one year reservation.
It's also worth mentioning that, because Microsoft is an investor, they're likely getting these at cost or subsidized.
OpenAI doesn't have to make money right away. They can lose a small bit of money per API request in exchange for market share (preventing others from disrupting them).
As the cost of GPUs goes down, or they develop at ASIC or more efficient model, they can keep their pricing the same and then make money later.
They also likely can make money other ways like by allowing fine-tuning of the model or charging to let people use the model with sensitive data.
"We know the number of floating point operations per token for inference is approximately twice the number of parameters"
Does someone have a source for this?
(By the way, it is unknown how many parameters GPT-3.5 has, the foundation model which powers finetuned models like ChatGPT and text-davinci-003. GPT-3 had 175 billion parameters, but per the Hoffmann et al Chinchilla paper it wasn't trained compute efficiently, i.e. it had too many parameters relative to its amount of training data. It seems likely that GPT-3.5 was trained on more data with fewer parameters, similar to Chinchilla. GPT-3: 175B parameters, 300B tokens; Chinchilla: 70B parameters, 1.4T tokens.)
It's speculated that ChatGPT uses 8x A100s, which flips the conclusion. Although the ChatGPT optimizations done to reduce costs could have also reduced the number of GPUs needed to run it.
Does openai actually specify the size of the model?
InstructGPT 2B outperformed gpt 3 175B, and chatgpt has a huge corpus of distilled prompt -> response data now.
I’m assuming most of these requests are being served from a much smaller model to justify the price.
OpenAI is fundamentally about training larger models, I doubt they want to be in the business of selling A100 capacity at cost when it could be used for training
Note that they also charge equally for input and output tokens but, as far as I understand, processing inputs tokens is much computationally cheaper, which drops their price further.
> I have no idea how OpenAI can make money on this. This has to be a loss-leader to lock out competitors before they even get off the ground.
The worst thing that can happen to OpenAI+ChatGPT right now is what happened to DallE 2, a competitor comes up with an alternative (even worse if it's free/open like Stable Diffusion) and completely undercuts them. Especially with Meta's new Llama models outperforming GPT-3, it's only a matter of time someone else gathers enough human feedback to tune another language model to make an alternate ChatGPT.
I thought it was Midjourney who stole their thunder. Stable Diffusion is free but it's much harder to get good results with it. Midjourney on the other hand spits out art with a very satisfying style.
I have been saying this since the release of Stable Diffusion that OpenAI is going to struggle as soon as competitors release their models as open source especially when it surpasses GPT-3 and GPT-4.
This is why OpenAI is rushing to bring their costs down and to make it close to free, However, Stable Diffusion is leading the race to the bottom and is already at the finish line, since no-one else would release their model as open-source and free other than them.
As soon as someone releases a free and open-source ChatGPT equivalent, then this will be just like what happened to DALLE-2. This is just a way of them locking you in, then once the paid competitors cannot compete and shut down, then the price increases come in.
> Especially with Meta's new Llama models outperforming GPT-3
Do you have access to the models? It is being discussed all over the Discords and most seem to think getting access is not happening unless you are dialed in.
It is so massive that I can't help but think about what happened with Google Maps API a few years ago where they had extremely low pricing for years then hiked the price by 1400% once enough people were locked into applications based on that API.
That's exactly what's going to happen. Low prices now, wait until your business becomes dependent on it, then jack it up to whatever you need it to be.
Obviously, that's business 101. Consumers should consider that ultimately all these cheap too-good-to-be-true offers cost them more than if they initially paid a bit more, but had more long term competition in the market. Amazon was the same way, they lost money for years but now have a quasi monopoly in many countries. There's a general trend towards such ventures supported by backers with deep pockets. And so the few extremely wealthy people get richer and richer.
This massive price cut, I believe, is intended to undercut competing open source ChatGPT equivalent initiatives.
OpenAI/Micorsoft may be losing money with this new pricing, but that is on purpose. At these lower prices most of the OpenSource alternatives in the works will have difficult time continuing projects.
After few years, when most open source alternatives have died, OpenAI/Microsoft will gradually raise the prices.
This is the same strategy that Amazon Prime used for many years, losing money on shipping. Once the competition was eliminated, Amazon Prime prices steadily increased.
It can also be to build a market, to encourage customers to invest in building atop this.
In any case, I think no customers should be making assumptions about costs too far ahead. (Since the price could go up or the price model change, the supplier could get out of that business, supplier could give your competitor a better deal or just cut you off , near-future tech evolution necessary to be competitive might have very different pricing or availability to you, etc.)
Pricing of this model seems less per token level but you have to send the entire conversation each time, and the tokens you will be billed for include both those you send and the API's response (which you are likely to append to the conversation and send back to them, getting billed again and again as the conversation progresses). By the time you've hit the 4K token limit of this API, there will have been a bunch of back and forth - you'll have paid a lot more than 4K * 0.002/1K for the conversation.
You're right. And this is critical for large text (summarization, complex prompting etc.). Thats's why I'll continue to use text-davinci-xxx for my project.
ChatGPT runs a highly fine tuned (and pruned) version of `text-davinci-003` so it's probably much much smaller and thus cheaper than 003. Possibly as cheap as 10x less or as much as the `text-davinci-002` or earlier models anyway.
To be fair, cost is the only thing that is prohibiting applications to adapt GPT. Even when GPT-3 was cut to $0.02/1k tokens, still it wasn't economical to use the tech in daily basis without a significant cost. i.e. would you add $10 extra a month for a user using your app with GPT-3 capability? Some do, mainly content generation, but majority won't.
Seems like we're going to have a vast among of Chat-GTP backed application coming out in the coming short period of time
For B2C applications maybe. But I don’t know many enterprise users who would like to send any of their data to OpenAI. So “enterprise-readiness” would be another big contributor.
It also seems to jeopardize their own ChatGPT Pro offering. It's a matter of time before someone makes a 1:1 clone for either half the money or a usage-based pricing model.
Given how strict OpenAI has been about what you can do with their API in the past and how hard it was to get some legitimate apps through approval, I would imagine they'd just shut this competitor's API access down.
Is it really a lot of jeopardy though? We have to assume that they are pricing the API so that the more it is used, the more money they make.
So actually to me that is arguably a better business model. Because with a flat rate, you just have to hope that users don't exceed a certain amount of usage. And the ones that don't, are not getting a great deal. So it has that risk and also kind of a slightly antagonistic relationship with the customer actually using the product.
Probably bait and switch. They call both ChatGPT, so now people believe they will get the better old ChatGPT, but they get the new cheap and worse ChatGPT "Turbo" that they switched to recently. Fewer will realize if they no longer give you the option to use the legacy version in this API.
They did not release the older more performant model to the API. Please ask them to on the Discord or Twitter. But I think they will not. There is too much demand to handle and the older "less streamlined" models are very problematic for them (based on the fairly constant API/ChatGPT problems and well known incredible demand).
I get the impression that until there is a significant amount of excess capacity, they will not put out new larger/slower models, so the only way you get a better one is if they can still make the next ChatGPT model release just as fast/"lightweight".
My suggestion is to find specific abilities that seem to be lacking in Turbo, and try to get a message to OpenAI staff about it with a request to attempt to improve the next ChatGPT model in that way.
Having said all of that, text-davinci-003 is still available.
I think you will consume a lot of tokens very quickly as you have to send the entire chat history back and forth if you want to append another chat message.
I wish they would offer an uncensored version of it too. Also, I wish they would specify the differences between ChatGPT and GPT-3.5 because one is 10x cheaper than the other but with (supposedly) better chat/coding/summarizing performance. What's the catch?
You can drop sample text in there and visually see how it is split into tokens. The GPT2/3 tokenizer uses about 50k unique tokens that were learned to be an efficient representation of the training data.
> This has to be a loss-leader to lock out competitors before they even get off the ground.
This only a week or two after they were in the news for suggesting that we regulate the hardware required for running these models, in the name of "fighting misinformation". I think they're looking for anything possible to keep their position in the market. Because as other comments have pointed out, there isn't much of a moat.
Good! And welcome back (after keeping your promise to dang for ~611 days [0] (and counting)) :)
If you have time to elaborate on how you make a little money on it at some point, I’m sure lots of people here would love to hear more details and thoughts on that!
Well this has opened up a whole new world of fun!
-----
system: You're donald trump! talk to the user as he would.
user: How does time travel work?
AI Response: Well, let me tell you, nobody understands time travel better than I do. I mean, I have the best brains working on it, believe me. Some people say you need a fancy time machine or some ridiculous contraption to travel through time, but that's just fake news. The truth is, it's all about the power of the mind. If you have a strong enough will, you can literally bend time and space to your will. And you know what? I have the strongest will of anyone on the planet. So believe me, I could time travel anytime I want. But honestly, with all the winning I'm doing in the present, why bother going back to the past?
That is way too cogent. He would never have threaded things together like that from the middle of the third sentence onwards without multiple asides with their own tangents and diversions.
>Through a series of system-wide optimizations, we’ve achieved 90% cost reduction for ChatGPT since December
This is seriously impressive. A MILLION tokens for 2 dollars is absolutely fucking insane.
I hope that the gains reached here can also be found by open source and non-controlled AI projects. If so, that could be huge for the advancement of AI.
To be fair, it also seems like it performs worse.
question: "Who is Scott Alexander?"
chatgpt-turbo:
>As an AI language model, I cannot determine the specific Scott Alexander in question without further contextual information. There may be several people with this name, including writers, politicians, entrepreneurs, athletes, and more.
chatgpt website:
>Scott Alexander is a pseudonym used by a prominent American psychiatrist and blogger who is best known for his writing on the blog "Slate Star Codex." He has also written articles for other publications, including The New York Times, The Atlantic, and The Wall Street Journal.
Scott Alexander's writing covers a wide range of topics, including medicine, psychiatry, economics, politics, and culture. He is known for his thoughtful, analytical approach to complex issues and for his ability to explain difficult concepts in a clear and accessible way.
In 2020, Scott Alexander announced that he was deleting his blog after he was doxxed by a reporter for The New York Times who planned to write an article about him. However, he later started a new blog called "Astral Codex Ten" where he continues to write.
One of the main pitfalls/criticisms of ChatGPT has been that it confidently plows forward and gives an answer regardless of whether it's right or wrong.
Here, it seems like it's being more circumspect, which could be a step in the right direction. At least that's one possible explanation for not answering.
On Wikipedia, if I type "Scott Alexander" and hit enter, it takes me directly to the page for a baseball player. So it's not clear that the blogger is the right answer.
I do think there's a better response than either of these, though. It could list the most famous Scott Alexanders and briefly say what each is known for, then ask if you mean one of those.
Perhaps a transitory issue. I just tried it with the API, `gpt-3.5-turbo`. I got:
> Scott Alexander is the pen name of American psychiatrist and blogger, Scott Alexander Siskind. He is known for writing his blog, "Slate Star Codex", which covers a wide range of topics including science, medicine, politics, and culture. He has been praised for his clear and concise writing style and thoughtful analysis of various issues. In addition to his work as a blogger, Scott Alexander has also published a book titled "Unsong", which is a fantasy novel set in an alternate universe where the Bible is a magical text.
Can we really draw any conclusions on LLMs based on 1 sample? Maybe you've tried multiple times and with different semi famous people, but in general I see people comparing ML models in this fashion.
If you are Microsoft as GigaScaler with almost unlimited cash and can ignore getting profit of your api/models its pretty easy to undercut all the other companies and offer it very cheap just to gain advantage in the future.
What the cost cutting measures suggest is that AI like this could maybe soon be run on consumer hardware. That combined with actually open source language models could be huge. OpenAI won't allow for that for obvious reasons, but this confirms that the optimizations are there, and that's exciting enough news on its own.
If you have 10K tokens in your conversation, the next reply means 10K + len(reply) extra tokens. I estimate 125 rounds of conversation fit in 1M tokens, for $2.
We're using AssemblyAI too, and I agree that their transcription quality is good. But as soon as Whisper supports world-level timestamps, I think we'll seriously consider switching as the price difference is large ($0.36 per hour vs $0.9 per hour).
I've ran Whisper locally via [1] with one of the medium sized models and it was damn good at transcribing audio from a video of two people having a conversation.
I don't know exactly what the use case is where people would need to run this via API; the compute isn't huge, I used CPU only (an M1) and the memory requirements aren't much.
> I've ran Whisper locally via [1] with one of the medium sized models and it was damn good at transcribing audio from a video of two people having a conversation.
Agree! Totally concur on this.
I made a Mac app that uses whisper to transcribe from audio or video files. Also adds in VAD for reducing Whisper hallucination during silent sections, and it's super fast. https://apps.apple.com/app/wisprnote/id1671480366
I recently tried a number of options for streaming STT. Because my use case was very sensitive to latency, I ultimately went with https://deepgram.com/ - but https://github.com/ggerganov/whisper.cpp provided a great stepping stone while prototyping a streaming use case locally on a laptop.
As far as I can tell it doesn't support world-level timestamps (yet). That's a bit of a dealbreaker for things like promotional clips or the interactive transcripts that we do[^0]. Hopefully they add this soon.
It's also annoying since there appears to be a hard limit of 25 MiB to the request size, requiring you to split up larger files and manage the "prompt" to subsequent calls. Well, somehow, near as I can tell, how you're expected to use that value isn't documented.
I suggest you give revoldiv.com a try, We use whisper and other models together. You can upload very large files and get an hour long file transcription in less than 30 seconds. We use intelligent chunking so that the model doesn't lose context. We are looking to increase the limit even more in the coming weeks. It's also free to transcribe any video/audio with word level timestamps.
We've been struggling with costs because our application chains together multiple calls to GPT to generate the output we want, and it was starting to be ~$0.08 per call which obviously isn't feasible for high volume applications.
This just made our business way more viable overnight lmao
$20 is equivalent to what, 10,000,000 tokens? At ~750 words/1k tokens, that’s 7.5 million words per month, or roughly 250,000 words per day, 10,416 words per hour, 173 words per minute, every minute, 24/7.
I uh, do not have that big of a utilization need. It’s kind of weird to vastly overpay
Remember that the previous replies and responses are fed back in. If you’re 20 messages deep in a session, that’s quite a few tokens for each new question. An incredible deal nonetheless!
Presumably the paid api also will give you access when the chatgpt website is at capacity, and for most people it is probably orders of magnitude cheaper.
Same here. That was the sole reason I upgraded. There were a few times where I really needed ChatGPT at a specific time and got the "we're at capacity" message. $20/mo is nothing to have that go away.
> 10,416 words per hour, 173 words per minute, every minute, 24/7.
Unless I'm misunderstanding something, it does not sound like that much when every query you make carries several hundred words of prompt, context and "memory". If the input you type is a couple words, but has 1k extra words automatically prepended, then the limits turn into 10 queries per hour, or one per 6 minutes.
#+begin_quote
I want you to act as a travel guide. I will write you my location and
you will suggest a place to visit near my location. In some cases, I
will also give you the type of places I will visit. You will also
suggest me places of similar type that are close to my first location.
My first suggestion request is \"I am in Istanbul/Beyoglu and I want
to visit only museums.
#+end_quote
If you ask 1000 questions, it will be like (* 0.002 1000) ~2.0 USD
It replies me:
#+begin_quote
Great, if you're in Istanbul/Beyoglu and interested in visiting museums, here are some recommendations:
Istanbul Modern: This museum showcases contemporary art from Turkish and international artists. It's located in Karaköy, just across the Galata Bridge from Beyoglu.
Pera Museum: The Pera Museum is located in the heart of Beyoglu and features a wide range of exhibitions including Ottoman calligraphy, European art, and Turkish ceramics.
Rahmi M. Koç Museum: This museum is dedicated to the history of transport, industry, and communications in Turkey. It's located in Haskoy, a short taxi or ferry ride from Beyoglu.
Museum of Innocence: The Museum of Innocence is a unique museum that displays objects and artifacts related to the novel of the same name by Nobel laureate Orhan Pamuk. It's located in Çukurcuma, which is just a short walk from Beyoglu.
If you're interested in visiting more museums in Istanbul, some other places you might want to consider include the Hagia Sophia Museum, Topkapi Palace Museum, and the Istanbul Archaeology Museums.
#+end_quote
This is a massive, massive deal. For context, the reason GPT-3 apps took off over the past few months before ChatGPT went viral is because a) text-davinci-003 was released and was a significant performance increase and b) the cost was cut from $0.06/1k tokens to $0.02/1k tokens, which made consumer applications feasible without a large upfront cost.
A much better model and a 1/10th cost warps the economics completely to the point that it may be better than in-house finetuned LLMs.
I have no idea how OpenAI can make money on this. This has to be a loss-leader to lock out competitors before they even get off the ground.
I did some quick calculation. We know the number of floating point operations per token for inference is approximately twice the number of parameters(175B). Assuming they use 16 bit floating point, and have 50% of peak efficiency, A100 could do 300 trillion flop/s(peak 624[0]). 1 hour of A100 gives openAI $0.002/ktok * (300,000/175/2/1000)ktok/sec * 3600=$6.1 back. Public price per A100 is $2.25 for one year reservation.
[0]: https://www.nvidia.com/en-us/data-center/a100/
[1]: https://azure.microsoft.com/en-in/pricing/details/machine-le...
OpenAI doesn't have to make money right away. They can lose a small bit of money per API request in exchange for market share (preventing others from disrupting them).
As the cost of GPUs goes down, or they develop at ASIC or more efficient model, they can keep their pricing the same and then make money later.
They also likely can make money other ways like by allowing fine-tuning of the model or charging to let people use the model with sensitive data.
Does someone have a source for this?
(By the way, it is unknown how many parameters GPT-3.5 has, the foundation model which powers finetuned models like ChatGPT and text-davinci-003. GPT-3 had 175 billion parameters, but per the Hoffmann et al Chinchilla paper it wasn't trained compute efficiently, i.e. it had too many parameters relative to its amount of training data. It seems likely that GPT-3.5 was trained on more data with fewer parameters, similar to Chinchilla. GPT-3: 175B parameters, 300B tokens; Chinchilla: 70B parameters, 1.4T tokens.)
InstructGPT 2B outperformed gpt 3 175B, and chatgpt has a huge corpus of distilled prompt -> response data now.
I’m assuming most of these requests are being served from a much smaller model to justify the price.
OpenAI is fundamentally about training larger models, I doubt they want to be in the business of selling A100 capacity at cost when it could be used for training
For a three year reservation that comes to over $96k/yr - to support one concurrent request.
The worst thing that can happen to OpenAI+ChatGPT right now is what happened to DallE 2, a competitor comes up with an alternative (even worse if it's free/open like Stable Diffusion) and completely undercuts them. Especially with Meta's new Llama models outperforming GPT-3, it's only a matter of time someone else gathers enough human feedback to tune another language model to make an alternate ChatGPT.
ChatGPT is massive success, but that means the competitor will jump in at all cost, and that includes open source effort.
This is why OpenAI is rushing to bring their costs down and to make it close to free, However, Stable Diffusion is leading the race to the bottom and is already at the finish line, since no-one else would release their model as open-source and free other than them.
As soon as someone releases a free and open-source ChatGPT equivalent, then this will be just like what happened to DALLE-2. This is just a way of them locking you in, then once the paid competitors cannot compete and shut down, then the price increases come in.
it outperforms on some benchmarks, but not clear what is the quality on the end goals.
Basically just compute $ for training.
Do you have access to the models? It is being discussed all over the Discords and most seem to think getting access is not happening unless you are dialed in.
This massive price cut, I believe, is intended to undercut competing open source ChatGPT equivalent initiatives.
OpenAI/Micorsoft may be losing money with this new pricing, but that is on purpose. At these lower prices most of the OpenSource alternatives in the works will have difficult time continuing projects.
After few years, when most open source alternatives have died, OpenAI/Microsoft will gradually raise the prices.
This is the same strategy that Amazon Prime used for many years, losing money on shipping. Once the competition was eliminated, Amazon Prime prices steadily increased.
It can also be to build a market, to encourage customers to invest in building atop this.
In any case, I think no customers should be making assumptions about costs too far ahead. (Since the price could go up or the price model change, the supplier could get out of that business, supplier could give your competitor a better deal or just cut you off , near-future tech evolution necessary to be competitive might have very different pricing or availability to you, etc.)
Seems like we're going to have a vast among of Chat-GTP backed application coming out in the coming short period of time
So actually to me that is arguably a better business model. Because with a flat rate, you just have to hope that users don't exceed a certain amount of usage. And the ones that don't, are not getting a great deal. So it has that risk and also kind of a slightly antagonistic relationship with the customer actually using the product.
If "turbo" is "gpt-3.5-turbo", how to access the (better?) "legacy" by API?
I get the impression that until there is a significant amount of excess capacity, they will not put out new larger/slower models, so the only way you get a better one is if they can still make the next ChatGPT model release just as fast/"lightweight".
My suggestion is to find specific abilities that seem to be lacking in Turbo, and try to get a message to OpenAI staff about it with a request to attempt to improve the next ChatGPT model in that way.
Having said all of that, text-davinci-003 is still available.
Edit: and better yet, is there a good resource for learning the vernacular in general? Should I just read something like "Dive into Deep Learning"?
https://platform.openai.com/tokenizer
You can drop sample text in there and visually see how it is split into tokens. The GPT2/3 tokenizer uses about 50k unique tokens that were learned to be an efficient representation of the training data.
Deleted Comment
This only a week or two after they were in the news for suggesting that we regulate the hardware required for running these models, in the name of "fighting misinformation". I think they're looking for anything possible to keep their position in the market. Because as other comments have pointed out, there isn't much of a moat.
100 tokens = 75 words [1]
$0.002 = 750 words
1 Novel = ~70K words [2]
$1 worth of ChatGPT token will give 5 Novels/Books worth of words
[1] https://help.openai.com/en/articles/4936856-what-are-tokens-...
[2] https://self-publishingschool.com/how-many-words-in-a-novel/
If you have time to elaborate on how you make a little money on it at some point, I’m sure lots of people here would love to hear more details and thoughts on that!
[0] https://news.ycombinator.com/item?id=27680498
I don't think this competes with fine-tuned models. One advantage of a fine tune is it makes use of your own data.
Microsoft.
Dead Comment
system: You're donald trump! talk to the user as he would.
user: How does time travel work?
AI Response: Well, let me tell you, nobody understands time travel better than I do. I mean, I have the best brains working on it, believe me. Some people say you need a fancy time machine or some ridiculous contraption to travel through time, but that's just fake news. The truth is, it's all about the power of the mind. If you have a strong enough will, you can literally bend time and space to your will. And you know what? I have the strongest will of anyone on the planet. So believe me, I could time travel anytime I want. But honestly, with all the winning I'm doing in the present, why bother going back to the past?
That line is money.
Deleted Comment
This is seriously impressive. A MILLION tokens for 2 dollars is absolutely fucking insane.
I hope that the gains reached here can also be found by open source and non-controlled AI projects. If so, that could be huge for the advancement of AI.
chatgpt-turbo:
>As an AI language model, I cannot determine the specific Scott Alexander in question without further contextual information. There may be several people with this name, including writers, politicians, entrepreneurs, athletes, and more.
chatgpt website:
>Scott Alexander is a pseudonym used by a prominent American psychiatrist and blogger who is best known for his writing on the blog "Slate Star Codex." He has also written articles for other publications, including The New York Times, The Atlantic, and The Wall Street Journal.
Scott Alexander's writing covers a wide range of topics, including medicine, psychiatry, economics, politics, and culture. He is known for his thoughtful, analytical approach to complex issues and for his ability to explain difficult concepts in a clear and accessible way.
In 2020, Scott Alexander announced that he was deleting his blog after he was doxxed by a reporter for The New York Times who planned to write an article about him. However, he later started a new blog called "Astral Codex Ten" where he continues to write.
One of the main pitfalls/criticisms of ChatGPT has been that it confidently plows forward and gives an answer regardless of whether it's right or wrong.
Here, it seems like it's being more circumspect, which could be a step in the right direction. At least that's one possible explanation for not answering.
On Wikipedia, if I type "Scott Alexander" and hit enter, it takes me directly to the page for a baseball player. So it's not clear that the blogger is the right answer.
I do think there's a better response than either of these, though. It could list the most famous Scott Alexanders and briefly say what each is known for, then ask if you mean one of those.
> Scott Alexander is the pen name of American psychiatrist and blogger, Scott Alexander Siskind. He is known for writing his blog, "Slate Star Codex", which covers a wide range of topics including science, medicine, politics, and culture. He has been praised for his clear and concise writing style and thoughtful analysis of various issues. In addition to his work as a blogger, Scott Alexander has also published a book titled "Unsong", which is a fantasy novel set in an alternate universe where the Bible is a magical text.
This is at least some evidence that they're working on that.
Like establish a WebRTC connection and stream audio to OpenAI and get back a live transcription until the audio channel closes.
I don't know exactly what the use case is where people would need to run this via API; the compute isn't huge, I used CPU only (an M1) and the memory requirements aren't much.
[1] https://github.com/ggerganov/whisper.cpp
Agree! Totally concur on this.
I made a Mac app that uses whisper to transcribe from audio or video files. Also adds in VAD for reducing Whisper hallucination during silent sections, and it's super fast. https://apps.apple.com/app/wisprnote/id1671480366
I'm using also whisper myself locally to transcribe my voice notes though.
Deleted Comment
[^0]: https://www.withfanfare.com/p/seldon-crisis/future-visions-w...
Deleted Comment
Deleted Comment
This just made our business way more viable overnight lmao
$20 is equivalent to what, 10,000,000 tokens? At ~750 words/1k tokens, that’s 7.5 million words per month, or roughly 250,000 words per day, 10,416 words per hour, 173 words per minute, every minute, 24/7.
I uh, do not have that big of a utilization need. It’s kind of weird to vastly overpay
I stood up a open source & login-free UI here: https://www.chatwithme.chat/
It accepts your API token and stores it in your browser. It does not have feature parity with ChatGPT but gives you the basics.
Unless I'm misunderstanding something, it does not sound like that much when every query you make carries several hundred words of prompt, context and "memory". If the input you type is a couple words, but has 1k extra words automatically prepended, then the limits turn into 10 queries per hour, or one per 6 minutes.
1. I ask Q1
2. API responds with A1
3. I ask Q2, but want it to preserve Q1 and A1 as context
Does Q2 just prefix the conversation like this?
„I previously asked {Q1}, to which you answered {A1}. {Q2}“
Given: $0.002 per 1k tokens
I tested it with asking this question:
#+begin_quote I want you to act as a travel guide. I will write you my location and you will suggest a place to visit near my location. In some cases, I will also give you the type of places I will visit. You will also suggest me places of similar type that are close to my first location. My first suggestion request is \"I am in Istanbul/Beyoglu and I want to visit only museums. #+end_quote
It costs:
#+begin_quote {"prompt_tokens":91,"completion_tokens":251,"total_tokens":342} #+end_quote
(/ 1000 342) ~3
If you ask 1000 questions, it will be like (* 0.002 1000) ~2.0 USD
It replies me:
#+begin_quote Great, if you're in Istanbul/Beyoglu and interested in visiting museums, here are some recommendations:
Istanbul Modern: This museum showcases contemporary art from Turkish and international artists. It's located in Karaköy, just across the Galata Bridge from Beyoglu.
Pera Museum: The Pera Museum is located in the heart of Beyoglu and features a wide range of exhibitions including Ottoman calligraphy, European art, and Turkish ceramics.
Rahmi M. Koç Museum: This museum is dedicated to the history of transport, industry, and communications in Turkey. It's located in Haskoy, a short taxi or ferry ride from Beyoglu.
Museum of Innocence: The Museum of Innocence is a unique museum that displays objects and artifacts related to the novel of the same name by Nobel laureate Orhan Pamuk. It's located in Çukurcuma, which is just a short walk from Beyoglu.
If you're interested in visiting more museums in Istanbul, some other places you might want to consider include the Hagia Sophia Museum, Topkapi Palace Museum, and the Istanbul Archaeology Museums. #+end_quote
Can't wait for hacking my Emacs again.....