If Tamer is reading this, I know of opportunities in NYC for sharp ML people. Feel free to drop me a line at b7r6@b7r6.net and I’ll be more than happy to make an introduction or two.
NYC clearly doesn’t have the level of activity in this area that the Bay does, but there’s a scene. LeCun and the NYU crowd and the big FAIR footprint create a certain gravity. There’s stuff going on :)
I'd hire him if the conversation for the last question I asked went something like this.
Me: On your first day of work, what would you say to me if I asked you to do the samething in your writeup but in production and at scale using the same third party services (chatgpt etc) you used in your writeup.
T: "You're an idiot. I quit."
Me: You're hired!
Clearly it's a fun exploratory excercise, kind of like using kafta instead of a db for the main store for a crud app just to see how it works. But if you asked a senior engineer who follows the sector he would probably guess all the answers correctly blind. Tamer himself says it was his most expensive sunday hobby night. Now scale that uselessness to enterprise level. And you're not even sure if some of the results are hallucinated.
I sympathize and you’re maybe being facetious. But a good engineer doesn’t just say “you’re an idiot”, they are able to clearly document and communicate the expenses to the business folks, and they are able to brainstorm ways to change the cost dynamics with trade off options for business.
Cool analysis with GPT-4o! I was doing some messing around with the same dataset recently around the "Who is Hiring" and "Who wants to be hired". Although I was just using pandas and spacy. (I was job supply and demand with the US FED interest rates here: https://raw.githubusercontent.com/bobbywilson0/hn-whos-hirin...)
I can actually see how nice it would be for an llm to be able to disambiguate 'go' and 'rust'. However, it does seem a bit disappointing that it isn't consolidating node.js and nodejs or react-native and react native.
What the author could have done, and what I should have (but didn't) also, is add a bunch of possible values (enums) for each possible field value. This should solve it from coming up with variations e.g. node, nodejs
In zod/tooling it would look like this;
remote: z.enum(['none', 'hybrid', 'full']),
framework: z.enum(['nodejs', 'rails']),
But this just shifts the problem further down, which is now you need a good standard set of possible values. Which I am yet to find, but I'm sure it is out there.
On top of that, I am working on publishing a JobDescription.schema.json such that the next time the models train, they will internalize an already predefined schema which should make it a lot easier to get consistent values from job descriptions.
- Also I tend to forget to do it a lot recently in LLM days but there are plenty of good NER (Named Entity Recognition) tools out there these days, that you should run first before making robust prompts
This seems to have a similar problem in the Apple notes calculator where items that you set to as a variable in the new calculator mode can’t have spaces or any other delimiters.
The training data or some kind of enrichment of the data would have to make the systems understand node.js and nodejs are the same just like on the new notes calculator Apple-sauce = $2.50 * 8 makes the first statement a variable.
I'm more interested in technical side of this, but I'm not seeing any links to GitHub with the source code of this project.
Anyway, I have a tangential question, and this is the first time I see langchain, so may be a stupid one. The point is the vendor-API seems to be far less uniform than what I'd expect from a framework like this. I'm wondering, why cannot[0] this be done with Ollama? Isn't it ultimately just system prompt, user input and a few additional params like temperature all these APIs require as an input? I'm a bit lost in this chain of wrappers around other wrappers, especially when we are talking about services that host many models themselves (like together.xyz), and I don't even fully get the role langchain plays here. I mean, in the end, all that any of these models does is just repeatedly guessing the next token, isn't it? So there may be a difference on the very low-level, there my be some difference on a high level (considering different ways these models have been trained? I have no idea), but on some "mid-level" isn't all of this utlimately just the same thing? Why are these wrappers so diverse and so complicated then?
Is there some more novice-friendly tutorial explaining these concepts?
You really don't have to use langchain. I usually don't except on a few occasions I used some document parsing submodule.
The APIs between different providers are actually pretty similar, largely close to the OpenAI API.
The reason to use a paid service is because the models are superior to the open source ones and definitely a lot better than what you can run locally.
It depends on the task though. For this task, I think a really good small model like phi-3 could handle 90-95% of the entries well through ollama. It's just that the 5-10% of extra screw ups are usually not worth the privilege of using your own hardware or infrastructure.
For this particular task, I would definitely skip langchain (but I always skip it). You could use any of the top performing open or closed models, with ollama locally, together.ai, and multiple closed models.
It should be much less than 50 lines of code. Definitely under 100.
You can just use string interpolation for the prompts and request JSON output with the API calls. You don't need to get a PhD in langchain for that.
Well, I mean, it appears langchain just technically doesn't support structured response for Ollama (according to the link above). But, as I've said, I have absolutely no idea what all this middle-layer stuff actually does and what may be the reason why different vendors have different integration capabilities in this regard.
I'm totally new (and maybe somewhat late) to the domain, literally just tried right now to automate a fairly simple task (extracting/guessing book author + title in nice uniform format from a badly abbreviated/transliterated and incomplete filename) using plain ollama HTTP-API (with llama3 as a model), but didn't have much success with that (it tries to chat with me in its responses, instead of strictly following my instructions). I think, my prompts must be the problem, and I hoped to try the langchain, since it somehow seems to abstract the problem, but saw that it isn't supported for a workflow the OP used. But since this is a field where I'm really totally new, I suppose I also may be making some more general mistake, like using a model that cannot be used for this task at all. How would I know, they all look the same to me…
Ollama project itself is fairly stingy with explanations. Doubtfully there are many people out there trying to automate an answer to the "Why is the sky blue?" question.
So, I wonder, maybe somebody knows a more digestible tutorial somewhere, explaining this stuff from the ground up?
Thanks. Indeed, it doesn't directly answer my question. For one, the author seems to be failing the "Chesterton's fence" test here: it doesn't even try to answer what langchain is supposed to be good at, but ends up being bad. It just plainly says it's bad, that's all.
And, as stated, I also don't know the answer to that question, so this is kinda one of the primary concerns here. I mean, one possible answer seems pretty obvious to me: it would be better to keep your app vendor-agnostic (to be able to switch from OpenAI to Anthropic with 1 ENV var) if at all possible. Neither of articles and doc-pages I've read in the past few hours tries to answer to what extent this is possible and why, and if it even is supposed to be a real selling-point of langchain. TBH, I still have no idea what the main selling point even is.
Honestly, langchain solves no problems besides being an advertisement for langchain itself
It gets picked by people with more of a top-down approach maybe, who feel like adding abstraction layers (that don't abstract pretty much anything) is better. It isn't
Yeah langchain is not necessary for this. The author appear not to have shared his code yet (too bad, the visualizations are nice!), but as a poor replacement I can share mine from over a year ago:
Only using the plain OpenAI api. This was on GPT-3.5, but it should be easy to move to 4o and make use of the json mode. I might try a quick update this weekend
This seems like a great blend of LLM and classic analysis. I've recently started thinking that LLMs (or other ML models) would be fantastic at being the interface between humans and computers. LLMs get human nuance/satire/idioms in a way that other NLP approaches have really struggled with. This piece highlights how ML is great at extracting information in context (being able to tell whether it's go the language or go the word).
LLMs aren't reliable for actual number crunching though (at least from what I've seen). Which is why I really appreciate seeing this blend!
Yeah, I guess it is pretty mainstream :) Though another view I keep hearing is that LLMs are going to replace all jobs in 5 years, which quite frankly is disconnected from reality. Even assuming that we could create a ML model that could replace a human (which I think the current paradigm is insufficient after this[0] discussion), there's still the matter of building data centers, manufacturing chips, and it would need to be cheaper than paying a human.
I personally would like AI to help humans be better humans, not try to replace humans. Instead of using AI to create less understanding with black box processes, I'd rather it help us with introspection and search (I think embeddings[1] is still a pretty killer feature that I haven't heard much noise about yet).
This is very neat! Thanks for using your time and literal dollars to work through this!
As an added detail regarding the "remote" v "in-person", another interesting statistic, to me, is to know how many of those in-person job-seeking companies are repeats! It could absolutely mean they're growing rapidly, OR it could mean they're having trouble finding candidates. Equally, missing remotes could mean either they're getting who they need OR they're going out of business.
Interesting data, but I think the percentage of remote listings is misleading. Many “remote” jobs now require you to live within commuting distance of a particular city, usually SF or NY.
Another thing to improve this, is to ask posters to add GLOBAL_REMOTE, COUNTRY_REMOTE or something that indicates is not local remote only (within the same country).
I wonder how this would compare against a random sample of jobs on, say, Indeed or LinkedIn. My experience of Hacker News is that it’s a very biased group (in a good way) to the general industry.
I've had few interactions with HN crowd as I've posted my availability for consulting/freelancing and I feel like I don't like the bias.
People needing freelancers for few weeks/months to complete projects where the requirements are glueing the usual APIs and solving the usual Safari bugs asking me Leetcode questions are out of their mind.
I am not applying for a full time position, I'm not a cofounder that is going to make or break your startup and no sorry I am not sharing your vision/mission whatever.
Discord is by far better for finding work in your domain and related to technologies you like, and you can ask for much more money because people already know you're experienced on the topics you share on that discord.
>Discord is by far better for finding work in your domain and related to technologies you like
I've tried to interact on a variety of gamedev discords, and the results are about as dry as LinkedIn. But I suppose probing based on the 2020's market won't garner typical results. (still, open to checking out any suggestions. Far from a census here).
Money, though... ha. Less money and usually very few VC's so you're taking a hit compared to trying to grind interviews with EA/Activision. But that's games for you.
I've heard that many of the jobs posted on general jobs boards like that are never intended to result in a hire. They are posted when the company already knows who they are going to hire, but are legally obligated to post the position, or when the company wants to manufacture evidence that "no qualified candidates" could be found locally.
There's usually tells that it's a compliance post.
Used to be very specific instructions about mailing a resume to an address with a reference number. And advertised only in the newspaper. But Immigration said they can't do that one anymore; has to be the same submission methods (email, webform, whatever) as an actually open position and advertised/listed in the same places too.
But they'll still have the other tells, which is very specific experience and education requirements which happen to line up exactly with their preferred candidate. Sorry, we did our best, but we can't find any local candidates with a 4 year BS degree, a minor in Clown Studies, and 3 years experience with very specific software that isn't used many places (experience most likely obtained at the hiring company during internship or while on OPT; or while on H1-B if this is in support of a green card, rather than in support of H1-B).
I would say that's more prevalent in HN. A lot of the "Who's Hiring" posts are veiled show-and-tells. Some of those companies clearly have no intention of hiring. Even got an automatic rejection email from one of those (within a minute of applying). To be fair, it does work - I've discovered some interesting startups and market niches from the Who's Hiring threads.
Yes, but I doubt they use the HN jobs board for that. In fact, it will hurt their chances. They can simply post on a very generic job board (e.g. Monster) and say no qualified candidate applied.
The HN job board is much more likely to produce a qualified candidate.
definitely a skew here, yes. You'll get about the same web dev role demand, but it feels like there's more of some deeper domains here (embedded, compilers, etc) and less of other domains (games, IT).
NYC clearly doesn’t have the level of activity in this area that the Bay does, but there’s a scene. LeCun and the NYU crowd and the big FAIR footprint create a certain gravity. There’s stuff going on :)
Me: On your first day of work, what would you say to me if I asked you to do the samething in your writeup but in production and at scale using the same third party services (chatgpt etc) you used in your writeup.
T: "You're an idiot. I quit."
Me: You're hired!
Clearly it's a fun exploratory excercise, kind of like using kafta instead of a db for the main store for a crud app just to see how it works. But if you asked a senior engineer who follows the sector he would probably guess all the answers correctly blind. Tamer himself says it was his most expensive sunday hobby night. Now scale that uselessness to enterprise level. And you're not even sure if some of the results are hallucinated.
I can actually see how nice it would be for an llm to be able to disambiguate 'go' and 'rust'. However, it does seem a bit disappointing that it isn't consolidating node.js and nodejs or react-native and react native.
I'm curious on the need to do use selenium script to google to iterate, here's my script: https://gist.github.com/bobbywilson0/49e4728e539c726e921c79f.... Just uses the api directly and a regex for matching the title.
Thanks for sharing!
What the author could have done, and what I should have (but didn't) also, is add a bunch of possible values (enums) for each possible field value. This should solve it from coming up with variations e.g. node, nodejs
In zod/tooling it would look like this; remote: z.enum(['none', 'hybrid', 'full']), framework: z.enum(['nodejs', 'rails']),
But this just shifts the problem further down, which is now you need a good standard set of possible values. Which I am yet to find, but I'm sure it is out there.
On top of that, I am working on publishing a JobDescription.schema.json such that the next time the models train, they will internalize an already predefined schema which should make it a lot easier to get consistent values from job descriptions.
- Also I tend to forget to do it a lot recently in LLM days but there are plenty of good NER (Named Entity Recognition) tools out there these days, that you should run first before making robust prompts
The training data or some kind of enrichment of the data would have to make the systems understand node.js and nodejs are the same just like on the new notes calculator Apple-sauce = $2.50 * 8 makes the first statement a variable.
Anyway, I have a tangential question, and this is the first time I see langchain, so may be a stupid one. The point is the vendor-API seems to be far less uniform than what I'd expect from a framework like this. I'm wondering, why cannot[0] this be done with Ollama? Isn't it ultimately just system prompt, user input and a few additional params like temperature all these APIs require as an input? I'm a bit lost in this chain of wrappers around other wrappers, especially when we are talking about services that host many models themselves (like together.xyz), and I don't even fully get the role langchain plays here. I mean, in the end, all that any of these models does is just repeatedly guessing the next token, isn't it? So there may be a difference on the very low-level, there my be some difference on a high level (considering different ways these models have been trained? I have no idea), but on some "mid-level" isn't all of this utlimately just the same thing? Why are these wrappers so diverse and so complicated then?
Is there some more novice-friendly tutorial explaining these concepts?
[0] https://python.langchain.com/v0.2/docs/integrations/chat/
The APIs between different providers are actually pretty similar, largely close to the OpenAI API.
The reason to use a paid service is because the models are superior to the open source ones and definitely a lot better than what you can run locally.
It depends on the task though. For this task, I think a really good small model like phi-3 could handle 90-95% of the entries well through ollama. It's just that the 5-10% of extra screw ups are usually not worth the privilege of using your own hardware or infrastructure.
For this particular task, I would definitely skip langchain (but I always skip it). You could use any of the top performing open or closed models, with ollama locally, together.ai, and multiple closed models.
It should be much less than 50 lines of code. Definitely under 100.
You can just use string interpolation for the prompts and request JSON output with the API calls. You don't need to get a PhD in langchain for that.
I'm totally new (and maybe somewhat late) to the domain, literally just tried right now to automate a fairly simple task (extracting/guessing book author + title in nice uniform format from a badly abbreviated/transliterated and incomplete filename) using plain ollama HTTP-API (with llama3 as a model), but didn't have much success with that (it tries to chat with me in its responses, instead of strictly following my instructions). I think, my prompts must be the problem, and I hoped to try the langchain, since it somehow seems to abstract the problem, but saw that it isn't supported for a workflow the OP used. But since this is a field where I'm really totally new, I suppose I also may be making some more general mistake, like using a model that cannot be used for this task at all. How would I know, they all look the same to me…
Ollama project itself is fairly stingy with explanations. Doubtfully there are many people out there trying to automate an answer to the "Why is the sky blue?" question.
So, I wonder, maybe somebody knows a more digestible tutorial somewhere, explaining this stuff from the ground up?
https://www.octomind.dev/blog/why-we-no-longer-use-langchain...
And, as stated, I also don't know the answer to that question, so this is kinda one of the primary concerns here. I mean, one possible answer seems pretty obvious to me: it would be better to keep your app vendor-agnostic (to be able to switch from OpenAI to Anthropic with 1 ENV var) if at all possible. Neither of articles and doc-pages I've read in the past few hours tries to answer to what extent this is possible and why, and if it even is supposed to be a real selling-point of langchain. TBH, I still have no idea what the main selling point even is.
https://news.ycombinator.com/item?id=40739982 (14 days ago, 300 comments)
It gets picked by people with more of a top-down approach maybe, who feel like adding abstraction layers (that don't abstract pretty much anything) is better. It isn't
https://github.com/m3at/hn_jobs_gpt_etl
Only using the plain OpenAI api. This was on GPT-3.5, but it should be easy to move to 4o and make use of the json mode. I might try a quick update this weekend
LLMs aren't reliable for actual number crunching though (at least from what I've seen). Which is why I really appreciate seeing this blend!
Join the club :) I'd say that's a pretty mainstream idea right now.
Exciting nonetheless
I personally would like AI to help humans be better humans, not try to replace humans. Instead of using AI to create less understanding with black box processes, I'd rather it help us with introspection and search (I think embeddings[1] is still a pretty killer feature that I haven't heard much noise about yet).
[0] https://news.ycombinator.com/item?id=21786547 [1] https://terminusdb.com/blog/vector-database-and-vector-embed...
As an added detail regarding the "remote" v "in-person", another interesting statistic, to me, is to know how many of those in-person job-seeking companies are repeats! It could absolutely mean they're growing rapidly, OR it could mean they're having trouble finding candidates. Equally, missing remotes could mean either they're getting who they need OR they're going out of business.
All interesting plots on the graph!
That is, until December last year. Then it was like a flood gate had opened up. Suddenly we had tons of candidates to try to select from.
I knew things were bad over in bay area and such, but I didn't expect it to hit the market over the pond here so quickly and abruptly.
People needing freelancers for few weeks/months to complete projects where the requirements are glueing the usual APIs and solving the usual Safari bugs asking me Leetcode questions are out of their mind.
I am not applying for a full time position, I'm not a cofounder that is going to make or break your startup and no sorry I am not sharing your vision/mission whatever.
Discord is by far better for finding work in your domain and related to technologies you like, and you can ask for much more money because people already know you're experienced on the topics you share on that discord.
I've tried to interact on a variety of gamedev discords, and the results are about as dry as LinkedIn. But I suppose probing based on the 2020's market won't garner typical results. (still, open to checking out any suggestions. Far from a census here).
Money, though... ha. Less money and usually very few VC's so you're taking a hit compared to trying to grind interviews with EA/Activision. But that's games for you.
Used to be very specific instructions about mailing a resume to an address with a reference number. And advertised only in the newspaper. But Immigration said they can't do that one anymore; has to be the same submission methods (email, webform, whatever) as an actually open position and advertised/listed in the same places too.
But they'll still have the other tells, which is very specific experience and education requirements which happen to line up exactly with their preferred candidate. Sorry, we did our best, but we can't find any local candidates with a 4 year BS degree, a minor in Clown Studies, and 3 years experience with very specific software that isn't used many places (experience most likely obtained at the hiring company during internship or while on OPT; or while on H1-B if this is in support of a green card, rather than in support of H1-B).
The HN job board is much more likely to produce a qualified candidate.
Or is it just chatter from the grapevine?