Lapha (u/Lapha) - Readit News

Lapha commented on Meta AI releases Code Llama 70B twitter.com/AIatMeta/stat... · Posted by u/albert_e

pandominium · 2 years ago

Everyone is mentioning using 4090 and a smaller model, but I rarely see an analysis where the energy consumption is used.

I think Copilot is already highly subsidized by Microsoft.

Let's say you use Copilot around 30% of your daily work hours. How much kWh does an opensource 7B or 13B model use then in a month on one 4090?

EDIT:

I think for a 13B at 30% use per day it comes around 30$/no on energy bill.

So probably with a even more smaller but capable model can beat the Copilot monthly subscription.

Lapha · 2 years ago

Running models locally using GPU inference shouldn't be too bad as the biggest impact in terms of performance is ram/vram bandwidth rather than compute. Some rough power figures for a dual AMD GPU setup (24gb vram total) on a 5950x (base power usage of around 100w) using llama.cpp (i.e., a ChatGPT style interface, not Copilot):

46b Mixtral q4 (26.5 gb required) with around 75% in vram: 15 tokens/s - 300w at the wall, nvtop reporting GPU power usage of 70w/30w, 0.37kWh

46b Mixtral q2 (16.1 gb required) with 100% in vram: 30 tokens/s - 350w, nvtop 150w/50w, 0.21kWh.

Same test with 0% in vram: 7 tokens/s - 250w, 0.65kWh

7b Mistral q8 (7.2gb required) with 100% in vram: 45 tokens/s - 300w, nvtop 170w, 0.12kWh

The kWh figures are an estimate for generating 64k tokens (around 35 minutes at 30 tokens/s), it's not an ideal estimate as it only assumes generation and ignores the overhead of prompt processing or having longer contexts in general.

The power usage essentially mirrors token generation speed, which shouldn't be too surprising. The more of the model you can load into fast vram the faster tokens will generate and the less power you'll use for the same amount of tokens generated. Also note that I'm using mid and low tier AMD cards, with the mid tier card being used for the 7b test. If you have an Nvidia card with fast memory bandwidth (i.e., a 3090/4090), or an Apple ARM Ultra, you're going to see in the region of 60 tokens/s for the 7b model. With a mid range Nvidia card (any of the 4070s), or an Apple ARM Max, you can probably expect similar performance on 7b models (45 t/s or so). Apple ARM probably wins purely on total power usage, but you're also going to be paying an arm and a leg for a 64gb model which is the minimum you'd want to run medium/large sized models with reasonable quants (46b Mixtral at q6/8, or 70b at q6), but with the rate models are advancing you may be able to get away with 32gb (Mixtral at q4/6, 34b at q6, 70b at q3).

I'm not sure how many tokens a Copilot style interface is going to churn though but it's probably in the same ballpark. A reasonable figure for either interface at the high end is probably a kWh a day, and even in expensive regions like Europe it's probably no more than $15/mo. The actual cost comparison then becomes a little complicated, spending $1500 on 2 3090s for 48gb of fast vram isn't going to make sense for most people, similarly making do with whatever cards you can get your hands on so long as they have a reasonable amount of vram probably isn't going to pay off in the long run. It also depends on the size of the model you want to use and what amount of quantisation you're willing to put up with, current 34b models or Mixtral at reasonable quants (q4 at least) should be comparable to ChatGPT 3.5, future local models may end up getting better performance (either in terms of generation speed or how smart they are) but ChatGPT 5 may blow everything we have now out of the water. It seems far too early to make purchasing decisions based on what may happen, but most people should be able to run 7b/13b and maybe up to 34/46b models with what they have and not break the bank when it comes time to pay the power bill.

Lapha commented on Anki – Powerful, intelligent flash cards apps.ankiweb.net/... · Posted by u/bcg361

alabhyajindal · 2 years ago

He should have used Vim instead of memorizing it's keybindings. That makes no sense at all.

Lapha · 2 years ago

Well, he absolutely should have been using Vim, he instead spent the week learning vi.

Lapha commented on How many CPU cores can you use in parallel? pythonspeed.com/articles/... · Posted by u/itamarst

kristjansson · 2 years ago

> (you can also use perfplot, but note it’s GPL-licensed)

Surly using a GPL-licensed development tool cannot affect the licensing of the project it's being used to develop?

Lapha · 2 years ago

If it were just a standalone tool it wouldn't affect the licencing of the project, but it's not a tool, it's a software library, so projects using the library need to follow the terms of the GPL.

The tricky part here is that it's a library that produces graphs which themselves aren't covered by the terms of the GPL. The intent of the tool seems to be that you use it to produce graphs during development for benchmarking or during research or whatever where the tool is internal but the graphs aren't necessarily internal, in which case the licence of the library is mostly irrelevant as the GPL is mostly concerned with distributing software to other users of said software, not the output of the software, but if you later choose to want to release that tool it really aught to follow the terms of the GPL and be under a licence compatible with the GPL, so the note about it being a GPL licenced library is completely valid. You should use a more permissive LGPL/MIT/BSD/whatever licenced library like benchit if you don't want to have to think about this sort of stuff.

Lapha commented on Copyright claim against Tolkien estate backfires on LOTR fanfiction author theguardian.com/books/202... · Posted by u/pseudolus

seanhunter · 2 years ago

Certainly in the UK it seems you have to have permission from the original rights holder to create the derivative work in the first place. If they didn't have this permission they weren't entitled to create the work and therefore their claim of copyright in the original parts of that work would be moot. I'd be surprised if that were not the case elsewhere https://copyrightservice.co.uk/copyright/p22_derivative_work...

Lapha · 2 years ago

Note that the author of the unauthorised derivative work 'B', based on the 'original' work 'A', still retains copyright over their own (unauthorised) work B. Even if a court compels the author of B to stop distributing their infringing work B this fact doesn't change, B doesn't go out of copyright, and the author of A (or anybody else for that matter) doesn't have any additional rights to the use the work B.

In this particular case the Guardian article doesn't answer the question, but it's almost certainly the case that the courts found that Tolkien's estate didn't infringe upon Polychron's unauthorised work. In the UK (and much of the world), the bar for creating an original work is incredibly low, currently it's the EU approach where the author needs to express (creative) originality through choice, selection, or arrangement of a work. Previously the UK had the 'sweat of the brow' rule where the bar was even lower, only requiring 'labour, skill or judgement' in producing a work. The UK also explicitly protects derivative works from being considered infringement if they're found to be a parody, caricature, or pastiche, but this obviously isn't the case here.

Also note that 'copyright service' is a private company that sells copyright protection as a service through registering a work. The UK, unlike some countries, has no register of copyrighted works so you don't need to register a work for it to receive copyright protection in the UK.

Lapha commented on Kyutai AI research lab with a $330M budget that will make everything open source techcrunch.com/2023/11/17... · Posted by u/vasco

yieldcrv · 2 years ago

the “and” in FOSS is actually doing alot of heavy lifting, as open source really does only mean source available

yes, I want FOSS too

Lapha · 2 years ago

>as open source really does only mean source available

The definition and history of the term as a licence is unambiguous in that the only restriction on redistribution is that it contains the source code under the same licence. There are senior engineers alive today that weren't even born when this was the commonly understood meaning of the term, it's not a new concept.

The term and usage is being co-opted these days but that's bound to happen when it's not a legally protected definition. Give it another 10-20 years and I'm sure we'll be having the same argument over whatever term ends up replacing 'open source'.

Lapha commented on Kyutai AI research lab with a $330M budget that will make everything open source techcrunch.com/2023/11/17... · Posted by u/vasco

ekianjo · 2 years ago

except in the name

Lapha · 2 years ago

'Free' software has problems with its name too. The ones muddying the waters are people and companies releasing source code with a proprietary licence while trying to latch onto the open source branding.

Lapha commented on MP3 vs. AAC vs. FLAC vs. CD (2008) stereophile.com/features/... · Posted by u/thefilmore

jbverschoor · 2 years ago

I don’t know where it came from.. it was already there in the CRT times.

A simple google on 60 fps will still show these “scientists” who claim that we can perceive anything higher than 30-60 fps.

“Science” does NOT equal truth.

Lapha · 2 years ago

The topic of human vision and perception is complex enough that I very much doubt it's scientists who are making the claim that we can't perceive anything higher than 30-60fps. There's various other effects like fluidity of motion, the flicker fusion threshold, persistence of vision, and strobing effects (think phantom array/wagon wheel effects), etc, which all have different answers. For example, the flicker fusion threshold can be as high as 500hz[0], similarly strobing effects like dragging your mouse across the screen are still perceivable on 144hz+ and supposedly 480hz monitors.

As far as perceiving images goes, there's a study at [1] which shows people can reliably identify images shown on screen for 13ms (75hz, the refresh rate of the monitor they were using). That is, subjects were shown a sequence of 6-12 distinct images 13ms apart and were still able to reliably able to identify a particular image in that sequence. What's noteworthy is this study is commonly cited for the claims that humans can only perceive 30-60fps, despite the study addressing a completely separate issue to perception of framerates, and is a massive improvement over previous studies which show figures as high as 80-100ms, which seems like a believable figure if they were using a similar or worse methodology. I can easily see this and similar studies being the source of the claims that people can only process 10-13 images a second, or perceive 30-60 fps, if science 'journalists' are lazily plugging something like 1000/80 into a calculator without having read the study.

There's also the old claim [2] from at least 2001 that the USAF studied fighter pilots and found that they can identify planes shown on screen for 4.5ms, 1/220th of a second, 1/225th of a second, or various other figures, but I can't find the source for this and I'm sure it's more of an urban legend that circulated gaming forums in the early 2000s than anything. If it was an actual study I'm almost certain perception of vision played a role in this, something the study at [1] avoids entirely.

[0] 'Humans perceive flicker artifacts at 500 Hz' https://pubmed.ncbi.nlm.nih.gov/25644611/

[1] 'Detecting meaning in RSVP at 13 ms per picture' https://dspace.mit.edu/bitstream/handle/1721.1/107157/13414_...

[2] http://amo.net/nt/02-21-01fps.html

Lapha commented on Using ChatGPT to fix annoying Safari UI issue on macOS chat.openai.com/share/af5... · Posted by u/amichail

AndrewKemendo · 2 years ago

Unfortunately the data it’s trained on (internet text - commoncrawl) has an antisocial data bias, and there’s no alternative “pro-social” data corpuses that exist.[1]

Every one of these has to be “taught” not to be toxic - usually by some version of exploited labor paid pennies to look at the worst stuff on the planet [2]

It’s true of every larger model trained on internet corpus - Microsoft notoriously had this issue with Taybot

[1] https://hci.stanford.edu/publications/2022/Park_ContentModAu...

[2] https://time.com/6247678/openai-chatgpt-kenya-workers/

Lapha · 2 years ago

While I'm sure the publicly available and already moderated Reddit comments OpenAI pays workers to classify are truly disturbing, nobody should be forced to read Reddit after all, I think the moderators at Facebook, Google, porn websites, etc, that have to deal with moderating graphic videos of extreme violence, torture, gore, abuse, sexual abuse, child sexual abuse, etc, are the ones that have to look at the worst stuff on the planet.

Lapha commented on CT scans of coffee-making equipment scanofthemonth.com/scans/... · Posted by u/eucalyptuseye

cjs_ac · 3 years ago

UK plug sockets are rated at 13 A, giving a maximum power rating of 2,990 W. Kettles are consequently amongst the highest-drawing household appliances in the UK.

Back when there were only three television channels, the National Grid planners used to pore over the Radio Times, looking for popular programmes like the Morecambe and Wise Christmas Special (21 to 28 million viewers in 1977), so they could prepare for the demand surge of the entire nation putting the kettle on at the end of the programme.

Lapha · 3 years ago

>UK plug sockets are rated at 13 A, giving a maximum power rating of 2,990 W.

Nominally, anyway. When the EU standardised mains voltages they mostly did so on paper by fudging the tolerances, UK's 240v +/- 3% became 230v -6/+10%. 28 years later and I still see the mains voltage in the 240-250v range more often than not.

Lapha commented on Electricity Maps app.electricitymaps.com... · Posted by u/fulafel

404mm · 3 years ago

Can you please elaborate more on this? I take it this includes CO2 released during manufacturing. Is it averaged over the expected life span? What about for a nuclear plant? Whole I can see how we can estimate CO2 for a panel, I don’t think we can have good idea of a whole plant.

Lapha · 3 years ago

Ideally it's lifecycle carbon footprint averaged over expected power production during that time, but the numbers aren't always directly comparable and the error bars are huge to boot.

With solar, most of the carbon footprint comes from the massive energy requirements needed to produce the panels, and countries that use coal to produce panels like China have a significantly higher carbon footprint than say the EU. Then there's the matter of where you install them, panels in most of Europe will have a CO2/kWh figure similar to to panels installed in north-west Canada, while panels installed in the US will have a similar figure to panels installed in southern Europe and northern Africa. Newer panels generally have a longer lifespan and are more efficient so will have a lower number even if the total footprint stays the same. Should the albedo effect be considered with solar? It matters if you plan to cover a light desert with dark panels. How often it rains can also ironically have a difference, as rain helps clean the panels keeping them running efficiently.

With nuclear, there's the mostly fixed costs of constructing and decommissioning the plants, the ongoing cost of running and maintaining the plants, disposing of spent fuel, and with most of the cost coming from mining the fuel. To me nuclear seems a little more straight forward to calculate, but there's still variability that can come from the availability and difficulty of mining the ore. There's also political issues to consider. If your country decides to shut down your nuclear plants prematurely, like Germany did, the huge upfront cost of building the plants can't be recouped by running them for an additional 10, 20, 30+ years until it becomes necessary to decommission them.

Regardless of how renewables are compared to nuclear, coal and gas are the elephants in the room when it comes to CO2 produced per kWh.