That was going to be my suggestion for how to get around the anti-robot responses.
That was going to be my suggestion for how to get around the anti-robot responses.
This solution uses a generous 87s delay to retrieve the Amazon pages. There are 328 films listed as "great movies" on rogerebert.com. As such, the script, named "1.sh", needs 8h to complete, e.g., the time while you are at work or sleeping. No cookies, no state, no problems.
Usage: sh -c 1.sh > 1.html
Open 1.html in a browser and it shows whether each "great movie" is available as Prime Video or whether it is only available in some other format, such as Blu-ray, DVD, Multi-format, Hardcover. A link to the item on Amazon is provided. #!/bin/sh
curl -HUser-Agent: -H'Accept: application/json' --compressed 'https://www.rogerebert.com/great-movies/page/[1-16]?utf8=%E2%9C%93&filters%5Btitle%5D=&sort%5Border%5D=newest&filters%5Byears%5D%5B%5D=1914&filters%5Byears%5D%5B%5D=2020&filters%5Bstar_rating%5D%5B%5D=0.0&filters%5Bstar_rating%5D%5B%5D=4.0&filters%5Bno_stars%5D=1'|grep -o "/reviews/great-movie-[^\\]*"|sed 's/.reviews.great-movie-//'|sort|uniq|while read x;do y=$(echo $x|sed 's/-/+/g');echo $x;curl -s --compressed -HUser-Agent: https://www.amazon.com/s/?k=$y 2>/dev/null|grep -m1 -C4 a-link-normal.a-text-bold;sleep 87;done|sed '/^[^< ]/s/.*/@&,/;1s|.*|<base href=https://www.amazon.com />&|;s/ *//;/^$/d;/^[@<]/!s|$|</a>|;1s/@//;s/@/<br>/'
> And now, we the architects of the modern web — web designers, UX designers, developers, creative directors, social media managers, data scientists, product managers, start-up people, strategists — are destroying it.
The interests of tech companies, investors and web professionals have not always aligned with the best interests of end-users and so there has been a gradual erosion of the freedoms embedded in the foundations of the web itself.
My favourite StarTrek moment is Captain Pike's statement "We are always in a fight for the future". Given the current state of the web, this feels truer than ever. Unlike the author, however, I don't think the answer is better web pages. Any chance of us winning the fight for user freedoms must be bigger and bolder than that.
There has been an entire generation of entrepreneurs and investors who have thought and planned strategically how to shape the web to work in their best interests. A meaningful counter has to be equally intentional and coordinated to stand a chance at shaping the course technology takes. We are in a fight for the future and we need to think bigger to stand a chance of winning that fight.
Absolute favourite episode hands down, The Cage. According to Shatner's autobiography, NBC called the pilot "too cerebral" and "too intellectual".
There has always been a place for commerce and marketing on the web."
Not really true as I remember it. The web opened up to the public in 1993. There was no commerce and marketing in the beginning. Even by 1996 while commerce and marketing may have existed, e.g., Amazon founded in 1995, its place was in the background. As I rememember the early web, the foreground, the "starting point" or "portal", was something like Yahoo! You had to pick a topic (direction) that you wanted to go in. For example, if you were after music, you might end up browsing the Internet Underground Music Archive. The "front page" of the portal was predominantly non-commercial, mostly generic headings for topics. If you wanted to search out something commercial, no doubt you could but the initial starting point was intellectual curiosity. This is IMO what has been lost over time with regard to web use: intellectual curiosity and the ability to actually satisfy it. (A fun tangent here is the collections of inane queries that people type into Google. These are simultaneously hilarious and disturbing.)
As an experiment have a look at the Yahoo! page today. It is full of low quality mainstream "news". There is zero attention to intellectual curiosity. Nothing to see here, folks, but here is the latest news. For part 2 of the experiment, run a Google search for the term "music". The results are dominated by YouTube. Every result is directly or indirectly commercial (either selling something or conducting surveillance and serving ads), except one: Wikipedia. The chances of someone new to the web not following a link to YouTube or some other Google-controlled domain would seem almost nil.
The "onboarding" process for new web users is very different today than it was in the early 1990's. Perhaps it is still possible to approach the web with a sense of awe and wonder, pondering "What is out there?" However a new web user is scant likely to end up on a non-commercial website besides Wikipedia. What is out there? Surveillance, ads and an endless supply of soon-to-be-obsolete Javascript du jour.
This article doesn’t mention a really, really straightforward factor for why AI hasn’t invaded these domains despite billions of dollars being dumped into them.
An automated process only has to be wrong once to compel human operators to double or triple check every other result it gives. This immediately destroys the upside as now you’re 1) doing the process manually anyway and 2) fighting the automated system in order to do so.
99% isn’t good enough for truly critical applications, especially when you don’t know for sure that it’s actually 99%; there’s no way to detect which 1% might be wrong; there’s no real path to 100%; and critically: there’s no one to hold responsible for getting it wrong.
"... and critically: there's no one to hold responsible for getting it wrong."
Could this be part of "AI"'s appeal? A dream of absolving businesses and individuals from accountability.[2]
1. "What's more, artificial research teams lack an awareness of the specific business processes and tasks that could be automated in the first place. Researchers would need to develop an intuition of the business processes involved. We haven't seen this happen in too many areas."
2. Including the ones who designed the "AI" system.
I think the level of control programmers have over their domain naturally gives rise to that sort of overconfidence. You need to remember that computer systems are built on human made abstractions to human standards and follow human defined logic. DNA is not code, it's just a molecule that reacts with stuff, as are all the other molecules. They exist as they are and are their own system that needs to be understood, we did not create that system. Chemistry and probability and time did.
This sentence exemplifies the perspective to which I referred.
#!/bin/sh
test -s max-PMID||echo 32446294 > max-PMID;read x < max-PMID;x=$((x-1));h=pubmed.ncbi.nlm.nih.gov;
test ${#x} -eq 8||exec echo weird max-PMID;sed -i "/test/s/echo [0-9]\{8\} /echo $x /" $0;
case $1 in update) mkfifo 1.fifo 2>/dev/null;test -p 1.fifo||exec echo need 1.fifo;
(grep "<title>PMID .* is not available" < 1.fifo|sed 1q|sed 's/<title>PMID //;s/ *//;s/ .*//;' >max-PMID)&
y=$((x+10000));seq $x $y|sed '$!s|.*|GET /&/ HTTP/1.1\r\nHost: '"$h"'\r\nConnection: keep-alive\r\n\r\n|;
$s|.*|GET /&/ HTTP/1.1\r\nHost: '"$h"'\r\nConnection: close\r\n\r\n|'|socat - ssl:$h:443 >1.fifo 2>/dev/null;
;;"")awk -v min=1 -v max=$x 'BEGIN{srand();printf "https://'$h'/" int(min+rand()*(max-min+1)) "/\n"}';esac
#/bin/sh
test -s max-PMID||echo 32449615 > max-PMID;read x < max-PMID;h=pubmed.ncbi.nlm.nih.gov;
test ${#x} -eq 8||rm max-PMID;sed -i "s/[0-9]\{8\}/$x/" $0;
case $1 in update) mkfifo 1.fifo 2>/dev/null;test -p 1.fifo||exec echo need 1.fifo;
(grep "<title>PMID .* is not available" < 1.fifo|sed 1q|sed -n 's/<title>PMID //;s/ *//;s/ .*//;wmax-PMID')&
y=$((x+10000));seq $x $y|sed '$!s|.*|GET /&/ HTTP/1.1\r\nHost: '"$h"'\r\nConnection: keep-alive\r\n\r\n|;
$s|.*|GET /&/ HTTP/1.1\r\nHost: '"$h"'\r\nConnection: close\r\n\r\n|'|socat - ssl:$h:443 2>/dev/null|grep -o '<title>[^<]*' >1.fifo;
;;"")awk -v min=1 -v max=$((x-1)) 'BEGIN{srand();printf "https://'$h'/" int(min+rand()*(max-min+1)) "/\n"}';esac
If you want to really grok genetics and be able to understand and interpret news and discussion about the field, especially considering how important the field is in our day to day lives, both with the virus and with biotech/medicine in general.
You mean like this: https://ds9a.nl/amazing-dna
Having worked in both industries I prefer working with wet science people. For some reason they generally have a much healthier perspective on life. Their work is humbling because it is, and will forever be, full of unsolved mysteries, not simply because it is challenging. The other folks, whether they call themselves "scientists" or "engineers" or "developers" or "coders" or whatever, are working with something that as far as I can see has no inherent connection to the natural world, other than being a production of the human mind. Perhap that affects the perspective many of them have on life. For example, how common among them is this belief that all things, not simply computers, can be thoroughly understood and mastered. Note this is pure opinion, not fact, and I am generalising; there are exceptions to every generalisation.
https://letterboxd.com/dvideostor/list/roger-eberts-great-mo...
You can look at each movie to see what streaming service it's on one at a time for free.
If you have a pro paid account, you can even do:
https://letterboxd.com/dvideostor/list/roger-eberts-great-mo...
Which shows that there are 39 movies in Amazon Prime US from Ebert's "Great Movies," not 21 like this guy's spreadsheet says.
To be fair, the exercise was to scrape the reference sources... so it might just need some refinement.
Need to double check though if both lists are correct, only confirmed number totals.
Full disclosure: That letterboxd list is not mine, I just found it
I could be wrong, I am not a Prime Video user, but the result I got was that there are 217 movies in Prime Video from Ebert's great movies.
Instructions on how to generate 1.html are here: https://news.ycombinator.com/item?id=23508182