Readit News logoReadit News
tildef · 3 years ago
Cool to see a new general purpose search engine with its own crawler! I tried a couple of queries and it worked fine for me. Searching for for inetdex (https://www.tuxdex.com/?q=inetdex) returns 1,442 documents that apparently just reflect the crawler's user agent though. Might need some tuning. Can you (assuming OP is affiliated with the site) say something about the team and business model, if any? Would also be interesting to hear about what infrastructure you're running on and if there's any plans to open source the crawler.

I really like the minimal/90s look and feel too--keep up the good work, assuming it's not a honeypot! :^)

NoMAD76 · 3 years ago
Yup, a plus from me about the clean 90's look, hope it will stay that way.

Searched for a term, gives +400k results, up to page 80 and still going. Another big plus from me since I just love to search low ranked results, sometimes you find true gems in obscure pages.

tux37 · 3 years ago
it's returns already 2,816 documents. no, that is not a honeypot!
charlieyu1 · 3 years ago
I searched "how to center a div" and seem to find a bug.

It looks like the html tags from one of the search results messed up the page

tux37 · 3 years ago
I can't see any error for the search query "how to center a div", maybe it has been fixed. https://tuxdex.com (Anonymous Search Engine)
sigmoid10 · 3 years ago
It's so anonymous, it's not even filtering html tags.
tux37 · 3 years ago
Of course, html tags are filtered, it is possible that this function is not yet perfect.
marginalia_nu · 3 years ago
Pretty impressive.

The search results come very fast, but there's some low hanging fruit QOL optimizations as far as I can see, mostly to do with search result ranking.

It doesn't seem to weigh the terms properly, there is some pull toward relevant terms but not as much as I'd expect from BM-25. I also think it doesn't promote results with adjacent terms enough. As a result the relevance of the search results sort of undulate up and down as you go through the results.

The de-duplication also seems non-existent. You can easily get a page full of duplicates of the same two links. You come a pretty long way if you just keep a simple hash of the HTML of each website or whatever.

This is made worse by what I think is a bug where the crawler follows redirects, and reports them as the original URL and not the redirected URL. I actually had the same bug in my crawler when I started out. Nasty anti-synergy with Wikipedia which has a lot of redirect.

These are relatively minor issues that shouldn't be hard to fix.

jwilk · 3 years ago
What's QOL?
w4ffl35 · 3 years ago
quality of life
tux37 · 3 years ago
Thank you for the many posts and for visiting the TUXDEX.COM.

TUXDEX has received an update of both the software and the indexes, thanks bacon-and-stars we have been able to fix the error, TUXDEX is safe from XSS and other attacks.

About the results: We are working on our algorithm and other methods that will improve the results. Please keep in mind that the well-known search engines sometimes employ thousands of people and have been on the market for years.

http://tuxdex.com (Anonymous Search Engine)

Deleted Comment

greenail · 3 years ago
Results for the query: how+does+tuxdeo+make+money Instead of: how does tuxdex make money Result page: 1 0 Documents found in 0.27 seconds
octoberfranklin · 3 years ago
This. I'm glad these guys are running their own crawler, but their story doesn't add up.

Most likely this is an outfit that does a lot of webscraping and wants to minimize the amount of manual banning they incur. So they set up a plausible indie search engine as a front. Of course the search engine has to actually work when the scrapees notice and try it out.

memorable · 3 years ago
TBH, the comments here are way more positive than I expected.

And the search engine is actually pretty decent (from my test).

encryptluks2 · 3 years ago
Is this open source, and if not how do they prove anonymity?
smabie · 3 years ago
Even if it was open source, how would they prove anonymity?
techsin101 · 3 years ago
I dont remember exactly, but it's possible to create a completely anon search engine. User encrypts the query using their key and sends it to search engine, search engine now doesn't know the query, but then it encrypts its entire database using public key of user and see if hashes meet for keywords?... something like that
onlyrealcuzzo · 3 years ago
Also - how are you going to block bots without being anti-user also without some sort of tracking?

You could easily get 100 not requests for every 1 user request.

tux37 · 3 years ago
Our search server receives the request from the web server and returns the results. The web server, which could store the user IP, does not know the search query, since the POST method is used for transmission. The search server that processes the request does not know the user IP, only the web server.

The logs will be deleted after the visitor numbers have been determined. Your search query remains anonymous at https://tuxdex.com.

octoberfranklin · 3 years ago
Easy: just don't block bots.

Problem solved.