ukutaht (u/ukutaht) - Readit News

ukutaht commented on Hetzner continues its growth in the US with a new location hetzner.com/news/12-22-cl... · Posted by u/matteocontrini

zelphirkalt · 3 years ago

I hope they created an offspring company with no access to the Hetzner main infrastructure. It would be very inconvenient to lose the "hosted in Germany" - a.k.a. "not accessible data for the US" - aspect in terms of GDPR.

ukutaht · 3 years ago

Yes, very important to understand the corporate structure and legal implications wrt to GDPR and the Schrems II decison.

ukutaht commented on Ask HN: Who is hiring? (June 2022) · Posted by u/whoishiring

ukutaht · 4 years ago

Plausible Analytics | https://plausible.io | Product Engineer | Remote Worldwide | Full-time

Plausible Analytics is an open source alternative to Google Analytics. Our mission is to reduce corporate surveillance by providing an alternative web analytics tool that doesn’t come from the AdTech world. To learn more, you can check out the live demo of our product and read more about us.

We are looking for a senior product developer who can confidently ship new features and evolve our system architecture at this growth stage. Our team is small (3 devs) and your impact will be big.

We use Elixir/Phoenix, React, PostgreSQL, ClickhouseDB, Terraform/Ansible

For more details, see: https://plausible.io/jobs/product-engineer

ukutaht commented on Use of Google Analytics declared illegal by French data protection authority cnil.fr/en/use-google-ana... · Posted by u/guillem_lefait

nickjj · 4 years ago

What type of server specs (memory, CPU, disk size, etc.) do you use to self host it?

Based on an open issue[0], it's suggested to run a server with 32GB+ of memory to handle hosting Clickhouse but that would mean self hosting Plausible would end up being $160 / month on DigitalOcean which would make it 10x more expensive than hosting my custom app that I want to see analytics for.

I know you can use less memory but it sounds like using less can result in an unpredictable environment where everything can stop working at any given moment depending on what Clickhouse wants to do. This happened to someone who replied in that issue. Their production set up stopped working because it ran out of memory.

Someone else wrote about it using close to 8GB of disk space to track ~8k page views at https://cyberhost.uk/plausible-3-month-review/. That was only written back in March 2021 too. They said they are going to look for an alternative solution because the the storage costs are too high.

[0]: https://github.com/plausible/docs/issues/67

ukutaht · 4 years ago

Clickhouse has got a lot better in limited memory environments. They now recommend 4GB minimum.

The production environment that crashed due to Clickhouse OOM was our hosted product a while ago :) After that, we haven't had any downtime on our Clickhouse DB for over a year.

The issue with disk space stems from a bad default configuration. Clickhouse used to have EXTREMELY noisy debug level logging enabled by default with no rotation. This has been fixed in our hosting repo[1] so you get sensible defaults.

If you don't want to worry about downtime, planning disk space or compute capacity, then that's exactly what we offer at https://plausible.io. We process and keep the visitor data on our Hetzner servers in Germany.

1. https://github.com/plausible/hosting

ukutaht commented on Use of Google Analytics declared illegal by French data protection authority cnil.fr/en/use-google-ana... · Posted by u/guillem_lefait

NLMichel · 4 years ago

The powerful thing about GA is the link with Google Ads, does that work nice for Plausible as well?

ukutaht · 4 years ago

Plausible founder here. There's nothing automatic but you can track your campaigns with utm_campaigns manually.

Google has made sure that analytics for Google Ads works best within their own walled garden. Same with Facebook and Twitter with their Pixel products.

Instead of using the Referer header or utm parameters as intended, these large corps send obtuse random IDs (gclid, t.co/<id> links) which only they can correlate to an ad, search query or tweet using their internal database.

So until there is anti-trust action in this space towards more oppenness and competition, you're stuck with the ad provider if you want tight integration between ads and analytics.

ukutaht commented on Ask HN: Who is hiring? (February 2022) · Posted by u/whoishiring

ukutaht · 4 years ago

Plausible Analytics | Site Reliability Engineer | Full-time | Remote (global) | €60-100k

Hi HN! I'm the technical founder of Plausible Analytics [1], I've been the only full-time developer on the project since the beta in 2018. We are growing steadily and need to start upgrading our infrastructure to keep up with the demand. We're looking for help with:

  - Deploying and running our production workloads
  - Creating and testing our disaster recovery protocols. From single node failures all the way to whole datacenter failover scenarios
  - Defining our monitoring, alerting and incident response practices
  - Enabling horizontal scale-out of our application services and database systems
  - Automating operational tasks
  - etc

It's the most fun project I've ever worked on and I'm not just saying that because I started it. Our product is open source, we have great customers with household names and it feels great to be doing something about the abysmal state of online privacy. We have many self-hosters and get a lot of love and good vibes from the community.

The company is fully bootstrapped and independent from any investors. This means we can grow at a sustainable pace without having to satisfy external growth targets and timelines. We can take time to get things right and ship stuff that we're really proud of.

We are currently just 3 people which means there's little bureaucracy and politics like in larger companies. Things go smooth, you won't be sitting in meetings all day. It's a pure engineering role with ample time and space to actually focus on doing good work. There are pros and cons to how we work but we really enjoy working in a small remote team with tons of autonomy. This is how we plan to keep it.

If you're interested, do check out the full job description here: https://plausible.io/jobs/infrastructure-engineer

[1] https://plausible.io/

ukutaht commented on Is Google Analytics illegal in your country? isgoogleanalyticsillegal.... · Posted by u/james_impliu

sccxy · 4 years ago

Just replaced my Google Analytics with Plausible.

Self hosted docker and it is not blocked by adblocker (nginx custom js filename).

Only thing I miss at the moment is extensive GA history which is gone now, but Plausible is so much faster and simpler and maybe they implement history import someday.

ukutaht · 4 years ago

We're working on GA import at the moment[1], it will be the next big feature we land.

As with everything we will integrate and test with our large customer base on the hosted version and then release it for self-hosted as well. Next release is planned in Q2.

1: https://github.com/plausible/analytics/pull/1466

ukutaht commented on Who Contributed to PostgreSQL Development in 2020 and 2021? rhaas.blogspot.com/2022/0... · Posted by u/craigkerstiens

ukutaht · 4 years ago

We decided to create a donation fund to support open-source projects that we depend on last year[1]. We rely heavily on PostgreSQL and so just today I was trying to find the best way to contribute to the development of Postgres.

I didn't find much beyond sponsoring official community events[2]. The impression I got was that there are no paid core developers for PostgreSQL, is that correct? If so, what's the best way to support the project financially?

1. https://plausible.io/giving-back 2. https://www.postgresql.org/about/donate/

ukutaht commented on Ask HN: Good open source alternatives to Google Analytics? · Posted by u/TekMol

juriansluiman · 4 years ago

As stated by others already, there's Plausible (plausible.io) and Matomo (matomo.org).

I have used both and stuck at Plausible. A few reasons (subjective):

1. Plausible is GDPR compliant by default, it has an effective way to measure analytics throughout the day without cookies

2. It is simple and that's key. I don't need to know much, Plausible just gives me that

3. It's fairly lightweight. Matomo is quite heavy and as my VPS'es are pretty much scaled down, less is just more

4. The Plausible self-hosting doc is centered around Docker, which is the architecture I use myself and is set up in literally a few minutes

ukutaht · 4 years ago

Disclaimer: Plausible Analytics founder here

I think Matomo is quite similar to Google Analytics which many people feel is bloated and confusing from the user's perspective. The idea with Plausible is to simplify web analytics and make it more understandable compared to what GA/Matomo offer.

Granted, Matomo does have more depth and features in some areas. It can be the better choice if you want to go very deep into analytics and need some power features that Plausible might not support.

We wrote a little (clearly biased) comparison with Matomo[1]. I hope we're not too harsh on it because Matomo is a great project and still a good fit for many people. But obviously we feel like a modern and simplified take on web analytics fits better for the majority of website owners.

1. https://plausible.io/vs-matomo

ukutaht commented on Ask HN: Good open source alternatives to Google Analytics? · Posted by u/TekMol

ggoo · 4 years ago

I use plausible for my very low traffic side project, mostly because it's easy to host yourself and free if you do so.

https://github.com/plausible/analytics

ukutaht · 4 years ago

Thank you!

I'm the maintainer of the project and it's so heartwarming to see it being recommended on this forum.

All the projects mentioned here are great. What I think sets Plausible apart is that we've managed to create a profitable business around a 100% AGPL-licensed codebase (i.e. no dual-license for enterprise version). This means we can keep investing into the product and adding new features without being in the 'thankless OSS maintainer' role that so often ends in burnout.

We're currently working on importing historical data from Google Analytics into Plausible[1] which should make switching even easier for many folks. Stay tuned.

1. https://github.com/plausible/analytics/pull/1466

ukutaht commented on Taking on Google inthegood.co/taking-on-go... · Posted by u/rapnie

cyberlab · 5 years ago

Plausible is great, and I see the need for it, but I've always enjoyed using AWStats instead, as there is no need to add third party code to my site. It all happens in the background and it paints a much better picture of your stats since users can't block the gathering of stats with an AD-Blocker.

ukutaht · 5 years ago

Plausible developer here.

Interesting you say that. There's no reason Plausible could not be used like AWStats. Parsing logs is just a different ingestion mechanism and we already provide self-hosting via Docker. On principle it wouldn't be too difficult to drain your logs into a Plausible instance or just run it on the same host along your web server.

We ran a test last summer and found the stats from our JS-based tracker much much much more usable: https://plausible.io/blog/server-log-analysis

So this is why we haven't put too much effort in log analysis. The stats we got from AWStats were mostly bot traffic with no good way to get rid of them.

Have you run AWStats and Plausible side-by-side? Do you not have ~90% bots in your logs?