Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Counter – Simple and free web analytics (counter.dev)
169 points by ihucos on March 13, 2023 | hide | past | favorite | 81 comments


One of the saddest trends in web dev over the past decade or so is the shift away from hosting on a server you have access to and moving over to something opaque like Netlify or Vercel. Pretty much all of the data you can get from a JS-plugin based third party like Counter is sitting in your Apache/nginx/IIS log files if you host your site on a server you control, but without the need for stuffing yet another piece of client side code that has no benefit to the user in to the HTML.

It's not likely to change so I'm just yelling at a cloud right now, but website stats is one of the biggest losses in modern web dev so it bugs me. Maybe more than it should. For what it's worth, Counter looks to be a much better option than Google-"here's 200KB of glacially slow JS because screw users"-Tag-Manager.


> but website stats is one of the biggest losses in modern web dev

That seems a bit overly dramatic. If you look at your log files these days you'll see that a high percentage of it is just scrapers, bots, scripts trying to access wp-admin etc.

Collecting this information somewhere else (instrumented backend, client-side script) makes a lot more sense now where you can filter out noise more easily. There's also very light client-side scripts like https://plausible.io which are a nice trade-off between privacy and useful information while not being too heavy.


Request logs are a source of truth for certain metrics. A tracker service may get you more metadata than the request line in the logs, but something like 25% of internet users use ad block, which often blocks trackers as well (I block plausible). Not seeing metrics for bots, scrapers, and x% of users can really mess up certain metrics.


25% seems way too high if you are not talking about some tech audiences focused website. Especially on mobile not many people install an as blocker, I would be surprised if it’s more than 1% globally.


I'm not sure about the number, but I was quoting https://www.statista.com/statistics/804008/ad-blocking-reach...


Plausible is not lightweight. Its codebase is large for the problem its solving. JavaScript is not the way for analytics.


I actually agree. And with more and more (Personal impression) users using blockers like uBlock origin the meaningfulness of data collected with JS gets eroded. But if there is no alternative then what some web analytics providers provide (counter.dev doesn't) is means to try to circumvent the blockers (For example suggesting proxies: https://plausible.io/docs/proxy/introduction). This leads to what I would call an unhealthy ecosystem.

That being said log files are not the ultimate solutions as non-techies would have more difficulties handling with them.

So I actually agree the current situation is suboptimal. Spinning this thought more I see "Honest" analytics players being pressured to circumvent the blockers to stay relevant but ultimately the blockers have the upper hand and its a game I don't want to play so yeahhh....

A consensus between blockers and web analytics providers that tracking really just the simplest metrics is important for e.G. a yoga studio with its website might actually be difficult because... ...users might just choose the most aggressive blockers because more is always better.

I am eager to see how it will play out. But generally it all goes slow.


"So I actually agree the current situation is suboptimal. Spinning this thought more I see "Honest" analytics players being pressured to circumvent the blockers to stay relevant but ultimately the blockers have the upper hand and its a game I don't want to play so yeahhh...."

This isn't entirely true. Blockers only work if you use a JavaScript trigger which isn't necessary for a self-hosted solution.


Web analytics solutions for self hosting without using javascript is not plug and play and as easy to integrate. So if you have log files maybe its easier. If not you need a middleware for your specific web framework. Or some kind of proxy to route your entire traffic which is not good for performance. I don't know, how would you do it?


"I don't know, how would you do it?"

Capture and process HTTP headers on-the-fly. Either at the server level or the (web) framework level. If done efficiently, it outperforms JavaScript.


Yep, I believe technically you can get better results with such an approach. It's just much more hard to do and the tooling for this approach is not as mature as client side tracking.

The main hurdle is that every code base would need a different kind of integration. Or with a proxy you loose overall performance. And complexity would also be increased. But as said more accurate tracking could be possible. Has all pros and cons :-)


I've built this solution from scratch and am currently pairing it with a custom/prototype OODB.

These are the tradeoffs as I see them and based on my experience building:

In a self-hosted configuration, advanced analytics can be captured without JavaScript. It's unblockable, transparent, and incredibly fast. This is a stack specific solution.

In a cloud configuration, it requires a JavaScript trigger. With JavaScript, each capture (on a dev machine) takes 70ms including storage (cloud db). Performance is good but blocking is now possible. This is a stack agnostic solution.

One of the above suits me as a user; the other as an entrepreneur.


I think mostly we are actually on the same page. One thing I would add in is that starting the whole web analytics endeavour from scratch needs a lot of work, testing and iterations. I am actually thinking that it would make sense for counter.dev to offer an API so that you can use that as a backend middleware or in some other way.

> One of the above suits me as a user; [...]

That might be true. But as roughly said let's say that for over 99% of people, just reading the term "tracking script" already switches off their brain. If you start with "code base specific middleware addition" or "deploy your own tracking http proxy" or something like this you only have knowledgeable techies which well, to be fair could be fair to a more specialised product.

> [...] the other as an entrepreneur.

With hat thinking for me it wouldn't make much sense to offer and maintain the service for basically free since some years already. At least I can cover the hosting costs. So yes, but also a little bit no :-)


"I think mostly we are actually on the same page."

Agreed.

I think the problem facing analytics is that the ideal solution and ideal business require fundamentally different products.


Wasn’t part of the drive to client side analytics an effort to improve data quality, in particular to differentiate bots from humans, and measure actual human analytics without getting caught by caches along the way.

If you use something like Cloudflare you can also get some of that serverside logging back.

And netlify and Vercel both have first class analytics features.


> Wasn’t part of the drive to client side analytics an effort to improve data quality, [...]

Interesting, I did not know that narrative. But what I can tell from subjective experience is that bots is'nt so much a problem with client side analytics. counter.dev filters them out by not logging very short page views by the way. For me the bigger challenge with client side analytics is not being able to track clients which are using a tracking blocker. Which I guess is the end users right to use (I even use uBlock origin myself). But if you start missing roughly 50% of page visits it starts getting an issue for web site owners. The data does not need to be detailed, just accurate enough.

Web analytics on hosters... yeah if they fit your use case then great, but for me that is vendor lock in and I would avoid it if possible and web analytics is more or less a topic for itself that I'd prefer to leave with a specialised solution. But obviously I am biased haha.


Since then most bots would have abandoned libcurl and moved on to use something like headless chrome to get around bot mitigation techniques, so the playing field has evened significantly.


And ubiquitous https has dramatically cut down on caches that sit in the middle, so you only really have to worry about the impact of the browser cache on your analytics.


I've thought about using Cloudflare workers to build a proxy that would do user tracking. Not sure if that is something people would want to do but it effectively gives you a JavaScript free way to track your visitors while still using whatever CDN host like Netlify they want. The challenge would be getting users to change their websites DNS records to a visitor tracking service.


I guess at some point most of us will have to yell at clouds. Whether that changes anything or not remains debatable.

I think the biggest challenge we have today is that building on SaaS and huge dependency charts is very attractive to somebody eager to get an idea off the ground. It has a typically low barrier to entry (as long as you have a PayPal account) and you don’t have to deal with “low level stuff”. Unfortunately it typically also comes with vendor lock in issues but when you realise that you are already way too far down the road.

The fun thing is, anybody with a little bit of guy and docker know how can have a better developer experience in hosting their web projects on a VM so maybe this is an issue of fundamentals?


For those still hosting websites themself, is there any modern web stats analyzer you could recommend? AWStats or Webalizer look so dated.


I've been using GoAccess because of this exact line of thinking (logs over js pixel tracker). GoAccess comes with a really nice TUI, a built-in web server, and can export to csv and other formats. It's pretty robust. You just pipe logs right into it and it starts crunching.


I had a manager ask to include Google Analytics on a government funded site a couple week ago.

Instead I sent nice HTML reports generated by GoAccess from access logs.

Highly recommend.


Somebody mentioned GoAccess. I haven't tried it but it looks good imo.


What’s the difference between this and my Grandad’s nostalgia for 1950s tractors, or his Grandad’s nostalgia for 1880s ploughs.


None whatsoever. In fact that's a great analogy. Modern tractors are incredibly unfriendly to consumers - they're closed source, not open to repairs, expensive, and massively over-engineered. They offer some superficial UX improvements but nothing you can't live without. John Deere's shenanigans get posted to HN regularly. As a company they're one of the reasons the US government is considering laws to protect consumers access to things they've bought. I don't think anyone would find it controversial that someone might want a simpler, easier, repairable, accessible tractor in the light of that.

A little nostalgia for a time when things were actually better for the end user is exactly where my post is coming from. Going back to analysing traffic using server logs without stuffing more JS into every website seems reasonable to me.


I am maintaining counter.dev - If you have any questions, don't hesitate to ask :-)


why not offer static content hosting like gitlab pages and give us the logs on that.

cut the js beacon middle man. ...I mean, I myself block your beacons and didn't even know before looking just now.

-- someone who just looked at recent vps prices for a small personal static site and went with gitlab "hosting" .


That is a complete valid technical solution if you can go trough the effort to do it. I'd recommend GoAccess, looks good but I have not used it myself. It's just that it can be cumbersome and a lot of people have websites but are not techies for example.


You can find a more complete list of (european) based web analytics and other tools here: https://european-alternatives.eu/category/web-analytics-serv...

(not my website)


Ha, thanks I did not know site. Will submit counter.dev to it. Counter.dev is hosted in Frankfurt, Germany and tries to comply with German laws.


Thanks a lot :)), I didn´t know this website!


In case anybody cares, you'll also find mine there (pirsch.io).


I've been using Counter for a few of my small sites for a number of months now and I find it is non-invasive and does a good job of providing basic analytics. Thanks for making this tool available for free!


<3


Looks cool. Any chance you would provide a docker-compose.yml file for self hosting? Right now it's not really simple to do so.



Yes, I would like to do it. But to be quite honest there are other priorities on top of that. Of course we're happy to accept pull requests though!


I think self-hosting is the future of analytics for the reasons described in many of the other comments: ad blockers, browser privacy settings and content filters are here to stay and will only become more powerful. If we can’t self-host, it’s going out with the tide.


Related:

Show HN: Counter – Simple and Free Web Analytics - https://news.ycombinator.com/item?id=26379569 - March 2021 (77 comments)

Edit: reposts are fine after a year or so: https://news.ycombinator.com/newsfaq.html. Lists of "related" links are just for curiosity.


All planned :-)


oof, i can’t load the page because it’s blocked by my Pi-Hole.


I noticed the other day that blocklists really don’t like the word “counter”. I had a bunch of simple demos, one of which was a counter, and it was the only one that didn’t work. Turned out u-block was preventing it from loading.


This is a never-ending problem with web applications that use so-called "routing" (where the client is made to make connections to a variety of URLs, which gets blocked if the URL contains "ad" or similar).


That's not called "routing"


Same. The whole domain shows up in the list. I have too many lists right now to know which is blocking it. Does Pi-Hole have an easy way to see which list a domain comes from?

edit: it's in the https://v.firebog.net/hosts/AdguardDNS.txt adlist.


Under „Tools“ > „Search Adlists“.


Hidden in sight :>

counter.dev is in the https://v.firebog.net/hosts/AdguardDNS.txt adlist.

It seems someone else's mention of adlists not liking the word 'counter' is true. There's almost 300 domains with 'counter' in it just in that list.



I would prefer to selfhost all my analytics. Any plans for that?



Yes, but not very soon. So the thing is that counter.dev is optimised to handle many users for free. For example as the primary database Redis is used. But if you are actually paying for the server resources SQL is just fine. The service is optimised for efficiency and my concern would be that having some kind of "white label" self hosting one click solution would bring in too much complexity that would go against the aim of saving server resources in order to offer the service for free. Maybe I am wrong, but that is my current thinking.

That being said you can technically self host and I am happy to answer questions at hey@counter.dev . Is just that this solution is not designed to be self hosted and with another web analytics solution the self hosting experience could be more streamlined.

One thing that for me would actually make sense is having a conventional backend that plug and plays with the awesome counter.dev UI - then the "only" codebase change would be to whitelabel the UI. But yeah, that might not happen anytime soon.


It would be nice to see comparisons to other options in this space—e.g. I’m currently self hosting umami for analytics, and it does what I need it to (I think anyway—I’m a software engineer and not a marketer/SEO!)

How does Counter compare? I couldn’t grok that from the website or GitHub repo


Looks interesting. I've been absolutely loving GoAccess. Never going back to Google Analytics. How does this compare?


I would say comparing GoAccess and counter.dev is a little bit like comparing apples and oranges. GoAccess analyses log files, that is a more robust approach I would say. I wonder how GoAccess can know unique Unique Visitors, guess that is directly mapped to the IP address which should be fairly accurate. An advantage with GoAccess as a website owner is that you can see traffic coming in from users using things like uBlock origin which is pretty widespread actually.

With counter.dev you have some metrics that you do not have with GoAccess like for example device type, the country of the users and some other minor things. But the biggest advantage is that it is easier to integrate. There are a lot of people with something like a Wordpress site for their restaurant that email me asking how to add counter.dev's tracking script into their site. Many people don't have access to their log files and I think it would be impractical for them to literally compile their analytics tool.

So if you are happy with GoAccess and have it set up you don't need to change.

But if you want to try counter.dev you get your tracking script after just registering with a username and a password. Then put the tracking script in your site and that's it.


Looks good, needs wordpress plugin for real proliferation


How the hell are you getting unique visitors without cookies or IP addresses? I call bullshit.


From the GitHub page:

> Counting unique users is achieved with a combination of relying on sessionStorage facilities, the browser's cache mechanism and inspecting the referrer. Using this technique considerably reduces the complexity and load on the server while improving data privacy at the cost of knowing less about users. We can't and don't want to be able to connect single page views to an user identity.


This is not directly related to Counter, but here is a nice explanation of how GoatCounter[1] has implemented session tracking: https://github.com/arp242/goatcounter/blob/master/docs/sessi...

[1]: https://www.goatcounter.com


Also a very interesting approach. The difference to counter.dev is that counter.dev does not need any hashes or any unique ids at all, which saves complexity and avoids needing to explain why our specific way of using hashes or ids is privacy friendly. As we have none.


It can be done without cookies. I built a simple visitor stats module for my website package, based on the solution of Simple Analytics. They explained that they make a difference between unique and non-unique visitors by checking if the referrer is your own site. If so, it's not a unique visitor. Of course there are edge cases, but for my case it works more than good enough. I thought it was a brilliant solution from Simple Analytics and I figured I could build something like that myself. It was also fun and I learned a lot.


Guid stored in local storage?


Hello I am the counter.dev maintainer.

No! To see the complete tracking script in all glory check it out here: https://github.com/ihucos/counter.dev/blob/master/docs/scrip...

The relevant part is here:

13| sessionStorage.setItem("_swa", "1");

With that the script marked that the visit was tracked and no further http requests to track the unique view are send.


I appreciate that the whole tracking script fits on my smartphone screen.


Tomato tomato. A httpsonly cookie (as a tracking cookie should be) is just as visible as localstorage. So sure, technically correct not a cookie but does the name matter when the result is identical.

"Hey, I got an idea how we get rid of those pesky cookie banners, let's drop the cookies, no.. wait.. hear me out. We store the id... in localstorage! Brilliant!" POW Confetti.

It's cookies.


It's not cookies. It can't be used to track you across sites the way that a cookie can, as all the embedding site needs to do is ping the site the cookie is for, and it's done. How do multiple sites access the same localStorage?


> How do multiple sites access the same localStorage?

With an iframe. Although Safari already has protections in place against this scenario: https://webkit.org/tracking-prevention/#partitioned-third-pa...


I support this behavior, but I wish it allowed users to opt-out per domain or something. The current behavior breaks localStorage on HTML5 games on itch.io (for iPad users, mostly) and there’s no real workaround (other than asking users to disable the setting entirely—something I don’t want them to do). It’s not even possible to detect this case and warn users, so they just think the games are broken…


Cookies can't really track you across sites reliably these days either. And in just the next couple years it will be almost entirely disabled by default in all major browsers


Is that really how httpOnly cookies work though? That and cross-origin is supposed to solve that, but I might not be aware of the work-around you're referring to. Did a search that turned up nothing, do you have a source?

I can't try myself right now. But last time I checked it's not at all that easy. What am I missing?


Well, no, but you wouldn't set httpOnly if you wanted to do that


Hello worksonmine,

please see on my upper response a more technical elaboration on how it works. It is not the same as cookies.

Cheers


It's not really a more technical explanation. I get how localstorage and cookies are different. But it's a technicality the user doesn't care about. The problem is third party cookies.

Self-hosted solutions where only the site I visit tracks me is not a problem even for me as a privacy extremist. But calling it "no cookies" and putting the id in localstorage is misleading or misunderstanding the problem people have with cookies in the first place.

Don't get me wrong I like the effort and was looking for something similar before just implementing my own solution with regular nginx logs. But this is a space where trust is important.


counter.dev does not use any id at all for tracking.


No? How do you identify a specific browser? What's in these values?

  var id = document.currentScript.getAttribute("data-id");
  new URLSearchParams({
    id: id,
  })
You send it to the server (so I assume it can be used to identify a specific device) and call it an id in the code. Can't wait to see how you are going to try to wrap it to tell me it's technically not an id but...


looking at the code that seems to be the id of the website not of the user. You might have multiple websites on one instance of Counter and you just need to let the server know on which one to record the view.

IIRC, GDPR doesn't really care about cookies, but about storing information on the user's machine. sessionStorage would still fall under this classification. So it doesn't really matter if they are different or not.

That said, it does seem like Counter does not use user ids.


I don't see how unique visitors and no identifier are both possible. One of them is not true.


from what I can tell, it works like this:

- the script is loaded

- check if sessionStorage has the '_swa' item

- if no, then hit the /track endpoint (this I assume is to count unique visitors). Then set the '_swa' item.

- hit the /trackpage endpoint (this one just increases every time)

- On next page load, the _swa item will be set in the sessionStorage, so /track will not be hit again.

That's about it from what I can see.

I think the main point is that the server doesn't do any tracking itself, it just has a number that increases when you hit /track or /trackpage.


Right, and being sessionStorage it's cleared on browser close, and the next time I visit I will be counted as another daily unique visitor right? Trivial to work around using localStorage (or why not just a cookie) and a timer. So sure, not an id but a visited/not visited flag.

I personally would rather have the pages I visit use a self-hosted solution gather everything I do, instead of a third-party getting little data from many sites I use. If this script is used across many sites it can be checked server-side against my IP to get my usage. I can never verify what logs they keep and for how long.

That's the problem people tend to care about, not the cookies themselves.


> Right, and being sessionStorage it's cleared on browser close, and the next time I visit I will be counted as another daily unique visitor right?

No! There are a few rudimentary mechanisms on top of each other if one of them fails as you described. The /track endpoint sets up http caching. So if sessionStorage fails, you still have that. Then there is also inspecting document.referrer. If it is the page you are already on, then it's definitely not a unique visit.

I get regular user feedbacks for all kind of stuff and unique visits being counted double was not reported once since now.

> (or why not just a cookie)

Because cookies are considered "bad". But technically basically just saving a boolean value on the cookie would not be worse from a privacy perspective than using sessionStorage for a boolean value.

> I personally would rather have the pages I visit use a self-hosted solution gather everything I do, instead of a third-party getting little data from many sites I use. If this script is used across many sites it can be checked server-side against my IP to get my usage. I can never verify what logs they keep and for how long.

That is a general problem with externally hosted services. You can audit the source code (https://github.com/ihucos/counter.dev) but there is not way to verify that my deployment is as stated. I heard a podcast once that web hosters could guarantee that a deployment is in a specific way and contains a specific code base revision. But such solutions unfortunately do not exist. If you really want to be sure self hosting is the way to go (but somewhat cumbersome)


Thanks, I sometimes have troubles explaining how it works but this is it! That is basically how it works (There are some hidden extras though)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: