30 comments

  • vextea2 days ago
    There seems to be some mentions of selling licenses (and pricing) in the source. What are the plans around that?<p><a href="https:&#x2F;&#x2F;github.com&#x2F;vinceanalytics&#x2F;vince&#x2F;blob&#x2F;f0c2c3cc38cbd8c6263993163e102dd339d2f44d&#x2F;internal&#x2F;web&#x2F;templates&#x2F;stats&#x2F;site_locked.html#L19">https:&#x2F;&#x2F;github.com&#x2F;vinceanalytics&#x2F;vince&#x2F;blob&#x2F;f0c2c3cc38cbd8c...</a>
    • gernest2 days ago
      When I started working on vince, I thought I could bootstrap a sustainable business, that was about 3 years ago.<p>My dream for a business is practically dead now. That snippet is a relic of early days of vince and I will remove it.<p>I am currently looking for work, and will be maintaining vince as usual (I do a lot of open source stuff) since I also use it with my hobby projects.<p>I&#x27;m struggling finding remote roles now, since remote now means Remote US or Remote EU and I&#x27;m stuck here in Tanzania.<p>So, don&#x27;t worry, I also use vince so I will keep hacking on it.
      • vextea2 days ago
        Makes sense, wish you the best of luck!
  • zoidb3 days ago
    My go-to self hosted GA alternative is goatcounter <a href="https:&#x2F;&#x2F;www.goatcounter.com" rel="nofollow">https:&#x2F;&#x2F;www.goatcounter.com</a>. It would be interesting to know what advantages it has over it.
    • huhtenberg2 days ago
      Does it allow filtering visited page list by a specific referrer and vice verse?
      • zoidb2 days ago
        Yes, it does if I understand what you mean. You can see the traffic distribution (what paths were accessed) broken down by referrer.
    • james-bcn3 days ago
      Oh I like that main dashboard. Very simple.
      • TravisPeacock3 days ago
        If you like that there is <a href="https:&#x2F;&#x2F;www.piratepx.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.piratepx.com&#x2F;</a> which is even more minimal (though less data), I also built something even MORE minimal (only API calls) <a href="https:&#x2F;&#x2F;github.com&#x2F;teamcoltra&#x2F;ninjapx">https:&#x2F;&#x2F;github.com&#x2F;teamcoltra&#x2F;ninjapx</a> but I&#x27;m certainly not recommending it. It is super simplistic (also the readme is embarrassing)
    • sandeepthroat2 days ago
      [dead]
  • written-beyond2 days ago
    Code quality is pristine, really great job! I see that you&#x27;ve used protocol buffers, can you expand on why? I am aware of the benefits it offers but I think it adds a bit of mental overhead initially due to it being an additional type system you have to understand.<p>Also why are you using pebble exactly? I was interested in seeing how you&#x27;re managing your geo databases because that&#x27;s usually the most mind numbing part of handling analytics if your cloud provider doesn&#x27;t add that information into the request header already. However, I can&#x27;t understand why you&#x27;d use pebble over something like sqlite.
    • gernest2 days ago
      Thanks,<p>&gt; Why protocol buffers ?<p>They are very good for defining API boundaries, in vince we only use them for configuration and admin structure. We use Roaring Bitmap based storage, so fundamental units persisted are Bitmap containers.<p>&gt; Also why are you using pebble exactly?<p>Well, vince is write heavy and any LSM based key value store would have been nice. It happens pebble is the best option for us.<p>Also, we don&#x27;t use transactions (We batch writes and use snapshots for reads). Combining with the fact we rely on pebble batch Merge api.<p>The merge api allows us to do efficient updates. Since we only store bitmap containers, when doing update we just do a container union of observed values of a key.<p>Bitmap unions are pretty fast and efficient.<p>I hope I covered all your questions.
      • written-beyond2 days ago
        It answered them alright, but really opened a few hundred more. I appreciate your time!
  • just-tom3 days ago
    The screenshot on your homepage looks very similar to plausible&#x27;s <a href="https:&#x2F;&#x2F;plausible.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;plausible.io&#x2F;</a> which is also open-source analytics software. Is it based on it? What are the differences?<p>Edit: Just noticed the feature comparison in the readme.
    • dewey3 days ago
      Also Plausible is almost stock TailwindUI elements + including the default color, so many sites look like that.
  • rgbrgb3 days ago
    &gt; Full dashboard demo hosted on 6$ vultr instance <a href="https:&#x2F;&#x2F;demo.vinceanalytics.com&#x2F;share&#x2F;vinceanalytics.com&#x2F;v1&#x2F;" rel="nofollow">https:&#x2F;&#x2F;demo.vinceanalytics.com&#x2F;share&#x2F;vinceanalytics.com&#x2F;v1&#x2F;</a>...<p>404 page not found
    • thangngoc893 days ago
      I found a link from github <a href="https:&#x2F;&#x2F;demo.vinceanalytics.com&#x2F;v1&#x2F;share&#x2F;vinceanalytics.com?auth=Ls9tV4pzqOn7BJ7-&amp;demo=true" rel="nofollow">https:&#x2F;&#x2F;demo.vinceanalytics.com&#x2F;v1&#x2F;share&#x2F;vinceanalytics.com?...</a>
  • colesantiago3 days ago
    Great project keep it up it&#x27;s good to see competition in this space.<p>Plausible gets crazy expensive on their hosted option and it complex to setup (needs elixir + high memory requirements)<p>If Vince gets 1:1 parity with plausible and has the option to use clickhouse, I&#x27;ll consider moving a few servers and people I know over.<p>Love that Vince is also a single binary as well.
  • pdyc3 days ago
    Looks exactly like plausible, may be change the ui a bit to avoid legal issues.
    • carlosjobim3 days ago
      I was going to say that it looks exactly like BeamAnalytics, and now I&#x27;m confused to who&#x27;s copying who...
      • serial_dev2 days ago
        I&#x27;m wondering when copying becomes just following industry best practices...<p>Twitter, Threads, Mastodon, Blusky all look the same. Project management apps all reuse the same UI patterns. The &quot;AI&quot; logo looked pretty much the same for all companies for a while. Video sharing websites all use YouTube&#x27;s layout. Forums like Reddit and HN share quite a lot in their looks.<p>If you want to display website analytics, you will want to show the most important metrics at a glance, you&#x27;ll need graphs showing visitors over time, top sources and pages... There is only so much you can do to display those and have users understand what&#x27;s going on on your website.
      • dewey3 days ago
        Because everyone is using: <a href="https:&#x2F;&#x2F;tailwindui.com&#x2F;components#product-marketing" rel="nofollow">https:&#x2F;&#x2F;tailwindui.com&#x2F;components#product-marketing</a>
        • huhtenberg2 days ago
          It&#x27;s not just the looks that are the same. The UX &#x2F; mechanics are way too similar too, e.g. how you can apply filters (by URL, by referrer, by browser, etc.) to narrow down the stats view.
          • rkuodys2 days ago
            I would say pretty much the idea is as follow: &quot;Let&#x27;s do it so User would know how to use it before we are big&quot;, and once you&#x27;re big enough - you can set the trend. But at the beginning it&#x27;s just not worth it and highly risky
    • NelsonMinar2 days ago
      What legal issues are you imagining?
  • kukkeliskuu2 days ago
    This is great. For me the commercial Plausible is just not plausible. I have a site with 2M page views, with most of the pages cached, which keeps the server costs minimal, I pay around 50 USD per month. I don&#x27;t get much revenue from the site. I want to show visit counts on the site. For 2M page views, Plausible (with the stats API) would cost 189 USD per month, quadrupling my costs.
    • gernest2 days ago
      This is one of the reason I created vince.<p>For reference, the demo is hosted on a 6$ vultr instance, the last 3 days it handled about 11.9K pageviws with 4.3K unique visitors.<p>I have just checked the vultr dashboard.<p>Bandwidth = 3.37 GB ,vCPU usage = 1% (yep one percent) , Current charges = 1.06$.<p>Majority of the bandwidth is for outgoing data serving the dashboard.<p>I carefully designed vince to be extremely efficient for web analytics workloads.<p>Please give vince a try.
    • openplatypus1 day ago
      Hi, just FYI, the Wide Angle Analytics (my product) will cost you between 30 and 90 EUR per month for 1M and 10M accordingly.<p>There are many web analytics providers with surprisingly high prices.<p>We are cheaper and even planning on creating free tier by making smart use of resources and avoiding overpriced cloud providers.
    • maeil1 day ago
      2M page views and not much revenue does sound like a choice. I have no affiliation to Plausible but 2M pageviews per month has such high revenue potential that if you&#x27;d monetize it (which frankly is the logical assumption they&#x27;d operate on), $189 month would be a trivial expense.
      • kukkeliskuu3 hours ago
        You are partly correct, although it really depends. My site is in Finnish, which makes Google AdSense the only really viable option, unless I want to spend lots of time finding affiliate marketing revenue. That pays approximately 1.3 euros per 1000 page views, and does not work well with mobile page views on my site. I get 2M page views on high season, now it is off-season and visit counts are lower. I really get only around 20 euros per day on ad revenue, which makes around 600 euros per month. 200 euros per month cost is not &quot;trivial&quot;. I have some other revenue, but that is small as well. Header bidding companies are interested to work with you if you have 5M+ page views per month. On a longer term, I think there is potential, but sure, I have made the decision to make the site foremost a public service, and revenue is secondary.
  • gonafr2 days ago
    How this compares to umami (<a href="https:&#x2F;&#x2F;umami.is&#x2F;" rel="nofollow">https:&#x2F;&#x2F;umami.is&#x2F;</a>)?
    • arcastroe2 days ago
      I&#x27;m also interested in this. They seem to have very similar UI
  • brokegrammer3 days ago
    This is amazing! I self host Plausible but don&#x27;t like depending on Clickhouse and Postgres because they&#x27;re annoying to upgrade.<p>What kind of database is this using though? I don&#x27;t know enough Go to figure it out from the source.
    • tricked3 days ago
      I checked the go.mod and it seems to be importing a module named pebble by cockroachdb i assume that&#x27;s where everything is stored<p><a href="https:&#x2F;&#x2F;github.com&#x2F;cockroachdb&#x2F;pebble">https:&#x2F;&#x2F;github.com&#x2F;cockroachdb&#x2F;pebble</a>
    • akshayshah3 days ago
      It uses Pebble, the key-value store that backs CockroachDB.
      • colesantiago3 days ago
        Just saw this notice:<p>&gt; WARNING: Pebble may silently corrupt data or behave incorrectly if used with a RocksDB database that uses a feature Pebble doesn&#x27;t support. Caveat emptor!<p>Slightly worrying for now running this in prod if there is a risk for silent data corruption, but hopefully in a few years Vince would have drivers for Postgres &#x2F; Clickhouse.
        • rickette3 days ago
          This just warns about using Pebble with an existing RocksDB which isn&#x27;t the case here. Pebble powers CockroachDB which is a Serious Database.
          • kamikazechaser6 hours ago
            And Ethereum&#x27;s state store. Which is an even more serious &quot;database&quot;.
        • dangoodmanUT3 days ago
          Reread the sentence, it says if you mix it with RocksDB (another database that has compatible file formats)
  • slyall2 days ago
    Going through the docs I find you don&#x27;t actually have a bit about how to make your website to use it. I mean I can work it would and it&#x27;ll be obvious to proper front end developers but at not point do you say:<p>&quot;Add the following line to you page source to send data to Vince&quot;
  • lovegrenoble3 days ago
    Is is a Plausible clone? <a href="https:&#x2F;&#x2F;plausible.io" rel="nofollow">https:&#x2F;&#x2F;plausible.io</a>
    • __jonas3 days ago
      From the Readme:<p>&gt; vince started as a Go port of plausible with a focus on self hosting.
  • lomkju2 days ago
    Nice Work! Very easy to install and use.<p>I deployed this on our cloud (excloud.in) in less than 2 mins.<p>Anyone you can use the below k8s manifest to deploy it to their k8s cluster. Just change the admin password before doing so.<p><a href="https:&#x2F;&#x2F;gist.github.com&#x2F;lomkju&#x2F;90fe7500d8cf854bf3b7c2f26aa580e0" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;lomkju&#x2F;90fe7500d8cf854bf3b7c2f26aa58...</a>
    • gernest1 day ago
      Thanks, that is very nice setup.<p>Does it always pull the latest vince image?<p>Just FYI, we also have simple helm charts, and the repository is hosted on <a href="https:&#x2F;&#x2F;vinceanalytics.com&#x2F;charts" rel="nofollow">https:&#x2F;&#x2F;vinceanalytics.com&#x2F;charts</a>
      • lomkju1 day ago
        &gt; Just FYI, we also have simple helm charts, and the repository is hosted on <a href="https:&#x2F;&#x2F;vinceanalytics.com&#x2F;charts" rel="nofollow">https:&#x2F;&#x2F;vinceanalytics.com&#x2F;charts</a><p>Oh cool, didn&#x27;t see that in the docs.<p>&gt; Does it always pull the latest vince image?<p>Yes haven&#x27;t specified any tag so should default to latest.
  • cebert3 days ago
    If you haven’t checked it out yet, Serverless Website Analytics, is a great solution for this too. It’s easy to deploy and very inexpensive to run. I’ve been using it and am quite happy with it. <a href="https:&#x2F;&#x2F;github.com&#x2F;rehanvdm&#x2F;serverless-website-analytics">https:&#x2F;&#x2F;github.com&#x2F;rehanvdm&#x2F;serverless-website-analytics</a>
    • gernest3 days ago
      Interesting, I just checked the readme. Very similar but looks like it only works with AWS and has a lot of moving pieces.<p>How do you deal with location data, do you purchase maxmind db license or use their free versions.<p>Both maxmind and db-ip free versions of city data miss city geo id values, rendering city data useless for many cases.<p>With vince, I had to index embed the whole city data from geonames database to work around this.
      • reincoder2 days ago
        &gt; How do you deal with location data, do you purchase maxmind db license or use their free versions.<p>&gt; Both maxmind and db-ip free versions of city data miss city geo id values, rendering city data useless for many cases.<p>I work for IPinfo.<p>I think you might find my conversation with Goatcounter&#x27;s dev interesting: <a href="https:&#x2F;&#x2F;github.com&#x2F;arp242&#x2F;goatcounter&#x2F;issues&#x2F;765">https:&#x2F;&#x2F;github.com&#x2F;arp242&#x2F;goatcounter&#x2F;issues&#x2F;765</a><p>I pitched him to use our free country database because of MaxMind&#x27;s EULA issues. MaxMind does not permit distribution of the database and requires end users to use their own token. Moreover, they actually charge thousands of dollars when you distribute the &quot;free&quot; database with a commercial intent.<p>Now, we have a free IP to Country database that we offer under a straight CC-BY-SA 4.0 license without an EULA. It is free, comes with daily updates, has full accuracy, and you can even commercially redistribute the database (via providing us an attribution).<p>I understand we do not have a free city database to offer, nor is our database lightweight because we have full accuracy. But you can check it out if you are interested. We do have a version with ASN (ISP) information as well.
      • alam20003 days ago
        [dead]
  • paradite2 days ago
    Not sure why I would use this over Plausible CE on docker. Does it consume less memory&#x2F;CPU?<p>Also I am pretty sure Plausible CE doesn&#x27;t limit number of sites &#x2F; events, unlike what&#x27;s listed in &quot;Comparison with Plausible Analytics&quot;.
  • aaronbrethorst3 days ago
    Looks interesting. What sort of memory requirements does it have and how does it persist data?
    • gernest1 day ago
      The demo, which survived HN hug of death is running on 6$ vultr instance.<p>RAM : 1GB<p>STORAGE: 25 GB<p>so far bandwidth used is 3.6GB<p>So, you can successful deploy vince on low spec servers depending on your expected traffic.
  • sira042 days ago
    Looks great!<p>I found a small bug, if you click Expand in the Top Pages section, the Time on Page column has NaNs.<p>Dark mode for the dashboard and showing realtime current visitors in the &lt;title&gt; would be great.
  • manishsharan3 days ago
    I think the reason some of us continue using Google Analytics is its demographic data. That information is not available elsewhere as far as I know , which I admit is not a lot.
  • t0mas883 days ago
    It says GDPR compliant and no cookies on the project page. How are unique visitors calculated? And I&#x27;m assuming it can&#x27;t link conversions to campaigns without some cookie-alternative?
    • withinboredom3 days ago
      No idea, but generally, a bloom filter would get you there without any identifying information being stored. The counts would merely be estimates at that point, not exact values.
    • awongh1 day ago
      As a side consideration, according to the varying opinions in response to this question it’s not really clear what constitutes PII (personally identifying information).<p>When I researched this topic it was strange to me that no one seems to agree. Is it just arm-chair internet answers? Or is it actually that the letter of the law is actually ambiguous? What are the real world consequences of using this when it’s possible it violates GDPR? Or, what are the chances there would be consequences?
    • beeb3 days ago
      At least for Plausible, they state this (<a href="https:&#x2F;&#x2F;plausible.io&#x2F;blog&#x2F;google-analytics-cookies" rel="nofollow">https:&#x2F;&#x2F;plausible.io&#x2F;blog&#x2F;google-analytics-cookies</a>):<p>&gt; Instead of tagging users with cookies, we count the number of unique IP addresses that accessed your website. Counting IP addresses is an old-school method that was used before the modern age of JavaScript snippets and tracking cookies.<p>Since IP addresses are considered personal data under GDPR, we anonymize them using a one-way cryptographic hash function. This generates a random string of letters and numbers that is used to calculate unique visitor numbers for the day. Old salts are deleted to avoid the possibility of linking visitor information from one day to the next. We never store IP addresses in our database or logs.
      • chrismorgan3 days ago
        &gt; <i>Since IP addresses are considered personal data under GDPR, we anonymize them using a one-way cryptographic hash function.</i><p>Um... hashing IPv4 addresses, even with salt, does <i>literally nothing</i> to anonymise (assuming the output space is at least ~32 bits, which I think is safe to assume): they’ll still be PII. IPv6 addresses I’m not so confident about; <i>maybe</i> it would be sufficient for some parts, but it’s definitely inadequate for some concerns.<p>(For IPv4, enumerating all four billion inputs is so completely practical that “one-way” is nonsense.)<p>I’m almost certain this is legal theatre.
        • alkonaut7 hours ago
          Couldn&#x27;t this be done with a Bloom filter in such a way that (in exchange for a small error rate) you&#x27;d not keep any individual hashes?
        • Semaphor3 days ago
          One way if you have a salt? Enumerating won’t help, you need to know the salt, which gets deleted.<p>That said, the whole IP thing is weird to me. Not only are we allowed to log IPs directly for security reasons, we even *have* to log IPs in certain cases (newsletter subscriptions).
          • kadoban3 days ago
            &gt; That said, the whole IP thing is weird to me. Not only are we allowed to log IPs directly for security reasons, we even <i>have</i> to log IPs in certain cases (newsletter subscriptions).<p>The point of designating something as PII isn&#x27;t that we then _never_ store or use it, it&#x27;s to carefully consider if we actually need it or not (and what protections we can add for the values we do need to store&#x2F;use).<p>We&#x27;re meant to stop the practice of just collecting and storing all data, without consideration for the harms that causes.
        • kadoban3 days ago
          If what they&#x27;re doing is using a secure salt and then throwing the salt away once a day that _might_ be doing something.
          • chrismorgan3 days ago
            What I understand they’re doing is storing the salt in one place, a set of hashed IP addresses in another place, then daily trashing the lot after counting the number of elements in the set and storing that.<p>Information-theory-wise, this is no different to just storing the actual IP addresses (and deleting them daily after tallying, as before). It <i>does</i> mean that you need to obtain <i>two</i> things instead of just one, but if you get access to it all, it’s straightforward to reverse the lot (though computationally a little expensive), and easy to check a single value for a match.<p>The technique may be considered reasonable effort at protecting against casual abuse, but it’s not technically effective of itself, and it doesn’t stop the data from being PII. The important aspect is that the PII is deleted within 24 hours. My personal opinion is that the hashing part should probably be considered snake oil and whitewash, at least for what they’re claiming—I don’t say it’s useless, but it definitely doesn’t do what they’re touting it for.<p>Unless they’re actually keeping the hashed values for some reason after one day, and associating them with other records? In which case, disregard <i>part</i> of what I say, it’s obviously better than persisting IP addresses long-term! But also it’s extremely dubious to call that anonymisation as they do, because you can so often tie things together, behavioural patterns and such, to deanonymise. It’s frighteningly effective.
            • tingletech3 days ago
              If you throw away the daily random salt (but keep the obscured IP address), how can you check a single value for a match the next day?
              • chrismorgan2 days ago
                Refer to my understanding in the first paragraph—I don’t <i>think</i> they’re retaining the hashed values after a day either? If they are, sure, apply my last paragraph, you can’t do a single match any more. (But the whole thing would still <i>definitely</i> be susceptible to deanonymisation.) But at the very least, it’s easily reversible for up to 24 hours.
        • jszymborski3 days ago
          What matomo does is mask parts of the IP address (you choose how much).
        • gizzlon3 days ago
          hm.. are you saying they need scrypt or something similar?
          • chrismorgan3 days ago
            The “PII” label is taint that is probably impossible to dispel completely&#x2F;perfectly, and difficult to dispel sufficiently (and deanonymising is an arms race).<p>Lossless techniques do <i>nothing</i> to dilute that taint.<p>Lossy techniques are necessary to get <i>anywhere</i>, such as disregarding certain bits of the address, or Bloom filters.
          • kadoban3 days ago
            The problem, in general with hashing IP addresses (especially ipv4) is that there&#x27;s not that many of them.<p>If I tell you the value is either 1 or 2, but I hashed it with sha256 to make it secure, that&#x27;s bullshit, right? You can just hash both and see which it is.<p>Same concept applies regardless of the hash algo, and still applies if you have more than 2 possible values, 4 billion or so possible ipv4 addresses is _not_ that many values to a computer.<p>Other common places this problem occurs is with any other restricted set of values, eg phone numbers and email addresses (most are at like 5 domains and are easy to guess&#x2F;know).
    • pdyc3 days ago
      most likely through one way ip hashing bounded by time duration. If you have utm&#x27;s in your url than it can track otherwise probably not.
  • skeptrune2 days ago
    Cool that there are so many of these now. Currently self hosting plausible and it does seem quite barebones. Will have to give this a shot!
  • QuasarLogic2 days ago
    can we compare it with Shynet? <a href="https:&#x2F;&#x2F;github.com&#x2F;milesmcc&#x2F;shynet">https:&#x2F;&#x2F;github.com&#x2F;milesmcc&#x2F;shynet</a><p>Shynet is similarly self hostable, and has a tiny footprint..
  • cchance2 days ago
    &quot;see live dashboard&quot; button on main page just... goes to the top of the page lol
  • samdung3 days ago
    This is great. I&#x27;m def going to use it.<p>Minor bug: &quot;See Live Demo Dashboard&quot; url is wrongly pointed.
  • cpursley3 days ago
    How would y’all go about building analytics into a professional marketplace type of app where you can provide the professional with their own profile page stats (in a reliable way)?
  • rasso3 days ago
    Does this work on your average 10,-&#x2F;month shared hosting server? If so, it might really be „for everyone“. Otherwise, we are stuck with matomo.
    • diggan3 days ago
      &gt; Does this work on your average 10,-&#x2F;month shared hosting server?<p>Since they usually offer software via cPanel and alike, seems unlikely unless you give it lots of time for the project to first get popular enough to get on the &quot;admin panels&quot; mind, and secondly for them to integrate it.<p>Besides, do people really pay 10 USD&#x2F;month for shared hosting? Sounds really expensive when you can grab VPSes for half that price and run whatever software you want, not just what they&#x27;ve packaged for you. I guess ongoing maintainace is included in that price, but still sounds kind of expensive for what you get.
      • rasso3 days ago
        I don‘t know… around here (Germany), that‘s pretty common. No need to manage anything, no usage-based cost, … my favourite is <a href="https:&#x2F;&#x2F;all-inkl.com" rel="nofollow">https:&#x2F;&#x2F;all-inkl.com</a>. OG no-bs hosting for boring tech.
  • notRobot3 days ago
    The dashboard demo isn&#x27;t working :(
  • 8ig83 days ago
    Matomo is another one…<p><a href="https:&#x2F;&#x2F;matomo.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;matomo.org&#x2F;</a>
  • Oras3 days ago
    If you don&#x27;t have plans to offer saas, what are you trying to achieve from it?<p>I mean, it is quite nice to have binary installation hosted on a single VPS, but will you support it?
  • drchaim3 days ago
    this is great, congrats!