-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to track #requests per user agent? #6
Comments
This is something we have in mind. We do not plan. We do when we have time :-) However, feel free to submit a PR, or discuss how you see it. |
I would like to tackle this in a PR. The most straightforward way seems to be to add a counter for every specific user agent property that could be of interest and increment those. I would therefore propose to add the following columns:
Alternatively I could see an appeal to use more branded column names, but this would mean we would have to do database migrations every time a operating system of interest is launched. Something along the lines of:
For browser I do very much see the appeal of having branded names, but this will be hard to get right. I wouldn't want to add a new column for every chromium fork. I would therefore propose to just mention the browser engine:
Alternatively I would love to use a PostgreSQL |
Does it really need to be column specific? Why not track user agent as an
indexed column, or columns if you want to also track OS, browser, etc.. and
then group by/count by those columns when you report traffic? This way
you can add just a couple of columns to the schema and manage counting each
different type in the business logic without the need to have a catch all
specific column to aggregate things that you don't have columns for.
…On Tue, Feb 27, 2024 at 1:55 AM Marcus Riemer ***@***.***> wrote:
I would like to tackle this in a PR. The most straightforward way seems to
be to add a counter for every specific user agent property that could be of
interest and increment those. I would therefore propose to add the
following columns:
- total_desktop
- total_tablet
- total_phone
- total_bot
Alternatively I could see an appeal to use more branded column names, but
this would mean we would have to do database migrations every time a
operating system of interest is launched. Something along the lines of:
- total_linux
- total_macos
- total_windows
- total_ios
- total_android
- total_bot
For browser I do very much see the appeal of having branded names, but
this will be hard to get right. I wouldn't want to add a new column for
every chromium fork. I would therefore propose to just mention the browser
engine:
- total_gecko
- total_webkit
- total_blink
Alternatively I would love to use a PostgreSQL jsonb or hstore field
(which can be properly indexed). But I have no idea how support for this
looks outside my beautiful PostgreSQL bubble and I'm not really eager to
find out how MySQL or SQLite deals with this.
—
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AXK665X34NPEOD75RIZ4CWTYVWURZAVCNFSM6AAAAAASLPUSWCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRWGE3TSMRWGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
*Robison Rodrigues dos Santos *| *CTO, BoxedUp*
https://tryboxedup.com
*206-519-8760*
|
The current schema works with a single table with a single row per counted day and navigation target. Most of the aggregation is done immediately when tracking, not afterwards when evaluating. This does have the benefit of being comparatively fast, not requiring any regular maintenance and an overall very simple code base. Introducing more columns that pile up more rows per day seemed ... against this spirit ... to me. But the more I think about it the more I lean into the direction of "there are already many rows with very small totals for a single target due to navigation, maybe this doesn't really matter after all". So in this case I would propose the following additional columns:
Technically the device type could almost be computed from the operating system, but this gets muddy for the phone and tablet distinction. This approach could also allow developers to provide their own function to express the desired level of granularity for these fields. That way people who would like to also track things like versions could do so. |
Thanks for your suggestions. Is it important for you to know the browsers, devices ans OS per page ? In my case, knowing there are X% of Firefox version Y today across the whole site is enough. I'm asking because I can't see a case where that would be important. Maybe for a marketing person of a big company this detail matters. But ActiveAnalytics does not target people who might consider all the fancy features of GA for example. This is the schema I have in my mind by adding 3 more tables :
That should cover all cases you listed. Let me know if I miss one. Regarding the bots, I try to skip them from ActiveAnalytics because I am interested by natural trafic only. But that is really up to you. Indeed it could make sens to have a device named bot, or search engine X.
Since January 2024 SQLite supports JSONB and JSON for much longer. I prefer to avoid the non standard types to keep the code very simple. Thanks again to both of you. I am happy to see people willing to bring new features! |
I personally don't need versions at all.
It does indeed cover all use cases, but it also means suddenly have to do up to 4 upserts per visit. Although I do imagine these could be fired simultaneously instead of serially. And when reading the data it requires 3 joins unless a person wouldn't be interested in seeing this data by default. Not sure whether keeping the single table would be premature optimization on my part. But I would still prefer to add 3 columns instead of 3 tables.
Ultimately we can document this and leave it to the user with (imho) negligible impact on complexity.
Not sure how to read this: Would you be open to using
Works the other way around as well: I was happy to find your project that almost does exactly what I needed :-) One more thing from my part: I would pull in the browser gem to do the actual user agent detection. Is pulling in a very mature gem okay with you or do you prefer to keep this practically dependency free? |
Ahem, I added a MR of my initial attempt and only at the end I properly understood why you asked whether the user agent stuff should be trackable per page. I somehow jumped directly to the version question ... I stilll wouldn't see 3 additional tables, but maybe one table with the three new columns and a date to group by would've also done the trick. |
The browser version is very important to check on https://caniuse.com the supported CSS and JS features. For example, if too many people use the X browser before version Y, I might not use all the new fancy JS and CSS stuff. I still don't know how to store the versions. Indeed strings cannot handle all cases : Yes, more tables means more writes. But writes are delayed and aggregated with By adding 3 columns the table views_per_days will grow exponentially. Currently, in the worst case I also have in mind to add geolocation which will add more combinations. By adding 2 more columns such as country and city the worst case becomes even more exponential By adding more tables, (devices_per_days, os_per_days, browsers_per_day) the worst case is
In this cas no.
I try to avoid adding dependencies, but in this case it makes sens. I don't want to write a user agent parser, and all the code won't be binded to it. After writing this, I just thought the models DevicesPerDay and OperatingSystemPerDay could be merged without increasing too much the numbers of rows. Indeed, if the OS is iOS this is a either a phone or a tablet and same for Android.
I am very motivated to work on it but I would like to finish a big improvement on ActiveHashcash first. I don't know if you finally agree with my idea after explaining in details. |
I like the idea. I think it works fine. The lack of all that data made me
migrate out of Active analytics and I am now using Ahoy, but I find AA
simpler and would love to have a reason to go back to it in the future.
Having a breakdown by devices, os, etc makes the tool even better.
…On Thu, Feb 29, 2024, 8:01 AM Alexis Bernard ***@***.***> wrote:
The browser version is very important to check on https://caniuse.com the
supported CSS and JS features. For example, if too many people use the X
browser before version Y, I might not use all the new fancy JS and CSS
stuff. I still don't know how to store the versions. Indeed strings cannot
handle all cases : "1.2.3" < "1.2.10" # => false.
Yes, more tables means more writes. But writes are delayed and aggregated
with ActiveAnalytics.queue_request. That means currently there is 1 write
per page instead of pages * views. Thus the cost of ActiveAnalytics is
very small even for sites with a large trafic. With a large trafic it's not
reasonable to perform a write for each view.
By adding 3 columns the table views_per_days will grow exponentially.
Currently, in the worst case ViewsPerDay.count == days * pages which is
reasonable. By adding more columns the worst case becomes ViewsPerDay.count
== days * pages * browser_name * browser_version * device * os_name *
os_version. It's scary, because we are close to have one row per view or
only a few.
I also have in mind to add geolocation which will add more combinations.
By adding 2 more columns such as country and city the worst case becomes
even more exponential ViewsPerDay.count == days * pages * browser_name *
browser_version * device * os_name * os_version * cities.
By adding more tables, (devices_per_days, os_per_days, browsers_per_day)
the worst case is ViewsPerDay.count + DevicesPerDay.count +
OsPerDay.count + BrowserPerDay.count == days * (pages + browser_name *
browser_version + device + os_name * os_version). That will grow slower
and should produce less writes.
Not sure how to read this: Would you be open to using JSONB or not?
In this cas no.
One more thing from my part: I would pull in the browser gem
<https://github.com/fnando/browser> to do the actual user agent
detection. Is pulling in a very mature gem okay with you or do you prefer
to keep this practically dependency free?
I try to avoid adding dependencies, but in this case it makes sens. I
don't want to write a user agent parser, and all the code won't be binded
to it.
After writing this, I just thought the models DevicesPerDay and
OperatingSystemPerDay could be merged without increasing too much the
numbers of rows. Indeed, if the OS is iOS this is a either a phone or a
tablet and same for Android.
BrowsersPerDay(name, version)
PlatformsPerDay(name, version, device)
I am very motivated to work on it but I would like to finish a big
improvement on ActiveHashcash first. I don't know if you finally agree with
my idea after explaining in details.
—
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AXK665VUZXZH4R363KK7JYDYV5IGZAVCNFSM6AAAAAASLPUSWCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZRGQ2TEOJYGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
My personal use is to distinguish form factors of devices, I don't care about anything else. But I figured I could throw in the other metrics for other people.
Yeah, I have wandered off too far down the "but I don't want to overly emphasize people who are visiting exactly once" road. That might have worked if I had only pulled in the form factor, but hindsight is always 20/20.
So just to make this explicit: You are proposing to count every user agent characteristic once per day?
I do agree now. My MR could be salvageable with not too many issues, but I'm currently struck down by rather annoying health issues and am not sure when I have time to revisit that MR. |
Yes, as views are counted one per day and per page, browsers will be counted one per day, per version and per device. I'm sorry to hear about your health issue. Take care! |
One metric I'd like to track is the user agent accessing the application, so I can divide my users into cohorts per device (desktop vs mobile). Is there any way to do this already? Is that in the roadmap?
The text was updated successfully, but these errors were encountered: