Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[refactor]! Implement Vue.js client #103

Closed
wants to merge 28 commits into from

Conversation

leon-wbr
Copy link
Contributor

@leon-wbr leon-wbr commented Dec 29, 2024

As discussed in #91, I believe that a Vue.js client will be immensely useful for future development:

  • Easier for developers as it allows for hot module replacements, better code and file structure
  • No need to reinvent the wheel as many functions are already implemented (routing, components, reactivity, etc.)
  • Scales easily and seamlessly integrates with APIs for dynamic data fetching
  • Saves time in order to focus on features and usability

I'd like to transition to a minimal API that looks like this:

[x] GET /api/v1/points
[x] GET /api/v1/points/:id
[x] POST /api/v1/points
[x] POST /api/v1/points/:id/review
[x] POST /api/v1/points/:id/duplicate

Additionally, fulfill a database migration that normalizes the data and keeps such an API efficient:

  • Refactor table Points (Latitude, Longitude)
  • Create table Reviews (PointId, Rating, Duration, Name, Comment, Signal, RideAt, etc.)
  • Create table Destinations (PointId, ReviewId, Latitude, Longitude, Distance)
  • Refactor Duplicates (FromPointId, ToPointId)
  • Refactor Hitchwiki (PointId, Link, etc.)

The store is managed by Pinia now – if you get an error message, here is a fix.

My goal is also to implement true offline-first functionality in the next PR:

  • Sync the complete dataset to your local device
  • Edit data without an internet connection and sync as soon as you reconnect

(I've never done this in Vue before. Perhaps localForage or absurd-sql with some custom sync functionality?)

I've also chosen to implement tailwindcss to reduce the amount of duplicate css and improve consistent styling. I'll try to make a replica of the styling as it is on live, but for now I've tried to focus on functionality.

There are still a lot of things to be done as this is essentially a complete rewrite. I've thought of making this a milestone, something like Hitchmap 2.0, as there are a lot of necessary changes to all parts. A few styling changes could really pull this together and improve the whole experience for the users as well.

Feel free to discuss this, add additional tasks, or help me work on this.

@bopjesvla
Copy link
Owner

  • I'm cool with using a full version of Vue, but I'd prefer to use the no-build-step version.
  • Old links have to keep working. Given that this is the case, I'm not sure we should support an alternative list of routes. We can continue to do our own routing in Vue, the code snippet is in the Move to petite-vue #91.
  • Related: routers with positional arguments (like vue-router) don't lend themselves well to map applications which often have tons of optional named parameters. Really don't want to get locked into something unsuitable
  • Again in mapping applications, you really don't want the front-end to think about anything more than it has to. This does involve rolling your own stuff and doing as many of the calculations on the backend as possible.
  • The current application is really fast for something that displays 30k points on a web map and it's been battle-tested on a wide variety of devices. It's very easy to build something that's not as fast. Because of this, I'll probably say no to anything that wants to get rid of embedding the dataset in the HTML page. Also because of this, I'll probably say no to using too many external JS libraries.
  • Leaflet is a rendering engine that doesn't interop with Vue, another rendering engine. If possible, I'd just have the map interactions change the route, and have Vue listen to those changes.
  • I really like the generated static-site aspect of the current application. We can add a bit of magic so people can see their own changes right away, but this has made many things so much easier, because you don't have to worry about performance on the backend that much.
  • A relational data model for geodata is always a bit of a mess. You could normalize the points to another table, but that's kinda silly: the only reason those reviews are logged on the same location is because of our interface enforcing this, and the only reason you're probably thinking about that collection of reviews as belonging to one spot is because we currently group them as such in the front-end. In reality, reviews are rarely for exactly the same location, and I want the application to accommodate for this as per Cluster nearby points on the front-end #46. In the future, we'll probably display reviews on the same road or in the same gas station as one spot, but I certainly wouldn't want this reflected in our database (unless it's for caching purposes).

@bopjesvla
Copy link
Owner

Sync the complete dataset to your local device

This happens in the current version: the HTML page with everything in it is cached by the sw

Edit data without an internet connection and sync as soon as you reconnect

Not the highest priority, definitely not something to implement while refactoring

@leon-wbr
Copy link
Contributor Author

leon-wbr commented Jan 6, 2025

I will consider all of your points, but:

  1. I do not see why we would go for the no-build-step version. This is an application that includes a build step by default and the build for Vue only needs to happen rarely, whenever we have to deploy. A lot of benefits to it, too.
  2. The cache is useless if the whole dataset has to be loaded again whenever a change occurs. It is a massive waste of bandwidth and does not cope well with bad internet or mobile data. Syncing would be much more user-friendly. Yes, I will implement this in another step, not during this refactor.
  3. I cannot imagine a worse database schema than the current one. The redundancy is unbelievable and most of it is being restructured in the Python code out of necessity. The relational data has nothing to do with Geodata. A review has a location, a spot has a location, those can be identical or different, but why would we not structure that as a relation?

@tillwenke tillwenke linked an issue Jan 7, 2025 that may be closed by this pull request
@leon-wbr
Copy link
Contributor Author

leon-wbr commented Jan 7, 2025

I've created an example of how the database could be refactored in the last commit. That can be used as a basis for further discussion.

If you need the old schema, you can simply do an INNER JOIN on Reviews with all the relations.

But for now, this makes it entirely possible for me to easily provide all the data needed through a REST API. It seems that the most expensive operations are the duplicates calculation; which is now obsolete and distance calculation, which should be saved in the database when a new destination is added.

@tillwenke
Copy link
Collaborator

tillwenke commented Jan 7, 2025

I cannot really contribute to the discussion on this topic. I d like to see a working draft of the refactor - then I d be able to judge it. But I would not recommend investing too much energy into it as it is possible that it gets (partially) rejected.

If possible I can recommend tearing the discussed topics apart so we can discuss them separately - some might make more sense than others. E.g. I would support the more relational db approach of 1 spot having N reviews as we I would like to encourage reviewing existing spots instead of creating new spots each time. Thus this should also be reflected in our db.

I see that we want to optimize certain parameters. Lets define and agree on those parameters and measure or estimate them. E.g. bandwidth usage before and after the refactor.
As I am not using the application on the road a lot so this would not be a major pain point for me anyway.

What is a major pain point for me is:

Easier for developers as it allows for hot module replacements, better code and file structure
No need to reinvent the wheel as many functions are already implemented (routing, components, reactivity, etc.)

This is important to attract new devs and keep the app running (we didnt manage to get a successful PR by an external person since the app exists).
To discuss this seriously please open a new issue and lets exchange examples of similar apps there. What do we want to copy? Where do we want to reinvent the wheel? E.g. https://www.dumpstermap.org/page/about also uses leaflet so I d shy away from not using it.

Happy coding :)

@leon-wbr
Copy link
Contributor Author

leon-wbr commented Jan 8, 2025

Thanks for chiming in, Till.

A little note in advance: I didn't suggest replacing Leaflet (just yet), but I did mention Maplibre to you in a discussion recently. I do think that Maplibre is far superior.

If you'd like to see the current state of the refactor – clone my branch, apply the migration on a current dump of the database, run server.py and in the client directory, run yarn install + yarn dev. If you haven't used it before, install it.

As for the database structure, I will simplify it again. It makes no sense to nest a Comment, better to keep it flat.

@bopjesvla
Copy link
Owner

I do not see why we would go for the no-build-step version. This is an application that includes a build step by default and the build for Vue only needs to happen rarely, whenever we have to deploy. A lot of benefits to it, too.

I've worked a lot with JS frameworks that use a build step, and while I used to be a fan, nowadays, if it's not necessary I like to avoid them. I realize this is not the popular opinion right now, but things I'm involved in that don't transpile JS tend to be more stable and easier to debug.

The cache is useless if the whole dataset has to be loaded again whenever a change occurs. It is a massive waste of bandwidth and does not cope well with bad internet or mobile data. Syncing would be much more user-friendly. Yes, I will implement this in another step, not during this refactor.

Syncing is very complex.

I cannot imagine a worse database schema than the current one. The redundancy is unbelievable and most of it is being restructured in the Python code out of necessity. The relational data has nothing to do with Geodata. A review has a location, a spot has a location, those can be identical or different, but why would we not structure that as a relation?

I don't have much to say about this other than I've never seen it done your way and I've often seen it done my way. Have you worked before in PostGIS etc? In geographical systems locations are almost always modeled as a single Point column, not as a relation to a separate table unless there's a good reason to. Sqlite doesn't support GIS columns by default, so I put the point in two columns.

@bopjesvla
Copy link
Owner

Also, I realize map.js and the current directory structure are hard to understand, I definitely want to change that. On the other hand, I don't want a two-step build process where both the front-end and the display data have to be constructed. This is really going to be hard to debug if you're making a change that hits both, and it's probably going to require some kind of CI/CD and all of the unfun stuff.

Overall, I support the effort, but I think your first PR using petite-vue was pretty close to being right. At the time I was in the mindset of not changing too much to Hitchmap, but I've looked at updating it to the current version before. And that's not to say I prefer petite-vue over Vue, just the approach taken there.

@bopjesvla
Copy link
Owner

Also, if you're including sqlalchemy so the writes don't have to be done using pandas, that's cool, but that should go in a separate PR.

@bopjesvla
Copy link
Owner

My goal is also to implement true offline-first functionality:


The cache is useless if the whole dataset has to be loaded again whenever a change occurs. It is a massive waste of bandwidth and does not cope well with bad internet or mobile data. Syncing would be much more user-friendly. Yes, I will implement this in another step, not during this refactor.

It's not useless when you're, I don't know, offline

@leon-wbr
Copy link
Contributor Author

leon-wbr commented Jan 8, 2025

I don't want a two-step build process where both the front-end and the display data have to be constructed.

A proper database structure could eliminate the need for the Python build step, so we will be back to one build step. If you want to have a no-build-step or one-build-step option, I'd recommend Nuxt and moving the backend into Prisma. But I have a feeling that'll not make things any better from your perspective.

CI/CD is a goal to work towards. It is a useful addition to a repository with multiple contributors. Same goes for testing.

if you're including sqlalchemy [...], that's cool, but that should go in a separate PR.

I agree that my PR is bigger than it needs to, but it would have been a pain to do the same in pandas. It is only used in the routes that I've added so far. I do not intend to do any work on show.py except adapting it to use the REST API later on (so that nothing breaks and the static version can still be served as a backup).

The request to the REST API is also cachable with the ServiceWorker, so it will work out to the same and eliminate the build step as mentioned before.

It's not useless when you're, I don't know, offline

On the road, the mobile data will get drained and waste a lot of bandwidth. Every ten minutes, the cache gets invalidated as long as there changes (right?). It will reload if there is an internet connection, requiring you to download all the data again.

Syncing will be easy to implement and it'll be my next step after this PR.

I'll take a break from this for now and would encourage you to check my code thoroughly. It is very much a WIP, the styling is not right yet, but a lot of these changes are worth it. My app is running extremely smoothly. Yes, the API is resource intensive, but so is the build step that was managed through a crontab.

I'd really like to find a consensus in this.

@leon-wbr
Copy link
Contributor Author

leon-wbr commented Jan 8, 2025

There is a bug in this PR that I haven't fully figured out yet:

When you do a POST Request to /api/v1/posts or /api/v1/posts/<post_id>/review, it occasionally returns an empty response. My hunch is that it has to do with CORS, but my experience with Flask is limited so I'm not sure what is going on here. If anyone has an idea, would be much appreciated to get a hint.

@bopjesvla
Copy link
Owner

Syncing will be easy to implement and it'll be my next step after this PR.

I'm not worried about the initial implementation, but about the maintenance. Having state on the client side is always more complex and frankly doesn't suit a small volunteer-run project.

@bopjesvla
Copy link
Owner

A proper database structure could eliminate the need for the Python build step

I feel like you missed this:

In reality, reviews are rarely for exactly the same location, and I want the application to accommodate for this as per #46. In the future, we'll probably display reviews on the same road or in the same gas station as one spot, but I certainly wouldn't want this reflected in our database (unless it's for caching purposes).

We're obviously not going to compute what road reviews were made for on the client-side.

This was not rhetorical, I'm curious about your experience:

Have you worked before with PostGIS etc?

@bopjesvla
Copy link
Owner

I'd really like to find a consensus in this.

I wouldn't mind taking all of the stuff I think is definitely good and putting it in the application when I have the time, then you can check if you find the result acceptable, and then we can discuss what other changes are necessary. Does that sound ok to you?

@leon-wbr
Copy link
Contributor Author

leon-wbr commented Jan 8, 2025

We will work on syncing later, let us not discuss this here now. It is indeed, as you mentioned, another pull request when the time has come. But this PR gets us closer to it by a large margin.

Have you worked before with PostGIS etc?

Only once about ten years ago, but if I am not mistaken, PostGIS only extends PostgreSQL. In that way, it should not differ regarding relational data and my approach would also be suitable for it. It might be worth exploring whether to implement it for us (also separate – of course!), SQLite is generally not a great long-term solution.

There is clearly some misunderstanding, not only between you and me, but also with @tillwenke: I'm not sure that any two of us share the same idea of how points should be treated in the application. If we can sort this out, I am sure that a lot of the discussions can be solved.

My interpretation is that every review is clearly linked to a point and if it is in reality supposed to be for another one, it can be marked as a duplicate to merge these points. This is what my database schema reflects, same as the existing one.

We're obviously not going to compute what road reviews were made for on the client-side.

I feel like there are some future plans that I am unaware of. Either way, you can simply join the points or comments on the reviews table and you'll have the exact same data scheme as before. Cache it, serve it as an endpoint, and it's all good.

I wouldn't mind taking all of the stuff I think is definitely good and putting it in the application when I have the time [...]

I'd like to get this PR mergeable, preferably through code review. I do believe in this implementation.

Would you mind creating comments on the code in this PR so I can go adjust my course?

@leon-wbr
Copy link
Contributor Author

leon-wbr commented Jan 9, 2025

I just checked out #46 and from the description, it should be fully compatible with this PR as well.

@bopjesvla
Copy link
Owner

My interpretation is that every review is clearly linked to a point and if it is in reality supposed to be for another one, it can be marked as a duplicate to merge these points. This is what my database schema reflects, same as the existing one.

Users are linked to first names, some users have the same first name, but that doesn't mean you should store first names in a separate table from users.

Either way, you can simply join the points or comments on the reviews table and you'll have the exact same data scheme as before.

This argument also holds for users and first names. You can store first names in a separate table and join them to the user table, but that doesn't mean they should be stored in separate tables. Only if there's a meaningful connection between users with the same first name.

In our case, there should be a meaningful connection between all reviews at the same location. Right now, a spot is indeed all reviews at the exact same location, but this will soon be different. So this refactor doesn't make sense.

The duplicates table is weird, and I agree it shouldn't reference lat/lons as if they were spots. But the goal is to make this table obsolete with #46.

I feel like there are some future plans that I am unaware of.

This is why I suggest slowing down a bit. It's always good to first write an issue explaining what you're intending to do, get some feedback, before doing it.

Would you mind creating comments on the code in this PR so I can go adjust my course?

Before doing a code review, I think we have to agree on the fundamentals. There's still quite a lot in my first comment that we didn't discuss. Like this one:

Related: routers with positional arguments (like vue-router) don't lend themselves well to map applications which often have tons of optional named parameters. Really don't want to get locked into something unsuitable

vue-router is for normal apps that have hierarchical pages, Hitchmap, like many map apps, can have both have an active route and a spot open, and both should (optionally) be reflected in the URL. At the same time, we might want the zoom level and location in the URL. It's a bad fit.

@bopjesvla
Copy link
Owner

bopjesvla commented Jan 10, 2025

Only sending the spot's locations during the initial load and sending comments etc when they're opened is worth considering. Definitely more standard, but it means that comments won't be available once the user goes offline. Another downside from a development perspective is that it's nice to assume the user already has everything. It also makes filtering on the front-end harder, which we were planning.

Your solution is more future-proof though. 15 mb is doable, but we definitely can't serve 100 mb during the initial load. But maybe we have much faster internet by then.

@leon-wbr
Copy link
Contributor Author

Relationships

Points have a clear one-to-many relationship with reviews, you would absolutely store that as a relationship as per best practice, as well as a many-to-many relationship with duplicates. The hitchwiki relationship is one-to-one, but still totally alright. There are virtually no downsides to such relationships. There is a meaningful connection.

If I want to access a point's reviews, I can access them as Point.Reviews through an ORM now. I do not want a join. I want clearly defined relationships to loop through. (The join is an option for you, if you for some reason require the opposite.)

Same goes for #46, which I believe will change nothing, no matter the implementation. I bet that there needs to be a relationship between multiple points to cluster them. It is just that you will likely attempt a programmatic solution such as in show.py that solves the same issue, or in the UI. But it is little different to storing it in the database as, for example, Point.RelatedPoints: ID[] (or, probably possible in PostGIS, dynamically query by distance). Neither does it change the fact that a single point can still have multiple reviews attached ot it.

What I'm trying to say is that the current implementation heavily relies on relationships, which could be reflected in a relational database – made for this purpose – instead of an absurd recalculation.

Vue-Router

Vue-Router allows anything. It allows hash based states, it allows dynamic regex matching for routes. There are a million more robust and flexible ways to solve what we need, and it also allows for the original solution.

Data Management

Syncing or what data to fetch is still another topic and PR. I am sending all data at once. It is dynamically fetched and cached (on the server) for five minutes, eliminating the build step on the backend. There is a loading state implemented which allows for an immediate page load.

@leon-wbr
Copy link
Contributor Author

I've added an example of hash-based states in the last commit by implementing #38, just to prove that it is not an issue at all. If it needs to be persisted, extending parseHash and setHash functions would be quickly done.

@bopjesvla
Copy link
Owner

bopjesvla commented Jan 11, 2025

Points have a clear one-to-many relationship with reviews

Because we enforce this in the front-end, which is a bad thing, not something we should structure our app around. In reality, people don't stand on the exact same location down to the tenth decimal. You're still thinking of a review's location as a spot identifier. The goal is to make a review's location the review's location, and nothing more. If two reviews are ten meters away from each other on the same road, we're still going to display it as one top-level marker.

There are virtually no downsides to such relationships.

So you wouldn't mind if someone on your team proposed to put users and first names in separate tables?

There's obvious complexity involved in this. If someone changes their review's location you have to check if the new location is already used, then point to that location. If the new location is not used you have to first add a new location record, then point to that. Also, you have to check if the old location record is no longer used so you can delete it. The only reason you're currently not feeling that complexity is because your implementation matches the current way we're displaying reviews (with lat/lon as spot identifiers) and it's currently not possible to alter a review's location. In any other world, it doesn't make sense.

I genuinely hope this clears it up. If not, this is still the last I'll say about it.

@bopjesvla
Copy link
Owner

Same goes for #46, which I believe will change nothing, no matter the implementation. I bet that there needs to be a relationship between multiple points to cluster them. It is just that you will likely attempt a programmatic solution such as in show.py that solves the same issue, or in the UI. But it is little different to storing it in the database as, for example, Point.RelatedPoints: ID[] (or, probably possible in PostGIS, dynamically query by distance). Neither does it change the fact that a single point can still have multiple reviews attached ot it.

For what it's worth, I think this could be right. This was kinda what I was getting at with this:

In the future, we'll probably display reviews on the same road or in the same gas station as one spot, but I certainly wouldn't want this reflected in our database (unless it's for caching purposes).

I wouldn't mind caching the result of the clustering in a table. After that's been done, we can query the clustering table much in the same way you're querying the points table in your implementation: as a source of spot marker locations.

The reason I don't like your implementation, is because you're conflating the review location and marker location. The marker's location and a review's location should be able to be different things. Also, the spot marker locations should be generated, not user data.

@bopjesvla
Copy link
Owner

bopjesvla commented Jan 11, 2025

Vue-Router allows anything. It allows hash based states, it allows dynamic regex matching for routes. There are a million more robust and flexible ways to solve what we need, and it also allows for the original solution.

Yeah, but you might as well implement routing yourself if you're going to roll your own regexes

Left this comment on Till's PR:

I think the syntax for a user page should be #filter,user:bob

In the future, I want to support combined routes using slashes like #filter,user:bob/51.4045468489593,5.53727030754089 meaning there's a filter active and a spot open. Multiple filters would be #filter,user:bob/filter,datetime:2024. Not sure slash is right since the order doesn't matter.

vue-router is not built for these query-parameter type of routes. It's easier to implement without vue-router than with, I know from experience. This is also the last I'll say on this.

@leon-wbr leon-wbr closed this Jan 16, 2025
@leon-wbr leon-wbr deleted the refactor-vuejs branch February 1, 2025 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move to petite-vue
3 participants