Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Restructure #167

Closed
wants to merge 7 commits into from
Closed

[WIP] Restructure #167

wants to merge 7 commits into from

Conversation

kiliankoe
Copy link
Member

@kiliankoe kiliankoe commented Aug 10, 2018

This is still very much a work in progress without a lot of progress. I spent some time in the last few days rethinking the basic db (and app) structure. Not sure how far I'll get in the next few days, so I'm dumping this here for now.

Basic ideas:

  • The old database schema is based on my laziness and uses PostgreSQL like a nosql database, which is super weird and has lots of issues, this amends that.
  • The old scraper and server are much too intertwined and could use some splitting apart, maybe even throwing away the entire server in the process and replacing it with something more sensible.
  • The API responses are weirdly structured and should be optimized, the old format will obviously still have to be supported indefinitely for the apps.
  • The city data and modules should leave this repository (@jklmnn already did that) and data should be updated regularly. The geojson files are already out-of-date and contain wildly inconsistent data.
  • Deployment should be optimized, ideally through the use of a Docker image/compose file, see Docker image #103.

This PR will likely stay open for a while, please unsubscribe if you don't want to get GitHub notifications for every update 🙈 I apologize in advance! 😅 I'm opening it instead of just leaving a branch so that changes can be discussed here.

@kiliankoe
Copy link
Member Author

kiliankoe commented Aug 10, 2018

The biggest change for now is the proposed db structure. It can be found here:

CREATE TABLE sources (
id serial PRIMARY KEY,
name text NOT NULL UNIQUE,
attribution_contributor text,
attribution_license text,
attribution_url text,
url text,
source_url text NOT NULL,
latitude double precision,
longitude double precision,
has_active_support boolean NOT NULL,
created_at timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at timestamp with time zone NOT NULL -- TODO: automatically update these via triggers
);
CREATE TYPE lot_type AS ENUM ('underground', 'lot', 'carpark');
CREATE TABLE lots (
id serial PRIMARY KEY,
name text NOT NULL,
address text,
region text,
city text,
country text,
-- coordinates geography, -- is this an option via postgis?
latitude double precision,
longitude double precision,
type lot_type,
has_forecast boolean NOT NULL,
detail_url text,
total_spaces integer,
source_id integer NOT NULL,
FOREIGN KEY (source_id) REFERENCES sources(id),
pricing text,
opening_hours text,
additional_info text,
created_at timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at timestamp with time zone NOT NULL -- TODO: automatically update these via triggers
);
CREATE TYPE lot_state AS ENUM ('open', 'closed', 'no_data');
CREATE TABLE data (
id serial PRIMARY KEY,
lot_id integer NOT NULL,
FOREIGN KEY (lot_id) REFERENCES lots(id),
free_count integer NOT NULL,
total_count integer,
state lot_state,
timestamp_downloaded timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
timestamp_data_age timestamp with time zone
);
CREATE TABLE pools (
id serial PRIMARY KEY,
name text NOT NULL UNIQUE,
type text,
created_at timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at timestamp with time zone NOT NULL -- TODO: automatically update these via triggers
);
CREATE TABLE pools_lots (
pool_id integer,
lot_id integer,
FOREIGN KEY (pool_id) REFERENCES pools(id),
FOREIGN KEY (lot_id) REFERENCES lots(id),
PRIMARY KEY (pool_id, lot_id)
);
-- Materialized Data Views, create one for each city/data-source
CREATE MATERIALIZED VIEW data_dresden AS SELECT DISTINCT ON (lots.id)
lots.id, lots.name, data.timestamp_downloaded, data.timestamp_data_age, data.state, data.free_count, data.total_count
FROM
lots
JOIN data ON lots.id = data.lot_id
WHERE lots.city = 'Dresden'
ORDER BY
lots.id, data.timestamp_downloaded DESC;
CREATE UNIQUE INDEX lot_id ON data_dresden (id); -- create unique indices so that refreshing can be done concurrently
-- REFRESH MATERIALIZED VIEW CONCURRENTLY data_dresden;

There's (at least) two todos there still open. Updating the created_at and updated_at fields with a trigger on the respective tables and automatically creating pools (see #130) for every source.

I'd also love to use PostGIS for the coordinate fields, but don't have a clue how to query against that 🙈

@jklmnn
Copy link
Member

jklmnn commented Aug 12, 2018

I just want to state that I did not yet remove the modules from the main repo completely. There are still unresolved dependencies. You can use the modules from https://github.com/offenesdresden/ParkApi_Modules with my ParkAPI fork https://github.com/offenesdresden/ParkAPI/tree/external_modules but you cannot use the modules alone as they still require dependencies only available in the ParkAPI repo. This is surely something that needs to be fixed.

@kiliankoe
Copy link
Member Author

kiliankoe commented Apr 24, 2019

relationships real large-2

Dumping this here so that I don't lose it.

@jklmnn jklmnn mentioned this pull request Apr 24, 2019
2 tasks
@kiliankoe
Copy link
Member Author

Closing in favor of a new approach.

@kiliankoe kiliankoe closed this Aug 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants