Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Row-level security / automatic filters #12

Open
mividtim opened this issue Feb 15, 2016 · 8 comments
Open

Row-level security / automatic filters #12

mividtim opened this issue Feb 15, 2016 · 8 comments

Comments

@mividtim
Copy link

It would be great if a client could do something like r.table('turtles').filter({name: 'Ralph'}), and the server would automatically inject the herdId filter (from your example) in between the .table and the .filter provided by the client, so that from a client's perspective, the only data in the DB is what is available to them. This could ease the burden not only on user-level security, but help to support multi-tenancy (e.g. filtering every table on tenantId).

@mikemintz
Copy link
Owner

@mividtim that would be really cool to have. Although I have a hard time picturing how to implement it in a way that generalizes to different schemas. For example, I might have an online bank app, and I'd generally want to filter Account, Transaction, Statement, and Message with .filter({userId: session.userId}). But maybe I'd like to display r.table('transaction').filter({date: ...}).count() on the front page for everyone to see the total volume, and I wouldn't want the automatic filter for that query. And it could also be tricky for things like ChatMessage which has two user ids.

Do you have ideas on what you think the API could look like?

@khoerling
Copy link
Contributor

There's got to be, at least, an additional flag part of the query, eg: {validateAuthToken: true} to designate which QueryRequests to filter-- yes?

@mividtim
Copy link
Author

@mikemintz I guess I was thinking something along the lines of an object with table names as keys and filters as values. Perhaps, to support the multi-tenancy use-case, accept '*' as a key for a catch-all filter across all tables. Perhaps also, the keys could support a comma-separated list of table names, so one filter can be applied across a group of tables, but not all tables.

@mividtim
Copy link
Author

@khoerling I'm not sure I understand this question. There is an existing piece of functionality for authenticating a session, and for adding user-specific information to the session which are made available to the whitelists. Does this not cover "validateAuthToken" in your question?

@mividtim
Copy link
Author

@mikemintz I think what would make this complicated to implement (default filters for tables) would be joins. I haven't dug into the internals of Rethink enough (yet) to understand this. But what I'm trying to accomplish on this thread is to see whether or not we can describe a system of row- and column-level security in the middle layer, such that not every filter would need to be whitelisted. Throttling is another use case that could be implemented to support a really robust system. My goal here is to minimize the amount of duplicated work in describing queries. Anything short wouldn't quite transcend the status quo of implementing detailed queries at the server-side-app layer, and providing an API to the client to wrap those queries. Having to duplicate the query on the client in full actually adds code to the whole, rather than cull it down. With a good way to describe col- and row-level filters at the middle layer, this system would allow less code overall, and thus higher productivity.

@mividtim
Copy link
Author

@mikemintz To use your example, and answer your question more directly, the filter description would boil down to this object:
{
'account,transaction,statement,message': RP.filter({userId: session.userId})
}
Your use case of providing a summary across all users is interesting, and perhaps this is what the whitelist specifically describes - a way to break out of the bounds of the automatic filters.

Perhaps we need yet another layer involved here, that declares roles in the system, and which filters apply to which roles. For instance, a system administrator would have no filters (query anything you like), a tenant admin would have only the tenancy filter ('*': RP.filter({tenantId: session.tenantId}), but end-users would have filters on their own user ID. Other roles could exist between tenant admin and end-user, as well, like branch managers. These would have to be described somehow in configuration, or by the server-app author providing a closure to this library that is passed a session object and returns a list of filters that apply to that role.

@mikemintz
Copy link
Owner

@mividtim I agree this would be complicated to implement with joins. Are you thinking maybe we'd only do this on queries that follow a formulaic syntax like r.table(...).filter(...).orderBy(...).limit(...) and require any other query to be in the whitelist? I'd also be concerned there that I didn't fully understand what reql allows, and someone malicious maybe embeds something like r.table('users').filter({name: r.table('transactions').insert(...)})

Either way, doing something like this will first require support for modifying client queries before sending them off to the server, which would be nice to have support in general for. But that can have unintended consequences that we'll have to figure out, like an error in reql might send the replaced query back to the browser in the error message, potentially with sensitive information they weren't supposed to have.

@mividtim
Copy link
Author

I think I'd like to study a bit more about ReQL. If we're going to go down
the road of altering the query, then I believe it would make sense to go
all the way down the rabbit hole, parsing and understanding the query
fully, and introducing automated filters as needed at any point in the
query (at any table reference), and also whitelisting subqueries. If we
fully parse the entire query (much as RethinkDB itself must do) on the way
through, I believe we could accomplish this. It would not, certainly, be
trivial.

On Wed, Feb 17, 2016 at 4:12 PM Mike Mintz [email protected] wrote:

@mividtim https://github.com/mividtim I agree this would be complicated
to implement with joins. Are you thinking maybe we'd only do this on
queries that follow a formulaic syntax like
r.table(...).filter(...).orderBy(...).limit(...) and require any other
query to be in the whitelist? I'd also be concerned there that I didn't
fully understand what reql allows, and someone malicious maybe embeds
something like r.table('users').filter({name:
r.table('transactions').insert(...)})

Either way, doing something like this will first require support for
modifying client queries before sending them off to the server, which would
be nice to have support in general for. But that can have unintended
consequences that we'll have to figure out, like an error in reql might
send the replaced query back to the browser in the error message,
potentially with sensitive information they weren't supposed to have.


Reply to this email directly or view it on GitHub
#12 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants