Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Variable Names #46

Open
hrishikeshrt opened this issue May 17, 2021 · 10 comments
Open

Dynamic Variable Names #46

hrishikeshrt opened this issue May 17, 2021 · 10 comments

Comments

@hrishikeshrt
Copy link

Thank you for this package.
I wanted to know if the nodes can be specified by strings

For example, I want to write a query

MATCH (n) WHERE n.root = "Something"

Now, I can do this using
p.Match.node('n').Where.n.property('root') == "Something"
However, I have to do this for many variables for a longer pattern in MATCH, and the name n is generated dynamically,
for example it could be n1, n2, n3 etc depending on which iteration of loop it is in. Is it possible to achieve that?

I tried giving Where.raw('n') and Where._('n') but both insert a STATEMENT word in the query.

(Apologies if this is a stupid question)

@emehrkay
Copy link
Owner

emehrkay commented May 17, 2021

Howdy,

Im not sure that I fully understand. The node name, the n in your example, could be dynamic and named different things? If that is the case you should be able to pass that in. Like this:

some_var = 'n23'
p.match.node(some_var).where.property('root') == some_var

Screen Shot 2021-05-17 at 9 04 27 AM

Also, the raw calls should not produce the word STATEMENT I believe that was a bug from a few versions ago (maybe this one https://github.com/emehrkay/Pypher/blob/master/CHANGELOG.md#0171----11262019). Try updating to the latest version

Screen Shot 2021-05-17 at 9 08 14 AM

Let me know if this fixes your issue or if I missed what you were asking

@hrishikeshrt
Copy link
Author

hrishikeshrt commented May 17, 2021

Thank you prompt response.
Sample query is,

MATCH (n5)-[r1:`REL1`]->(n1), (n3)-[r2:`REL2`]->(n2), (n4)-[r3:`REL4`]->(n3), (n5)-[r4:`REL2`]->(n4) WHERE n1.`name` = "NAME1" AND n3.`name` = "NAME2" AND n4.`name` = "NAME4" RETURN *

Here, n1, n2, ... etc can be of variable number (e.g. for some query it's n1, n2, n3. For some other query n1, n2, n3, n4, n5,
However same function is to generate the query.

My code is

p = pypher.Pypher()

# Example
# EDGES = [(1, 'REL1', 2), (1, 'REL2', 3), (2, 'REL3', 3)]

node_prefix = 'n'
relation_prefix = 'r'
edges = []
_rid = 0

for _src_id, relation, _dst_id in EDGES:
    _rid += 1
    src_var = f'{node_prefix}{_src_id}'
    dst_var = f'{node_prefix}{_dst_id}'
    rel_var = f'{relation_prefix}{_rid}'
    edge = pypher.__.node(src_var).rel_out(rel_var, relation).node(dst_var)
    edges.append(edge)

p.Match(*edges) 

Now, after this I want to add conditions. Conditions are of the form n1.name == 'NAME1' etc. These info comes to another dictionary, say NODES, so I want to do something like for node_var, node_name in NODES.items() and add multiple conditions joined by AND

Earlier I was appending these to a list by extending anon class, and was getting STATEMENT
Now, I'm doing as follows,

first_condition = True
conditions = pypher.Pypher()

for _var, _name in NAMES.items():
    if first_condition:
        conditions.where(_var).property('name') == _root  # Even this seems to work, not sure if this is better or the next line
        # conditions.where._(_var).property('name') == _root 
        first_condition = False
    else:
        conditions.And(_var).property('name') == _root

And after this, I call

p.Match(*edges)
p.append(conditions)
p.Return('*')

which completes the generation.

This almost works, except that it is now producing queries like
AND "n3".name = "NAME3" AND "n4".name = "NAME4"' (Extra " around node variables).

@emehrkay
Copy link
Owner

emehrkay commented May 17, 2021

I was able to get STATEMENT to print out, I think that is an error with the Anon object. I will check into it.

I did manage to get something going based on what you want using the ConditionalAND (https://github.com/emehrkay/Pypher#conditionals) object

EDGES = [(1, 'REL1', 2), (1, 'REL2', 3), (2, 'REL3', 3)]
 
node_prefix = 'n'
relation_prefix = 'r'
edges = []
_rid = 0
 
for _src_id, relation, _dst_id in EDGES:
    _rid += 1
    src_var = f'{node_prefix}{_src_id}'
    dst_var = f'{node_prefix}{_dst_id}'
    rel_var = f'{relation_prefix}{_rid}'
    edge = __.node(src_var).rel_out(rel_var, relation).node(dst_var)
    edges.append(edge)
 
 
conditions = []
 
NAMES = {'first': 'mark', 'you': 'hrshikeshrt'}
_root = 'some root value'
 
 
for _var, _name in NAMES.items():
    conditions.append(__().raw(_var).property('name') == _root)
 
p.Match(*edges)
p.WHERE(__.ConditionalAND(*conditions))
p.Return('*')

Screen Shot 2021-05-17 at 12 29 23 PM

In this example I did __.() which would end up creating a new Pypher object (https://github.com/emehrkay/Pypher/blob/master/pypher/builder.py#L1264) -- this is the root of the STATEMENT bug.

@hrishikeshrt
Copy link
Author

hrishikeshrt commented May 17, 2021

Thank you again!
ConditionalAND works for now. That raises the question though, what if there are some ANDs and some ORs?

I'm wondering why we needed to do __(). instead of just __. ?

And on an unrelated note, is there a method that would give the final query with bound parameters substituted? (I was also wondering why they are separated out in the first place, but I'm sure there would be usecases for that)

@emehrkay
Copy link
Owner

The bound params are substituted out for a few reasons:

  1. It is safer. Typically the params are arguments provided by users. Think of post/get data from a browser. They could pass anything in and without the argument being bound, it would allow of cypher injection (im not sure if I've seen that term before, same as sql injection)
  2. The neo4j server (or any db server) would create a template of your query and simply swap out the placeholder variables. If you feel that your query is going to be used a lot, I would recommend manually naming your variables so that the neo4j server has one copy of MATCH(n) where n.name = $name_var instead of having thousands of MATCH(n) where n.name = $NEO_random_string_xy123 etc

Yes, there is ConditionalOR object. You could chain them together like this p.Where.ContionalAND(*cond).ContionalOR(*ors) or nest them p.Where(__().ConditionalAND(__.ConditionalOr(*ors), some_other_and_condition, *cond) and so on. Play around with it in the tester

I just looked into the __.() STATEMENT bug. Anon.__getattr__ creates a new Pypher instance and calls getattr on it, even if that value is a function on the Pypher instance. That will add an unnamed Statement object to the linked list. When the Statement is turned into a string and it doesnt find a name, it simply capitalizes the class name, in this case Statement. Calling __.() will return a Pypher instance and then when you add .raw('values') on to it, it will call the actual Pypher.raw() function which adds an Raw object to the linked list. I'm still thinking about how to fix this

@emehrkay
Copy link
Owner

I have a testing tool that will substitute the variables. I use it for logging.

def _query_debug(query, params):
    from string import Template

    if not params:
        return query

    temp = Template(query)
    fixed = {}

    for k, v in params.items():
        if isinstance(v, str):
            v = "'{}'".format(v) if v else ''

        fixed[k] = v or "''"

    try:
        return temp.substitute(**fixed)
    except Exception as e:
        return '{} -- {}'.format(query, params)

@hrishikeshrt
Copy link
Author

Thank you again for your very quick and very helpful responses.
I'm currently replacing the bound parameters myself after calling str on the pypher object.

Does the point 2 in your comment mean that there's a way of providing the query generated with str(p) directly to neo4j somehow and let it perform replacements?

@emehrkay
Copy link
Owner

Which library are you using to run queries in python? It should have a way to pass in parameters

And to do named params, you could do something like

name = Param(name='var_name', value='Unique Name')
p.Where(__.__name__ == name)

str(p) # where name == $var_name

@hrishikeshrt
Copy link
Author

I plan to use py2neo.
I'll check it out. Thank you.

@emehrkay
Copy link
Owner

py2neo is great. Here is the base query method that accepts params https://py2neo.org/2021.1/workflow.html#py2neo.Graph.run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants