This is very simple, but also very powerful! You define a starting point, and search for the closest matches of a desired table. When you add more connections your questions will stay the same, only the results will improve. If you like to see a real live example, here is a small code for a computer store . The network can be serialized through peewee to be stored on MySQL, PostgreSQL or SQLite.

After teaching the MyNQL network relations between two table1.id1 <-> table2.id2 , you can ask the network also about all the indirect relations you like to know. A simple connect and select is all you need.

You may already have tables like: Customers, merchants, products, places, areas, promotions, interests. Those tables used to have an id that together with the table name identify each entry.

MyNQL is a minimalistic graph database based on the Python library Networkx . Instead of replacing your relational database, it helps you to add a network with references to the data you already have.

MyNQL is a minimalistic graph database based on the Python library Networkx. Instead of replacing your relational database, it helps you to add a network with references to the data you already have.

Nodes have the format table.id

Connections (only) have a distance

You may already have tables like: Customers, merchants, products, places, areas, promotions, interests. Those tables used to have an id that together with the table name identify each entry.

After teaching the MyNQL network relations between two table1.id1 <-> table2.id2 , you can ask the network also about all the indirect relations you like to know. A simple connect and select is all you need.

This is very simple, but also very powerful! You define a starting point, and search for the closest matches of a desired table. When you add more connections your questions will stay the same, only the results will improve. If you like to see a real live example, here is a small code for a computer store. The network can be serialized through peewee to be stored on MySQL, PostgreSQL or SQLite.

Install¶ MyNLQ’s source code hosted on GitHub. git clone https://github.com/livinter/MyNQL.git python setup.py install or just pip install MyNQL

Teach the Network¶ For example if a customer make a purchase of a product you assume a relation between customer.id and product.id , so you connect them. Optional you can specify a distance between nodes, to represent how close the nodes are related. connect - connect two nodes

- connect two nodes delete - delete a connection Nodes are created automatically when you do the connection, and removed if they do not have any more connections. So do not worry about them.

Ask the Network¶ Now you can query all kinds of relations, not only the once you taught. With select you specify a starting point, like customer.id and specify the category where you like to know its closes relation. select - gives you the best related nodes from a specified category The searching query takes into account all the different routes up to a radius you specify.

Example¶ Lets imagine we already have a table customer Id Name 101 jose … 102 maria … 103 juan … and you want to teach the network about recent purchases. from MyNQL import MyNQL mynql = MyNQL ( 'store' ) mynql . connect ( 'customer.juan' , 'product.jeans' ) mynql . connect ( 'customer.juan' , 'product.socks' ) mynql . connect ( 'customer.maria' , 'product.socks' ) If the column Name is unique you can use it as a key, otherwise you would need column Id , and your code would look like this: mynql . connect ( "customer.103', 'product.12') Now you can ask questions from other points of view. You always specify a starting point, and the category where you want to know the best matches: >>> mynql . select ( 'customer.maria' , 'product' ) ['socks', 'jeans'] Maria is more connected to socks , as she has a direct connection, but also a bit to jeans as there exist an indirect connection through Juan. >>> mynql . select ( 'product.jeans' , 'product' ) ['socks'] Any combination is valid. For example you can ask about how one product is related to other.

Backend¶ Storage is done in memory, but if you want to use MySQL, SQLite or PostgreSQL as a backend take a look at test/pee_example.py . This will keep a copy of all updates in your database.

The “MyNQL” module¶ class MyNQL. MyNQL ( db_name, serializer=None, log_file=None, log_level=40, backward_factor=0.5 ) ¶ The MyNQL class log_level and log_file can be used to get debugging information to screen or to a logfile. For details regarding logging refer to the python lib logging. The serializer allow you to save the data into a database. See the pee_example for reference. connect ( nodes1, nodes2, distance=1.0, distance_backward=None, rewrite=False ) ¶ connect two nodes, if the relation already exist its closeness will be reduces. nodes are created if they do not exist. >>> x = MyNQL ( "x" ) . connect ( "table1.1" , "table2.3" ) >>> x . G [( "table1" , "1" )][( "table2" , "3" )] {'distance': 1.0} >>> _ = x . connect ( "table1.1" , "table2.3" ) >>> x . G [( "table1" , "1" )][( "table2" , "3" )] {'distance': 0.5} >>> _ = x . connect ( "table1.1" , "table2.3" , rewrite = True ) >>> x . G [( "table1" , "1" )][( "table2" , "3" )] {'distance': 1.0} Parameters: nodes1 (basestring) – this is a node as a tuple composed like (name/id, category)

(basestring) – this is a node as a tuple composed like (name/id, category) nodes2 (basestring) – this is a node as a tuple composed like (name/id, category)

(basestring) – this is a node as a tuple composed like (name/id, category) distance (float) – the closer the distance the more both nodes are related

(float) – the closer the distance the more both nodes are related distance_backward (float) – distance from node 2 to node 1 Returns: None delete ( nodes1, nodes2 ) ¶ delete a connection. if nodes do not have any neighbour anymore, nodes are also deleted. >>> nql = MyNQL ( "x" ) . connect ( "person.juan" , "promo.promo1" ) >>> nql = nql . delete ( "person.juan" , "promo.promo1" ) >>> nx . number_of_nodes ( nql . G ) 0 Parameters: node1 – node 1

– node 1 node2 – node 2 Returns: None get_categories ( ) ¶ all the categories that have been used so far. >>> MyNQL ( "x" ) . connect ( "person.juan" , "promo.promo1" ) . get_categories () ['person', 'promo'] Returns: list of categories get_distance ( node1, node2, radius=3.0 ) ¶ select the relation between two nodes Parameters: node1 – node 1

– node 1 node2 – node 2 Returns: total distance as float load ( typ='gexf', path='' ) ¶ load the complete network Parameters: typ – one of gmi, gexf, gpickle, graphml, yaml, node_link_data

– one of gmi, gexf, gpickle, graphml, yaml, node_link_data path – location of network file Returns: None load_serialized_node ( key, json_node_data ) ¶ used to load network from database Parameters: key –

– json_node_data – Returns: None plot ( ) ¶ draw the graph using mathplotlib Returns: None save ( typ='gexf', path='' ) ¶ save network to disk Parameters: typ – one of gmi, gexf, gpickle, graphml, yaml, node_link_data

– one of gmi, gexf, gpickle, graphml, yaml, node_link_data path – location to save file Returns: select ( nodes_1, category, radius=3.0, in_order=True, limit=None, value_only=True ) ¶ select the most matching nodes of a specific category ordered by closeness to node1. if value_only is True only the IDs are returned otherwise the score as closeness comes with the tuple of the data. [(closeness, (node, id)),..] if no nodes are found and empty list is returned. Parameters: nodes_1 (str) – the starting node for calculating closeness

(str) – the starting node for calculating closeness category (str) – the result is reduced to only elements from a specific category

(str) – the result is reduced to only elements from a specific category radius (float) – reduce search radius to radius

(float) – reduce search radius to radius in_order (bool) – sort output by having the best relation first

(bool) – sort output by having the best relation first limit (int) – limit the amount of results to an number

(int) – limit the amount of results to an number value_only – only return the id, without score Returns: best matching nodes