RELATE
statementThe RELATE
statement can be used to generate graph edges between two records in the database.
This allows you to traverse related records efficiently without needing to pull data from multiple tables and merging that data together using SQL JOINs.
Edges created using the RELATE statement are nearly identical to tables created using other statements, and can contain data.
The key differences are that:
in
and out
, which specify the directions of the relationships. These cannot be modified in schema declarations except to specify that they must be of a certain record type or to add assertions.Otherwise, edge tables behave like normal tables in terms of updating, defining a schema or indexes.
Record links are the alternative option when connecting data, and simply consist of a field with record IDs that serve as uni-directional links. The key differences are that graph relations
SurrealQL SyntaxRELATE [ ONLY ] @from_record -> @table -> @to_record [ CONTENT @value | SET @field = @value ... ] [ RETURN NONE | RETURN BEFORE | RETURN AFTER | RETURN DIFF | RETURN @statement_param, ... ] [ TIMEOUT @duration ] [ PARALLEL ] ;
Note
RELATE
will create a relation regardless of whether the records to relate to exist or not. As such, it is advisable to create the records you want to relate to before usingRELATE
, or to at least ensure that they exist before making a query on the relation. You can do this using the CREATE statement. If the records to relate to don’t exist, a query on the relation will still work without an error but will return an empty array. To override this default behaviour and return an error if no records exist to relate, you can add theENFORCED
keyword to aDEFINE TABLE
statement.
The following query shows the basic structure of the RELATE
statement, which creates a relationship between a record in the person table and a record in the article table.
RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm;
Response[ { "id": "wrote:ctwsll49k37a7rmqz9rr", "in": "person:l19zjikkw1p1h9o6ixrg", "out": "article:8nkk6uj4yprt49z7y3zm" } ]
There is no relationship information stored in either the person or article table.
Instead, an edge table (in this case a table called wrote
) stores the relationship information. The structure in -> id -> out
mirrors the record IDs from the RELATE
statement, with the addition of the automatically generated ID for the wrote
edge table.
By default, the edge table gets created as a schemaless table when you execute the RELATE
statement. You can however make the table schemafull by either defining a schema before or after executing the RELATE statement.
A common use case is to make sure only unique relationships get created. You can do that by defining an index.
DEFINE INDEX unique_relationships ON TABLE wrote COLUMNS in, out UNIQUE;
Edge tables are bi-directional by default. To enforce unidirectional relationships, you can restrict the type definition using field definition.
DEFINE FIELD in ON TABLE wrote TYPE record<person>; DEFINE FIELD out ON TABLE wrote TYPE record<article>;
A RELATE
statement that involves an array instead of a single id will not fail. Instead, it will create a graph edge for each record in the array:
INSERT INTO cat (id) VALUES ("mr_meow"), ("mrs_meow"), ("kitten"); RELATE [cat:mr_meow, cat:mrs_meow]->parent_of->cat:kitten;
Response[ { "id": "parent_of:7ypx27tgoa9lrsa4s6fm", "in": "cat:mr_meow", "out": "cat:kitten" }, { "id": "parent_of:mw08ri5e1jgh78wp1c1p", "in": "cat:mrs_meow", "out": "cat:kitten" } ]
Similarly, a RELATE
statement that involves two arrays will return a number of graph edges equal to their product (in this case 2 * 2):
CREATE cat:kitten2; RELATE [cat:mr_meow, cat:mrs_meow]->parent_of->[cat:kitten, cat:kitten2];
Response[ { "id": "parent_of:kvgu19yulq347b8xksxw", "in": "cat:mr_meow", "out": "cat:kitten" }, { "id": "parent_of:xxvj5pkxci3k9bqm6br9", "in": "cat:mr_meow", "out": "cat:kitten2" }, { "id": "parent_of:e4xwvecwwm6sxq1xe0yr", "in": "cat:mrs_meow", "out": "cat:kitten" }, { "id": "parent_of:ub9t2mm948jfuup6r6c4", "in": "cat:mrs_meow", "out": "cat:kitten2" } ]
SET
and CONTENT
Graph edges are standalone tables that can hold other fields besides the default in
, out
, and id
. These can be added during a RELATE
statement or during an UPDATE
in the same manner as any other SurrealDB table.
Let’s look at the two ways you can add record data in the RELATE
statement. Both of these queries will produce the same result.
RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now(); RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm CONTENT { time: { written: time::now() } };
Here is an example of a graph edge being updated in the same way as any other SurrealDB record:
-- Add a small synopsis composed of the table name and article ID UPDATE wrote SET description = meta::tb(out) + ' written by ' + <string>in;
Response{ "id": "wrote:s51jjewpw953ahysbhak", "in": "person:l19zjikkw1p1h9o6ixrg", "out": "article:8nkk6uj4yprt49z7y3zm", "description": "article written by person:l19zjikkw1p1h9o6ixrg" }
CONTENT
and SET
Available since: v1.5.0
You can also pass variables in the CONTENT
block. This is useful when you want to pass a variable that is not a record ID.
LET $time = time::now(); RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm CONTENT { time: { written: $time } };
Response{ "id": "wrote:ctwsll49k37a7rmqz9rr", "in": "person:l19zjikkw1p1h9o6ixrg", "out": "article:8nkk6uj4yprt49z7y3zm", "time": { "written": "2021-09-29T14:00:00Z" } }
Below is an example of how you can pass a variable in the SET
block:
LET $time = time::now(); RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = $time;
Response{ "id": "wrote:ctwsll49k37a7rmqz9rr", "in": "person:l19zjikkw1p1h9o6ixrg", "out": "article:8nkk6uj4yprt49z7y3zm", "time": { "written": "2021-09-29T14:00:00Z" } }
ONLY
keywordUsing the ONLY keyword, just an object for the relation in question will be returned. This, instead of an array with a single object.
RELATE ONLY person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm;
Let’s look at how the above gets queried, for more examples see the SELECT docs.
For the questions below, each of the queries will give you the same answer. Note that whether ->
and <-
are parsed as in
or out
depends on their direction in relation to the graph edge wrote
. An arrow pointing towards wrote
corresponds to in
, and vice versa.
The following examples show how to make similar queries in a number of different ways, in the context of person
records that wrote
one or more article
s.
Who wrote the articles?
SELECT in FROM wrote; SELECT ->wrote.in FROM person; SELECT ->wrote<-person FROM person; SELECT <-wrote.in FROM article; SELECT <-wrote<-person FROM article;
Which articles did the person write?
SELECT out FROM wrote; SELECT ->wrote.out FROM person; SELECT ->wrote->article FROM person; SELECT <-wrote.out FROM article; SELECT <-wrote->article FROM article;
When was the article written?
SELECT time.written as time_written FROM wrote; SELECT ->wrote.time.written as time_written FROM person; SELECT <-wrote.time.written as time_written FROM article;
For a more complicated query like the one below you can use a simple rule of thumb: Place the subject in front of the graph selection, then read it backward.
-- This query SELECT ->purchased->product<-purchased<-person->purchased->product FROM person:tobie -- Then becomes person:tobie->purchased->product<-purchased<-person->purchased->product SELECT
Reading this backwards then makes more sense:
Select every product that was purchased by a person who purchased a product that was also purchased by person Tobie.
Alternatively, you can break it down into steps.
-- Starting with Tobie person:tobie -- and his purchased products ->purchased->product -- that were also purchased by persons <-purchased<-person -- what are all of those persons' purchased products? ->purchased->product
Putting it all together it would be: based on all the products Tobie purchased, which person also purchased those products and what did they purchase?
You can use expressions to filter relations based on specific conditions using the WHERE
clause.
For example, suppose we want to limit the query to only take recent purchases into account. We can filter purchased
graph edge to only include purchases made in last 3 weeks:
-- Select products purchased by people in the last 3 weeks who have purchased the same products that tobie purchased SELECT ->purchased->product<-purchased<-person->(purchased WHERE created_at > time::now() - 3w)->product FROM person:tobie;
Sometimes a relation is such that it is impossible to determine which record is located at the in
part of a graph table, and which is located at the out
part. This is the case when a relationship is truly bidirectional and equal, such as a friendship, marriage, or sister cities:
CREATE city:calgary, city:daejeon; RELATE city:calgary->sister_of->city:daejeon; SELECT id, ->sister_of->city AS sister_cities FROM city;
In such a case, a query on the relationship makes it appear as if one city has a twin city but the other does not.
[ { id: city:calgary, sister_cities: [ city:daejeon ] }, { id: city:daejeon, sister_cities: [] } ]
To solve this, we can use the <->
operator instead of ->
. Using <->
will access both the in
and out
fields, instead of just one.
SELECT id, <->sister_of<->city AS sister_cities FROM city;
This brings up another issue in which a city now appears to be a sister city of itself.
[ { id: city:calgary, sister_cities: [ city:calgary, city:daejeon ] }, { id: city:daejeon, sister_cities: [ city:calgary, city:daejeon ] } ]
Here we can use the array::complement
function to return only items from one array that are not present in another array.
SELECT id, array::complement(<->sister_of<->city, [id]) AS sister_cities FROM city;
Response[ { id: city:calgary, sister_cities: [ city:daejeon ] }, { id: city:daejeon, sister_cities: [ city:calgary ] } ]
When using RELATE
for this sort of relationship, you may want to define a unique key based on the ordered record IDs involved.
DEFINE FIELD key ON TABLE sister_of VALUE <string>array::sort([$this.in, $this.out]); DEFINE INDEX only_one_sister_city ON TABLE sister_of FIELDS key UNIQUE;
With the index in place, a relation set from one record to the other now cannot be created a second time.
RELATE city:calgary->sister_of->city:daejeon; -- OK RELATE city:daejeon->sister_of->city:calgary; -- "Database index `only_one_sister_city` already contains '[city:calgary, city:daejeon]', with record `sister_of:npab0uoxogmrvpwsvfoa`"
in
and out
fields of a relationAs mentioned above, the in
and out
fields of a graph table are mandatory but can be modified to specify their record type or make assertions.
Thus, the following field declarations will work:
DEFINE FIELD in ON TABLE wrote TYPE record<author>; DEFINE FIELD out ON TABLE wrote TYPE record<book>;
But this will be ignored:
DEFINE FIELD in ON TABLE wrote TYPE string; DEFINE FIELD out ON TABLE wrote TYPE int;
An example of an assertion on one of the fields of a record table for a library which is not yet ready to handle non-English books:
DEFINE FIELD out ON TABLE wrote TYPE record<book> ASSERT $value.language = "English"; CREATE book:demian SET title = "Demian. Die Geschichte von Emil Sinclairs Jugend", language = "German"; CREATE author:hesse SET name = "Hermann Hesse"; RELATE author:hesse->wrote->book:demian;
Output"Found book:demian for field `out`, with record `wrote:le3ntnyv2lb943km9gxk`, but field must conform to: $value.language = 'English'"
Using an alias is a common practice in both regular and relation queries in SurrealDB to make output more readable and collapse nested structures. You can create an alias using the AS
clause.
CREATE cat:one, cat:two, cat:three; RELATE cat:one->friends_with->cat:two; RELATE cat:two->friends_with->cat:three; SELECT ->friends_with->cat->friends_with->cat FROM cat:one; -- create an alias for the result using the `AS` clause. SELECT ->friends_with->cat->friends_with->cat AS friends_of_friends FROM cat:one;
// Output without alias { "->friends_with": { "->cat": { "->friends_with": { "->cat": [ cat:three ] } } } } // Output with alias { friends_of_friends: [ cat:three ] }
However, an alias might not be preferred in a case where you have multiple graph queries that resolve to the fields of a large nested structure. Take the following data for example:
CREATE country:usa SET name = "USA"; CREATE state:pennsylvania SET population = 12970000; CREATE state:michigan SET population = 10030000; CREATE city:philadelphia, city:pittsburgh, city:detroit, city:grand_rapids; RELATE country:usa->contains->[state:pennsylvania, state:michigan]; RELATE state:pennsylvania->contains->[city:philadelphia, city:pittsburgh]; RELATE state:michigan->contains->[city:detroit, city:grand_rapids];
A query on the states and cities of these records using aliases would return the data in a structure remade to fit the aliases declared in the query.
SELECT name, ->contains->state AS states, ->contains->state->contains->city AS cities FROM country:usa;
Output[ { cities: [ city:philadelphia, city:pittsburgh, city:grand_rapids, city:detroit ], name: 'USA', states: [ state:pennsylvania, state:michigan ] } ]
However, opting to not use an alias will return the original graph structure which makes the levels of depth of the query clearer. In addition, the population
field is clearly the population for the states.
SELECT id, ->contains->state.id, ->contains->state.population, ->contains->state->contains->city.id FROM country:usa;
Output[ { "->contains": { "->state": { "->contains": { "->city": { id: [ city:philadelphia, city:pittsburgh, city:grand_rapids, city:detroit ] } }, id: [ state:pennsylvania, state:michigan ], population: [ 12970000, 10030000 ] } }, id: country:usa } ]
As the query that uses aliases does not maintain the original graph structure, adding population
would require clever renaming such as ->contains->state.population AS state_populations
to make it clear that the numbers represent state and not city populations.
LET
parameters in RELATE statementsYou can also use parameters to specify the record IDs.
-- These two statements store the result of the subquery in a parameter -- The subquery returns an array of IDs LET $person = (SELECT VALUE id FROM person); LET $article = (SELECT VALUE id FROM article); -- This statement creates a relationship record for every combination of Record IDs -- Such that if we have 10 records each in the person and article table -- We get 100 records in the wrote edge table (10*10 = 100) -- In this case it would mean that each article would have 10 authors RELATE $person->wrote->$article SET time.written = time::now();
RETURN
clauseBy default, the relate statement returns the record value once the changes have been made. To change the return value of each record, specify a RETURN clause, specifying either NONE
, BEFORE
, AFTER
, DIFF
, or a comma-separated list of specific fields to return.
-- Don't return any result RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now() RETURN NONE; -- Return the changeset diff RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now() RETURN DIFF; -- Return the record before changes were applied RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now() RETURN BEFORE; -- Return the record after changes were applied (the default) RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now() RETURN AFTER; -- Return a specific field only from the updated records RELATE person:l19zjikkw1p1h9o6ixrg->wrote->article:8nkk6uj4yprt49z7y3zm SET time.written = time::now() RETURN time;
TIMEOUT
clauseWhen processing a large result set with many interconnected records, it is possible to use the TIMEOUT
keyword to specify a timeout duration for the statement. If the statement continues beyond this duration, then the transaction will fail, and the statement will return an error.
-- Cancel this conditional filtering based on graph edge properties -- if it's not finished within 5 seconds SELECT * FROM person WHERE ->knows->person->(knows WHERE influencer = true) TIMEOUT 5s;
PARALLEL
clauseWhen processing a large result set with many interconnected records, it is possible to use the PARALLEL
keyword to signify that edges and remote records should be fetched and processed in parallel. This can be useful when you want to process a large result set in a short amount of time.
-- Fetch and process the person, purchased and product targets in parallel -- Select every product that was purchased by a person that purchased a product that person tobie also purchased SELECT ->purchased->product<-purchased<-person->purchased->product FROM person:tobie PARALLEL;
You can also delete graph edges between two records in the database by using the DELETE statement.
For example the graph edge below:
RELATE person:tobie->bought->product:iphone;
Response[ { "id": "bought:ctwsll49k37a7rmqz9rr", "in": "person:tobie", "out": "product:iphone" } ]
Can be deleted by:
DELETE person:tobie->bought WHERE out=product:iphone;
As mentioned above, a graph edge will also automatically be deleted if it is no longer connected to a record at both in
and out
.
-- Create three people CREATE person:one, person:two, person:three; -- And a love triangle involving them all RELATE person:one->likes->person:two; RELATE person:two->likes->person:three; RELATE person:three->likes->person:one; -- Person two moves to Venus permanently, so delete DELETE person:two; -- Only one `likes` relationship is left SELECT * FROM likes;
Output[ { id: likes:55szjin5yfqwl4sbmy1f, in: person:three, out: person:one } ]
Imagine a database that holds detailed information on the relations between NPCs in a game that are made to be as realistic as possible. Two of the characters have a rocky past but finally a happy conclusion in which they end up married. During this period, we might have tracked their relationship by adding and removing graph edges between the two of them as they move from a stage of being friends, to dating, to hating each other, to finally ending up married.
-- These three relations would end up deleted RELATE person:one->friends_with->person:two; RELATE person:one->dating->person:two; RELATE person:one->hates->person:two; -- Finally this would be the graph edge connecting the two RELATE person:one->married->person:two;
This works well to track the current state of the relationship, but creating a more general table such as knows
along with a number of fields can be a better method to track the changing relationship over time. The following shows the relationship between the two person
records, along with a third record called person:three
who went to the same school and once dated person:one
.
RELATE person:one->knows->person:two SET has_been_friends = true, has_dated = true, has_hated = true, married_to = true; RELATE person:one->knows->person:three SET same_high_school = true, has_dated = true;
With these fields in place, it is possible to use a WHERE
clause to do refined searches on relationships of a certain type.
SELECT id, ->knows->person FROM person:one; SELECT id, ->knows[WHERE has_dated]->person FROM person:one; SELECT id, ->knows[WHERE same_high_school AND has_dated]->person FROM person:one;
Because the WHERE
clause simply checks for truthiness (whether a value is present and not empty), these fields do not necessarily need to be booleans and can even be complex objects.
RELATE person:one->knows->person:two SET same_high_school = false, has_been_friends = true, has_dated = { from: d'2020-12-25', to: d'2023-12-25' }, has_hated = { from: d'2023-12-25', to: d'2024-03-01' }, married_to = { since: d'2024-03-01' }; RELATE person:one->knows->person:three SET same_high_school = true, has_dated = { from: d'2019-09-10', to: d'2020-12-31' };
With these objects, a jealous person:two
could do a check on person:one
to see how many relationships with has_dated
have an end time that overlaps with the has_dated
period of person:one
and person:two
.
SELECT id, ->knows[WHERE same_high_school AND has_dated.to > d'2020-12-25']->person FROM person:one;