What is cURL?

Ever heard of cURL? It’s a famous command line tool for sending/receiving data using URL syntax, says wikipedia here. Let’s start with an example related to Couchbase. The N1QL query service is available through a REST API. If you want to execute a N1QL query with cURL, and supposing your query service is enabled on localhost, it will look like this:

curl http://localhost:8093/query/service -d ‘statement=SELECT * FROM default LIMIT 1’

or if you want to run a fulltext query you would do something like

curl -XPOST -H “Content-Type: application/json” http://localhost:8094/api/index/beer/query -d ‘{“explain”: true,”fields”: [“*”],”highlight”: {},”query”: {“query”: “pale ale”}}’

Right now there is nothing specific about N1QL, it’s just an exemple of REST API call with cURL. So why this title? Well that’s because you can call cURL from a N1QL query.

cURL() in N1QL

curl() is a new N1QL function that allow you to access external JSON data over a remote(usually HTTP(s)) endpoint. Similar to other N1QL functions, this function can be used in various N1QL expressions and various clauses of the SELECT/DML queries (projection, where, from etc). When used in from clause the curl() invocation should result in a set/collection of JSON documents.

It has already been covered quite extensively on DZone, I strongly invite you to read it, as well as many other DZone posts about N1QL and written by the N1QL team.

We can start as anyone will do when trying a new language with a very simple SELECT:

SELECT CURL(“GET”,”https://maps.googleapis.com/maps/api/geocode/json”, {“data”:”address=santa+cruz&components=country:ES&key=YOUR_API_KEY”});

This will get you the following JSON result:

[ { "results": [ { "address_components": [ { "long_name": "Santa Cruz de Tenerife", "short_name": "Santa Cruz de Tenerife", "types": [ "locality", "political" ] }, { "long_name": "Santa Cruz de Tenerife", "short_name": "TF", "types": [ "administrative_area_level_2", "political" ] }, { "long_name": "Canary Islands", "short_name": "CN", "types": [ "administrative_area_level_1", "political" ] }, { "long_name": "Spain", "short_name": "ES", "types": [ "country", "political" ] } ], "formatted_address": "Santa Cruz de Tenerife, Spain", "geometry": { "bounds": { "northeast": { "lat": 28.487616, "lng": -16.2356646 }, "southwest": { "lat": 28.4280248, "lng": -16.3370045 } }, "location": { "lat": 28.4636296, "lng": -16.2518467 }, "location_type": "APPROXIMATE", "viewport": { "northeast": { "lat": 28.487616, "lng": -16.2356646 }, "southwest": { "lat": 28.4280248, "lng": -16.3370045 } } }, "place_id": "ChIJcUElzOzMQQwRLuV30nMUEUM", "types": [ "locality", "political" ] } ] } ] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 [ { "results" : [ { "address_components" : [ { "long_name" : "Santa Cruz de Tenerife" , "short_name" : "Santa Cruz de Tenerife" , "types" : [ "locality" , "political" ] } , { "long_name" : "Santa Cruz de Tenerife" , "short_name" : "TF" , "types" : [ "administrative_area_level_2" , "political" ] } , { "long_name" : "Canary Islands" , "short_name" : "CN" , "types" : [ "administrative_area_level_1" , "political" ] } , { "long_name" : "Spain" , "short_name" : "ES" , "types" : [ "country" , "political" ] } ] , "formatted_address" : "Santa Cruz de Tenerife, Spain" , "geometry" : { "bounds" : { "northeast" : { "lat" : 28.487616 , "lng" : - 16.2356646 } , "southwest" : { "lat" : 28.4280248 , "lng" : - 16.3370045 } } , "location" : { "lat" : 28.4636296 , "lng" : - 16.2518467 } , "location_type" : "APPROXIMATE" , "viewport" : { "northeast" : { "lat" : 28.487616 , "lng" : - 16.2356646 } , "southwest" : { "lat" : 28.4280248 , "lng" : - 16.3370045 } } } , "place_id" : "ChIJcUElzOzMQQwRLuV30nMUEUM" , "types" : [ "locality" , "political" ] } ] } ]

There is a lot of information here and maybe you are only interested in getting the coordinates of the identified address. It’s easy to do. You just treat the result of the cURL function like any other JSON object:

SELECT CURL(“GET”,”https://maps.googleapis.com/maps/api/geocode/json”, {“data”:”address=santa+cruz&components=country:ES&key=YOUR_API_KEY”}).results[0].geometry ;

And this will return just what you wanted:

[ { "geometry": { "bounds": { "northeast": { "lat": 28.487616, "lng": -16.2356646 }, "southwest": { "lat": 28.4280248, "lng": -16.3370045 } }, "location": { "lat": 28.4636296, "lng": -16.2518467 }, "location_type": "APPROXIMATE", "viewport": { "northeast": { "lat": 28.487616, "lng": -16.2356646 }, "southwest": { "lat": 28.4280248, "lng": -16.3370045 } } } } ] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 [ { "geometry" : { "bounds" : { "northeast" : { "lat" : 28.487616 , "lng" : - 16.2356646 } , "southwest" : { "lat" : 28.4280248 , "lng" : - 16.3370045 } } , "location" : { "lat" : 28.4636296 , "lng" : - 16.2518467 } , "location_type" : "APPROXIMATE" , "viewport" : { "northeast" : { "lat" : 28.487616 , "lng" : - 16.2356646 } , "southwest" : { "lat" : 28.4280248 , "lng" : - 16.3370045 } } } } ]

While this is quite nice, there is currently no link with your data. Let’s pretend you have a document with an address but no geo coordinates. You can now add those coordinates with the following N1QL query:

UPDATEtravel-sampleUSE KEYS “myDocumentWithoutCoordinates” SET geo = CURL(“GET”,”https://maps.googleapis.com/maps/api/geocode/json”, {“data”:”address=santa+cruz&components=country:ES&key=YOUR_API_KEY”}).results[0].geometry returning *

And now you just modified an existing document based on an external service with a N1QL query 🙂

Here’s another example. Same FTS query than before but as the FROM clause of a N1QL query:

SELECT result.total_hits, array_length(result.hits) FROM curl(“POST”,”http://localhost:8094/api/index/beer/query”,{ “header”:”Content-Type: application/json”, “data”:{“explain”:true,”fields”: [“*”],”highlight”: {},”query”: {“query”: “pale ale”}} }) result;

And this will give you:

[ { “$1”: 10, “total_hits”: 3815 } ]

Now you know that you can run a Fulltext search (or any call to any URL that returns JSON) and use all the goodness that is N1QL to get a nice projection on that JSON data. In a way this allow us to integrate FTS with N1QL. Since you can use cURL in a FROM clause than you can for instance JOIN the result to a bucket.

Here’s another example in that spirit, using our traditional beer-sample containing beers and brewery document. First we transform the previous query to get the list of IDs for the documents containing “pale ale”:

SELECT hit.id FROM curl("POST","http://localhost:8094/api/index/beer/query", {"header":"Content-Type: application/json", "data":{"explain":true,"fields": ["*"],"highlight": {},"query": {"query": "pale ale"}} }) result UNNEST result.hits AS hit [ { "id": "stone_brewing_co-stone_pale_ale" }, { "id": "sierra_nevada_brewing_co-sierra_nevada_pale_ale" }, { "id": "yards_brewing-yards_philadelphia_pale_ale" }, { "id": "cooper_s_cave_ale_company-cooper_s_cave_pale_ale" }, { "id": "tommyknocker_brewery_and_pub-pick_axe_pale_ale" }, { "id": "bell_s_brewery_inc-pale_ale" }, { "id": "flying_dog_brewery-classic_pale_ale" }, { "id": "appalachian_brewing_company-hoppy_trails_india_pale_ale" }, { "id": "mogollon_brewing_company-superstition_pale_ale" }, { "id": "the_church_brew_works-pipe_organ_pale_ale" } ] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 SELECT hit . id FROM curl ( "POST" , "http://localhost:8094/api/index/beer/query" , { "header" : "Content-Type: application/json" , "data" : { "explain" : true , "fields" : [ "*" ] , "highlight" : { } , "query" : { "query" : "pale ale" } } } ) result UNNEST result . hits AS hit [ { "id" : "stone_brewing_co-stone_pale_ale" } , { "id" : "sierra_nevada_brewing_co-sierra_nevada_pale_ale" } , { "id" : "yards_brewing-yards_philadelphia_pale_ale" } , { "id" : "cooper_s_cave_ale_company-cooper_s_cave_pale_ale" } , { "id" : "tommyknocker_brewery_and_pub-pick_axe_pale_ale" } , { "id" : "bell_s_brewery_inc-pale_ale" } , { "id" : "flying_dog_brewery-classic_pale_ale" } , { "id" : "appalachian_brewing_company-hoppy_trails_india_pale_ale" } , { "id" : "mogollon_brewing_company-superstition_pale_ale" } , { "id" : "the_church_brew_works-pipe_organ_pale_ale" } ]

To learn more about this FTS query and what you can generally do with FTS REST API, please refer to the current documentation.

Now we have a list of document IDs from FTS we can join on easily:

SELECT beers.* FROM curl(“POST”,”http://localhost:8094/api/index/beer/query”, {“header”:”Content-Type: application/json”, “data”:{“explain”:true,”fields”: [“*”],”highlight”: {},”query”: {“query”: “pale ale”}} }) result UNNEST result.hits AS hit JOIN beer-sample beers ON KEYS hit.id

How cool is that? But then you don’t have to limit yourself to query Couchbase, you can use external sources. Here’s another example. You will find lots of interesting JSON datasets on catalog.data.gouv. You can for instance look at the Most Popular Baby Names by Sex and Mother’s Ethnic Group in New York City. It’s a big JSON file, lots of rows, 13962 rows as you can learn running:

SELECT count(row) FROM CURL(“GET”,”https://data.cityofnewyork.us/api/views/25th-nujf/rows.json?accessType=DOWNLOAD”) result UNNEST result.data AS row

And if you want a top 10 of the given name, you can run the following query:

SELECT row FROM CURL(“GET”,”https://data.cityofnewyork.us/api/views/25th-nujf/rows.json?accessType=DOWNLOAD”) result UNNEST result.data AS row ORDER BY TONUMBER(row[13]) ASC, TONUMBER(row[12]) DESC LIMIT 10

It seems that Jayden and Isabella are very popular name in New York these days 🙂

From those examples you can see that the query Language N1QL is an innovative way to figure out what’s inside a big JSON file. You can basically map a JSON endpoint to a Database 🙂 And of course there are other possibilities we have not touched yet like using JOINs across a Couchbase bucket and a JSON endpoint.

Documentation

Now we don’t support every cURL features. It should however cover the basics and be usable right now. Here’s the list of options we support so far. The curl function takes up to three parameters:

1 – The first parameter is the HTTP verb used. So far we support “GET” and “POST”.

2 – The second parameter is the endpoint URL

3 – The third parameter represents different cURL options.

Security Options

Option Description value user Server user and password USERNAME[:PASSWORD] basic Use HTTP Basic Authentication — insecure Allow connections to SSL sites without certs (H) — anyauth curl to figure out authentication method by itself, and use the most secure one (In our case this will be basic only for now. Eventually we will support digest) —

Other Transfer-Related Options

Option Description value connect-timeout Maximum time allowed for connection SECONDS max-time Maximum time allowed for the transfer SECONDS data HTTP POST data (H) DATA header Pass custom header LINE to server (H) LINE show-error Show error. With -s, make CURL() show errors when they occur silent Silent mode (don’t output anything) — max-redirs Maximum number of redirects allowed (H) NUM keepalive-time Wait SECONDS between keepalive probes SECONDS

We would love to hear what you think about this feature. How would you use this? Is there anything missing? We are waiting for your comments below.