DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Open-Source SPL That Can Execute SQL Without RDB
  • Develop a Scalable Event Listener for Blockchain
  • MongoDB to Couchbase: An Introduction to Developers and Experts
  • Anatomy of a PostgreSQL Query Plan

Trending

  • How to Format Articles for DZone
  • Blue Skies Ahead: An AI Case Study on LLM Use for a Graph Theory Related Application
  • How Clojure Shapes Teams and Products
  • Java's Quiet Revolution: Thriving in the Serverless Kubernetes Era
  1. DZone
  2. Data Engineering
  3. Data
  4. CURL Comes to N1QL: Querying External JSON Data

CURL Comes to N1QL: Querying External JSON Data

JSON is a helpful way to transmit data. Now, N1QL can query it thanks to a CURL update in Couchbase 5.0. See how to connect to various endpoints to do just that.

By 
Isha Kandaswamy user avatar
Isha Kandaswamy
·
Updated Aug. 28, 17 · Tutorial
Likes (13)
Comment
Save
Tweet
Share
16.7K Views

Join the DZone community and get the full member experience.

Join For Free

N1QL has many functions that allow you to perform a specific operation. One such function that has been added into the new Couchbase 5.0 DP is CURL.

CURL allows you to use N1QL to interact with external JSON endpoints; namely, Rest API’s that return results and data in JSON format. This function will allow N1QL to have a conservative set of curl functionality built into the language. Interaction primarily consists of data transfer to and from a server using the http and https protocols. In short, the CURL function in N1QL provides you, the user, a subset of standard curl functionality (https://curl.haxx.se/docs/manpage.html) within a query language.

In order to retrieve data from different servers (such as Google Maps, Yahoo Finance etc), we can issue either GET or HTTP POST requests using the CURL function. This is seen in the diagram below.

Function Definition

CURL (URL, [options])

The first argument is the URL, which represents any URL that points to a JSON endpoint. Only URLs with the http:// or the https:// protocol are supported. Redirection is disabled. The input arguments to the CURL() function can be both static values and N1QL expressions that can be evaluated. 

Later in the article we shall see examples that query from the Google Geocode API, the Yahoo Finance API, Couchbase full text search and the Github API. The second argument is a list of options. This is a JSON object that contains a list of curl options and their corresponding values.

We support a variety of options that allow you to interact with any endpoint effectively. In general these can be categorized into security related options and general options. A table of the supported options is given at the end of the article.


Security features/enhancements for CURL

With the addition of the CURL function, to avoid security vulnerabilities and control and minimize the risks associated with it, multiple security measures have been implemented.

Using CA certificates with N1QL's CURL function

Certificates used by the N1QL CURL function should be stored on every query node within the cluster in the n1qlcerts directory. The location where this directory needs to be created by the user depends upon the location of the couchbase installation. (It is OS specific). The following assume default installation location and show where the n1qlcerts directory has been created. 

Linux                       

 /opt/couchbase/var/lib/couchbase/n1qlcerts                                                         

Mac OSX

/Users/couchbase/Library/Application\ Support/Couchbase/var/lib/couchbase/n1qlcerts

Windows

C:\Program Files\Couchbase\Server\var\lib\couchbase\n1qlcerts

For non-default installation locations, the relative path - "../var/lib/couchbase/n1qlcerts directory" from the bin directory where cbq-engine is run from needs to be created.

This directory must be created for every query node.

Once this directory has been created, add the certificate for CURL to use in here. In order to use this certificate, we use the option cacert and pass in the name of the certificate. 

For example if n1qlcerts/user1.pem is the name of the certificate, use the cacert option -

"cacert":"user1.pem"

Only names are valid, paths are invalid and passing one will cause an error. CURL() throws an error in the case of invalid and expired certificates.

NOTE : The n1qlcerts directory and its contents need to be replicated for each query node within the cluster.

Custom headers and user-agent

CURL() runs on the query node within a cluster. This enables the function to gain access to all REST endpoints that are accessible through the Query service (since that is where the function is executed). In order to avoid access to such insecure endpoints, a custom header, that cannot be changed by the user, is added to all requests sent using the N1QL curl function. This is of the format "X-N1QL-User-Agent: couchbase/n1ql/1.7.0-N1QL". 

A user-agent is also always set by default. This can be reset using the -user-agent option. The value set by default is "couchbase/n1ql/1.7.0-N1QL".

Both these values are designed to allow both internal and external endpoints to check for the header/user-agent and disallow access in their code and deny access to the function. The one caveat when using this however, is that we still cannot protect against any software that doesnt check for this header and the existing versions of locally installed software (both Couchbase and non-Couchbase software). For such cases, the curl whitelist feature has been added (see below).


Creating a Whitelist to restrict CURL access.

A whitelist is a JSON document, that lists out the permitted REST endpoints and URLs for the CURL() function to access. The URL's themselves, need to be a prefix match. The whitelist document is created within the n1qlcerts directory (for the location see above), and is named curl_whitelist.json (this name is fixed and cannot be changed by the user). The file (curl_whitelist.json) needs to be created by the administrator (or a user with access to the machine where couchbase is installed). 

If the whitelist is not setup (..../n1qlcerts/curl_whitelist.json doesnt exist) or if it exists but is empty then the CURL function cannot be used. 

Any whitelist needs to have the following fields - 

Field

Type

Description

Default value

all_access

boolean

This will decide whether the user has access to all urls or only the urls specified in the allowed_urls array.

false

allowed_urls

array

List of prefixes for urls that we wish to allow.

empty

disallowed_urls

array

List of prefixes for urls that will be restricted no matter what

empty


If the all_access field is false, then the usage for the CURL function has been fully restricted. In order to be able to use CURL() with any endpoint in N1QL, the administrator needs to set allowed_urls and disallowed_urls accordingly. In order to allow access to all urls, we can set all_access to true. This essentially gives us full CURL access.

Curl_whitelist.json 
{
  "all_access":false,
  "allowed_urls":["https://maps.googleapis.com"]
}

Any url in CURL() prefixed by https://maps.googleapis.com/ will be allowed. 

NOTE : The whitelist needs to be replicated for each query node within the cluster.

Role based access to the CURL function

An important thing worth mentioning here is that CURL is designed so that it cannot be arbitrarily invoked. In order to avoid injection of data from an external source using the UPDATE statement, a new role QUERY_EXTERNAL_ACCESS has been added. Only a user assigned this role has access to the CURL function. By default this role membership is empty. The CURL function can only be accessed by a FULL_ADMIN or any user that has been granted the QUERY_EXTERNAL_ACCESS role by the FULL_ADMIN. For previous versions of couchbase that dont support role-based access control, a password protected bucket can be used. Also for the CURL() functionality, internally a specific set of SSL ciphers (MEDIUM or HIGH) are supported. This is dependent on the COUCHBASE_SSL_CIPHER_LIST.


Restricting the result size for CURL()

An important concern with using the CURL() function is when a user crafts a really long file -  greater than 64 MB, and tries to read from it. Since the data is loaded into memory, if the result size is not capped the query service could crash. Due to this possibility, the maximum result size for data that can be returned by CURL() is 64MB (67 108 864 bytes). The user can restrict the amount of memory for CURL results by using the result-cap option.The minimum (default)value for the result-cap option is 20MB ( 20 971 520 bytes).

Interaction With Different Endpoints

Let us see how to query different endpoints using the CURL function in N1QL.

Google Maps Geocoding API

The Geocoding API from Google Maps allows you to convert static addresses into coordinates and vice versa using HTTP request. (For more information refer to https://developers.google.com/maps/documentation/geocoding/intro )

Say you want to search for Santa Cruz in Spain using your Google Dev API key. In order to do this, you can use the following query:

Curl Request

curl https://maps.googleapis.com/maps/api/geocode/json?address=santa+cruz&components=country:ES&key=<Your Developer API key>

Corresponding Query

SELECT CURL("https://maps.googleapis.com/maps/api/geocode/json", {"get":true,"data":"address=santa+cruz&components=country:ES&key=<Your Developer API key>"}) GEO;

"results": [
        {
            "GEO": {
                "results": [
                    {
                        "address_components": [
                            {
                                "long_name": "Santa Cruz de Tenerife",
                                "short_name": "Santa Cruz de Tenerife",
                                "types": [
                                    "locality",
                                    "political"
                                ]
                            },
                            {
                                "long_name": "Santa Cruz de Tenerife",
                                "short_name": "TF",
                                "types": [
                                    "administrative_area_level_2",
                                    "political"
                                ]
                            },
                            {
                                "long_name": "Canary Islands",
                                "short_name": "CN",
                                "types": [
                                    "administrative_area_level_1",
                                    "political"
                                ]
                            },
                            {
                                "long_name": "Spain",
                                "short_name": "ES",
                                "types": [
                                    "country",
                                    "political"
                                ]
                            }
                        ],
                        "formatted_address": "Santa Cruz de Tenerife, Spain",
                        "geometry": {
                            "bounds": {
                                "northeast": {
                                    "lat": 28.487616,
                                    "lng": -16.2356646
                                },
                                "southwest": {
                                    "lat": 28.4280248,
                                    "lng": -16.3370045
                                }
                            },
                            "location": {
                                "lat": 28.4636296,
                                "lng": -16.2518467
                            },
                            "location_type": "APPROXIMATE",
                            "viewport": {
                                "northeast": {
                                    "lat": 28.487616,
                                    "lng": -16.2356646
                                },
                                "southwest": {
                                    "lat": 28.4280248,
                                    "lng": -16.3370045
                                }
                            }
                        },
                        "place_id": "ChIJcUElzOzMQQwRLuV30nMUEUM",
                        "types": [
                            "locality",
                            "political"
                        ]
                    }
                ],
                "status": "OK"
            }
        }
    ]


This query retrieves the address and geographic location bounds of the address, Santa Cruz, ES. We use the address, components and key parameters from the Google Maps Geocoding REST API. The "data" option represents the curl data option that represents HTTP POST data.  However, because this is a get request we set the "get" option to true. You can provide values to all the REST parameters within the data option. 

Now lets search for Half Moon Bay in CA.

Query with curl request

SELECT CURL("https://maps.googleapis.com/maps/api/geocode/json", {"data":"address=Half+Moon+Bay", "get":true}) GEO;

"results": [
        {
            "GEO": {
                "results": [
                    {
                        "address_components": [
                            {
                                "long_name": "Half Moon Bay",
                                "short_name": "Half Moon Bay",
                                "types": [
                                    "locality",
                                    "political"
                                ]
                            },
                            {
                                "long_name": "San Mateo County",
                                "short_name": "San Mateo County",
                                "types": [
                                    "administrative_area_level_2",
                                    "political"
                                ]
                            },
                            {
                                "long_name": "California",
                                "short_name": "CA",
                                "types": [
                                    "administrative_area_level_1",
                                    "political"
                                ]
                            },
                            {
                                "long_name": "United States",
                                "short_name": "US",
                                "types": [
                                    "country",
                                    "political"
                                ]
                            }
                        ],
                        "formatted_address": "Half Moon Bay, CA, USA",
                        "geometry": {
                            "bounds": {
                                "northeast": {
                                    "lat": 37.5226389,
                                    "lng": -122.4165183
                                },
                                "southwest": {
                                    "lat": 37.4249286,
                                    "lng": -122.4778879
                                }
                            },
                            "location": {
                                "lat": 37.4635519,
                                "lng": -122.4285862
                            },
                            "location_type": "APPROXIMATE",
                            "viewport": {
                                "northeast": {
                                    "lat": 37.5226389,
                                    "lng": -122.4165183
                                },
                                "southwest": {
                                    "lat": 37.4249286,
                                    "lng": -122.4778879
                                }
                            }
                        },
                        "place_id": "ChIJC8sZCqULj4ARVJvnNcic_V4",
                        "types": [
                            "locality",
                            "political"
                        ]
                    }
                ],
                "status": "OK"
            }
        }
    ]

Yahoo Finance API

The Yahoo Finance API allows you to use the Yahoo Query Language (YQL) to fetch stock quotes (as seen in http://meumobi.github.io/stocks%20apis/2016/03/13/get-realtime-stock-quotes-yahoo-finance-api.html ). Below is the YQL SELECT statement to access the stock of Hortonworks Inc (HDP).

select * from yahoo.finance.quotes where symbol in ("HDP")


In order to get the results in JSON you can use the following URL:

https://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20(%22HDP%22)&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=

Curl Request

curl https://query.yahooapis.com/v1/public/yql --data 'q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20(%22HDP%22)&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback='


Corresponding Query

SELECT temp.query.results from 
CURL("https://query.yahooapis.com/v1/public/yql",
     {"data":"q=select%20*%20from%20yahoo.finance.quotes%20where%20symbol%20in%20(%22HDP%22)&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback="})temp;
     
"results": [
        {
            "results": {
                "quote": {
                    "AfterHoursChangeRealtime": null,
                    "AnnualizedGain": null,
                    "Ask": "16.950",
                    "AskRealtime": null,
                    "AverageDailyVolume": "952135",
                    "Bid": "16.940",
                    "BidRealtime": null,
                    "BookValue": "-0.654",
                    "Change": "+0.075",
                    "ChangeFromFiftydayMovingAverage": "0.377",
                    "ChangeFromTwoHundreddayMovingAverage": "3.625",
                    "ChangeFromYearHigh": "-0.755",
                    "ChangeFromYearLow": "10.525",
                    "ChangePercentRealtime": null,
                    "ChangeRealtime": null,
                    "Change_PercentChange": "+0.075 - +0.445%",
                    "ChangeinPercent": "+0.445%",
                    "Commission": null,
                    "Currency": "USD",
                    "DaysHigh": "17.010",
                    "DaysLow": "16.780",
                    "DaysRange": "16.780 - 17.010",
                    "DaysRangeRealtime": null,
                    "DaysValueChange": null,
                    "DaysValueChangeRealtime": null,
                    "DividendPayDate": null,
                    "DividendShare": null,
                    "DividendYield": null,
                    "EBITDA": "-223.00M",
                    "EPSEstimateCurrentYear": "-1.720",
                    "EPSEstimateNextQuarter": "-0.380",
                    "EPSEstimateNextYear": "-1.190",
                    "EarningsShare": "-3.737",
                    "ErrorIndicationreturnedforsymbolchangedinvalid": null,
                    "ExDividendDate": null,
                    "FiftydayMovingAverage": "16.568",
                    "HighLimit": null,
                    "HoldingsGain": null,
                    "HoldingsGainPercent": null,
                    "HoldingsGainPercentRealtime": null,
                    "HoldingsGainRealtime": null,
                    "HoldingsValue": null,
                    "HoldingsValueRealtime": null,
                    "LastTradeDate": "10/5/2017",
                    "LastTradePriceOnly": "16.945",
                    "LastTradeRealtimeWithTime": null,
                    "LastTradeTime": "12:50pm",
                    "LastTradeWithTime": "12:50pm - <b>16.945</b>",
                    "LowLimit": null,
                    "MarketCapRealtime": null,
                    "MarketCapitalization": "700.96M",
                    "MoreInfo": null,
                    "Name": "Hortonworks, Inc.",
                    "Notes": null,
                    "OneyrTargetPrice": "18.930",
                    "Open": "17.010",
                    "OrderBookRealtime": null,
                    "PEGRatio": "-0.400",
                    "PERatio": null,
                    "PERatioRealtime": null,
                    "PercebtChangeFromYearHigh": "-4.266%",
                    "PercentChange": "+0.445%",
                    "PercentChangeFromFiftydayMovingAverage": "+2.275%",
                    "PercentChangeFromTwoHundreddayMovingAverage": "+27.214%",
                    "PercentChangeFromYearLow": "+163.941%",
                    "PreviousClose": "16.870",
                    "PriceBook": null,
                    "PriceEPSEstimateCurrentYear": null,
                    "PriceEPSEstimateNextYear": null,
                    "PricePaid": null,
                    "PriceSales": "3.212",
                    "SharesOwned": null,
                    "ShortRatio": "3.640",
                    "StockExchange": "NMS",
                    "Symbol": "HDP",
                    "TickerTrend": null,
                    "TradeDate": null,
                    "TwoHundreddayMovingAverage": "13.320",
                    "Volume": "217430",
                    "YearHigh": "17.700",
                    "YearLow": "6.420",
                    "YearRange": "6.420 - 17.700",
                    "symbol": "HDP"
                }
            }
        }
    ]

For this query, the value of the data option contains the Yahoo REST parameters, q (for the YQL query), format (to return data in JSON) and some other parameters.

Couchbase Full Text Search

Couchbase’s Full Text Search allows you to apply fuzzy search to data stored in Couchbase. For more information see https://blog.couchbase.com/2016/february/couchbase-4.5-developer-preview-couchbase-fts .

Suppose you create a FTS index called beers on the beer-sample bucket in Couchbase. You can now search for beer type pale ale using this index, using the CURL function in N1QL. It is important to note that FTS currently accepts HTTP POST instead of GET. To explicitely specify the POST request method, use the request option. 

Curl Request

curl –u beer-sample:pass -XPOST -H "Content-Type: application/json" http://127.0.0.1:8094/api/index/beers/query -d '{ "explain": true, "fields": ["*"],"highlight": {},"query": {"query": "pale ale"}}'


Corresponding Query

SELECT result.total_hits, array_length(result.hits) hits_per_page 
FROM CURL("http://localhost:8094/api/index/beers/query", 
          {"request":"POST","header":"Content-Type: application/json",
          "data":'{"explain":false,"fields": ["*"],"highlight": {},"query": {"query": "pale ale"}}',
          "user":"Administrator:password"}) result;
          
"results": [
        {
            "hits_per_page": 10,
            "total_hits": 3815
        }
  ]


We give multiple options in this query. The header option allows you to pass a custom header to server. Content-Type : application/json tells the server that the data is provided in JSON format. If we have a password protected bucket in Couchbase, then we need to pass its credentials with the query. The user option can be used to pass in a colon-separated username and password. The request option specifies that POST request method is used.


If you want to retrieve only those documents from beer-sample that are returned by the search above, you can write a N1QL JOIN query as follows.

SELECT ARRAY x.id for x in b1.result.hits END as hits, 
       b1.result.total_hits as total, 
       array_length(b1.result.hits), 
       b 
FROM (SELECT CURL("http://localhost:8094/api/index/beers/query", 
          {"request":"POST","header":"Content-Type: application/json",
          "data":'{"explain":false,"fields": ["*"],"highlight": {},"query": {"query": "pale ale"}}',
          "user":"Administrator:password"}) result) b1 
INNER JOIN `beer-sample` b ON KEYS b1.result.hits[*].id LIMIT 1;

"results": [
        {
            "$1": 10,
            "b": {
                "abv": 5.4,
                "brewery_id": "stone_brewing_co",
                "category": "North American Ale",
                "description": "Our flagship ale, Stone Pale Ale is our Southern California interpretation of the classic British pale ale style. Deep amber in color, Stone Pale Ale is robust and full flavored. A delicate hop aroma is complemented by a rich maltiness. This is an ale for those who have learned to appreciate distinctive flavor. Stone Pale Ale is great by itself, or with food that requires a beer of character.",
                "ibu": 0,
                "name": "Stone Pale Ale",
                "srm": 0,
                "style": "American-Style Pale Ale",
                "type": "beer",
                "upc": 0,
                "updated": "2010-07-22 20:00:20"
            },
            "hits": [
                "stone_brewing_co-stone_pale_ale",
                "flying_dog_brewery-classic_pale_ale",
                "yards_brewing-yards_philadelphia_pale_ale",
                "bell_s_brewery_inc-pale_ale",
                "sierra_nevada_brewing_co-sierra_nevada_pale_ale",
                "cooper_s_cave_ale_company-cooper_s_cave_pale_ale",
                "appalachian_brewing_company-hoppy_trails_india_pale_ale",
                "cooperstown_brewing_company-backyard_india_pale_ale",
                "mogollon_brewing_company-superstition_pale_ale",
                "troegs_brewing-troegs_pale_ale"
            ],
            "total": 3815
        }
    ]


This will retrieve the ids of the documents returned by the FTS query that searches for pale ale, along with the total hits and all the details from the corresponding document in beer-sample.

Github API 

Github’s API is a bit different from the previous API’s, in that it returns multiple results in the form of a JSON array of result values. This can be treated as multiple documents.Refer to the Github API docs in https://developer.github.com/v3/ for more details on what can be queried.

Say you want to find out all the repositories linked to a Github account. The following query does this

Curl request

curl -H "User-Agent: ikandaswamy" https://api.github.com/users/ikandaswamy/repos

Corresponding query

SELECT RAW list FROM CURL("https://api.github.com/users/ikandaswamy/repos")list LIMIT 1;

"results": [
        {
            "archive_url": "https://api.github.com/repos/ikandaswamy/ds-algo/{archive_format}{/ref}",
            "assignees_url": "https://api.github.com/repos/ikandaswamy/ds-algo/assignees{/user}",
            "blobs_url": "https://api.github.com/repos/ikandaswamy/ds-algo/git/blobs{/sha}",
            "branches_url": "https://api.github.com/repos/ikandaswamy/ds-algo/branches{/branch}",
            "clone_url": "https://github.com/ikandaswamy/ds-algo.git",
            "collaborators_url": "https://api.github.com/repos/ikandaswamy/ds-algo/collaborators{/collaborator}",
            "comments_url": "https://api.github.com/repos/ikandaswamy/ds-algo/comments{/number}",
            "commits_url": "https://api.github.com/repos/ikandaswamy/ds-algo/commits{/sha}",
            "compare_url": "https://api.github.com/repos/ikandaswamy/ds-algo/compare/{base}...{head}",
            "contents_url": "https://api.github.com/repos/ikandaswamy/ds-algo/contents/{+path}",
            "contributors_url": "https://api.github.com/repos/ikandaswamy/ds-algo/contributors",
            "created_at": "2017-09-07T22:42:03Z",
            "default_branch": "master",
            "deployments_url": "https://api.github.com/repos/ikandaswamy/ds-algo/deployments",
            "description": "Use this to implement various fun problems while relearning Data Structures and Algorithms",
            "downloads_url": "https://api.github.com/repos/ikandaswamy/ds-algo/downloads",
            "events_url": "https://api.github.com/repos/ikandaswamy/ds-algo/events",
            "fork": false,
            "forks": 0,
            "forks_count": 0,
            "forks_url": "https://api.github.com/repos/ikandaswamy/ds-algo/forks",
            "full_name": "ikandaswamy/ds-algo",
            "git_commits_url": "https://api.github.com/repos/ikandaswamy/ds-algo/git/commits{/sha}",
            "git_refs_url": "https://api.github.com/repos/ikandaswamy/ds-algo/git/refs{/sha}",
            "git_tags_url": "https://api.github.com/repos/ikandaswamy/ds-algo/git/tags{/sha}",
            "git_url": "git://github.com/ikandaswamy/ds-algo.git",
            "has_downloads": true,
            "has_issues": true,
            "has_pages": false,
            "has_projects": true,
            "has_wiki": true,
            "homepage": null,
            "hooks_url": "https://api.github.com/repos/ikandaswamy/ds-algo/hooks",
            "html_url": "https://github.com/ikandaswamy/ds-algo",
            "id": 102792479,
            "issue_comment_url": "https://api.github.com/repos/ikandaswamy/ds-algo/issues/comments{/number}",
            "issue_events_url": "https://api.github.com/repos/ikandaswamy/ds-algo/issues/events{/number}",
            "issues_url": "https://api.github.com/repos/ikandaswamy/ds-algo/issues{/number}",
            "keys_url": "https://api.github.com/repos/ikandaswamy/ds-algo/keys{/key_id}",
            "labels_url": "https://api.github.com/repos/ikandaswamy/ds-algo/labels{/name}",
            "language": null,
            "languages_url": "https://api.github.com/repos/ikandaswamy/ds-algo/languages",
            "merges_url": "https://api.github.com/repos/ikandaswamy/ds-algo/merges",
            "milestones_url": "https://api.github.com/repos/ikandaswamy/ds-algo/milestones{/number}",
            "mirror_url": null,
            "name": "ds-algo",
            "notifications_url": "https://api.github.com/repos/ikandaswamy/ds-algo/notifications{?since,all,participating}",
            "open_issues": 0,
            "open_issues_count": 0,
            "owner": {
                "avatar_url": "https://avatars1.githubusercontent.com/u/9203396?v=4",
                "events_url": "https://api.github.com/users/ikandaswamy/events{/privacy}",
                "followers_url": "https://api.github.com/users/ikandaswamy/followers",
                "following_url": "https://api.github.com/users/ikandaswamy/following{/other_user}",
                "gists_url": "https://api.github.com/users/ikandaswamy/gists{/gist_id}",
                "gravatar_id": "",
                "html_url": "https://github.com/ikandaswamy",
                "id": 9203396,
                "login": "ikandaswamy",
                "organizations_url": "https://api.github.com/users/ikandaswamy/orgs",
                "received_events_url": "https://api.github.com/users/ikandaswamy/received_events",
                "repos_url": "https://api.github.com/users/ikandaswamy/repos",
                "site_admin": false,
                "starred_url": "https://api.github.com/users/ikandaswamy/starred{/owner}{/repo}",
                "subscriptions_url": "https://api.github.com/users/ikandaswamy/subscriptions",
                "type": "User",
                "url": "https://api.github.com/users/ikandaswamy"
            },
            "private": false,
            "pulls_url": "https://api.github.com/repos/ikandaswamy/ds-algo/pulls{/number}",
            "pushed_at": "2017-09-07T22:42:04Z",
            "releases_url": "https://api.github.com/repos/ikandaswamy/ds-algo/releases{/id}",
            "size": 0,
            "ssh_url": "git@github.com:ikandaswamy/ds-algo.git",
            "stargazers_count": 0,
            "stargazers_url": "https://api.github.com/repos/ikandaswamy/ds-algo/stargazers",
            "statuses_url": "https://api.github.com/repos/ikandaswamy/ds-algo/statuses/{sha}",
            "subscribers_url": "https://api.github.com/repos/ikandaswamy/ds-algo/subscribers",
            "subscription_url": "https://api.github.com/repos/ikandaswamy/ds-algo/subscription",
            "svn_url": "https://github.com/ikandaswamy/ds-algo",
            "tags_url": "https://api.github.com/repos/ikandaswamy/ds-algo/tags",
            "teams_url": "https://api.github.com/repos/ikandaswamy/ds-algo/teams",
            "trees_url": "https://api.github.com/repos/ikandaswamy/ds-algo/git/trees{/sha}",
            "updated_at": "2017-09-07T22:42:03Z",
            "url": "https://api.github.com/repos/ikandaswamy/ds-algo",
            "watchers": 0,
            "watchers_count": 0
        }
    ]


If the account has three repositories, the query gives three results (here I have added limit 1). The RAW keyword is used  to return the array of documents that the query returns, without a wrapper object. One point you will notice is that the header option contains the User-Agent with a github username. This is now mandatory for all Github API requests.

Now from this list, say you would like to know, what is the clone url for each of these repos. The following query accomplishes this

SELECT clone_url FROM 
(SELECT RAW list 
 FROM CURL("https://api.github.com/users/ikandaswamy/repos", 
           {"header":"User-Agent: ikandaswamy"}) list) s;
           
"results": [
        {
            "clone_url": "https://github.com/ikandaswamy/ds-algo.git"
        },
        {
            "clone_url": "https://github.com/ikandaswamy/github-cheat-sheet.git"
        },
        {
            "clone_url": "https://github.com/ikandaswamy/jsapp.git"
        }
    ]


Summary

As you can see with the above examples, using the CURL function, N1QL users can now interact with any external API’s that return results in JSON format. This opens up many possibilities. For example, if Couchbase contains data corresponding to different hotels, then you can use the Google Maps API to find nearby locations to each of the corresponding hotels.

In order to have a secure environment with the addition of CURL() multiple security enhancements have been added. The following is a short list

  • CURL runs on the query node within a cluster.
  • CURL function is disabled by default.

  • CURL supports HTTP and HTTPS only. All other protocols are disabled.

  • Redirection of URLs is not allowed. 

  • Custom header for N1QL CURL is "X-N1QL-User-Agent: couchbase/n1ql/1.7.0-N1QL".

  • User-Agent is  "couchbase/n1ql/1.7.0-N1QL".

  • Restrict amount of memory for CURL results using result-cap. Min result cap will be 20MB, max result-cap is 64MB.

  • FULL_ADMIN role will allow access to CURL. QUERY_EXTERNAL_ACCESS role can be assigned to a user by the FULL ADMIN. This will enable the user to use the CURL functionality. 

  • Certificates should be stored on the local machine - each query node within a cluster. Use …./Couchbase/var/lib/couchbase/n1qlcerts to store certificates.Use cacert to pass in the “name” of the certificate to use. Only names are valid, paths are invalid. (passing in a path will cause an error.)

  • CURL throws an error in case of invalid/expired certificates.

  • User has the ability to “Whitelist” endpoints.


The N1QL implementation of CURL uses the golang libcurl API  - https://github.com/andelf/go-curl

List of Available Options

Security Options

Option

Description

value

user

Server user and password

When password is empty it is treated as an empty password string.

USERNAME[:PASSWORD]

basic

Use HTTP Basic Authentication

BOOLEAN (TRUE/ FALSE)

insecure  

Allow connections to SSL sites without certs (H)

BOOLEAN (TRUE/ FALSE)

anyauth

curl to figure out authentication method by itself, and use the most secure one

BOOLEAN (TRUE/ FALSE)

cacert





Specify CA signed certificate filename

Certificates should be stored on the local machine - each query node within a cluster.

/Couchbase/var/lib/couchbase/n1qlcerts to store certificates. This is not visible to the user.

The Filename cannot contain a path. If it is not a match to the existing contents of n1qlcerts directory, the function throws an error.

For expired and invalid certificates throw an error.

FILENAME (This is the certificate, pem file for aws for example)


result-cap

Set capacity for buffer that stores result of CURL operation

Number of MB. (Minimum is 20MB)


Other Transfer-Related Options

Option

Description

Value

get, G

Get request for CURL

BOOLEAN (true/false)

X, request

Set the request method. This only accepts GET or POST and is case sensitive.

For all other cases it errors out.

{“request”:”POST”}

connect-timeout


Maximum time allowed for connection. It should contain the maximum time in seconds that you allow the connection phase to the server to take. This only limits the connection phase, it has no impact once it has connected. Set to zero to switch to the default built-in connection timeout - 300 seconds.

If float value, we truncate it to the integer value.

For all other types (not a number) throw error.

SECONDS

max-time

Maximum time allowed for the transfer operation.

Default timeout is 0 (zero) which means it never times out during transfer.

If float value, we truncate it to the integer value.

For all other types (not a number) throw error.

SECONDS

data     

HTTP POST data (H)

Allows us to set all the rest api parameters for the given endpoint.

STRING

OR

[...string,string….]

header   

Pass custom header string to server (H)

STRING

OR

[...string,string….]

show-error

Show error.

When true show errors when they occur.

When false suppress the errors

BOOLEAN (TRUE/ FALSE)

silent

Silent mode (don't output anything)

BOOLEAN (TRUE/ FALSE)

keepalive-time

Wait SECONDS between keepalive probes for low level TCP connectivity. (Does not affect HTTP level keep-alive)

SECONDS

user-agent

Value for the User-Agent to send to the server.

STRING

data-urlencode

Encode the data, and send to server.

This is a test => this%20is%20a%20test  

STRING

OR

[...string,string….]


CURL Database file IO Data (computing)

Opinions expressed by DZone contributors are their own.

Related

  • Open-Source SPL That Can Execute SQL Without RDB
  • Develop a Scalable Event Listener for Blockchain
  • MongoDB to Couchbase: An Introduction to Developers and Experts
  • Anatomy of a PostgreSQL Query Plan

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!