Quick Start

Overview

This example-driven tutorial presents 5 steps to get started with Blue Brain Nexus to build and query a simple knowledge graph. The goal is to go over some capabilities of Blue Brain Nexus enabling:

  • The creation of a project as a protected data space to work with
  • An easy ingestion of a dataset
  • Querying a dataset to retrieve various information
  • Sharing a dataset by making it public

For that we will work with the small version of the Global Research Identifier Database (GRID) dataset containing a set of:

  • institutes (institutes.csv)
  • their acronyms (acronyms.csv)
  • their addresses (addresses.csv)
  • their urls (links.csv)
  • and their relationships (relationships.csv)

An overview of this dataset can be found here.

Note
  • We will be using Blue Brain Nexus CLI, a python client throughout this quick start tutorial.
  • This tutorial assumes you’ve installed and configured the CLI. If not, please follow the set up instructions.

Let’s get started.

Create a project

Projects in BlueBrain Nexus are spaces where data can be:

  • managed: created, updated, deprecated, validated, secured;
  • accessed: directly by ids or through various search interfaces;
  • shared: through fine grain Access Control List.

A project is always created within an organization just like a git repository is created in a github organization. Organizations can be understood as accounts hosting multiple projects.

Select an organization

Note

A public organization named demo is already created for the purpose of this tutorial. All projects will be created under this organization.

The following command should list the organizations you have access to. The demo organization should be listed and tagged as non-deprecated in the output.

Command
nexus orgs list
Full source at GitHub
Output
+----------------+-------------------+-------------------------------------------------+------------+
| Name           | Description       | Id                                              | Deprecated |
+----------------+-------------------+-----------------------------------------------------+--------+
| demo  | Nexus sandbox     | https://sandbox.bluebrainnexus.io/v1/demo     | False      |
Full source at GitHub

Let select the demo organization.

Command
nexus orgs select demo
Full source at GitHub
Output
demo organization selected.
Full source at GitHub

Create a project

A project is created with a label and within an organization. The label should be made of alphanumerical characters and its length should be between 3 and 32 (it should match the regex: [a-zA-Z0-9-_]{3,32}).

Pick a label (hereafter referred to as $PROJECTLABEL) and create a project using the following command. It is recommended to use your github username to avoid collision of projects labels within an organization.

Command
nexus projects create $PROJECTLABEL && nexus projects list
Full source at GitHub
Output
Project created (id: https://sandbox.bluebrainnexus.io/v1/projects/demo/$PROJECTLABEL)
+---------------+-------------+------------------------------------------------------------------------+------------+
| Label         | Description | Id                                                                     | Deprecated |
+---------------+-------------+------------------------------------------------------------------------+------------+
| $PROJECTLABEL |             | https://sandbox.bluebrainnexus.io/v1/projects/demo/$PROJECTLABEL | False      |
+---------------+-------------+------------------------------------------------------------------------+------------+
Full source at GitHub

By default, created projects are private meaning that only the project creator (you) has read and write access to it. We’ll see below how to make a project public.

The output of the previous command shows the list of projects you have read access to. The project you just created should be the only one listed at this point. Let select it.

Command
nexus projects select $PROJECTLABEL && nexus projects list
Full source at GitHub
Output
$PROJECTLABEL project selected
+---------------+-------------+------------------------------------------------------------------------+------------+
| Label         | Description | Id                                                                     | Deprecated |
+---------------+-------------+------------------------------------------------------------------------+------------+
| $PROJECTLABEL |             | https://sandbox.bluebrainnexus.io/v1/projects/demo/$PROJECTLABEL | False      |
+---------------+-------------+------------------------------------------------------------------------+------------+
Full source at GitHub

We are all set to bring some data within the project we just created.

Ingest data

Load the dataset

Let first list the files that made the small version of the GRID dataset.

Command
cd getting-started/dataset/grid-small && ls
Full source at GitHub
Output
acronyms.csv  addresses.csv  institutes.csv  links.csv  relationships.csv
Full source at GitHub

The data to be ingested come in 5 csv files (see the output of the above command) containing each a partial description of the organizations. A single command allows to load the organisations within the institutes.csv file and merge it with all the other csv files.

nexus resources create --file institutes.csv --type Organization --format csv \
 --idcolumn grid_id --idnamespace http://www.grid.ac/institutes/ \
 --mergewith links.csv --mergewith addresses.csv --mergewith relationships.csv --mergewith acronyms.csv \
 --mergeon grid_id \
 --max-connections 4

Access data

View data in Nexus Web

Nexus is deployed with a developer oriented web application allowing to browse organizations, projects, data and schemas you have access to. You can go to the address https://sandbox.bluebrainnexus.io/web/demo and browse the data you just loaded.

List data

The simplest way to accessed data within Nexus is by listing them. The following command lists 5 resources:

Command
nexus resources list --size 5
Full source at GitHub

The full payload of the resources are not retrieved when listing them: only identifier, type as well as Nexus added metadata are. But the result list can be scrolled and each resource fetched by identifier.

Let fetch the EPFL organization identified by http://www.grid.ac/institutes/grid.5333.6

Command
nexus resources fetch http://www.grid.ac/institutes/grid.5333.6
Full source at GitHub
Output
{
  "@context": [
    {
      "@base": "https://sandbox.bluebrainnexus.io/v1/resources/demo/$PROJECTLABEL/_/",
      "@vocab": "https://sandbox.bluebrainnexus.io/v1/vocabs/demo/$PROJECTLABEL/"
    },
    "https://bluebrain.github.io/nexus/contexts/resource.json"
  ],
  "@id": "http://www.grid.ac/institutes/grid.5333.6",
  "@type": "Organization",
  "acronym": "EPFL",
  "city": "Lausanne",
  "country": "Switzerland",
  "country_code": "CH",
  "email_address": "",
  "established": 1853,
  "geonames_city_id": 2659994,
  "grid_id": "grid.5333.6",
  "lat": 46.519082,
  "line_1": "",
  "line_2": "",
  "line_3": "",
  "link": "http://www.epfl.ch/index.en.html",
  "lng": 6.566747,
  "name": "\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne",
  "postcode": "",
  "primary": false,
  "related_grid_id": "grid.482253.a",
  "relationship_type": "Related",
  "state": "",
  "state_code": "",
  "wikipedia_url": "http://en.wikipedia.org/wiki/%C3%89cole_Polytechnique_F%C3%A9d%C3%A9rale_de_Lausanne",
  "_self": "https://sandbox.bluebrainnexus.io/v1/resources/demo/testdemo/_/http%3A%2F%2Fwww.grid.ac%2Finstitutes%2Fgrid.5333.6",
  "_constrainedBy": "https://bluebrain.github.io/nexus/schemas/unconstrained.json",
  "_project": "https://sandbox.bluebrainnexus.io/v1/projects/demo/testdemo",
  "_rev": 1,
  "_deprecated": false,
  "_createdAt": "2019-06-04T08:42:26.433Z",
  "_createdBy": "https://sandbox.bluebrainnexus.io/v1/realms/github/users/mfsy",
  "_updatedAt": "2019-06-04T08:42:26.433Z",
  "_updatedBy": "https://sandbox.bluebrainnexus.io/v1/realms/github/users/mfsy"
}
Full source at GitHub

Whenever a resource is created, Nexus injects some useful metadata. The table below details some of them:

Metadata Description Value Type
@id Generated resource identifier. The user can provide its own identifier. URI
@type The type of the resource if provided by the user. URI
_self The resource address within Nexus. It contains the resource management details such as the organization, the project and the schema. URI
_createdAt The resource creation date. DateTime
_createdBy The resource creator. DateTime

Note that Nexus uses JSON-LD as data exchange format.

Filters are available to list specific resources. For example a list of resources of type Organization can be retrieved by running the following command:

Command
nexus resources list --type Organization --size 5
Full source at GitHub
Output
+------------------------------------------------------------------------------------+----------------------------------------------------------------------------+----------+------------+
| Id                                                                                 | Type                                                                       | Revision | Deprecated |
+------------------------------------------------------------------------------------+----------------------------------------------------------------------------+----------+------------+
| https://sandbox.bluebrainnexus.io/v1/resources/demo/$PROJECTLABEL/_/Rating_1  | https://sandbox.bluebrainnexus.io/v1/vocabs/demo/$PROJECTLABEL/Rating | 1        | False      |
| https://sandbox.bluebrainnexus.io/v1/resources/demo/$PROJECTLABEL/_/Rating_9  | https://sandbox.bluebrainnexus.io/v1/vocabs/demo/$PROJECTLABEL/Rating | 1        | False      |
| https://sandbox.bluebrainnexus.io/v1/resources/demo/$PROJECTLABEL/_/Rating_12 | https://sandbox.bluebrainnexus.io/v1/vocabs/demo/$PROJECTLABEL/Rating | 1        | False      |
| https://sandbox.bluebrainnexus.io/v1/resources/demo/$PROJECTLABEL/_/Rating_7  | https://sandbox.bluebrainnexus.io/v1/vocabs/demo/$PROJECTLABEL/Rating | 1        | False      |
| https://sandbox.bluebrainnexus.io/v1/resources/demo/$PROJECTLABEL/_/Rating_8  | https://sandbox.bluebrainnexus.io/v1/vocabs/demo/$PROJECTLABEL/Rating | 1        | False      |
+------------------------------------------------------------------------------------+----------------------------------------------------------------------------+----------+------------+
Full source at GitHub

Query data

Listing is usually not enough to select specific subset of data. Data ingested within each project can be searched through two complementary search interfaces called views.

View Description
ElasticSearchView Exposes data in ElasticSearch, a document oriented search engine and provide access to it using the ElasticSearch query language.
SparqlView Exposes data as a graph and allows to navigate and explore the data using the W3C Sparql query language.

Query data using the ElasticSearchView

The ElasticSearchView URL is available at the address https://sandbox.bluebrainnexus.io/v1/views/demo/$PROJECTLABEL/documents/_search.

The query below selects 5 organizations sorted by creation date in descending order.

Select queries
nexus views query-es --data \
'{
     "size":5,
     "sort" : [
       {
        "_createdAt" : {"order" : "desc"}
       }
     ],
     "query": {
     	"terms" : {"@type":["https://sandbox.bluebrainnexus.io/v1/vocabs/demo/$PROJECTLABEL/Organization"]}
     }
 }'
Full source at GitHub

Query data using the SparqlView

The SparqlView is available at the address [https://sandbox.bluebrainnexus.io/v1/views/demo/$PROJECTLABEL/graph/sparql]. The following diagram shows how the MovieLens data is structured in the default Nexus SparqlView. Note that the ratings, tags and movies are joined by the movieId property.

The query below selects 5 organizations sorted by creation date in descending order.

Select queries
nexus views query-sparql --data \
'
PREFIX vocab: <https://sandbox.bluebrainnexus.io/v1/vocabs/demo/$PROJECTLABEL/>
PREFIX nxv: <https://bluebrain.github.io/nexus/vocabulary/>
Select ?org ?name ?createdAt
 WHERE  {

    ?org a vocab:Organization.
    ?org vocab:name  ?name.
    ?org nxv:createdAt ?createdAt
}
ORDER BY DESC (?createdAt)
LIMIT 5'
Full source at GitHub

Share data

Making a dataset public means granting read permissions to “anonymous” user.

$ nexus acls make-public

To check that the dataset is now public:

  • Ask the person next to you to list resources in your project.
  • Or create and select another profile named public-tutorial (following the instructions in the Set up. You should see the that the public-tutorial is selected and its corresponding token column is None.
Output
Selected profile: tutorial
+-------------------+----------+-------------------------------------+------------------+
| Profile           | Selected | URL                                 |       Token      |
+-------------------+----------+-------------------------------------+------------------+
| tutorial          |          | https://sandbox.bluebrainnexus.io/v1         |  Expiry: 2019... |
| public-tutorial   |   Yes    | https://sandbox.bluebrainnexus.io/v1         |       None       |
+-------------------+----------+-------------------------------------+------------------+
Full source at GitHub
  • Resources in your project should be listed with the command even though you are not authenticated.
Command
nexus resources list --size 5 -o demo -p $PROJECTLABEL
Full source at GitHub