Ingesting metadata
This document showcases compass usage, from producing data, to using APIs for specific use-cases. This guide is geared towards adminstrators, more than users. It is meant to give the administrators an introduction of different configuration options available, and how they affect the behaviour of the application.
Prerequisites
This guide assumes that you have a local instance of compass running and listening on localhost:8080
. See Installation guide for information on how to run Compass.
Adding Data
Let’s say that you have a hypothetical tool called Piccolo and you have several deployments of this tool on your platform. Before we can push data for Piccolo deployments to Compass, you need to recognize the type of Piccolo, whether it is a kind of table
, topic
, dashboard
, or job
. One can ingest metadata to compass with the Upsert Patch API. The API contract is available here.
If there is an existing asset, Upsert Patch API will check each field whether there is an update in the field of the existing asset. With this behaviour, it is possible to send partial updated field to update a certain field only as long as the urn
, type
, and service
match with the existing asset. If there is any field changed, a new version of the asset will be created. If the asset does not exist, upsert patch API will create a new asset. Apart from asset details, we also could send upstreams and downstreams of lineage edges of the asset in the body.
Let's say piccolo
tool is a kind of table
, we can start pushing data for it. Let's add 3 metadata of picollo
.
- Picollo 1
- Picollo 2
- Picollo 3
$ curl --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Compass-User-UUID:gotocompany@email.com' \
--data-raw '{
"asset": {
"urn": "picollo:deployment-01",
"type": "table",
"name": "deployment-01",
"service": "picollo",
"description": "this is the one",
"data": {},
"owners": [
{
"email": "john.doe@email.com"
}
]
}
}'
$ curl --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Compass-User-UUID:gotocompany@email.com' \
--data-raw '{
"asset": {
"urn": "picollo:deployment-02",
"type": "table",
"name": "deployment-02",
"service": "picollo",
"description": "this came second",
"data": {},
"owners": [
{
"email": "kami@email.com"
}
]
}
}'
$ curl --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Compass-User-UUID:gotocompany@email.com' \
--data-raw '{
"asset": {
"urn": "picollo:deployment-03",
"type": "table",
"name": "deployment-03",
"service": "picollo",
"description": "the last one",
"data": {},
"owners": [
{
"email": "kami@email.com"
}
]
}
}'
Searching
We can search for required text in the following ways:
- Using
compass search <text>
CLI command - Calling to
GET /v1beta1/search
API withtext
to be searched as query parameter
Now we're ready to start searching. Let's run a search for the term 'one' from the assets we ingested earlier.
- CLI
- HTTP
$ compass search one
$ curl 'http://localhost:8080/v1beta1/search?text\=one' \
--header 'Compass-User-UUID:gotocompany@email.com' | jq
The output is the following:
{
"data": [
{
"urn": "picollo:deployment-01",
"type": "table",
"name": "deployment-01",
"service": "picollo",
"description": "this is the one",
"data": {},
"owners": [
{
"email": "john.doe@email.com"
}
],
"labels": {}
},
{
"urn": "picollo:deployment-03",
"type": "table",
"name": "deployment-03",
"service": "picollo",
"description": "the last one",
"data": {},
"owners": [
{
"email": "kami@email.com"
}
],
"labels": {}
}
]
}
The search is run against ALL fields of the records. It can be further restricted by specifying a filter criteria, could be exact match with filter
and fuzzy match with query
. For instance, if you wish to restrict the search to piccolo deployments that belong to kami
(fuzzy), you can run:
We can search with custom queries in the following ways:
- Using
compass search <text> --query=field_key1:val1
CLI command - Calling to
GET /v1beta1/search
API withtext
andquery[field_key1]=val1
as query parameters
- CLI
- HTTP
$ compass search one --query=owners:kami
$ curl 'http://localhost:8080/v1beta1/search?text=one&query[owners]=kami' \
--header 'Compass-User-UUID:gotocompany@email.com' | jq
The output is the following:
{
"data": [
{
"urn": "picollo:deployment-03",
"type": "table",
"name": "deployment-03",
"service": "picollo",
"description": "the last one",
"data": {},
"owners": [
{
"email": "kami@email.com"
}
],
"labels": {}
}
]
}
Lineage
Now that we have configured the piccolo
type and learnt how to use the search API to search it's assets, let's configure lineage for it.
To begin with, let's start over adding picolo metadata with its lineage information and add another metadata with service name sensu
and type topic
and add some records for it.
Adding picollo
Metadata
- Picollo 1
- Picollo 2
- Picollo 3
$ curl --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Compass-User-UUID:gotocompany@email.com' \
--data-raw '{
"asset": {
"urn": "picollo:deployment-01",
"type": "table",
"name": "deployment-01",
"service": "picollo",
"description": "this is the one",
"data": {},
"owners": [
{
"email": "john.doe@email.com"
}
]
},
"upstreams": [
{
"urn": "sensu:deployment-01",
"type": "topic",
"service": "sensu"
}
],
"downstreams": [
{
"urn": "gohan:deployment-01",
"type": "table",
"service": "gohan"
}
]
}'
$ curl --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Compass-User-UUID:gotocompany@email.com' \
--data-raw '{
"asset": {
"urn": "picollo:deployment-02",
"type": "table",
"name": "deployment-02",
"service": "picollo",
"description": "this came second",
"data": {},
"owners": [
{
"email": "kami@email.com"
}
]
},
"upstreams": [
{
"urn": "sensu:deployment-02",
"type": "topic",
"service": "sensu"
}
],
"downstreams": [
{
"urn": "gohan:deployment-02",
"type": "table",
"service": "gohan"
}
]
}'
$ curl --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Compass-User-UUID:gotocompany@email.com' \
--data-raw '{
"asset": {
"urn": "picollo:deployment-03",
"type": "table",
"name": "deployment-03",
"service": "picollo",
"description": "the last one",
"data": {},
"owners": [
{
"email": "kami@email.com"
}
]
},
"upstreams": [
{
"urn": "sensu:deployment-03",
"type": "topic",
"service": "sensu"
}
],
"downstreams": [
{
"urn": "gohan:deployment-03",
"type": "table",
"service": "gohan"
}
]
}'
Adding sensu
Metadata
sensu
is the data store that piccolo
instances read from. In order to have a lineage, we need to have the metadata urn of sensu
in Compass.
For instance, if you look at the upstreams
and downstreams
fields when we are ingesting piccolo
metadata, you'll see that they are urn's of sensu
instances. This means we can define the relationship between piccolo
and sensu
resources by declaring this relationship in piccolo
's definition.
- Sensu 1
- Sensu 2
- Sensu 3
$ curl --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Compass-User-UUID:gotocompany@email.com' \
--data-raw '{
"asset": {
"urn": "sensu:deployment-01",
"type": "topic",
"name": "deployment-01",
"service": "sensu",
"description": "primary sensu dataset",
"data": {}
},
"upstreams": [],
"downstreams": []
}'
$ curl --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Compass-User-UUID:gotocompany@email.com' \
--data-raw '{
"asset": {
"urn": "sensu:deployment-02",
"type": "topic",
"name": "deployment-02",
"service": "sensu",
"description": "secondary sensu dataset",
"data": {}
},
"upstreams": [],
"downstreams": []
}'
$ curl --request PATCH 'http://localhost:8080/v1beta1/assets' \
--header 'Compass-User-UUID:gotocompany@email.com' \
--data-raw '{
"asset": {
"urn": "sensu:deployment-03",
"type": "topic",
"name": "deployment-03",
"service": "sensu",
"description": "tertiary sensu dataset",
"data": {}
},
"upstreams": [],
"downstreams": []
}'
Note: it is sufficient (and preferred) that one declare it's relationship to another. Both need not do this.
Querying Lineage
We can search for lineage in the following ways:
- Using
compass lineage <urn>
CLI command - Calling to
GET /v1beta1/lineage/:urn
API withurn
to be searched as the path parameter
- CLI
- HTTP
$ compass lineage picollo:deployment-01
curl 'http://localhost:8080/v1beta1/lineage/picollo%3Adeployment-01' \
--header 'Compass-User-UUID:gotocompany@email.com'
The output is the following:
{
"data": [
{
"source": "sensu:deployment-01",
"target": "picollo:deployment-01",
"prop": {
"root": "picollo:deployment-01"
}
},
{
"source": "picollo:deployment-01",
"target": "gohan:deployment-01",
"prop": {
"root": "picollo:deployment-01"
}
}
],
"node_attrs": {}
}
The response represents a graph that consists of edges in its graph.