#
Export
The Export plugin allows data retrieved via the Rosetta data service to be exported to various export targets (e.g. a CSV file, an Elasticsearch index).
Exports are carried out asynchronously where export jobs are first requested and submitted to a blocking queue and
an export job id is generated and included in the response (see the POST /export
GET /export/job/{id}
COMPLETED
, then the export output may be retrieved from the target.
The Export plugin also provides a number of export-specific Views that allow the job submission/status endpoints to be bound to specific paths, as well as simplified/customized using request and response transforms.
#
REST Endpoints
Export jobs can be managed entirely with the REST endpoints built into this plugin, as described in this section.
However, since export requests may contain secure information such as local file paths or user credentials,
it is highly recommended that appropriate Views are configured for external use
(i.e. at least one export/request
and export/get
View), such that only carefully controlled parameters (if any)
and information are exposed externally, and that the built-in endpoints are locked down using an API
gateway such as Kong, Apache, or nginx.
Each endpoint below is listed in the form: {HTTP request method} {HTTP request path template}, followed by a description of the endpoint.
#
POST /export
Submit an export request to the export service.
#
Request
Body
Schema:
The request body is a JSON object with a String property, type
, whose value determines
the schema for the remaining properties.
See individual type
values and corresponding request schemas.
#
Response
Body
Schema:
A JSON object that describes the outcome of the export request, including the job id (if accepted), status and a string message.
Code
200 if the export request was accepted onto the queue; 403 if the request was refused (e.g. due to the queue being full);
or 500 if an error occurred.
#
POST /export/pipeline
Execute a named export pipeline. The pipelines must be defined in the
#
Request
Parameters
#
Response
Body
Schema:
An array of ExportResponse
objects, each one describing the outcome of each ExportRequest
in the pipeline.
#
GET /export/status
Reports the overall status of the export plugin.
#
Response
Body
Schema:
An object containing information about the status of the export service, including the total number of jobs in the queue.
#
GET /export/job
Lists the currently running and completed export jobs.
#
Response
Body
Schema:
An array of ExportJob
objects, including those in the queue and a history of completed jobs.
#
GET /export/job/{id}
Retrieves a single export job by id.
#
Request
Parameters
#
Response
Body
Schema:
The ExportJob
object whose id is given by the path parameter, id
.
#
GET /export/explain
Provides information about the job that would be submitted for a given request body (without actually submitting it).
#
Request
Body
Schema:
The request body is a JSON object with a String property, type
, whose value determines
the schema for the remaining properties.
#
Response
Body
Schema:
A JSON object describing details of the submitted ExportRequest
, including the corresponding Java class,
the resolved type, the full export request (including any default values for each field) and
any errors that may have occurred.
#
Processes
For a number of export targets, the means by which results are retrieved from the Rosetta data service for the purposes of exporting is defined in terms on one or more processes.
In this context, a process is composed of the following steps:
- Execute a starting data service request against the data service to retrieve the first page of results to export,
where the starting request body conforms to the CustomPagedRequest
schema (i.e. has the
from
andsize
parameters). - Export the current page of results.
- Form a new data service request by incrementing the
from
parameter, where the increment amount is influenced by various request fields, but by default, is equal to thesize
parameter of the starting request. - Execute the incremented request against the data service to retrieve the next page of results.
- Repeat steps 2-4 until at least one of the configured exit conditions is met.
#
Export Targets
The export target is determined by the type
field in the export request body. The following subsections describe each
type of export target and are titled for their respective type
value.
#
csv
Export to a CSV file, either on the local file system, in an S3 bucket, or a LocalStack S3 bucket.
Each result is exported as a separate row where the keys are interpreted as the column headers of the output CSV file. If the results are composed of complex data with nested maps and arrays, then the column headers will be the full definite JSONPaths (excluding the "$." prefix) to each leaf node in each result.
Due to the above, it is recommended that the results are converted to a CSV-appropriate JSON format
(e.g. a flat map of key-value pairs) before being processed by the export service. This can be done by creating a CSV-specific
Profile that has a Glyph registered in the data phase, which converts to a format
amenable to CSV exports. Then, configure the export to use this Profile, either in the
#
Request Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"type": {
"const": "csv"
},
"processes": {
"type": "array",
"description": "An array of processes that are executed in sequence by the export service for the purposes of retrieving the results to be exported.",
"items": {
"type": "object",
"properties": {
"starting_request": {
"type": "object",
"description": "The initial data service request made to the Rosetta data service to retrieve the result set for export",
"properties": {
"profile": {
"type": "string",
"description": "The profile used to retrieve the result set"
},
"request": {
"type": "object",
"description": "The initial request",
"properties": {
"from": {
"type": "integer",
"description": "The zero-based result offset for the initial request."
},
"size": {
"type": "integer",
"description": "The maximum number of results per page. Functions as the export batch size when 'increment_type' is set to 'size'."
}
}
}
},
"default": {
"request": {
"size": 100
}
}
},
"increment_type": {
"description": "Determines the method of incrementing the 'from' parameter.",
"oneOf": [
{
"const": "size",
"description": "Increments the 'from' parameter by the 'size' field of the starting Rosetta request"
},
{
"const": "one",
"description": "Increments the 'from' parameter by 1"
},
{
"const": "custom",
"description": "Increments the 'from' parameter by a custom amount given by 'custom_batch_size', which may be different from the size parameter in the starting request"
}
],
"default": "size"
},
"custom_batch_size": {
"type": "integer",
"description": "Batch size when used with 'custom' increment type."
},
"to": {
"type": "integer",
"description": "Maximum value for 'from' (exclusive). Has no effect unless used with 'to' exit condition."
},
"exit_conditions": {
"type": "array",
"description": "List of exit conditions. Combined with Boolean OR.",
"items": {
"oneOf": [
{
"const": "not_found",
"description": "Exits if the data response field 'found' is false"
},
{
"const": "size",
"description": "Exits if the number of results is smaller than the request size"
},
{
"const": "size_no_errors",
"description": "Exits if the result set is smaller than the request size and there are no errors"
},
{
"const": "total",
"description": "Exits if the total number of records is smaller than the 'from' request parameter"
},
{
"const": "to",
"description": "Exits if the 'from' parameter exceeds the 'to' parameter in the export request"
}
]
},
"default": [
"not_found",
"size_no_errors",
"total"
]
}
}
},
"minItems": 1
},
"skip_total_count": {
"type": "boolean",
"description": "Determines whether to skip populating the total count before starting the export",
"default": false
},
"config": {
"type": "object",
"properties": {
"export_type": {
"description": "Determines the type of destination for the export",
"oneOf": [
{
"const": "local",
"description": "Saves to local file system"
},
{
"const": "s3",
"description": "Pushes to remote S3 bucket"
},
{
"const": "localstack",
"description": "Pushes to S3 bucket within a LocalStack deployment"
}
]
},
"file_path": {
"type": "string",
"description": "The path to a directory in which the output file is saved. The file path must be specified, even if 'export_type' is 's3' or 'localstack' as the output file is created here as a temporary local file before uploading to S3."
},
"file_name": {
"type": "string",
"description": "The name of the output file to be saved (including the file extension). If this is not specified, then the job id followed by the file extension will be used."
},
"s3_config": {
"type": "object",
"description": "Configuration for the S3 connection. Required if 'export_type' is set to 's3' or 'localstack'. Ignored otherwise.",
"properties": {
"bucket": {
"type": "string",
"description": "The name of the S3 bucket"
},
"region": {
"type": "string",
"description": "The AWS region",
"default": "eu-west-1"
},
"access_key_id": {
"type": "string",
"description": "The AWS access key id"
},
"secret_key": {
"type": "string",
"description": "The AWS secret access key"
},
"uri": {
"type": "string",
"description": "The URI of the endpoint override to use if 'export_type' is set to 'localstack'. Ignored otherwise."
},
"prefix": {
"type": "string",
"description": "The path prefix to use when uploading objects to S3"
}
},
"required": [
"bucket",
"access_key_id",
"secret_key"
]
},
"columns": {
"type": "array",
"description": "An ordered list of column headers to be included in the CSV output. If not specified, then the headers are inferred from the structure of the exported records. In this case, the order of the headers is not guaranteed to be consistent.",
"items": {
"type": "string"
}
},
"create_directories": {
"type": "boolean",
"description": "Creates any non-existent directories as necessary when saving the CSV files. If set to false (default), then the directory specified in 'file_path' will have to be created manually. Otherwise, an error will occur.",
"default": false
},
"deduplicate": {
"type": "boolean",
"description": "If true, rows will only be written to the CSV output if they are unique.",
"default": false
},
"deduplication_cache_size": {
"type": "integer",
"description": "The size of the deduplication cache. The cache is used to speed up checking whether a CSV row is a duplicate while limiting the maximum memory footprint.",
"default": 100
},
"sort_on": {
"type": "string",
"description": "Name of column on which to sort. If none is specified, sorting is skipped."
},
"add_bom": {
"type": "boolean",
"description": "Determines whether to add a byte order mark (BOM) to the output file",
"default": false
}
},
"required": [
"export_type",
"file_path"
]
}
},
"required": [
"type",
"processes"
]
}
#
Example
{
"type": "csv",
"processes": [
{
"starting_request": {
"profile": "search",
"request": {
"queries": [
"John Smith"
],
"from": 0,
"size": 100
}
},
"increment_type": "size",
"exit_conditions": [
"not_found",
"size_no_errors",
"total"
]
}
],
"skip_total_count": false,
"config": {
"export_type": "s3",
"file_path": "/rosetta/data/temp/s3_export",
"s3_config": {
"bucket": "my-bucket",
"region": "eu-west-1",
"access_key_id": "abc123",
"secret_key": "def456",
"prefix": "export-output/csv"
},
"columns": [
"id",
"name",
"description",
"birth_year",
"death_year"
],
"create_directories": true,
"deduplicate": true,
"sort_on": "name",
"add_bom": true
}
}
#
json
Export to a JSON file, either on the local file system or in an S3 bucket.
The output JSON file takes the form of a JSON array whose items are the results that have been exported.
#
Request Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"type": {
"const": "json"
},
"processes": {
"type": "array",
"description": "An array of processes that are executed in sequence by the export service for the purposes of retrieving the results to be exported.",
"items": {
"type": "object",
"properties": {
"starting_request": {
"type": "object",
"description": "The initial data service request made to the Rosetta data service to retrieve the result set for export",
"properties": {
"profile": {
"type": "string",
"description": "The profile used to retrieve the result set"
},
"request": {
"type": "object",
"description": "The initial request",
"properties": {
"from": {
"type": "integer",
"description": "The zero-based result offset for the initial request."
},
"size": {
"type": "integer",
"description": "The maximum number of results per page. Functions as the export batch size when 'increment_type' is set to 'size'."
}
}
}
},
"default": {
"request": {
"size": 100
}
}
},
"increment_type": {
"description": "Determines the method of incrementing the 'from' parameter.",
"oneOf": [
{
"const": "size",
"description": "Increments the 'from' parameter by the 'size' field of the starting Rosetta request"
},
{
"const": "one",
"description": "Increments the 'from' parameter by 1"
},
{
"const": "custom",
"description": "Increments the 'from' parameter by a custom amount given by 'custom_batch_size', which may be different from the size parameter in the starting request"
}
],
"default": "size"
},
"custom_batch_size": {
"type": "integer",
"description": "Batch size when used with 'custom' increment type."
},
"to": {
"type": "integer",
"description": "Maximum value for 'from' (exclusive). Has no effect unless used with 'to' exit condition."
},
"exit_conditions": {
"type": "array",
"description": "List of exit conditions. Combined with Boolean OR.",
"items": {
"oneOf": [
{
"const": "not_found",
"description": "Exits if the data response field 'found' is false"
},
{
"const": "size",
"description": "Exits if the number of results is smaller than the request size"
},
{
"const": "size_no_errors",
"description": "Exits if the result set is smaller than the request size and there are no errors"
},
{
"const": "total",
"description": "Exits if the total number of records is smaller than the 'from' request parameter"
},
{
"const": "to",
"description": "Exits if the 'from' parameter exceeds the 'to' parameter in the export request"
}
]
},
"default": [
"not_found",
"size_no_errors",
"total"
]
}
}
},
"minItems": 1
},
"skip_total_count": {
"type": "boolean",
"description": "Determines whether to skip populating the total count before starting the export",
"default": false
},
"config": {
"type": "object",
"properties": {
"export_type": {
"description": "Determines the type of destination for the export",
"oneOf": [
{
"const": "local",
"description": "Saves to local file system"
},
{
"const": "s3",
"description": "Pushes to remote S3 bucket"
}
]
},
"file_path": {
"type": "string",
"description": "The path to a directory in which the output file is saved. The file path must be specified, even if 'export_type' is 's3' as the output file is created here as a temporary local file before uploading to S3."
},
"s3_config": {
"type": "object",
"description": "Configuration for the S3 connection. Required if 'export_type' is set to 's3'. Ignored otherwise.",
"properties": {
"bucket": {
"type": "string",
"description": "The name of the S3 bucket"
},
"region": {
"type": "string",
"description": "The AWS region",
"default": "eu-west-1"
},
"access_key_id": {
"type": "string",
"description": "The AWS access key id"
},
"secret_key": {
"type": "string",
"description": "The AWS secret access key"
},
"prefix": {
"type": "string",
"description": "The path prefix to use when uploading objects to S3"
}
},
"required": [
"bucket",
"access_key_id",
"secret_key"
]
}
},
"required": [
"export_type",
"file_path"
]
}
},
"required": [
"type",
"processes"
]
}
#
Example
{
"type": "json",
"processes": [
{
"starting_request": {
"profile": "search",
"request": {
"queries": [
"John Smith"
],
"from": 0,
"size": 100
}
},
"increment_type": "size",
"exit_conditions": [
"not_found",
"size_no_errors",
"total"
]
}
],
"skip_total_count": false,
"config": {
"export_type": "s3",
"file_path": "/rosetta/data/temp/s3_export",
"s3_config": {
"bucket": "my-bucket",
"region": "eu-west-1",
"access_key_id": "abc123",
"secret_key": "def456",
"prefix": "export-output/json"
}
}
}
#
elasticsearch
Export to an Elasticsearch index. Each Metadata
result is indexed such that the data
field is used as the document
source and the id
field is used as the document id, or a unique id is automatically generated if this field is not
populated.
Note that the id
field of a Metadata
result can only be populated in the domain phase of a Profile.
#
Request Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"type": {
"const": "elasticsearch"
},
"processes": {
"type": "array",
"description": "An array of processes that are executed in sequence by the export service for the purposes of retrieving the results to be exported.",
"items": {
"type": "object",
"properties": {
"starting_request": {
"type": "object",
"description": "The initial data service request made to the Rosetta data service to retrieve the result set for export",
"properties": {
"profile": {
"type": "string",
"description": "The profile used to retrieve the result set"
},
"request": {
"type": "object",
"description": "The initial request",
"properties": {
"from": {
"type": "integer",
"description": "The zero-based result offset for the initial request."
},
"size": {
"type": "integer",
"description": "The maximum number of results per page. Functions as the export batch size when 'increment_type' is set to 'size'."
}
}
}
},
"default": {
"request": {
"size": 100
}
}
},
"increment_type": {
"description": "Determines the method of incrementing the 'from' parameter.",
"oneOf": [
{
"const": "size",
"description": "Increments the 'from' parameter by the 'size' field of the starting Rosetta request"
},
{
"const": "one",
"description": "Increments the 'from' parameter by 1"
},
{
"const": "custom",
"description": "Increments the 'from' parameter by a custom amount given by 'custom_batch_size', which may be different from the size parameter in the starting request"
}
],
"default": "size"
},
"custom_batch_size": {
"type": "integer",
"description": "Batch size when used with 'custom' increment type."
},
"to": {
"type": "integer",
"description": "Maximum value for 'from' (exclusive). Has no effect unless used with 'to' exit condition."
},
"exit_conditions": {
"type": "array",
"description": "List of exit conditions. Combined with Boolean OR.",
"items": {
"oneOf": [
{
"const": "not_found",
"description": "Exits if the data response field 'found' is false"
},
{
"const": "size",
"description": "Exits if the number of results is smaller than the request size"
},
{
"const": "size_no_errors",
"description": "Exits if the result set is smaller than the request size and there are no errors"
},
{
"const": "total",
"description": "Exits if the total number of records is smaller than the 'from' request parameter"
},
{
"const": "to",
"description": "Exits if the 'from' parameter exceeds the 'to' parameter in the export request"
}
]
},
"default": [
"not_found",
"size_no_errors",
"total"
]
},
"deletion_query": {
"type": "object",
"description": "An optional Elasticsearch query that, if specified, is used to delete documents that match the query from the Elasticsearch index before proceeding with the export."
}
}
},
"minItems": 1
},
"skip_total_count": {
"type": "boolean",
"description": "Determines whether to skip populating the total count before starting the export",
"default": false
},
"batch_size": {
"type": "integer",
"description": "The maximum number of results to be included in each bulk index request before it is executed."
},
"config": {
"type": "object",
"properties": {
"client": {
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "The URL for the Elasticsearch cluster to which the results are to be exported. Should either be of the form '{scheme}://{host}:{port}' or '{scheme}://{host}'."
},
"path_prefix": {
"type": "string",
"description": "The path prefix of the Elasticsearch cluster (i.e. the path to its root/info endpoint). Not required if the cluster is exposed directly. Used in cases where the cluster is mapped to a specific path behind a reverse-proxy, for instance."
},
"username": {
"type": "string",
"description": "The username to be used in Elasticsearch requests."
},
"password": {
"type": "string",
"description": "The password to be used in Elasticsearch requests."
}
},
"required": [
"url"
]
},
"index": {
"type": "string",
"description": "The Elasticsearch index to which the results are to be exported."
},
"mappings": {
"type": "object",
"description": "The Elasticsearch mappings to use when indexing the results. See https://www.elastic.co/guide/en/elasticsearch/reference/7.17/mapping.html."
},
"settings": {
"type": "object",
"description": "The Elasticsearch index settings to use when indexing the results. See https://www.elastic.co/guide/en/elasticsearch/reference/7.17/index-modules.html."
}
},
"required": [
"client",
"index"
]
}
},
"required": [
"type",
"processes"
]
}
#
Example
{
"type": "elasticsearch",
"processes": [
{
"starting_request": {
"profile": "search",
"request": {
"queries": [
"John Smith"
],
"from": 0,
"size": 100
}
},
"increment_type": "size",
"exit_conditions": [
"not_found",
"size_no_errors",
"total"
],
"deletion_query": {
"term": {
"data_type": "person"
}
}
}
],
"skip_total_count": false,
"batch_size": 50,
"config": {
"client": {
"url": "http://myfakedomain:1234",
"path_prefix": "/path/to/elasticsearch",
"username": "my-user",
"password": "my-pass"
},
"index": "target-index",
"mappings": {
"dynamic_templates": [
{
"all_strings": {
"match_mapping_type": "string",
"mapping": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
],
"date_detection": false
},
"settings": {
"index": {
"max_result_window": 100000
}
}
}
}
#
Views
#
export/request
Entity Kind: View
Type: export/request
An export/request
View can be used to submit export requests to the export service and performs the following steps:
- Transforms the received HttpDataRequest into an
ExportRequest using the Glyphs specified in thetransforms.request
configuration property. - Submits the
ExportRequest
to the export service and receives anExportResponse - Transforms the
ExportResponse
into the final response using the Glyphs specified in thetransforms.response
configuration property.
This View can be used to simplify the submitting of export jobs by allowing the request transform to handle the population of export request fields. This can either be with static values if they are not expected to change between requests (such as output file paths or Elasticsearch client information), or with values dynamically determined by query string parameters or other aspects of the HTTP request (e.g. for fields within the starting request that are used to select the results for export), or with a combination of the two.
#
Properties
See
#
export/get
Entity Kind: View
Type: export/get
An export/get
View can be used to retrieve read-only information about the export service (such as job status)
and performs the following steps:
- Transforms the received HttpDataRequest into an
ExportGetRequest using the Glyphs specified in thetransforms.request
configuration property. - Retrieves a response from one of three
GET
endpoints that the export service provides depending on thetype
field of theExportGetRequest . - Transforms the response into the final response using the Glyphs specified in the
transforms.response
configuration property.
Endpoints that return export jobs (e.g. export/job/{id}
) embed within the response the full original export
request used to create the job. This may contain secure information such as credentials/file paths, which should
be removed from the response using a response transform.
#
Properties
See
#
Providers
#
export/job-history
Entity Kind: Provider
Type: export/job-history
Request Model: CustomPagedRequest
Result Model:
The export/job-history
Provider retrieves a sublist of the full list of historical export jobs determined by the
from
and size
parameters of the received CustomPagedRequest
.
#
Properties
This Provider has no configuration properties.
#
Example
name: my-provider
type: export/job-history
#
Configuration Properties
#
Data Structures
#
ExportGetRequest
The target model for the request phase of an export/get
#
Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"type": {
"oneOf": [
{
"const": "status",
"description": "Retrieve the status of the export service from the '/export/status' endpoint"
},
{
"const": "job",
"description": "Retrieve a single export job from the '/export/job/{id}' endpoint"
},
{
"const": "jobs",
"description": "Retrieve the full list of export jobs from the '/export/job' endpoint"
}
]
},
"id": {
"type": "string",
"description": "The export job id used in the '/export/job/{id}' request. Required if 'type' is set to 'job'. Ignored otherwise."
}
},
"required": [
"type"
]
}
#
Example
{
"type": "job",
"id": "abc123"
}
#
ExportRequest
An export request contains all the information necessary to execute an export job and is used as the request
body for the POST /export
#
Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"type": {
"type": "string",
"description": "The type of export. Determines the schema for the other properties of the export request."
}
},
"additionalProperties": {
"description": "For the full schema of an export request, including additional properties, see the documentation for a specific export target."
}
}
#
Example
See
#
ExportResponse
The response body for the POST /export
#
Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"job_id": {
"type": "string",
"description": "An automatically generated unique id for the submitted export job. Store this value in order to poll the status of this job in 'GET /export/job/{id}' requests."
},
"status": {
"description": "A status string describing the outcome of export request submission.",
"oneOf": [
{
"const": "accepted",
"description": "The job was accepted by the export service and has been queued for execution."
},
{
"const": "refused",
"description": "The job was refused. Possible reasons include the queue being full or that the request was invalid."
},
{
"const": "error",
"description": "An internal server error occurred when attempting to submit the export request."
}
]
},
"message": {
"type": "string",
"description": "A message describing the outcome of the export request submission attempt. Is absent if the request was accepted."
},
"error": {
"type": "object",
"description": "Describes the error if one occurred while fulfilling the request.",
"properties": {
"message": {
"type": "string",
"description": "A message describing the error."
},
"cause": {
"type": "string",
"description": "A message describing the root cause of the error."
}
}
}
}
}
#
Example
{
"job_id": "d41d8cd9-8f00-3204-a980-0998ecf8427e",
"status": "accepted"
}
#
ExportJob
Describes an export job that has been submitted to the export service. Used in the response bodies for the
GET /export/job/{id}
GET /export/job
#
Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$defs": {
"instant": {
"type": "string",
"description": "A date-time string conforming to the ISO 8601 standard."
}
},
"type": "object",
"properties": {
"sequence": {
"type": "integer",
"description": "The sequence number of this job, where each job is allocated a sequence number representing the order in which the job was created since the Rosetta service was started (starting at zero)."
},
"id": {
"type": "string",
"description": "An automatically generated unique id for the export job."
},
"request": {
"description": "The export request that causes the job to be created. See the schema for ExportRequest for more details."
},
"status": {
"oneOf": [
{
"const": "QUEUED",
"description": "The export job is queued but not yet started."
},
{
"const": "RUNNING",
"description": "The export job is current being executed."
},
{
"const": "COMPLETED",
"description": "The export job has finished execution."
},
{
"const": "FAILED",
"description": "The export job encountered an error during execution."
}
]
},
"error": {
"type": "object",
"description": "Describes the error if one occurred while executing the export job.",
"properties": {
"message": {
"type": "string",
"description": "A message describing the error."
},
"cause": {
"type": "string",
"description": "A message describing the root cause of the error."
}
}
},
"created": {
"$ref": "#/$defs/instant",
"description": "The instant that the export job was created."
},
"started": {
"$ref": "#/$defs/instant",
"description": "The instant that the export job started being executed."
},
"finished": {
"$ref": "#/$defs/instant",
"description": "The instant that the export job finished executing (whether due to completing successfully or encountering an error)."
},
"duration": {
"type": "string",
"description": "The duration between the job starting and finishing, or between the job starting and the current time if the job is still running. Represented as an ISO 8601 duration."
},
"progress": {
"type": "integer",
"description": "The number of results that have so far been exported successfully."
},
"total": {
"type": "integer",
"description": "The total number of results to be exported."
},
"percentage": {
"type": "integer",
"description": "The floored percentage of the total number of results that have so far been exported."
}
}
}
#
Example
{
"id": "d41d8cd9-8f00-3204-a980-0998ecf8427e",
"sequence": 7,
"request": {
"type": "csv",
"processes": [
{
"starting_request": {
"profile": "search",
"request": {
"queries": [
"John Smith"
],
"from": 0,
"size": 100
}
},
"increment_type": "size",
"exit_conditions": [
"not_found",
"size_no_errors",
"total"
]
}
],
"skip_total_count": false,
"config": {
"export_type": "s3",
"file_path": "/rosetta/data/temp/s3_export",
"s3_config": {
"bucket": "my-bucket",
"region": "eu-west-1",
"access_key_id": "abc123",
"secret_key": "def456",
"prefix": "export-output/csv"
},
"columns": [
"id",
"name",
"description",
"birth_year",
"death_year"
],
"create_directories": true,
"deduplicate": true,
"sort_on": "name",
"add_bom": true
}
},
"status": "RUNNING",
"created": "2025-05-07T16:21:26.358120372Z",
"started": "2025-05-07T16:21:26.358160291Z",
"duration": "PT27.172225887S",
"progress": 1200,
"total": 1500,
"percentage": 80
}
#
ExportServiceStatus
The response body for the GET /export/status
#
Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"job_count": {
"type": "integer",
"description": "The number of active jobs (those currently running or queued)."
}
}
}
#
Example
{
"job_count": 10
}
#
ExportView
The configuration schema for both export/request
and export/get
Views.
#
Properties
As BaseView but with the following additional properties.
#
Example
name: my-export-view
paths:
- /my-export
methods:
- POST
type: export/request
transforms:
request:
policy: list
names:
- http-to-export-request
response:
policy: list
names:
- format-export-response
media_type: text/plain
Note that the above example requires two Glyphs, http-to-export-request
and format-export-response
to be declared,
e.g. using the property rosetta.transform.glyphs
.
#
ExplainResponse
The response body for the GET /export/explain
#
Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"class": {
"type": "string",
"description": "The resolved fully-qualified Java class of the export request."
},
"type": {
"type": "string",
"description": "The type of the export request."
},
"request": {
"type": "string",
"description": "The fully resolved export request including any default values determined by the export request class."
},
"error": {
"type": "object",
"description": "Describes the error if one occurred while fulfilling the request.",
"properties": {
"message": {
"type": "string",
"description": "A message describing the error."
},
"cause": {
"type": "string",
"description": "A message describing the root cause of the error."
}
}
}
}
}