Feedback
3.3
CREATE REPOSITORY
¶
Register a new repository used to store, manage and restore snapshots.
Table of Contents
Synopsis¶
CREATE REPOSITORY repository_name TYPE type
[ WITH (repository_parameter [= value], [, ...]) ]
Description¶
CREATE REPOSITORY
will register a new repository in the cluster.
Note
If the repository configuration points to a location with existing snapshots, these are made available to the cluster.
Repositories are declared using a repository_name
and type
.
Further configuration parameters are given in the WITH Clause.
Parameters
- repository_name
The name of the repository as identifier
- type
The type of the repository, see Types.
Clauses¶
WITH
¶
[ WITH (repository_parameter [= value], [, ...]) ]
The following configuration parameters apply to repositories of all types.
For further configuration options see the documentation of the repository
type
(e.g. type fs).
- max_restore_bytes_per_sec
The maximum rate at which snapshots are restored on a single node from this repository.
Default:
40mb
per second.- max_snapshot_bytes_per_sec
The maximum rate at which snapshots are created on a single node to this repository.
Default:
40mb
per second.
Types¶
A type determines how and where a repository stores its snapshots.
The supported types are the following. More types are supported via plugins.
fs
¶
A repository storing its snapshots to a shared filesystem that must be accessible by all master and data nodes in the cluster.
Note
To create repositories of this type, it’s necessary to configure the
possible locations for repositories inside the crate.yml
file under
path.repo
as list of strings.
Parameters
- location
- Type:
string
RequiredAn absolute or relative path to the directory where snapshots get stored. If the path is relative, it will be appended to the first entry in the path.repo setting.
Windows UNC paths are allowed, if server name and shares are specified and backslashes properly escaped.
Only paths starting with an entry from path.repo are possible.
- compress
- Type:
boolean
Default:false
Whether the metadata part of the snapshot should be compressed or not.
The actual table data is not compressed.
- chunk_size
- Type:
long
orstring
Default:null
Defines the maximum size of a single file that gets created during snapshot creation. If set to
null
big files will not be split into smaller chunks. The chunk size can be either specified in bytes or using size value notation (e.g.1g
,5m
, or9k
).
hdfs
¶
A repository that stores its snapshot inside an HDFS file-system.
Parameters
- uri
- Type:
string
Default: default filesystem URI for the given Hadoop HDFS configurationHDFS uri of the form
hdfs:// <host>:<port>/
. - security.principal
- Type:
string
A qualified kerberos principal used to authenticate against HDFS.
- path
- Type:
string
HDFS filesystem path to where the data gets stored.
- load_defaults
- Type:
boolean
Default:true
Whether to load the default Hadoop Configuration.
- conf.<key>
- Type: various
Dynamic config values added to the Hadoop configuration.
- concurrent_streams
- Type:
integer
Default:5
The number of concurrent streams to use for backup and restore.
- compress
- Type:
boolean
Default:true
Whether the metadata part of the snapshot should be compressed or not.
The actual table data is not compressed.
- chunk_size
- Type:
long
orstring
Default:null
Defines the maximum size of a single file that gets created during snapshot creation. If set to
null
big files will not be split into smaller chunks. The chunk size can be either specified in bytes or using size value notation (e.g.1g
,5m
, or9k
).
s3
¶
A repository that stores its snapshot on the Amazon S3 service.
Parameters
- bucket
- Type:
string
Name of the S3 bucket used for storing snapshots. If the bucket does not yet exist, a new bucket will be created on S3 (assuming the required permissions are set).
- endpoint
- Type:
string
Default: Default AWS API endpointEndpoint to the S3 API. If a specific region is desired, specify it by using this setting.
- protocol
- Type:
string
Values:http
,https
Default:https
Protocol to be used.
- base_path
- Type:
string
Path within the bucket to the repository.
- access_key
- Type:
string
Default: Value defined through s3.client.default.access_key setting.Access key used for authentication against AWS.
Warning
If the secret key is set explicitly (not via configuration setting) it will be visible in plain text when querying the
sys.repositories
table. - secret_key
- Type:
string
Default: Value defined through s3.client.default.secret_key setting.Secret key used for authentication against AWS.
Warning
If the secret key is set explicitly (not via configuration setting) it will be visible in plain text when querying the
sys.repositories
table. - chunk_size
- Type:
long
orstring
Default:null
Defines the maximum size of a single file that gets created during snapshot creation. If set to
null
big files will not be split into smaller chunks. The chunk size can be either specified in bytes or using size value notation (e.g.1g
,5m
, or9k
). - compress
- Type:
boolean
Default:true
Whether the metadata part of the snapshot should be compressed.
The actual table data is not compressed.
- server_side_encryption
- Type:
boolean
Default:false
If set to
true
, files are encrypted on the server side using theAES256
algorithm. - buffer_size
- Type:
string
Default:5mb
Minimum:5mb
Minimum threshold below which chunks are uploaded with a single request. If the threshold is exceeded, the chunks will be split into multiple parts of
buffer_size
length. Each chunk will be uploaded separately. - max_retries
- Type:
integer
Default:3
Number of retries in case of errors.
- use_throttle_retries
- Type:
boolean
Default:true
Whether retries should be throttled (ie use backoff).
- read_only
- Type:
boolean
Default:false
If set to
true
the repository is made read-only. - canned_acl
- Type:
string
Values:private
,public-read
,public-read-write
,authenticated-read
,log-delivery-write
,bucket-owner-read
, orbucket-owner-full-control
Default:private
When the repository creates buckets and objects, the specified canned ACL is added.
url
¶
A read-only repository that points to the location of a
fs repository via http
, https
,
ftp
, file
and jar
urls. It only allows for
RESTORE SNAPSHOT operations.
Parameters
- read_only
- Type:
string
This url must point to the root of the shared fs repository.
Due to security reasons only whitelisted URLs can be used. URLs can be whitelisted in the
crate.yml
configuration file. See Repositories.