Database families¶
The concept of database families originated from the need to correctly distribute data of the partitioned tables within Qserv. This allows
Qserv to accurately process queries that JOIN
between the tables of different databases. A database family is a group of databases where
all tables within the family share the same partitioning parameters:
the number of stripes
the number of sub-stripes
the overlap radius
This will ensure that all chunk tables with the same chunk number will have:
the same spatial dimensions in the coordinate system adopted by Qserv
the same number and sizes of sub-chunks within each chunk.
The families are defined by the Replication/Ingest system and are not visible to Qserv users. Each family has a unique identifier (name). The system uses family names to correctly distribute the data of partitioned tables among Qserv worker nodes, ensuring that the data of tables joined in queries are co-located on the same worker node.
Families must be defined before the databases and tables are registered in Qserv. The current implementation of the API automatically creates a new family when the first database with a unique combination of partitioning parameters is registered in the system using:
Registering databases (REST)
If a family with the same partitioning parameters already exists in the system, the new database will be added to the existing family. Existing databases and families can be found using the following service:
The API allows users to delete existing database families from the Configuration of the Replication/Ingest system, but this should only be done for the empty families. The deletion of a family will also delete all databases and tables within the family. The operation will not affect Qserv from functioning. However, the Replication/Ingest system will exclude the deleted databases and tables from the replica management and monitoring operations. Further details on this service can be found in the following documents:
Deleting database families (REST)
For instructions on partitioning the tables with the desired set of parameters, refer to the following document:
Data preparation (DATA)