Usage
Cubes can be managed using any of the supported clients, such as hetu-cli located under the bin
directory in the installation.
CREATE CUBE
Synopsis
CREATE CUBE [ IF NOT EXISTS ]
cube_name ON table_name WITH (
AGGREGATIONS = ( expression [, ...] ),
GROUP = ( column_name [, ...])
[, FILTER = (expression)]
[, ( property_name = expression [, ...] ) ]
)
[WHERE predicate]
Description
Create a new, empty Cube with the specified group and aggregations. Use INSERT INTO CUBE (see below)
to insert into data.
The optional IF NOT EXISTS
clause causes the error to be suppressed if the Cube already exists.
The optional property_name
section can be used to set properties on the newly created Cube.
To list all available table properties, run the following query:
SELECT * FROM system.metadata.table_properties
Note: These properties are limited to the Connector which the Cube is being created for.
Examples
Create a new Cube orders_cube
on orders
:
CREATE CUBE orders_cube ON orders WITH (
AGGREGATIONS = ( SUM(totalprice), AVG(totalprice) ),
GROUP = ( orderstatus, orderdate ),
format = 'ORC'
)
Create a new partitioned Cube orders_cube
:
CREATE CUBE orders_cube ON orders WITH (
AGGREGATIONS = ( SUM(totalprice), AVG(totalprice) ),
GROUP = ( orderstatus, orderdate ),
format = 'ORC',
partitioned_by = ARRAY['orderdate']
)
Create a new Cube orders_cube
with some source data filter:
CREATE CUBE orders_cube ON orders WITH (
AGGREGATIONS = ( SUM(totalprice), COUNT DISTINCT(orderid) ),
GROUP = ( orderstatus ),
FILTER = (orderdate BETWEEN 2512450 AND 2512460)
)
Create a new Cube orders_cube
with some additional predicate on Cube columns:
CREATE CUBE orders_cube ON orders WITH (
AGGREGATIONS = ( SUM(totalprice), COUNT DISTINCT(orderid) ),
GROUP = ( orderstatus ),
FILTER = (orderdate BETWEEN 2512450 AND 2512460)
) WHERE orderstatus = 'PENDING';
This is same as following:
CREATE CUBE orders_cube ON orders WITH (
AGGREGATIONS = ( SUM(totalprice), COUNT DISTINCT(orderid) ),
GROUP = ( orderstatus ),
FILTER = (orderdate BETWEEN 2512450 AND 2512460)
);
INSERT INTO CUBE orders_cube WHERE orderstatus = 'PENDING';
The FILTER
property can be used to filter out data from the source table while building the Cube. Cube is built on the data after
applying the orderdate BETWEEN 2512450 AND 2512460
predicate on the source table. The columns used in the filter predicate must not be part the Cube.
Limitations
- Cubes can be created with only following aggregation functions.
In other words, Queries using the following functions can only be optimized using Cubes. COUNT, COUNT DISTINCT, MIN, MAX, SUM, AVG - Different connector might support different data type, and different table/column properties.
INSERT INTO CUBE
Synopsis
INSERT INTO CUBE cube_name [WHERE condition]
Description
CREATE CUBE
statement creates Cube without any data. To insert data into Cube, use INSERT INTO CUBE
SQL.
The WHERE
clause is optional. If predicate is provided, only data matching the given predicate are processed from the source table and inserted into the Cube.
Otherwise, entire data from the source table is processed and inserted into Cube.
Examples
Insert data into the orders_cube
Cube:
INSERT INTO CUBE orders_cube WHERE orderdate > date '1999-01-01';
INSERT INTO CUBE order_all_cube;
Limitations
- Subsequent inserts to the same Cube need to use same set of columns
CREATE CUBE orders_cube ON orders WITH (AGGREGATIONS = (count(*)), GROUP = (orderdate));
INSERT INTO CUBE orders_cube WHERE orderdate BETWEEN date '1999-01-01' AND date '1999-01-05';
-- This statement would fail because its possible the Cube already contain rows matching the given predicate.
INSERT INTO CUBE orders_cube WHERE location = 'Canada';
Note: This means that columns used in the first insert must be used in every insert predicate following the first to avoid inserting duplicate data.
INSERT OVERWRITE CUBE
Synopsis
INSERT OVERWRITE CUBE cube_name [WHERE condition]
Description
Similar to INSERT INTO CUBE
statement but with this statement the existing data is overwritten. Predicates
are optional.INSERT OVERWRITE CUBE
is not supported on partitioned cubes. Cubes are essentially stored as tables and so INSERT OVERWRITE
only
replaces the matching partitions and does not overwrite the entire table. So this operation is blocked on partitioned cube.
Drop and recreate cube if needed.
Examples
Insert data based on condition into the orders_cube
Cube:
INSERT OVERWRITE CUBE orders_cube WHERE orderdate > date '1999-01-01';
INSERT OVERWRITE CUBE orders_cube;
SHOW CUBES
Synopsis
SHOW CUBES [ FOR table_name ];
Description
SHOW CUBES
lists all Cubes. Adding the optional table_name
lists only the Cubes for that table.
Examples
Show all Cubes:
SHOW CUBES;
Show Cubes for orders
table:
SHOW CUBES FOR orders;
RELOAD CUBE
Synopsis
RELOAD CUBE cube_name
Description
Reloads the Cube if the source table has been updated.
Examples
If the source table orders
of the cube orders_cube
gets updated then the status of the cube orders_cube
gets EXPIRED. Use the command RELOAD CUBE cube_name
to overcome this issue as follows:
RELOAD CUBE orders_cube
DROP CUBE
Synopsis
DROP CUBE [ IF EXISTS ] cube_name
Description
Drop an existing Cube.
The optional IF EXISTS
clause causes the error to be suppressed if the Cube does not exist.
Examples
Drop the Cube orders_cube
:
DROP CUBE orders_cube
Drop the Cube orders_cube
if it exists:
DROP CUBE IF EXISTS orders_cube