System Tables

Universal Storage Engine Data Model

This document provides a reference for the USE Core data model, including detailed descriptions and schemas for all tables, views, and metadata structures.

The USE Core system organizes data into several categories of tables that provide insights into catalogs, namespaces, tables, and their associated metadata.

System Tables

The following sections detail the core system tables that provide metadata and monitoring capabilities.

CatalogsSummary

Table Name

<use_catalog>.system_lake_tables.catalogs_summary

Description

Contains summary information about all catalogs in the system, including their namespaces, tables, views, and functions.

Schema

root
|-- catalog_name: string
|-- default_namespace: array
|    |-- element: string
|-- namespaces: array
|    |-- element: array 
|    |    |-- element: string
|-- tables: array
|    |-- element: struct
|    |    |-- catalog_name: string
|    |    |-- namespace_name: array
|    |    |    |-- element: string
|    |    |-- name: string
|-- views: array
|    |-- element: struct
|    |    |-- catalog_name: string
|    |    |-- namespace_name: array
|    |    |    |-- element: string
|    |    |-- name: string 
|-- functions: array
|    |-- element: struct
|    |    |-- catalog_name: string
|    |    |-- namespace_name: array
|    |    |    |-- element: string
|    |    |-- name: string 
|-- table_catalog_capabilities: array
|    |-- element: string
|-- description: string
|-- catalog_type: string

NamespacesSummary

Table Name

<use_catalog>.system_lake_tables.namespaces_summary

Description

Provides summary information about namespaces within catalogs, including their properties and contained tables.

Schema

root
|-- namespace_name: array
|    |-- element: string
|-- catalog_name: string
|-- description: string
|-- location: string
|-- owner: string
|-- is_default: boolean
|-- tables: array
|    |-- element: struct
|    |    |-- catalog_name: string
|    |    |-- namespace_name: array
|    |    |    |-- element: string
|    |    |-- name: string

TablesSummary

Table Name

<use_catalog>.system_lake_tables.tables_summary

Description

Contains basic metadata for all tables in the system, including creation time, access patterns, and storage information.

Schema

root
|-- catalog_name: string 
|-- namespace_name: array 
|    |-- element: string
|-- name: string 
|-- created_time: string 
|-- last_access: string 
|-- provider: string 
|-- table_type: string 
|-- created_by: string 
|-- location: string 

MetastoreSummary

Table Name

<use_catalog>.system_lake_tables.metastore_summary

Description

Comprehensive view combining catalog, namespace, and table information in a single denormalized structure for efficient querying.

Schema

root
 |-- catalog_name: string 
 |-- catalog_default_namespace: array
 |    |-- element: string
 |-- catalog_namespaces: array 
 |    |-- element: array
 |    |    |-- element: string
 |-- catalog_tables: array 
 |    |-- element: struct
 |    |    |-- catalog_name: string 
 |    |    |-- namespace_name: array 
 |    |    |    |-- element: string
 |    |    |-- name: string 
 |-- catalog_views: array 
 |    |-- element: struct
 |    |    |-- catalog_name: string 
 |    |    |-- namespace_name: array 
 |    |    |    |-- element: string
 |    |    |-- name: string 
 |-- catalog_functions: array 
 |    |-- element: struct
 |    |    |-- catalog_name: string 
 |    |    |-- namespace_name: array 
 |    |    |    |-- element: string
 |    |    |-- name: string 
 |-- table_catalog_capabilities: array 
 |    |-- element: string
 |-- catalog_description: string 
 |-- catalog_type: string 
 |-- namespace_name: array 
 |    |-- element: string 
 |-- namespace_description: string 
 |-- namespace_location: string 
 |-- namespace_owner: string 
 |-- namespace_tables: array 
 |    |-- element: struct 
 |    |    |-- catalog_name: string
 |    |    |-- namespace_name: array
 |    |    |    |-- element: string
 |    |    |-- name: string 
 |-- namespace_is_default: boolean
 |-- table_name: string 
 |-- table_owner: string
 |-- table_created_time: string 
 |-- table_last_access: string 
 |-- table_provider: string 
 |-- table_type: string 
 |-- table_created_by: string 
 |-- table_location: string 
 |-- table_columns: array 
 |    |-- element: struct
 |    |    |-- name: string
 |    |    |-- description: string 
 |    |    |-- data_type: string 

Table Monitoring Tables

These tables provide detailed monitoring and tracking information for individual tables.

TableHistory

Table Name

<use_catalog>.system_lake_tables.history

Description

Tracks the history of operations performed on tables, including data changes and metadata modifications.

Schema

root
|-- catalog_name: string 
|-- namespace_name: array 
|    |-- element: string
|-- name: string 
|-- version: long 
|-- starting_time: timestamp 
|-- ending_time: timestamp 
|-- operation_type: string 
|-- format_specific_operation_type: string 
|-- added_rows: long 
|-- deleted_rows: long 
|-- updated_rows: long 
|-- estimated_net_byte_change: long 
|-- created_files: long 
|-- created_bytes: long 
|-- purged_files: long 
|-- purged_bytes: long 

TableFileStats

Table Name

<use_catalog>.system_lake_tables.files AND <use_catalog>.system_lake_tables.all_files

Description

Provides detailed statistics about files in tables. The files table shows current files, while all_files includes deleted and purged files.

Schema

root
|-- catalog_name: string 
|-- namespace_name: array 
|    |-- element: string (containsNull = true)
|-- name: string 
|-- path: string 
|-- schema: string 
|-- addition_timestamp: long 
|-- modification_timestamp: long 
|-- deletion_timestamp: long 
|-- creation_timestamp: long 
|-- purge_timestamp: long 
|-- deleted: boolean 
|-- purged: boolean 
|-- file_bytes: long 
|-- file_rows: long 
|-- metadata: map 
|    |-- key: string
|    |-- value: string
|-- column_metrics: string 

TableIngestion

Table Name

<use_catalog>.<catalog_name>.system_lake_tables.ingestions

Description

Contains information about the ingestion process of tables. To enable ingestion tracking, configure the ingestion source path in table properties.

The ingestion path should be configured when creating a table or using the SET TBLPROPERTIES command:

-- CREATE TABLE
CREATE TABLE t TBLPROPERTIES(use.ingestion.source.file.path = '/ingestion/path')
 
-- ALTER TABLE
ALTER TABLE t SET TBLPROPERTIES(use.ingestion.source.file.path = '/ingestion/path');

Schema

root
 |-- write_id: string (nullable = true)
 |-- started_at: long (nullable = false)
 |-- finished_at: long (nullable = false)
 |-- destination_table: string (nullable = true)
 |-- number_of_input_files: long (nullable = false)
 |-- byte_size_input_files: long (nullable = false)
 |-- input_files: list(string) (nullable = true)
 |-- process_time_ms: long (nullable = true)
 |-- batch_max_timestamp: long (nullable = false)
 |-- batch_min_timestamp: long (nullable = false)
 |-- txn_app_id: string (nullable = true)
 |-- txn_version: string (nullable = true)

TableConsumption

Table Name

<use_catalog>.system_lake_tables.consumptions

Description

Contains information about the consumption process of tables. To enable consumption tracking, configure the consumption settings in table properties.

The consumption setup should be configured when creating a table or using the SET TBLPROPERTIES command:

-- CREATE TABLE
CREATE TABLE t TBLPROPERTIES(use.consumption.enabled='true')
 
-- ALTER TABLE
ALTER TABLE t SET TBLPROPERTIES(use.consumption.enabled = 'true');

Schema

root
 |-- event_name: string (nullable = true)
 |-- event_source: string (nullable = true)
 |-- event_id: string (nullable = true)
 |-- region: string (nullable = true)
 |-- user_id: string (nullable = true)
 |-- event_time: timestamp (nullable = true)
 |-- bytes_in: long (nullable = false)
 |-- bytes_out: long (nullable = false)
 |-- object_path: string (nullable = true)

TableOptimization

Table Name

<use_catalog>.system_lake_tables.optimizations

Description

Contains information about optimization processes for tables. Optimization includes revision optimization and delta vacuum operations that help maintain table performance.

Note: Currently, optimization must be triggered manually from CLI or spark-shell. Please refer to the Optimization Guide for more information about this feature.

Schema

root
 |-- optimization_name: string (nullable = true)
 |-- write_id: string (nullable = true)
 |-- table_id: string (nullable = true)
 |-- revision_id: long (nullable = true)
 |-- started_at: long (nullable = true)
 |-- finished_at: long (nullable = true)
 |-- input_files_stats: struct (nullable = true)
 |    |-- bytes: long (nullable = true)
 |    |-- cubes: array (nullable = true)
 |    |    |-- element: string (containsNull = true)
 |    |-- elementCount: long (nullable = true)
 |    |-- files: array (nullable = true)
 |    |    |-- element: string (containsNull = true)
 |    |-- numCubes: long (nullable = true)
 |    |-- numFiles: long (nullable = true)
 |-- output_files_stats: struct (nullable = true)
 |    |-- bytes: long (nullable = true)
 |    |-- cubes: array (nullable = true)
 |    |    |-- element: string (containsNull = true)
 |    |-- elementCount: long (nullable = true)
 |    |-- files: array (nullable = true)
 |    |    |-- element: string (containsNull = true)
 |    |-- numCubes: long (nullable = true)
 |    |-- numFiles: long (nullable = true)
 |-- process_time_ms: long (nullable = true)
 |-- cube_defragmentation_stats: struct (nullable = true)
 |    |-- average_cube_count_per_file_after: double (nullable = true)
 |    |-- average_cube_count_per_file_before: double (nullable = true)
 |    |-- average_cube_count_per_file_reduction: double (nullable = true)
 |    |-- average_block_element_count_per_file_after: double (nullable = true)
 |    |-- average_block_element_count_per_file_before: double (nullable = true)
 |    |-- average_block_element_count_per_file_increase: double (nullable = true)
 |-- sampling_overhead_reduction_stats: struct (nullable = true)
 |    |-- fraction: double (nullable = true)
 |    |-- table_bytes_before: long (nullable = true)
 |    |-- table_bytes_after: long (nullable = true)
 |    |-- sampled_bytes_before: long (nullable = true)
 |    |-- sampled_bytes_after: long (nullable = true)
 |    |-- sampled_fraction_before: double (nullable = true)
 |    |-- sampled_fraction_after: double (nullable = true)
 |    |-- bytes_to_optimize: long (nullable = true)
 |    |-- optimized_bytes: long (nullable = true)

Querying Single Tables

If you want to load file or history information for a single table, the ReadOnly Catalog allows you to query based on Table Identifier.

For example, if you have a table named "students", you can retrieve the metadata by executing:

-- Current added and deleted files
SELECT * FROM <use_catalog>.spark_catalog.default.students.files 
 
-- All files (including deleted and purged)
SELECT * FROM <use_catalog>.spark_catalog.default.students.all_files 
 
-- Table history
SELECT * FROM <use_catalog>.spark_catalog.default.students.history
 
-- Ingestion information
SELECT * FROM <use_catalog>.spark_catalog.default.students.ingestions
 
-- Consumption information
SELECT * FROM <use_catalog>.spark_catalog.default.students.consumptions
 
-- Cube information
SELECT * FROM <use_catalog>.spark_catalog.default.students.cubes

There's also a set of metadata tables that contain information from single tables aggregated by hour:

-- Hourly file statistics (current added and deleted files)
SELECT * FROM <use_catalog>.spark_catalog.default.students.files_hourly 
 
-- Hourly ingestion statistics
SELECT * FROM <use_catalog>.spark_catalog.default.students.ingestions_hourly
 
-- Hourly consumption statistics
SELECT * FROM <use_catalog>.spark_catalog.default.students.consumptions_hourly

The general syntax for accessing table metadata is:

SELECT * FROM <use_catalog>.<table_catalog_name>.<table_namespace_name>.<table_name>.<metadata_table_name>

Listing Available Tables

All metadata tables are listed when you run the SHOW TABLES command:

SHOW TABLES IN <use_catalog>