Metadata
There are two types of metadata: file metadata, and page header metadata.
In the diagram below, file metadata is described by the FileMetaData
structure. This file metadata provides offset and size information useful
when navigating the Parquet file. Page header metadata (PageHeader
and
children in the diagram) is stored in-line with the page data, and is
used in the reading and decoding of said data.
All thrift structures are serialized using the TCompactProtocol. The full definition of these structures is given in the Parquet Thrift definition.