Article Contents:
Data Methodology
Hatched Analytics specialises in identifying and collecting proprietary sequential ID (order ID) data - unique identifiers that reflect the volume of orders or transactions processed through a company’s technology stack.
When an online purchase is made, a numeric or alphanumeric identifier is often generated. These identifiers typically form part of a sequence. By collecting sufficient volumes of these signals, Hatched can infer transaction/order volumes processed by the platform.
Sequential IDs may be identified across multiple locations, including:
- Email receipts
- URLs or clickstreams
- JSON page elements
- Delivered packages
Data Source
To scale this methodology, Hatched operates a cashback and rewards application. This provides access to a community of more than 20,000 gig workers who share transactional data in exchange for rewards, with full transparency on how the data is used.
This community is not a panel. Instead, these users generate the IDs that allow us to observe real order counts flowing through company platforms on a weekly, monthly, and quarterly basis. Importantly, no sampling is applied. The indices are built directly from the collected signals.
.gif)
Report Structure
This section provides an overview of the core dataset components available in Snowflake and S3 for backtesting our transaction indices. It covers the LIVE and PIT structures, core files per ticker, and how to interpret columns. Files are structured the same in Snowflake and S3.
You’ll see three top-level components:
- LIVE – Latest issued report for each of our data feeds.
- PIT (Point-in-Time) – Dated report snapshots for reproducible backtests.
- SUPPLEMENTAL – Reference tables (index/KPI descriptions, mappings, quality scores).
LIVE Files
These are the most recent reports for each data feed in your subscription. Each delivery includes three LIVE files per ticker, which are replaced with new versions at the next delivery. LIVE file structure is the same across all reports. Those files are:
- TS_INDEX – Unmodeled Data
- TS_RESULTS – Modelled KPI Estimates & Outcomes
- MODEL_DETAILS – Model Spec & Diagnostics
TS_INDEX
The TS Index is the foundational dataset in Snowflake/S3. It contains the raw, unmodelled values generated from sequential IDs.
The TS_INDEX tables use a consistent format across all reports. Each data feed contains one or more indices, depending on how the IDs are generated:
- Order/ Transaction Index – aggregate order volumes; the baseline index for most tickers. Where only a global sequential ID is available, this may be the sole index.
- Geo Index – geographic breakdowns, if IDs are generated at that level.
- Brand Index – brand-level breakdowns, if IDs are generated at that level.
- F(x)-Weighted Index - weighted order/transaction indices based on f(x) fluctuations, where applicable.

TS_RESULTS
The TS Results contain the modelled outputs using the index/indices to estimate the company’s KPI/KPIs, plus reported outcomes and error measurement for backtesting.

MODELS
The Models tables document how each index is mapped to its KPI and include the model parameters. Their format is consistent across all reports.

PIT Files
The PIT files provide dated archives of the LIVE files for reproducible, point-in-time backtests.
- Each PIT file name includes a date that corresponds to the last value contained in that snapshot.
- Use PIT when you need to guarantee that your backtest only sees the data state as of that specific date.
- Best practice: If you’re measuring “what did we know then?”, pull from PIT (or filter LIVE by
RELEASEDATE≤ your test cut-off).
SUPPLEMENTAL Files
These files help you interpret data feeds and align them to external systems.
- DESCRIPTIONS_INDEX
- Per-index relevancy & completeness scores (0–5) with narrative context.
- DESCRIPTIONS_KPI
- KPI alignment score (0–5) with narrative context.
- MAPPINGS
- Cross-refs to external identifiers (e.g., Bloomberg, Visible Alpha) and KPI naming to enable downstream publishing.