Four systems. None of them agree.

"What was our total visitor revenue last month?" Ticketing $1.82M Paid entries + add-ons ticket-level transactions F&B POS $0.41M Outlet sales excludes bundled meals Partner Channels $0.68M Klook / GYG / Resellers gross before commission Finance / GL $2.14M Authoritative total net revenue, all sources Naive sum: $2.91M vs GL: $2.14M Ticketing + POS + Partners double-counts combos. Partners are gross, GL is net. An AI picking from these doesn't know the difference.

Cross-domain validations AI discovers

cross-domain

Ticketing vs F&B attachment rate

12,000 Luge rides sold Tuesday. F&B outlet at Luge station: 1,800 transactions. Historical ratio: 18-22% of riders buy F&B. Today: 15%.

Flag: F&B POS downtime or visitor behaviour shift?

cross-domain

Partner redemption vs turnstile

Klook reports 900 combo packages redeemed. Cable car turnstile: 820 scans. 80 people bought combos but didn't ride.

Flag: No-show pattern or scanner malfunction?

cross-domain

Events vs F&B catering

3 corporate events booked this month. One event shows zero F&B catering revenue. BYO arrangement (different pricing) or invoice not yet raised?

Flag: Missing revenue or different billing terms?

cross-domain

Partner revenue vs GL

Partner channels report $680K gross. GL shows $520K net partner revenue. Commission rate implied: 23.5%. Contracted rate: 20%.

Flag: Promo commission tier or accounting error?

Intra-domain checks + AI-discovered patterns

intra-domain

Ticketing revenue integrity

Daily revenue should equal SUM(ticket price x quantity) across all ticket types. Delta today: -$2,400. Unreconciled discounts or comps not categorised.

intra-domain

Cable car capacity check

Max capacity: 8 pax x 67 cabins x 18 trips/hr x 12 hrs = ~115K/day. Ticketing shows 118K pax on Saturday. Physically impossible — double-scan or system error.

AI-discovered

Seasonal pattern shift

Weekday Luge ridership is normally 40% of weekend. Held for 11 months. December school holidays: ratio breaks to 75%. AI learns the seasonal exception — stops false-flagging every December.

AI-discovered

Gradual channel drift

Partner channel share of total ticketing creeping up 0.5% per month for 8 months. Not an error — but a trend affecting blended margin since commission rates differ by channel.

Business metrics defined in the semantic layer

Revenue per visitor

Which revenue? Ticket only? Ticket + F&B + retail? Which visitor count — ticketed or total footfall? Defined precisely.

Yield per ride

Net revenue per Luge ride after partner commission and GST. Not ticket price — actual yield to the business.

Attachment rate

% of visitors who buy F&B, retail, or add-on experiences. By entry channel: walk-in vs Klook vs corporate.

Partner channel margin

Revenue from Klook/GYG after commission vs direct sales. True margin comparison, not gross.

Cable car load factor

Passengers per cabin per trip vs capacity. By time slot. Drives pricing and scheduling optimisation.

Visitor acquisition cost

Marketing spend / attributed visitors, per channel. Requires marketing + ticketing cross-domain join.

Event space utilisation

Not just booked vs available. Revenue per sqm per day, including F&B uplift from events.

Dwell time proxy

Entry scan to last transaction timestamp. Correlates to spend per visitor. Longer dwell = higher yield.

Blended ticket yield

Weighted average net revenue per ticket across direct, partner, corporate, and comp channels. Accounts for different commission and pricing tiers.

Roll-up rules that aren't obvious

Combo package allocation

Visitor buys $60 combo (cable car + Luge + meal). How to split across business units?

Not 1/3 each. Management-defined split: $25 cable car, $20 Luge, $15 F&B. Without this rule, each unit either claims the full $60 or gets nothing.

Semantic layer encodes the allocation. AI uses it automatically.

GST treatment

Some items GST-inclusive (9%), some exempt. Rolling up "total revenue" means stripping GST from applicable items before summing.

Different treatment for tickets vs F&B vs event space rental. Get it wrong and total revenue is off by up to 9%.

Rules encoded per product type. Consistent across every query.

Partner commission netting

Report partner revenue gross (what customer paid) or net (what you received)?

Different stakeholders want different views. Marketing wants gross (market size). Finance wants net (actual revenue). Same data, different rules.

Both views defined. AI knows which to use based on who's asking.

Multi-visit pass revenue recognition

Annual pass sold for $200. Ticketing records one transaction in January. Finance recognises $16.67/month across 12 months.

Both are "correct" for their domain. Without the semantic layer, January revenue is either overstated or understated depending on which source the AI picks.

Reconciliation rule: ticketing cash vs finance accrual, with stored variance.

Why this is prerequisite for forecasting

Without reconciliation

Double-counted combos inflate base data

Forecast trains on $2.91M instead of $2.14M

36% overstated from day one

Gross vs net inconsistency

Partner revenue is gross some months, net others

Trend line is meaningless

Seasonal patterns wrong

School holidays not baselined — model misses every December

Forecast says dip when it's actually peak

No trust

CFO compares forecast to their GL report. Numbers don't match.

Credibility gone. Project fails.

With semantic layer first

Combos properly allocated

Each unit's revenue is clean and additive

Forecast trains on real $2.14M

Net revenue consistently defined

Every month uses the same netting rule

Trends are real trends

Seasonal baselines learned

School holidays, CNY, National Day — encoded as known patterns

Forecast accounts for seasonal lift

Numbers match GL

Forecast is anchored to the same authoritative source

CFO trusts it because it reconciles

Issues get smarter over time

first occurrence

Sentosa cable car revenue spikes 20% above GL transport line. Investigation: partner batch settlement crosses month boundary. Classified: timing — partner settlement lag.

same pattern, different partner

Next month, GYG settlement batch shows same pattern. System recognises the classification. Auto-resolved. No human touch.

structural variance learned

Ticketing F&B revenue ≠ POS F&B revenue. Reason: ticketing captures bundled meal packages at package price, POS captures individual item sales. Stored as permanent variance rule.

analyst awareness

Decision maker asks "what was total visitor revenue?" AI responds with reconciled number AND notes: "Partner channel data for Klook is 5 days delayed this period. Current figure may increase by ~$15K based on historical settlement patterns."

Traditional vendor: every issue is a new ticket. Our system: every issue makes it smarter.

The AI knows what it doesn't know yet

Data freshness by source — the AI factors this into every answer Cable Car Real-time scans Current as of now F&B POS Nightly batch Up to last night today pending Partner Channels T+5 settlement Settled up to 5 days ago pending settlement Finance / GL Monthly close Last closed month current month open

AI says: "Revenue as of last closed month is $2.14M. Current month estimate: $1.95M — but partner data is 5 days behind and GL hasn't closed."

Every number traces back to source

"Where did the $2.14M come from?" Provenance trail: 1. Source: Finance GL — table: revenue_summary, period: 2024-11 2. Composition: Ticketing $1.42M + F&B $0.39M + Events $0.18M + Other $0.15M 3. Reconciliation: Ticketing system total ($1.82M) less partner commission ($0.14M) less combo allocation to F&B ($0.26M) = $1.42M ✓ 4. Quality: All sources current. Cross-domain variance within 0.3% tolerance. No flags. 5. Freshness: GL closed 2024-11-30. Partner settlements finalised 2024-12-05. Data complete.

Why data projects stall

Bronze Ingest raw data Every vendor can do this Silver Clean, type, deduplicate Most competent teams get here Gold Business-ready models This is where they stall Requires business knowledge, not just engineering Silver → Gold isn't a technical problem. It's a business knowledge problem. That knowledge lives in your team's heads — no vendor extracts it in a discovery workshop.

We don't just build pipelines. We encode your business rules into the data — and validate them continuously.

How we're different

Reusable patterns, not custom builds

We've built this for a hospitality group with similar data challenges. Bronze/silver/gold pipeline patterns transfer directly. What took months the first time takes weeks now.

AI does the grunt work

Column mapping, schema inference, data cleaning, transformation code — AI writes 80% of it. Our engineers review and refine. 3x faster than traditional.

Small team, low overhead

No project manager layer, no 5-person squad. Engineers who understand both data and business. Direct communication, fast iteration.

The AI is the interface

No weeks spent on dashboard formatting. Ask questions, get answers with provenance. If you want dashboards later, the gold data is ready for any BI tool.

Start with financial data. See results. Then expand to ticketing, F&B, events, partners.

Questions you might ask

Where does our data live?

Dedicated cloud environment. Your data never touches other clients. AWS or Azure — your preference.

How do you get our data?

API connectors if available. SFTP / file drops for legacy systems. Database replication if you allow it. We adapt to what you have.

How often does it refresh?

Configurable per source. Real-time if the source supports it, daily batch otherwise. The system tracks freshness per source.

What if our data format changes?

Schema drift detection. If ticketing adds a column or changes a field name, the system flags it before it breaks downstream.

Who owns the data?

You do. We process it in your dedicated environment. Export everything at any time. No lock-in.

Can we connect our own BI tools?

Yes. Gold tables are standard SQL-queryable. Power BI, Tableau, Looker — whatever you use can connect directly.

Under the hood — production-grade data stack

Delta Lake format

ACID transactions, time travel, schema evolution. Roll back any table to any point in time. No data loss from bad writes.

Data validation

Automated checks at every layer. Schema validation, null checks, range checks, cross-table reconciliation. Issues caught before they reach gold.

Incremental processing

Only process what changed. Daily runs take minutes, not hours. Cost-efficient on cloud compute.

Lineage tracking

Every gold table traces back to its bronze source. Know exactly which raw file produced which metric.

Idempotent pipelines

Re-run any pipeline safely. Same input always produces same output. No duplicate records from retries.

AI-optimised data model

Gold tables structured for AI query generation. Pre-joined, pre-aggregated at the right granularity. Fast, accurate answers.