Data Ingestion Flow

Antei ingests structured data from integrations and CSV uploads to power tax compliance workflows including reconciliation, filings, taxability analysis, and risk monitoring. Every ingestion event is scoped per organization, logged, and transformed into normalized formats via our internal registry and mapping logic.

Sources of Data

We support data ingestion from the following sources:

Third-party Integrations
Examples include Stripe, Razorpay, Chargebee, QuickBooks, Xero, BambooHR, Gmail, Outlook, and Slack
CSV Uploads
Users can upload structured data manually, including:
- Transactions
- Invoices
- Products
- Contacts (Customers, Vendors, Authorities)

Supported Ingestion Modes

Method	Description	Use Cases
OAuth Integrations	API-based ingestion via scoped tokens	Ongoing syncs from billing, HRIS, communication
CSV Uploads	File-based UI ingestion	Bulk data uploads, one-time imports
Webhooks	Real-time, event-based triggers	Stripe payments, refunds, Slack messages

Ingestion Pipeline

1. Authorization or File Upload

OAuth 2.0 Authorization for integrated platforms
CSV Upload via secure UI workflow
All inputs scoped by organization and source

2. Sync or Upload Trigger

Triggered by webhooks, scheduled cron jobs, or manual syncs
Metadata recorded: source, sync_time, trigger_type, auth_token

3. Parsing & Normalization

API payloads and CSV rows parsed to match internal schema
Supported entities: transaction, invoice, contact, product, transaction_op
Mapping handled via integration-specific configs referencing the Antei registry

4. Breakdown & Componentization

Records decomposed into normalized atomic components
Nested fields (e.g. line items, tax summaries) flattened
Data enriched with inferred fields such as jurisdiction or currency

5. Deduplication & Matching

Antei applies weighted entity matching combining:
- Fuzzy scoring for names, emails, addresses
- Structured scoring for tax ID, currency, contact type, jurisdiction
Match outcome:
- Score ≥ 90%: Marked as unprocessed for user override
- Score < 90%: Created as new entity
This process runs at:
- Component level (e.g., contact, product)

6. Temporary Storage

Parsed records stored in region-aware staging tables
Includes sync metadata, classification tags, source, and extracted ID
No data pushed to primary tables without validation

7. Validation & Classification

Schema-level checks: required fields, correct formats, conditional rules
Entity linkage checks: references must exist or be resolvable
Classification tags (e.g., financial, sensitive, jurisdictional) applied automatically

8. Revised Deduplication & Matching

Antei applies weighted entity matching combining:
- Fuzzy scoring for names, emails, addresses
- Structured scoring for tax ID, currency, contact type, jurisdiction
Match outcome:
- Score ≥ 90%: Marked as unprocessed for user override
- Score < 90%: Created as new entity
This process runs at:
- Transaction level (to dedupe invoices and refunds)

9. Push to Org Database

Validated, deduplicated records inserted into organization-scoped production tables
Records become available for reconciliation, tax logic, invoicing, and filing
Logs persist linkage to original ingestion source and timestamp

Sync Frequencies

Frequency	Trigger Type	Examples
Real-time	Webhooks	Stripe charges, refunds
Hourly	Cron jobs	QuickBooks, Xero, Chargebee
Daily	Scheduled pull	BambooHR, email inboxes
Manual Upload	User-triggered	Offline data, corrections, imports

Observability & Logs

Each ingestion event is logged with metadata and trigger source
Staging and unprocessed records include reason codes
Sync logs accessible under Org Settings → Sync Logs
Retry queue auto-handles transient failures and slow APIs

Access Control

All ingestion is scoped by organization and environment
OAuth tokens and upload sessions are validated per request
CSV files are processed in-memory and discarded post-validation
Region-aware routing ensures PII is stored in appropriate jurisdictions
- EU data → EU
- India data → IND
- US/Rest of World → US

Need Help?

For help with ingestion structure, mapping schemas, or staging errors:
support@antei.com

Overview

Security

Data Hierarchy & Permissions

Data Flow & Processing

Privacy & Compliance

Infrastructure & Availability

Internal Operations

Certifications

Trust FAQ

Data Ingestion Flow

Data Ingestion Flow

Sources of Data

Supported Ingestion Modes

Ingestion Pipeline

Sync Frequencies

Observability & Logs

Access Control

Need Help?

Overview

Security

Data Hierarchy & Permissions

Data Flow & Processing

Privacy & Compliance

Infrastructure & Availability

Internal Operations

Certifications

Trust FAQ

​Data Ingestion Flow

​Sources of Data

​Supported Ingestion Modes

​Ingestion Pipeline

​Sync Frequencies

​Observability & Logs

​Access Control

​Need Help?

Data Ingestion Flow

Sources of Data

Supported Ingestion Modes

Ingestion Pipeline

Sync Frequencies

Observability & Logs

Access Control

Need Help?