Data Flow & Processing
Data Ingestion Flow
How Antei collects and structures data from third-party integrations and file uploads to power tax workflows.
Data Ingestion Flow
Antei ingests structured data from integrations and CSV uploads to power compliance workflows such as reconciliation, return filing, tax calculation, and risk analysis. Every ingestion event is securely scoped, logged, and transformed into normalized internal formats.
Sources of Data
We ingest data from the following sources:
- Third-party Integrations
Platforms such as Stripe, Razorpay, Chargebee, QuickBooks, Xero, BambooHR, Gmail, Outlook, and Slack - CSV Uploads
Users can manually upload structured data for entities including:- Transactions
- Invoices
- Products
- Contacts (Customers, Vendors, Authorities)
Supported Ingestion Modes
Method | Description | Use Cases |
---|---|---|
OAuth Integrations | Secure pull via scoped tokens | Continuous sync with billing and HRIS systems |
CSV Uploads | File-based manual ingestion via UI | One-time data loads or offline correction flows |
Webhooks | Real-time, event-driven ingestion | Stripe payments, refunds, Slack messages |
Ingestion Pipeline
1
1. Authorization or File Upload
- OAuth 2.0 authorization is completed for integrated platforms
- CSV uploads are securely uploaded through the Antei UI
- Each request is scoped to a specific organization
2
2. Sync or Upload Trigger
- Triggers can be webhook-based, time-based (cron), or manual
- CSV uploads are explicitly initiated by users
- Metadata such as
source
,sync_time
, andtrigger_type
are recorded
3
3. Parsing & Normalization
- Incoming data is parsed and transformed into Antei’s internal registry structure
- Supported entities include
transaction
,invoice
,contact
,product
, andtransaction_op
- Payloads are matched against the internal registry using a config-driven mapping layer
4
4. Breakdown & Componentization
- Each structured record is broken down into its atomic components
- Line items, taxes, and references are extracted and linked to core entities
- Nested data is flattened and aligned with the registry schema
5
5. Deduplication & Matching
- For each entity, we perform weighted matching using a hybrid method:
- Fuzzy scoring for names, emails, references
- Structured scoring using contact type, jurisdiction, currency, etc.
- Results are evaluated:
- Score ≥ 90%: Entity is marked as
unprocessed
for manual review and override - Score < 90%: A new object is created with generated IDs
- Score ≥ 90%: Entity is marked as
- This applies both at component level (e.g., product, contact) and at transaction level
6
6. Temporary Storage
- Parsed and mapped data is stored in secure staging tables
- Metadata includes sync source, change status, timestamps, and classification
- These records are held pending validation
7
7. Validation & Classification
- Validations ensure schema compliance and reference linkage
- Missing or mismatched data is flagged for user resolution
- Auto-tagging applies classification such as taxability, reverse charge, etc.
8
8. Push to Org Database
- Fully validated records are saved to the organization’s production tables
- Available for reconciliation, filings, reporting, and audits
- All actions are traceable via logs and metadata
Sync Frequencies
Frequency | Trigger Type | Examples |
---|---|---|
Real-time | Webhooks | Stripe charges, Slack alerts |
Hourly | Background Cron | Xero, Chargebee, QuickBooks |
Manual Upload | User-initiated | Invoices, legacy transactions |
Daily | Scheduled Pull | Employee and product sync |
Observability & Logs
- Every sync or upload is logged with timestamp, method, and source
- All records in staging and unprocessed buckets include error context
- Retry queues automatically process temporary failures
- Logs are viewable in Org Settings → Sync Logs
Access Control
- All ingestion is scoped per organization
- API integrations use short-lived tokens with limited scope
- CSV files are processed in memory and discarded post-validation
- Field-level data sensitivity is respected and retained in metadata
Need Help?
For questions on ingestion architecture, field mapping, or preparing import files:
support@antei.com