Version: 1.0.0

S3 Files

Replicate your data in minutes, through our UI or programatically with our APIs.

Replicate your S3 files in your data warehouse and lakes in minutes#

Files are often replicated from S3. This source aims to support an expanding range of file formats (CSV, JSON, HTML, Excel, Feather, Parquet, Orc, Pickle…).

The Files source supports Full Refresh syncs. That is, every time a sync is run, Katonic will copy all rows in the file and columns you set up for replication into the destination in a new table.

Check our detailed documentation on how to start syncing your S3 files.

File formats#

Format / Supported‍

  • CSV / Yes
  • JSON / Experimental
  • HTML / Untested
  • Excel / Untested
  • Feather / Untested
  • Parquet / Untested
  • Orc / Untested
  • Pickle / Untested

Features of the Connector#

Full Refresh SyncYes
Incremental SyncComing soon
Replicate Incremental DeletesComing soon
Replicate Folders (multiple files)Coming soon
Replicate Glob Patterns (multiple files)Yes

Resulting schema#

At this time, this source produces only a single stream for the target file as it replicates only one file at a time for the moment. We’ll be considering improving this behavior by globing folders or using patterns to capture more files in the next iterations as well as more file formats and storage providers.