Import

Overview

In the UI, Sift allows you to import certain files for analysis.

Sift allows you to import certain file types directly through the web interface. This is the fastest way to bring data into Sift for review, visualization, or analysis without any setup or code. Sift allows you to import the following file types through the UI:

File type	Details
`.csv`	Learn more.
`.tdms`
`.parquet`	Learn more.

General configuration

After uploading a supported file type, Sift requires basic configuration before the data can be used. The following general configuration settings may apply depending on the file type being imported.

Setting	Description	Applies to
Asset	This setting defines the system that generated the data. You can select an existing Asset or create a new one.	CSV, Parquet
Run	This setting specifies a data collection session. It defaults to the file name but can be edited.	CSV, Parquet
First data row	This setting identifies the row number where time-series data begins, allowing extra header rows to be skipped.	CSV
Timestamp column	This setting designates the column containing timestamps. It is auto-detected but can be edited.	CSV, Parquet
Timestamp format	This setting determines the format of the timestamp column. Must be one of the supported types.	CSV
Complex types import mode	This setting controls how complex types such as lists and maps are imported. They can be ingested as bytes, JSON strings, both formats, or skipped entirely. Learn more.	Parquet

Timestamp formats

The following table lists the timestamp formats supported by Sift. These formats must match the structure of the selected timestamp column during a supported file type import.

Format	Description
rfc3339	2023-01-02T15:04:05Z
datetime	2023-01-02 15:04:05
UNIX	Seconds since epoch
unix_millis	Milliseconds since epoch
unix_micros	Microseconds since epoch
unix_nanos	Nanoseconds since epoch
nanoseconds	Relative time in nanoseconds
microseconds	Relative time in microseconds
milliseconds	Relative time in milliseconds
seconds	Relative time in seconds
minutes	Relative time in minutes
hours	Relative time in hours

Channel configuration

During a supported file type import, each non-timestamp column is treated as a Channel representing a stream of time-series data. Sift displays a configuration table where each Channel can be reviewed and customized before import.

Column	Description
Checkbox	Determines whether the Channel is included in the import.
Name	Autodetected Channel name from the source file. Editable if incorrect.
Data type	Autodetected supported data type. Editable if incorrect.
Units	Optional. Unit of the Channel (for example, `°C`).
Description	Optional. Description of the Channel.

Data type

After importing a supported file type (in the UI), each Channel must be assigned a data type. The following data types are supported:

Data type	Description
double	64-bit floating point number
float	32-bit floating point number
int32	32-bit signed integer
uint32	32-bit unsigned integer
int64	64-bit signed integer
uint64	64-bit unsigned integer
bytes	Sequence of raw bytes
bool	Boolean value (true or false)
string	Text or alphanumeric values
enum	Categorical value with a fixed set of possible strings
bit field	Integer where each bit represents a distinct flag or condition

CSV

Format

A CSV to be imported to Sift must include a header row, a timestamp column, and one or more telemetry Channels, each formatted according to supported conventions.

Requirement	Description
First row	It must contain column headers.
Timestamp column	One column must contain timestamps, and the recommended name for this column is `timestamp`.
Data columns	All other columns are treated as telemetry Channels.

Parquet

Format

Sift supports Parquet files with a flat schema, where each telemetry Channel is represented as an individual column.

Requirement	Description
Timestamp column	One column must contain timestamps.
Channel columns	Each additional column is treated as a telemetry Channel.

Nested columns

When importing a Parquet file with nested columns (such as hierarchical structures or columns containing objects), Sift automatically flattens these columns into a flat schema. Nested fields are converted to individual columns using dot notation to represent their hierarchy.

For example, a nested column like location with two sub fields lat and lon will appear as a separate Channels named location.lat and location.lon. This flattening ensures that all data is accessible and can be analyzed as standard time-series Channels, regardless of the original file's complexity.

Complex types

When importing a Parquet file, you can select one of the supported modes to determine how lists or maps (complex types) are handled. Supported modes:

Mode	Description
Both (default)	Complex types are imported as both Arrow bytes and JSON strings.
Bytes	Complex types are imported as Arrow bytes only.
String	Complex types are imported as JSON strings only.
Ignore	Complex types are skipped and not ingested.

Working with strings

When complex types are imported as strings, the Channel name will automatically have .json appended to it to avoid naming conflicts.

Working with bytes

When querying complex types using Sift's API, you can use any available Arrow library to read the raw bytes and recover the original values. Each Sift "data point" for a given timestamp is an Arrow record with one column and one row. Here's a Python example for converting that into a usable value:

def arrow_bytes_to_list(data):
    """Converts an Arrow-formatted byte stream into structured data."""
    reader = pa.ipc.open_stream(io.BytesIO(data))
    batch = reader.read_next_batch()
    return batch.column(0).to_pylist()

Overview

File types

General configuration

Timestamp formats

Channel configuration

Data type

CSV

Format

Parquet

Format

Nested columns

Complex types

Resources

On this page