Sift | Docs

Import

Overview

In the UI, Sift allows you to import certain files for analysis.

File types

Sift allows you to import certain file types directly through the web interface. This is the fastest way to bring data into Sift for review, visualization, or analysis without any setup or code. Sift allows you to import the following file types through the UI:

File typeDetails
.csvLearn more.
.tdms
.parquetLearn more.

General configuration

After uploading a supported file type, Sift requires basic configuration before the data can be used. The following general configuration settings may apply depending on the file type being imported.

SettingDescriptionApplies to
AssetThis setting defines the system that generated the data. You can select an existing Asset or create a new one.CSV, Parquet
RunThis setting specifies a data collection session. It defaults to the file name but can be edited.CSV, Parquet
First data rowThis setting identifies the row number where time-series data begins, allowing extra header rows to be skipped.CSV
Timestamp columnThis setting designates the column containing timestamps. It is auto-detected but can be edited.CSV, Parquet
Timestamp formatThis setting determines the format of the timestamp column. Must be one of the supported types.CSV
Complex types import modeThis setting controls how complex types such as lists and maps are imported. They can be ingested as bytes, JSON strings, both formats, or skipped entirely. Learn more.Parquet

Timestamp formats

The following table lists the timestamp formats supported by Sift. These formats must match the structure of the selected timestamp column during a supported file type import.

FormatDescription
rfc33392023-01-02T15:04:05Z
datetime2023-01-02 15:04:05
UNIXSeconds since epoch
unix_millisMilliseconds since epoch
unix_microsMicroseconds since epoch
unix_nanosNanoseconds since epoch
nanosecondsRelative time in nanoseconds
microsecondsRelative time in microseconds
millisecondsRelative time in milliseconds
secondsRelative time in seconds
minutesRelative time in minutes
hoursRelative time in hours

Channel configuration

During a supported file type import, each non-timestamp column is treated as a Channel representing a stream of time-series data. Sift displays a configuration table where each Channel can be reviewed and customized before import.

ColumnDescription
CheckboxDetermines whether the Channel is included in the import.
NameAutodetected Channel name from the source file. Editable if incorrect.
Data typeAutodetected supported data type. Editable if incorrect.
UnitsOptional. Unit of the Channel (for example, °C).
DescriptionOptional. Description of the Channel.

Data type

After importing a supported file type (in the UI), each Channel must be assigned a data type. The following data types are supported:

Data typeDescription
double64-bit floating point number
float32-bit floating point number
int3232-bit signed integer
uint3232-bit unsigned integer
int6464-bit signed integer
uint6464-bit unsigned integer
bytesSequence of raw bytes
boolBoolean value (true or false)
stringText or alphanumeric values
enumCategorical value with a fixed set of possible strings
bit fieldInteger where each bit represents a distinct flag or condition

CSV

Format

A CSV to be imported to Sift must include a header row, a timestamp column, and one or more telemetry Channels, each formatted according to supported conventions.

RequirementDescription
First rowIt must contain column headers.
Timestamp columnOne column must contain timestamps, and the recommended name for this column is timestamp.
Data columnsAll other columns are treated as telemetry Channels.

Parquet

Format

Sift supports Parquet files with a flat schema, where each telemetry Channel is represented as an individual column.

RequirementDescription
Timestamp columnOne column must contain timestamps.
Channel columnsEach additional column is treated as a telemetry Channel.

Nested columns

When importing a Parquet file with nested columns (such as hierarchical structures or columns containing objects), Sift automatically flattens these columns into a flat schema. Nested fields are converted to individual columns using dot notation to represent their hierarchy.

For example, a nested column like location with two sub fields lat and lon will appear as a separate Channels named location.lat and location.lon. This flattening ensures that all data is accessible and can be analyzed as standard time-series Channels, regardless of the original file's complexity.

Complex types

When importing a Parquet file, you can select one of the supported modes to determine how lists or maps (complex types) are handled. Supported modes:

ModeDescription
Both (default)Complex types are imported as both Arrow bytes and JSON strings.
BytesComplex types are imported as Arrow bytes only.
StringComplex types are imported as JSON strings only.
IgnoreComplex types are skipped and not ingested.

Working with strings

When complex types are imported as strings, the Channel name will automatically have .json appended to it to avoid naming conflicts.

Working with bytes

When querying complex types using Sift's API, you can use any available Arrow library to read the raw bytes and recover the original values. Each Sift "data point" for a given timestamp is an Arrow record with one column and one row. Here's a Python example for converting that into a usable value:

def arrow_bytes_to_list(data):
    """Converts an Arrow-formatted byte stream into structured data."""
    reader = pa.ipc.open_stream(io.BytesIO(data))
    batch = reader.read_next_batch()
    return batch.column(0).to_pylist()

Resources

On this page