Import
Overview
In the UI, Sift allows you to import certain files for analysis.
File types
Sift allows you to import certain file types directly through the web interface. This is the fastest way to bring data into Sift for review, visualization, or analysis without any setup or code. Sift allows you to import the following file types through the UI:
File type | Details |
---|---|
.csv | Learn more. |
.tdms | |
.parquet | Learn more. |
General configuration
After uploading a supported file type, Sift requires basic configuration before the data can be used. The following general configuration settings may apply depending on the file type being imported.
Setting | Description | Applies to |
---|---|---|
Asset | This setting defines the system that generated the data. You can select an existing Asset or create a new one. | CSV, Parquet |
Run | This setting specifies a data collection session. It defaults to the file name but can be edited. | CSV, Parquet |
First data row | This setting identifies the row number where time-series data begins, allowing extra header rows to be skipped. | CSV |
Timestamp column | This setting designates the column containing timestamps. It is auto-detected but can be edited. | CSV, Parquet |
Timestamp format | This setting determines the format of the timestamp column. Must be one of the supported types. | CSV |
Complex types import mode | This setting controls how complex types such as lists and maps are imported. They can be ingested as bytes, JSON strings, both formats, or skipped entirely. Learn more. | Parquet |
Timestamp formats
The following table lists the timestamp formats supported by Sift. These formats must match the structure of the selected timestamp column during a supported file type import.
Format | Description |
---|---|
rfc3339 | 2023-01-02T15:04:05Z |
datetime | 2023-01-02 15:04:05 |
UNIX | Seconds since epoch |
unix_millis | Milliseconds since epoch |
unix_micros | Microseconds since epoch |
unix_nanos | Nanoseconds since epoch |
nanoseconds | Relative time in nanoseconds |
microseconds | Relative time in microseconds |
milliseconds | Relative time in milliseconds |
seconds | Relative time in seconds |
minutes | Relative time in minutes |
hours | Relative time in hours |
Channel configuration
During a supported file type import, each non-timestamp column is treated as a Channel representing a stream of time-series data. Sift displays a configuration table where each Channel can be reviewed and customized before import.
Column | Description |
---|---|
Checkbox | Determines whether the Channel is included in the import. |
Name | Autodetected Channel name from the source file. Editable if incorrect. |
Data type | Autodetected supported data type. Editable if incorrect. |
Units | Optional. Unit of the Channel (for example, °C ). |
Description | Optional. Description of the Channel. |
Data type
After importing a supported file type (in the UI), each Channel must be assigned a data type. The following data types are supported:
Data type | Description |
---|---|
double | 64-bit floating point number |
float | 32-bit floating point number |
int32 | 32-bit signed integer |
uint32 | 32-bit unsigned integer |
int64 | 64-bit signed integer |
uint64 | 64-bit unsigned integer |
bytes | Sequence of raw bytes |
bool | Boolean value (true or false) |
string | Text or alphanumeric values |
enum | Categorical value with a fixed set of possible strings |
bit field | Integer where each bit represents a distinct flag or condition |
CSV
Format
A CSV to be imported to Sift must include a header row, a timestamp column, and one or more telemetry Channels, each formatted according to supported conventions.
Requirement | Description |
---|---|
First row | It must contain column headers. |
Timestamp column | One column must contain timestamps, and the recommended name for this column is timestamp . |
Data columns | All other columns are treated as telemetry Channels. |
Parquet
Format
Sift supports Parquet files with a flat schema, where each telemetry Channel is represented as an individual column.
Requirement | Description |
---|---|
Timestamp column | One column must contain timestamps. |
Channel columns | Each additional column is treated as a telemetry Channel. |
Nested columns
When importing a Parquet file with nested columns (such as hierarchical structures or columns containing objects), Sift automatically flattens these columns into a flat schema. Nested fields are converted to individual columns using dot notation to represent their hierarchy.
For example, a nested column like location
with two sub fields lat
and lon
will appear as a separate Channels named location.lat
and location.lon
.
This flattening ensures that all data is accessible and can be analyzed as standard time-series Channels, regardless of the original file's complexity.
Complex types
When importing a Parquet file, you can select one of the supported modes to determine how lists or maps (complex types) are handled. Supported modes:
Mode | Description |
---|---|
Both (default) | Complex types are imported as both Arrow bytes and JSON strings. |
Bytes | Complex types are imported as Arrow bytes only. |
String | Complex types are imported as JSON strings only. |
Ignore | Complex types are skipped and not ingested. |
Working with strings
When complex types are imported as strings, the Channel name will automatically have .json
appended to it to avoid naming conflicts.
Working with bytes
When querying complex types using Sift's API, you can use any available Arrow library to read the raw bytes and recover the original values. Each Sift "data point" for a given timestamp is an Arrow record with one column and one row. Here's a Python example for converting that into a usable value: