Sending Data

Currently the protocol buffers involved in protobuf ingestion are not available in the public repository. To retrieve the protobufs required for protobuf ingestion, see the next section.

This toolset enables the sending protobuf messages to Sift, allowing data from each field to be ingested into corresponding channels.

Channels are created based on the field paths defined in the protobuf message.

Example

Given this protobuf message:

enum VehicleState {
    Started = 0;
    Stopping = 1;
    Stopped = 2;
}
message Vehicle {
    float velocity = 3;
    repeated float direction = 5;
    VehicleState vehicle_state = 7;
    PropulsionSubsystem propulsion = 12;
    map<string, BatterySystem> batteries = 13;
}
message PropulsionSubsystem {
    float fuel_level
}
message BatterySystem {
    float voltage
    float temp
}

The listing of channel names generated would be the following:

  • Vehicle.velocity
  • Vehicle.direction[0]
  • Vehicle.direction[1] (a channel will be created for each index of an array)
  • Vehicle.vehicle_state
  • Vehicle.propulsion.fuel_level
  • Vehicle.batteries[cpu].voltage (a channel will be created for each key of a map)
  • Vehicle.batteries[cpu].temp
  • Vehicle.batteries[propulsion].voltage
  • Vehicle.batteries[propulsion].temp

The process of ingesting protobuf messages is to send the compiled protobuf descriptor to our schema registration endpoint and then start streaming protobuf messages. There is an optional step to add custom Sift options to the protobuf messages to add meta data to the channels.

Schema Registration

The grpc service ProtobufDescriptorService includes an RPC called AddProtobufDescriptor which registers a protobuf message for ingestion.

Registration Process

  • Registration is performed by message type and namespace, allowing for schema separation between different systems.
  • The namespace is provided when saving descriptors and when ingesting messages. A major usecase of namespace is environment. If a developer wants to iterate on the protobuf definitions without affecting live telemetry, a new namespace can segregate these protobuf descriptors and ingestion requests.

Request Structure

  • The request takes a file_descriptor_set field which should be generated by compiling the protobuf. A simple way to get this is to compile with protoc:
    protoc --include_imports --descriptor_set_out=descriptor.output -I /path/to/protofile

Versioning

  • Adding multiple ProtobufDescriptors with the same message type and namespace will store a new version of descriptor. This is beneficial for managing multiple versions of the same message, especially when newer versions reserve fields that were used by older versions.
  • When a message is ingested, all stored protobuf descriptor sets for that message type and namespace will be used to generate channel values. The unique set of channels generated will then be ingested.

Deletion

  • DeleteProtobufDescriptors will remove all descriptors for the message type and namespace. This might be useful for:
    • Cleaning up test protobuf messages that are no longer in use.
    • Simplifying ingestion when a new version of the protobuf message is fully backward compatible. Deleting existing descriptors and adding the latest descriptor can streamline the ingestion process.
  • New protobuf descriptors must be backward compatible with existing protobuf descriptors that have the same message_type_full_name and are in the same namespace.
  • A protobuf descriptor violates backward compatibility when a field name changes e.g. field velocity with number 1 changes to speed.

When a new protobuf descriptor is not backward compatible, you will receive an error message e.g.

incompatible protobuf descriptors found. please delete the following protobuf descriptors to 
successfully add the new descriptor protobuf descriptor ID: 38cd9974-3352-4d5d-afee-05bae1419074, 
message type full name: vehicle, most recent field name: speed, previous field name: velocity, 
field number: 1

To resolve this error and successfully add the new protobuf descriptor, you must either make the protobuf backward compatible by adding a new field, e.g. field speed with number 2, and reserving the old field. You can optionally delete the old protobuf descriptor with the ID provided in the error message.

AddProtobufDescriptor

FieldTypeDescription
message_type_full_namestringThe fully qualified message type name, including the package
namespacestringThe namespace for this message
file_descriptor_setbytesThe compiled bytes of the file descriptor set for the protobuf message
proto_file_namestringThe filename containing the protobuf message

DeleteProtobufDescriptors

FieldTypeDescription
message_type_full_namestringThe fully qualified message type name, including the package
namespacestringThe namespace for this message

Data Ingestion

Once the schema has been registered, this protobuf message can be sent to IngestProtobuf endpoint.

The grpc service DataIngestionService has an rpc called IngestProtobuf which takes a stream of ingestion request messages to be ingested.

IngestProtobufRequest

FieldTypeDescription
message_type_identifierstringThe fully qualified message type name, including the package
namespacestringThe namespace of the protobuf message
message_type_display_namestring(Optional) This will replace the message name in the channel name. It can be useful if the same message is reused by different sources
asset_namestringThe name of the asset that is the source of this telemetry
timestampgoogle.protobuf.TimestampThe timestamp of the telemetry points
valuebytesThe serialized bytes of the protobuf message
run_idstringThe optional id of the run that is associated with this telemetry point

Sift Protobuf Options

To allow for creating more descriptive channels, some additional options can be added to the protobuf definition. The proto file channel_parsing_options.proto contains the custom message and field options that will add this meta data to the message's generated channel.

  • Units & Description - These options can be added to a primitive fields so that the units and description will be available when the channel is displayed
  • Bytes Decoding - This option can be used to interpret a bytes typed field as other types like utf-8
  • Tags - This can be helpful when a protobuf field path needs additional context from the data to be unique
  • Map Key Display Overrides - These options are similar to tags except that they will replace the display value of a map key. This can be useful when the keys are not human-readable or have transient values

Tagging example

Given the following proto messages:

message TestProto {
    option (azimuth.message_is_tag_target) = true;
    string name = 1;
    TestChild primary_child = 2 [(azimuth.tag_target).allowed_tag_source=DESCENDANT_AND_SIBLING_SOURCES];
    repeated TestChild array_of_children = 3;
    int32 type_id = 4 [(azimuth.tag_source).allowed_tag_target=ANCESTOR_TARGETS];
}
message TestChild {
    string child_name = 1 [(azimuth.tag_source).allowed_tag_target=ANCESTOR_AND_SIBLING_TARGETS,
    (azimuth.tag_source).tag_name="kid_name"];
    map<int32, NestedChild> map_int_to_message = 4;
}
message NestedChild {
    string sub_child_name = 1 [(azimuth.tag_target).allowed_tag_source=SIBLING_SOURCES];
    int32 id = 2  [(azimuth.tag_source).allowed_tag_target=SIBLING_TARGETS];
}

We would expect the channels WITHOUT tags to be (assuming some basic values for the map keys and array length):

  • TestProto.name
  • TestProto.primary_child.child_name
  • TestProto.primary_child.map_int_to_message[1].sub_child_name
  • TestProto.primary_child.map_int_to_message[1].id
  • TestProto.array_of_children[0].child_name
  • TestProto.array_of_children[0].map_int_to_message[1].sub_child_name
  • TestProto.array_of_children[0].map_int_to_message[1].id
  • TestProto.type_id

We would expect the channels WITH tags to be (assuming the same values for map keys and array length and some basic values for tag source fields):

  • TestProto(type_id:3).name
  • TestProto(type_id:3).primary_child(kid_name:childname).child_name
  • TestProto(type_id:3).primary_child(kid_name:childname).map_int_to_message[1].sub_child_name(id:35)
  • TestProto(type_id:3).primary_child(kid_name:childname).map_int_to_message[1].id
  • TestProto(type_id:3).array_of_children[0].child_name
  • TestProto(type_id:3).array_of_children[0].map_int_to_message[1].sub_child_name(id:35)
  • TestProto(type_id:3).array_of_children[0].map_int_to_message[1].id
  • TestProto(type_id:3).type_id

Multiple tag values will be displayed as field_name(field_1:value_1)(field_2:value_2)

Refer to the Channel Parsing Options Documentation for more detailed descriptions of how these options are applied.

Map Key & Array Index Override example

Given the following proto messages:

enum TestEnum {
  NONE = 0;
  SINGLE = 1;
  DOUBLE = 2;
}
 
message TestProto {
    string name = 1;
 
    map<int32, MapKeyTester> map_key_test = 2[(azimuth.map_key_override_type)=MAP_KEY_OVERRIDE_TARGET];
    map<int32, string> map_key_removal_test = 3[(azimuth.map_key_override_type)=MAP_KEY_OVERRIDE_REMOVE_KEY];
    map<int32, string> map_key_enum_test = 4[(azimuth.map_key_override_type)=MAP_KEY_OVERRIDE_ENUM, (azimuth.display_override_enum)="TestEnum"];
 
    repeated ArrayIndexTester array_index_override_test = 5[(azimuth.array_index_override_type)=ARRAY_INDEX_OVERRIDE_TARGET];
    repeated string array_index_override_remove_index = 6[(azimuth.array_index_override_type)=ARRAY_INDEX_OVERRIDE_REMOVE_INDEX];
    repeated string array_index_enum_test = 7[(azimuth.array_index_override_type)=ARRAY_INDEX_OVERRIDE_ENUM, (azimuth.display_override_enum)="TestEnum"];
}
 
message MapKeyTester {
  string new_key = 1[(azimuth.map_key_override_type)=MAP_KEY_OVERRIDE_SOURCE];
  float some_value = 2;
}
 
message ArrayIndexTester {
  string new_index = 1[(azimuth.array_index_override_type)=ARRAY_INDEX_OVERRIDE_SOURCE];
  float other_value = 2;
}

We would expect the channels WITHOUT map key or array index overrides to be (assuming 0 is the only map key):

  • TestProto.name
  • TestProto.map_key_test[0].new_key
  • TestProto.map_key_test[0].some_value
  • TestProto.map_key_removal_test[0]
  • TestProto.map_key_enum_test[0]
  • TestProto.array_index_override_test[0].new_index
  • TestProto.array_index_override_test[0].other_value
  • TestProto.array_index_override_remove_index[0]
  • TestProto.array_index_enum_test[0]

We would expect the channels WITH the map key or array index overrides to be (assuming the value of new_key is my-new-key/my-new-index):

  • TestProto.name
  • TestProto.map_key_test[my-new-key].new_key
  • TestProto.map_key_test[my-new-key].some_value
  • TestProto.map_key_removal_test
  • TestProto.map_key_enum_test[NONE]
  • TestProto.array_index_override_test[my-new-index].new_index
  • TestProto.array_index_override_test[my-new-index].other_value
  • TestProto.array_index_override_remove_index
  • TestProto.array_index_enum_test[NONE]

If multiple override sources apply to the target, the latest one is used and an error is logged

Refer to the Channel Parsing Options Documentation for more detailed descriptions of how these options are applied.

On this page