Google Pub/Sub Producer
This operator receives a message from the input port and publishes it to a Google Pub/Sub topic.
- The provided Topic is the default topic that received messages will be published to. Alternatively, if a gcp.pubsub.topicName attribute exists in the message, the value of this attribute is used as topic name and the message is published to this topic rather than Topic. This attribute will not be removed by the producer upon publish. If Topic is empty and the producer cannot get the topic name from the message attributes, the producer fails regardless of the value of Fail on error.
- Create topic if it does not exist applies only to Topic, not topic names mentioned in the message attributes.
- If Fail on error is false, upon error, the producer only outputs the error message on Error output port and does not fail the graph. The message to be published is discarded.
- Subscription name to create applies only to Topic and not topic names mentioned in the message attributes. If specified, the producer makes sure the subscription to Topic exists (if not, it creates the subscription) on Google Pub/Sub before processing input messages. Creating a subscription before publishing can ensure publications published to Topic are stored on Google Pub/Sub, even if at the time of publishing, no subscription to Topic exists. Google Pub/Sub does not guarantee storing undelivered publications when there are on matching subscriptions.
- If the number of publications received by the producer, but not yet published to the Google Pub/Sub service is more than Maximum outstanding publications, the producer does not process messages from messageToPublish input until one or more messages are finished publishing. Publishing a message is finished when the producer outputs its message ID on the publishedMessage output port, or an error occurs and the error is written to the error output port.
- Upon publish timeout, an error message is written to the Error output port and if Fail on error is true, the graph fails.
- The encoding of the received input message is stored in an attribute named message.encoding in the message published to the Google Pub/Sub service. The Google Pub/Sub consumer uses this attribute to reconstruct the same DataHub message.
- As Google Pub/Sub supports only string as key/value type of attributes, any attribute included in the input message to this operator must have strings as key and value.
- In order to improve the throughput of the operator, publications are batched. The value of Publication batch size must be smaller than Maximum outstanding publications. This is checked by the the producer upon initialization.
Configuration Parameters
Parameter | Type | Description |
---|---|---|
Connection | object | Mandatory. A Google Pub/Sub connection consisting of a GCP project ID and a JSON key file to access the Pub/Sub service. |
Topic | string | Google Pub/Sub topic name. |
Create topic if it does not exist | boolean | Whether to create Topic if it does not
exist on the given Google Pub/Sub project.
Default: true |
Fail on error | boolean | Whether to fail the whole graph if the producer encounters an
error at runtime.
Default: false |
Subscription name to create | string | Ensure that the given subscription name exists for Topic. |
Maximum outstanding publications | integer | Maximum number of publications that are submitted to the
operator but are not finished being published.
Default: 1000 |
Publish timeout (seconds) | integer | Maximum timeout in seconds for publishing a
publication.
Default: 60 |
Publication batch size | integer | Minimum number of publications that the producer needs to
receive before sending a batch of publications to the Google
Pub/Sub service.
Default: 100 |
Delay between retrying failed publications (milliseconds) | integer | Time to wait before retrying a failed publication.
Default: 1000 |
Number of retry attempts for failed publications | integer | Maximum number of retries for a failed publication. Upon
reaching this limit, an error is generated and handled according
to the value of Fail on error.
Default: 5 |
Input
Input | Type | Description |
---|---|---|
messageToPublish | message | The stream of DataHub messages to publish. |
Output
Output | Type | Description |
---|---|---|
publishedMessage | message | Upon successful publication, Google Pub/Sub assigns a unique ID to each publication. The producer outputs these IDs as a steam of messages on this port, with the message ID as value of the attribute msg.MessageID. |
error | message | Error messages occurring during runtime are output on this port as a stream of message (one for each error). The error message has the attribute msg.KeyError with the value true and for errors relating to publishing a message, the original message can be retrieved using the attribute originalMessage. The body of the message is the error message. |