Livy Spark Submit
The Livy Spark Submitter operator submits jobs to a cluster using the Livy REST API. It has 2 different modes: jar and snippet. Once a mode is chosen, the configuration tab will show only relevant configuration parameters.
In jar mode, you can submit an application in a way that is very similar to using spark-submit. The operator will succeed only if the underlying job is finished successfully. For instance, if a jar file is submitted to YARN, the operator status will be identical to the application status in YARN. Note that the jar file must be accessible to Livy.
In snippet mode, code snippets could be sent to a Livy session and results will be returned to the output port. This approach is very similar to using the Spark shell. Note that there are some limitations in adding jars to sessions due to LIVY-327.
-
In strict mode, the operator verifies if the output of the code snippet execution contains any errors. If any errors are found, the operator will also fail.
In non-strict mode, the operator will ignore possible errors that happen during the snippet execution. As a consequence, users have to analyze the result manually to see if the execution is successful. This can be done by exploring the job execution output, which is sent to the output port of the operator.
Configuration Parameters
Parameter |
Type |
Description |
---|---|---|
livyEndpoint |
string |
Mandatory. Defines the Livy endpoint to use (please, also specify
the port number). If the Livy service cannot be reached - the
operator will fail during the initialization phase.
Default: "http://livy-api-endpoint.com:8998" |
sourceType |
string |
Mandatory. Defines the type of the job that is being submitted:
"jar" for using a jar-file, "snippet" - for a snippet of code.
Default: "jar" |
errorHandlingMode |
string |
Mandatory. Defines the error handling mode:
Default: "default" |
securityContext |
string |
Defines the security context to use. This parameter must be set to communicate with a secure Livy endpoint. |
proxyUser |
string |
User to impersonate when starting a session or running a job. It's also used as a value of the "X-Requested-By" header in HTTP requests to avoid being blocked by the CSRF protection. If this value is empty, "X-Requested-By" will be equal to "hdfs". Default: "hdfs" |
jars |
string |
Jars to be used in this session or batch. Default: "jar1,jar2,..." |
conf |
string |
The value for the "--conf" argument of spark-submit. You must input configurations exactly according to the json-format, and wrap it with curly braces. Default: "{"key1":"value1","key2":"value2"}" |
accessToken |
string |
OAuth access token. Default: "" |
tlsRootCACert |
string |
The root certificate of CA. This parameter is useful if you are using a proprietary CA to sign the server certificate for Livy. Default: "" |
tlsSkipVerify |
bool |
If set to true, the certificate validation is disabled. Default: "false" |
batchName (For jar mode only) |
string |
The name of the batch. Default: "default batch name" |
jarPath (For jar mode only) |
string |
Mandatory. Path to jar to be submitted. Default: "hdfs://path-to-jar" |
className (For jar mode only) |
string |
Mandatory. Name of the class to be executed in jar. Default: "org.com.smth.className" |
args (For jar mode only) |
string |
Command line arguments for the application. Default: "arg1,arg2,..." |
sessionName (For snippet mode only) |
string |
The name of this session. Default: "default session name" |
snippet (For snippet mode only) |
string |
Mandatory. Snippet of code that has to be submitted. Default: "snippet of code" |
snippetType (For snippet mode only) |
string |
Mandatory. Defines the kind of session that should be created for snippet execution (language of snippet). Possible values: spark, pyspark, pyspark3 or sparkr. Default: "spark" |
strictSnippetExecutionMode (For snippet mode only) |
bool |
Mandatory. Defines whether operator is tolerant to errors in snippet execution output, i.e. switches strict and not strict modes of snippet submitting. Default: "false" |
Input
Input |
Type |
Description |
---|---|---|
inport |
string | Accepts path to jar or snippet of code that has to be submitted. Input signal initiates job submitting. |
Output
Output |
Type |
Description |
---|---|---|
out |
string | The following information will be sent to this port if the
submitted job finishes successfully:
|
error |
string |
If the submitted job fails and the Livy operator is using pipeline error handling mode, the error will be routed to this port. |