Customizing the error handling

In the previous section, you handled the errors in the simplest way. There are some options that you may configure to customize the error handling.

On one hand, PDI allows you to add new fields to your dataset describing the errors, namely:

  • Number of errors
  • Description of the errors
  • Name of the field (s) that caused the errors
  • Error code

You only configure the name of the field (s) that will contain these values. The values themselves are calculated and set by the tool. You don't define error descriptions and codes; they are internal to PDI.

On the other hand, you can control the number of errors to capture, which by default is unlimited. PDI allows you to set:

  • The maximum number of errors allowed. If the number of errors exceeds this value, the Transformation aborts. If this value is absent, all the errors are allowed.
  • Maximum percentage of errors allowed. Same as the previous point, but the threshold for aborting is a percentage of rows instead of an absolute number. The evaluation is not done right after reading the first row. Along with this setting, you have to specify another value: the minimum number of rows to read before doing % evaluation. If this setting is absent, there is no percentage control. As an example, suppose you set a maximum percentage of 20% and a minimum number of rows to read before doing percentage evaluation to 100. When the number of rows with errors exceeds 20 percent of the total, PDI will stop capturing errors and will abort. However, this control is made only after having processed 100 rows.

Having said that, let's modify the previous Transformation so you can see how and where you can configure all these settings. In this case, we will add a field for the description of the error:

  1. Open the Transformation from the previous section.
  2. Right-click on the Select values step and select Define Error handling.... The following dialog window will appear, allowing you to set all the settings described previously:

Error Handling settings

  1. In the Error descriptions fieldname textbox, type error_desc and click on OK.
  2. Double-click on the Write to log step and, after the last row, type or select error_desc.
  3. Save the Transformation and run a preview on the Write to log step. You will see a new field named error_desc with the description of the error.
  4. Run the Transformation. In Execution Window, you will see the following:
- There was an error changing the metadata of a field
- Write to log.0 -
- Write to log.0 - project_name = Project C
- Write to log.0 - start_date = 2017-01-15
- Write to log.0 - end_date = ???
- Write to log.0 - error_desc =
- Write to log.0 -
- Write to log.0 - end_date String<binary-string> : couldn't convert string [???] to a date using format [yyyy-MM-dd] on offset location 0
- Write to log.0 - ???
- Write to log.0 - ====================

Some final notes about the error handling setting:

  • You might have noticed that the window also had an option named Target step. This option gives the name of the step that will receive the rows with errors. This option was automatically set when you created the hop to handle the error, but you can also set it by hand.
  • Regarding the settings for describing the errors and controlling the number of errors, all of them are optional. In case of the fields, only those for which you provide a name are added to the dataset. In the Transformation you could see it when you previewed the Write to log step. You can also verify it by inspecting the input metadata of the Write to log step. Remember that you can do it by clicking on Input Fields... in the contextual menu of the step.