Select tool
The Select tool allows you to include, exclude, or reorder the columns of data that pass through a workflow. You can also use this tool to modify the type and size of data, rename a column, or add a description.
To include a column in data, select the check box to the left of the column name. Deselect the check box to exclude the column. You will notice that we have an Unknown field listed. This field is for adding new fields to your work and is present for many of the tools. For the workflow, select Order ID, Order Date, Ship Date, Customer ID, Product ID, Sales, Quantity and Shipping Cost. Once this is done, click the Run button.
In checking the results, only Order ID, Order Date, Ship Date, Customer ID, Product ID, Sales, Quantity and Shipping Cost. are seen.
Filter tool
The Filter tool queries records by using an expression and splits data into two outputs: True, where the data meets the specified criteria, and False, where the data does not meet the specified criteria. Use this tool to identify records in your data that meet a specified criteria.
Select the type of filter to use
Basic filter: Use the basic filter to quickly build a simple query on a single column of data.
Connect the Filter tool to the Data Cleansing tool.
Click on the Filter tool and select the Basic filter, then select Country as the data to filter. Choose Equals from the menu of operators. Then type United States in box for the value.
Click the Run button. By clicking on the T anchor, the data can be seen which meets criteria. Clicking on the F anchor reveals data that did not meet the criteria.
Custom filter: Use the custom filter to build a more complex expression or to query from multiple fields in the data stream.
Formula tool
The Formula tool creates a new column, or updates a column by using one or more expressions to perform a variety of calculations and operations.
Configure the tool
Using the Input Data tool, bring the datafile into the workspace. Drag the Formula tool onto the workspace and connect it to the Input Data tool.
In the Configuration window, select an Output Column of data in Select Column; choose an existing column or add a new column.
Once the workflow has been run, the Data Preview box displays the first row of data from the specified column with the expression applied.
To get started, select +Add Column. For the column name, type in Total.
For the expression, type in [Sales]+[Shipping Cost] As you type the [, a dialog box will pop up and allow you to select data columns that wiil be use to calculate our new column.
Once the expression is entered, click the Run button.
Once you have run your workflow, the Results pane should show the first rows of your data. Notice that a new column, Total, is now the last column in the results.
Imputation Tool
The Imputation tool allows you to replace a specified value within one or more numeric data fields with another specified value. This is particularly heplful in replacing NULL values or empty spaces in our datasets.
Configure the tool:
Download the RepInformation datafile to your desktop, Then use the Input data tool to bring the SalesTrip spreadsheet into your workspace. This spreadsheet covers sales trip made by company reps throughout the year. Since there are blanks (NULL values) in the data, you can use the Imputation tool to change these values.
Drag the Imputation tool onto the workspace and connect it to the Input Data tool.
This connection will bring up the Configuration box which lets you decide which fields to correct.
In the Configuration box, you can decide which fields to impute, which values to replace and what value to replace them with. Replace all the NULL values with a zero (0). Make the appropriate selections for this and click on RUN.
Once the workflow has run, you can see the results, All the NULL values have been replaced with zero.
Copyright © Baylor® University. All rights reserved.
Report It | Title IX | Mental Health Resources | Anonymous Reporting | Legal Disclosures