Use lenses for publishing¶
You use lenses to create publishing points from steps in your Data Prep project. When you publish from a lens, the resulting AnswerSet is a snapshot of your dataset at that particular step in the project. By default, the AnswerSet is saved to your data library.
Work with the New lens tool¶
To access the New lens tool, click new lens in the project Tools bar:
The following is an overview of the elements you work with when removing rows from your project:
|New lens tool||Click step in the Tools bar and select a step. Next, click new lens to access the Lens pane.|
|Lens pane||Lets you create a lens and publish it as an AnswerSet.|
|Enter a lens name field||Enter the name of the lens and click Save.|
|Publish||After you set your lens, click Publish to save the state of your data to an AnswerSet.|
Add a lens¶
To add a lens:
Click steps in the Tools bar and click the step where you want to add the lens.
Click new lens in the Tools bar.
In the Lens pane, enter a unique lens name and click Save.
Optionally, save to an Answerset by clicking Publish.
Tips for using lenses¶
- You can add a lens to any step or sub-step in your project, for example, to the Import step of an Append.
- You can drag an existing lens to any step or add an existing lens multiple times.
- All lenses persist as part of the project steps and are public to anyone who shares your project.
- A lens name must be unique because it is used to name the resulting AnswerSet.
- If there is an error in a formula created using the Compute tool, an error icon () displays in the Steps tool. In this case, you can create and save a lens but you cannot publish it to an AnswerSet.
The lenses you create are retained in project versions, and you can publish AnswerSets from lenses in older versions of your project.
Lenses are also essential for project automation because they define the publishing points to use for automated jobs. When you set up a project for automation, you select lenses and configure a corresponding schedule to automatically publish AnswerSets to your data library. Therefore, in order to automate a project, you require at least one lens in the project. For more help on automating a project, see Automation and operationalization.
Following are examples of when to use lenses.
Isolate rows in your dataset¶
You can use a lens to isolate rows in your dataset that need further investigation. To do so, you add a lens on a step and you filter rows that you want to isolate from your current dataset. Name the lens and click Publish. The resulting AnswerSet is published to the data library and includes only the isolated rows you can investigate later. Now you can create a new step to remove those rows from your current dataset.
View before and after aggregation¶
To view your data before and after aggregation, you can add a lens to publish the current dataset prior to shaping your data. Name the lens and click Publish. The resulting AnswerSet of pre-aggregated data is published to the data library. Create a Shape step, then add a lens to publish the resulting dataset. You now have two AnswerSets that reflect your data before and after the aggregation.
Schedule a project for automation.¶
To schedule a project for automation, add a lens for every step in your project where you want to create a publishing point. Name each lens with a unique name to describe the output generated from that publishing point. Set up automation to use the lenses for publishing AnswerSets to the data library based on the schedule you configure. See Automation and operationalization for details.