Once you’ve trimmed unnecessary fields and values from the data set, you may still want to change the amount of data in the sample or how the sample is generated.

These settings are available on the Data Sample tab in the Input step:

Amount of Data: This option determines how much data is brought into the flow.

Default Sample Amount: The amount of data included in the default sample configuration. This isn’t a fixed number of rows, rather how many records are returned depend on the characteristics of your data.

Fixed amount: Alternatively, you can specify a specific number of records to include in the sample, increasing or decreasing from the default.

Use all data: If you don’t want the data to be sampled, you can select this option to force Tableau Prep to retrieve all rows in your data.

Sampling Method:

This option determines how the records are chosen from the data source.

Quick select: By default, the database returns the number of rows requested as quickly as possible. This might be the first rows based on how the data is sorted, or the rows that the database had cached in memory from a previous query. While this is almost always a faster result than random sampling, it may return a biased sample (such as data for only one year rather than all years present in the data, if the records are sorted chronologically).

Random sample: The database looks at every row in the data set and randomly returns records until it reaches the number of rows requested, making the sample more representative. However, this will impact performance when the data is first retrieved because the entire data set must be scanned (rather than just the first N results like with Quick select). This can be useful if the quick select sample doesn’t contain the data that you need, are performing a wildcard union and want records from each file, or if joining two sampled tables returns few records.



(How random sampling can help if your data is ordered by time.)