Good question. “Prepare datasets of any size, megabytes to terabytes, with equal ease.” from Dataprep documentation.

It depends on how you define “efficiently”. The data cleaning and transforming on sample dataset is real-time. The job running on the entire dataset will be executed after you submit it. And the running time could vary due to type of instances, recipes and amount of data, etc.

Data Scientist: Keep it simple.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store