Preparing big data can often be slow and painful or restricted to the use of one type of a toolset. WPS Analytics solves this problem by providing both easy to use visual data prep tools and the ability to write code using a mixture of languages.
Workflow and coding
Use our Workbench GUI/IDE to prepare data by writing executable programs or by using drag-and-drop blocks to build workflows. Augment workflows by inserting programmable code blocks.
Workflow data prep blocks provide an iterative process to remove data irregularities by replacing, modifying or deleting as appropriate. Flag major issues like missing values and outliers using the point-and-click data profiler to speed up the process and prevent any impact that unclean data can have on model accuracy.
Reshape the structure of data from wide to long format using a range of drag-and-drop techniques. For example, transpose to extract information from datasets.
Extract features to transform raw data. Reveal underlying behaviour and trends to improve model performance. Use techniques such as binning, optimal binning, standardisation, scaling, clustering and factor analysis.
Use a wide range of interactive workflow blocks such as join, aggregate, filter and partition for data preparation. Re-use workflows to make repeatable processes fast and efficient. Employ with the point-and-click data profiler to confirm datasets are clean and ready for modelling.
A built-in SAS language capability allows you to create and run data prep tasks written in the SAS language without the need to install other third-party SAS language products.
Create SAS language programs to execute or augment any workflow by inserting SAS language code blocks.
Open source languages
Both workflows and SAS language programs support embedding and mixing code blocks written in the languages of Python, R or SQL.
Access disparate data sources
Access a wide range of data sources, including cloud data, Hadoop environments, data warehouses, databases, spreadsheets and many other file-based data formats.
See a list of the data sources and access types for use in coding and workflow here.