Whether you’re combining data sets, cleaning a nasty spreadsheet, or constructing a sophisticated predictive model, Alteryx has great tools to help you out. But with so many options, new or inexperienced users can feel easily overwhelmed. Luckily, this article will outline the top five most helpful predictive analytics tools in Alteryx.
It is one thing to know that you need to build a predictive model. But it is a completely different problem to know which predictive model to build.
Are you trying to predict future performance based on past behavior? Do you need to estimate what a value would be based on values of related variables? Or do you need to determine the optimal solution bounded by some set of constraints?
In this article, come with me as I list out five of the most helpful advanced and predictive tools that I have used for personal and professional projects. I believe that these tools comprise the majority of users’ needs, and knowing how to use these tools will greatly improve your skill set within Alteryx.
1. Linear Regression
Anybody who has ever taken a high school or college elementary statistics course should be familiar with linear regression. A linear regression model provides estimates for a variable given values of other, related variables.
In my experience, the linear regression is, by far, the most frequently used predictive model in Alteryx. Given its simplicity and that most users will be familiar with linear regression, it easily tops my list of most helpful advanced tools in Alteryx.
2. Time Series: ARIMA and ETS
In second place I have the two Alteryx time series forecasting tools: ARIMA and ETS. Both of these tools do the same thing: create a forecast model of a single variable time series based on historic performance.
Providing things like sales forecasts is a very common task for analysts to be assigned. Other tools, like the linear regression from above, aren’t appropriate for this type of analysis. As is often the case, things like sales follow somewhat predictable patterns, which is exactly what these time series tools are built for.
3. Logistic Regression
Slightly less well-known than its more popular cousin, the logistic regression tool is ideal when you need to predict the probability that a variable will take on one of two possible states. If you think of linear regression as a general-use model, logistic regression is much more focused.
Say, for example, that you wanted to predict the probability of a passenger surviving the sinking of the Titanic (this is a very popular machine learning data set —check it out!). You would use variables such as a person’s age, gender, class, and so on to predict the likelihood of two outcomes: surviving or perishing.
Although this tool isn’t as widely known or used as others, when you need to predict a binary variable it is perfect. Plus, it makes a cool looking graph, which never hurts.
If you could poll Alteryx users about the most challenging and intimidating tools, the optimization tool would undoubtedly be near the top. With four input anchors, three outputs, and a finicky way in which the data must be structured, this is no beginner’s tool.
But once you master how to structure your data sets, this tool becomes a hidden gem. When you need to solve a linear or quadratic programming problem, this tool will give you an answer in seconds. Try to build a workflow that can solve the same problem—go on, I’ll wait…
Optimization problems are generally posed as: “maximize or minimize some expression subject to one or more conditions.” This could take the form of creating an investment portfolio that maximizes returns (or minimizes risk, for the more conservative investor), maximizes profit, or minimizes travel distance over a network.
If I asked you to find the shortest route between Miami and San Francisco on the map above, it might not be immediately obvious which route is best. Convert the network into a matrix of nodes and edges (the hard part!), and the optimization tool will give you an answer in seconds. This tool can be your best friend or worst enemy.
5. Python and R
For all of its wonderful, helpful, and powerful tools, sometimes Alteryx just doesn’t have quite what you need. In this case, if you have some coding skills, the R and Python tools can swoop in to save the day.
With their ability to ingest data from within a workflow and to then output data back into the workflow, Python and R expand the world of the possible into outer space. Of course, that is only if you can code in R or Python… everything has a downside.
On a previous client project, I determined that using the popular XGBoost model was the best model—better than the built-in tools in Alteryx. Unfortunately, Alteryx does not have an XGBoost tool, so I had to pull my data into R to build the model. Without these tools, I would have been forced to use an inferior model.
These six tools offer only a tiny glimpse at the amazing tools available to you with Alteryx. An in-depth article would read like a novel, with sections on decision trees, gamma regression, neural networks, spline models, and more.
Do you have more questions about Alteryx? Talk to our expert consultants today and have all your questions answered!