I don't have a crystal ball, but I can still predict the future.

eight ball.jpg

In certain circumstances, a running sum is a great way to predict the future.  For example, if I take all the products I have on-hand, and add all the products I plan to build in the next few weeks, and subtract out the demand I plan to sell in those weeks, I should end up with my projected on-hand.  

fungible goods.jpg

In most businesses, keeping track of inventory is a really good idea, because it means your business' capital isn't just sitting idle and depreciating in value.  We ain't sellin' fine art 'round these parts!

missing-step-ladder.png

The first order problem is, there are often a lot of gaps in the data.  For a given part number, perhaps we don't plan to sell or build that item in week 2.  If there's no Wk 2 record for a given part, then a running sum tool will simply move on from the Wk1 record to the Wk3 record.  If I then display the example results by week, one can see, there's no record for Wk 2 when, in fact, we would have had 100 units laying around in that week.

no week2.png

The way to fix this is to separately summarize to every possible product, and every possible time at the lowest granularity, then append one to the other so you have a complete, empty, combination of product and time.  Then, join the empty data set on the data set that has values by products and times in the same join and make sure you keep all the empty time records on the side of that join.  

I wish I could take credit for this one, but this is all thanks to Mike Mixon, @mix_pix. You can check his twitter feed below.

Keeping things fresh: Adding [Latest Snapshot Date]

One of the key additions I make to our team's staging tables is a Time attribute called [Latest Snapshot Date], which I append to each subject matter dataset in the process of processing it from a .xlsx file that's originated out of Business Objects into a .yxdb file that's going to be used as an input for our more involved data prep workflows.

In most situations, analysts are keen to cite their sources, all the way down to the snapshot timestamp used.  This helps immensely when questions of freshness come up.

Personally, I like to provide a small table that lists each dataset and a timestamp.  This is trivial to produce in a presentation layer if each dataset is equipped with something like a [Latest Snapshot Date] attribute. I resort to two different ways to derive this attribute:

1) Transactional / Live Data

With a transactional or pseudotransactional dataset, where snapshots aren't provided, I resort to capturing the write time on the xlsx file written by Business Objects and append it as the [Latest Snapshot Date].

 

2) Snapshot Data

It's a bit different when a dataset has a series of snapshots.  I'll  use the Summarize to to find the maximum value within the snapshot field of the dataset and append that max snapshot as the [Latest Snapshot Date].

In some cases, the timestamp data is provided in a string and in such a situation I'll resort to a regex tool to convert it from a string into the time format preferred by Alteryx.  Regular expressions are a topic worthy of their own series of articles.  More on them in a future post!

 

Questionable fashion sense? Find your company's brand guide.

After all, my blog is called "questionabledecisionmaking"... So I'll confess that for almost a year I wore pants cut two inches too high because I tend to make questionable decisions, particularly in the fashion realm.

However, I do know enough about color theory that I probably shouldn't leave the walk-in closet without a consult from my better half.  To that end, your corporate overlords probably have sunk a vast sum of money in a Brand Guide and you should absolutely leverage that investment.   

As an example, Seagate's Brand Guide is a publicly available .pdf and it is a fantastically useful grimoire for those of us at Seagate do don't secretly harbor dreams of being a fashion buyer.

 

The most immediate value to me is in the color palette and hex codes.  

By referencing this graphic from the brand guide, I make any viz I've made look right when embedded in the inevitable PPT prezo or frame.  Further, it distinguishes my work from many other analysts in the company because most people simply don't take the time to recolor their work from the default palettes or perhaps aren't aware of this resource.