Face-melting concert photo

Phish Statistics Visualized: Summer Tour 2009

Last night Phish played the final show of their 2009 Summer Tour, a two-leg romp around the country that included 27 shows. To help make sense of it all, I used data from ZZYZX and Phish.com to create some statistical visualizations using tiny little graphs called sparklines.

There are two exhibits so far including over 1,100 data points. The first exhibit explores everyone's favorite topic, repeats, with a look at some of the most played songs from the tour. The second exhibit explores the shortest and longest sets, based on number of songs played.

Both exhibits demonstrate the beauty of sparklines: they are data-intense but design-simple, so even an incompetent statistician like me can quickly spot trends and outliers.

If you have ideas for other stats you'd like visualized, post a comment below.


Exhibit 1: Most Played Songs

Just like every other Phish tour I can remember, fans were quite vocal about the number of repeats throughout. Here's a closer look at the 20 most played songs, each of which appeared in at least 25% of the shows.

How to read it: Each song has a graph depicting all 27 shows from the early and late Summer Tours (Fenway through SPAC). A line going up means the song was played in the 1st set. A line going down means the song was played in the 2nd set. A longer mustard colored line down is an encore.

For example: Possum was played in the 1st set of Jones Beach 1 (show #2) then in the second set at Greatwoods (show #5) and again in the 2nd set at the Asheville (show #7).

Song Plays Early Late Notes
Possum 10 (37%)
  • Divided Sky, Ocelot, Stash and Stealing Time were only played in the first set
  • Chalkdust, Golgi, KDF and Sample made all but one of their appearances in the first set
  • Harry Hood and YEM were always in the 2nd set except at Bonnaroo 1 which was a single set affair
  • Suzy was an encore 3 times and made all but one its 7 appearances in the second set
  • Times Turns Elastic started out as a 2nd set song but was moved to the 1st set after its 3rd appearance
  • ZZYZX counts YEM as being played 9 times due to the YEM > Wilson > YEM at Bonnaroo
  • Tweezer Reprise, not shown, was also played 7 times
BDTNL 9 (33%)
Chalk Dust Torture 9 (33%)
Character Zero 9 (33%)
Down With Disease 9 (33%)
Harry Hood 9 (33%)
Kill Devil Falls 9 (33%)
Ocelot 9 (33%)
Run Like an Antelope 8 (29%)
Time Turns Elastic 8 (29%)
Wolfman's Brother 8 (29%)
You Enjoy Myself 8 (29%)
David Bowie 7 (25%)
Divided Sky 7 (25%)
Golgi Apparatus 7 (25%)
Sample in a Jar 7 (25%)
Stash 7 (25%)
Stealing Time 7 (25%)
Suzy Greenberg 7 (25%)
Tweezer 7 (25%)

Exhibit 2: Songs Per Show & Song Length

This tour, Phish played at total of 571 songs. Not counting the single set Bonnaroo 1, they played an average of 21.1 songs per show, 10.8 songs in the 1st set, 8.3 songs in the 2nd set, and a 2 song encore. Generally speaking, fewer songs means longer jams so in this case more isn't always better.

Update: August 17 at 11:00pm: Following a great suggestion from Doug at HelpingFriendlyBook.com he and I made a major update to this exhibit. It now indicates song length so you can see the “shape” of the show. A few factual errors were also corrected.

How to read it: Each show is followed by four numbers indicating songs in the 1st set, 2nd set, encore, and total. There is also graph depicting each song played for that show. The height of each bar indicates the length of the song, 1 pixel per minute.

For example: At Fenway there were 23 total songs played including 12 in the 1st set, 8 in the second set, and 3 in the encore. The longest song was the second set closer (YEM, 22:25).

Show S1 S2 E T Breakdown Notes
Fenway (5/31) 12 8 3 23
  • Burgettstown had the most total songs (25) while Jones Beach 2, Red Rocks 3, and Gorge 1 had the fewest (17)
  • Alpine Valley 2 had the most songs in the 1st set (14) except for the single set Bonnaroo 1 (19)
  • Alpine Valley 1 had the most songs in the 2nd set (11) and was the only show with more songs in the 2nd than the 1st
  • Red Rocks 3 had the fewest songs in the 2nd set (6 songs)
  • Burgettstown had the longest encore (6 songs) which was Grind, Hello My Baby, HYHU > Bike > HYHU, Loving Cup
  • Gorge 2 is perfectly average in terms of number of songs played
  • Toyota Park had a below average 1st set and slightly above average 2nd set — shocker
  • The longest song was Rock and Roll at Gorge 2 (23:12), while the shortest was Hello My Baby at Burgettstown (0:27)
Jones Beach 1 (6/2) 10 8 1 19
Jones Beach 2 (6/4) 9 7 1 17
Jones Beach 3 (6/5) 10 8 1 19
Great Woods (6/6) 11 7 2 20
Camden (6/7) 11 8 4 23
Asheville (6/9) 11 9 1 21
Knoxville (6/10) 11 10 1 22
Bonnaroo 1 (6/12) 19 - 1 20
Bonnaroo 2 (6/14) 11 8 2 21
St. Louis (6/16) 11 10 3 24
Burgettstown (6/18) 11 8 6 25
Deer Creek (6/19) 12 8 2 22
Alpine Valley 1 (6/20) 10 11 1 22
Alpine Valley 2 (6/21) 14 7 2 23
Red Rocks 1 (7/30) 10 9 1 20
Red Rocks 2 (7/31) 8 8 2 18
Red Rocks 3 (8/1) 9 6 2 17
Red Rocks 4 (8/2) 12 8 3 23
Shoreline (8/5) 10 9 2 21
Gorge 1 (8/7) 9 7 1 17
Gorge 2 (8/8) 11 8 2 21
Toyota Park (8/11) 10 9 1 20
Darien Lake (8/13) 12 9 2 23
Hartford Meadows (8/14) 11 10 1 22
Merriweather Post (8/15) 13 7 2 22
Saratoga(8/16) 12 9 3 24

That's all for now. Hope at least a few of you enjoyed these visualizations. Don't forget to post a comment below if you have ideas for more stats to visualize. And if you want to relive the music, head on over to PhishTwit where you can stream the entire tour.


20 Comments so far   

OMG, I love PHISH and love what you did with the stats.  Color would be nice?

Todd -  these graphs are fantastic.  Would be great to add additional colors to exhibit 1 for set openers and closers. 

Looking forward to the next round of stats after fall tour!

Well done Todd.  This is most likely the greatest statistical analysis I’ve ever read on the internet!!

Great information! I’ve been looking for something like this about the summer 09 tour. Does anyone know any stats about the attendance for this tour?  I’d love to find out just how many people went out to these shows this summer

@Todd Levy:  I’ve been trying to get in touch with David “ZZYZX” Steinberg to see if I could work out a deal with him, which would allow me access to the raw data (ie in a database, excel file, or any type of flat file). 

I already got approval from my company to host the analysis on our server, which would allow David embed in his website for all his visitors to use.  He has not replied to my email yet so I’m not sure if he is interested, but it would be pretty cool. 

If anyone knows how to get in touch with him please forward this message along with my email (.(JavaScript must be enabled to view this email address)).  I think he feels like I am trying to sell him something, which is definitely NOT the case.

Take care man.

@Jimmy Glitch: Funny stuff man. I was taking a bath on Llama but finally got my order filled at SPAC. Now, what to do about Weigh?

@Steve: Thanks so much

@Matthew Phoenix: All the stats came from ZZYZX and Phish.com. Would love to see what you can do with them.

Does someone out there have the raw data of these stats?  I have access to the most amazing Visual Analytics tool on the Market and would love the opportunity to put an application together the these stats.

You can check out the tool I’d be using at http://spotfire.tibco.com/.

Also if anyone know where I can get historical stats, I can include another analysis that shows how log it has been since they played some of their rare songs.  This could definitely be used to predict setlists.

If this turns out to be a cool app that people like we could embed it into the Jamtopia blog.

Feel free to email me and discuss .(JavaScript must be enabled to view this email address)

I love that you’re using sparklines.  This is a perfect application!  I too am a huge fan of E. Tufte.  Keep up the great work.

Thanks, Todd. Based on this information I’ve decided to go long on Harpua and Tela (of course), sell short on all the new songs and the flagship Hoist tunes. May hold a long position on Hood and Possum. They are cheap to pick up right now I feel they’ll be less played over the next tours and, therefore, rise in value.

hi todd,

nice work! visualization is an integral way for many of us to learn, and it helps with literacy on the internets, too. kudos to you.

that said, i’d love to see a marriage of your data and flash: make the sparklines interactive so that each bar reveals on mouseover which song is which length, give duration, show color for energy, etc. this could take care of the legibility issue as well (each bar zooms in or enlarges on mouseover, too).

if you’d like some input on how this might look, lemme know. i’m actually not as busy as i’d like to be with the web design at the moment :)

now if we can only produce a visual graph of trey’s ability to shred on a song by song basis….

best,
kevin

love phish love stats that is cool the only thing if you could make the graphs easier to read mabye more colors. Thanks for your hard work and the stats keep it going i will be checking back all the time for updates. Peace and Love. Keep on’ PHISHIN’.

Ah, the dreaded “show your work” problem.

I used the statistic you set that there are on average 10 shows per first set, and 8 per second set. = 54 songs for 3 days x 2 sets.  There will be a third set for two days, and we’ll assume like 2nd sets, they have 8 songs too.  So another 16 songs = 60 total.  There will be 3 encores, at approximately 2.5 songs per encore = 8 more songs.  My guess is 68 songs (possible) at Festival 8.  Maybe a few more, maybe a few less.  The current 3.0 Phish seems to play shorter more intense Chalkdusts etc.  No more rambling 40 minute “46 Days” (hopefully). So I think that number is reasonable.

Although looking at the three shows of Hampton, they played 28 songs the first night (holy crap!).  27 shows the second day, and 29 real songs on the last day.  Playing a huge majority of their catalog that they’ve been playing since about 1997.  That total comes to an amazing 84 songs. 

At $200/68 songs = $2.94 per song.  The other math is just kinda stupid stuff anyway.

I’m going to make a Magic 8 Ball App for the iPhone that randomly picks the next song based on the most likely 84 songs.  Wish me luck.

Mike

@Mike: Not sure I’m really tracking all of your math, but I can tell you precisely how many songs I’ll see at Indio… none. :(

@Sam: Thanks for the kind words. I basically used brute force to manipulate data from ZZYZX and Phish.com into alternate numeric representations that could be used with sparklines. It wasn’t particularly fast, fun or accurate, but I think the ends justify the means.

@Everyone: Hoping to update exhibit 1 tonight as it now feels kind of lame given last night’s update to exhibit 2. Also got a couple other tweaks that have been suggested on various message boards and the like so look for those too.

i love your stats and the way that you did it.  did you use a macro or just a code? 
i actually have a phish fact site called Phishapedia, and i wish i had something like this on there.
great job!!!

http://phishapedia.blogspot.com/

Based on your statistics, in the 8 sets of Phish at Festival 8, we will hear 8 sets.  Or 30 first set, 24 second set and 16 third set, and approximately 8 encores.  For a total of ~68 songs. 

At $200/ticket, we are paying approximately $3 per song.  At 75 minutes per set average, this means ~680 minutes of music including encores (8 minutes per encore).  Again, this is approximately .30 per minute of music.

We will hear all 20 of those songs at Festival 8. 

The question is, what do the BINGO cards look like?

@linerama, you can get those stats here… http://www.ihoz.com/consec.html

Just choose 5/31/09 as the start date and 8/16/09 as the end date.

There were 60 songs only played once, the most surprising to me being Meatstick and Mule. Also wouldn’t have minded an extra Pebbles & Marbles or two.

The longest span was a bit more cumbersome to figure out, so I’ll make it easy. Here’s a list of songs that were played Summer 2009 but weren’t played at all in ‘03-‘04.

They are the biggest bustouts if you will…

Mustang Sally was not seen between 6/21/88 and 6/14/09 [1143 shows]
How High The Moon was not seen between 3/8/93 and 8/13/09 [706 shows]
The Ballad of Curtis Loew was not seen between 8/2/93 and 5/31/09 [622 shows]
Drums Jam was not seen between 10/15/94 and 8/2/09 [537 shows]
Highway To Hell was not seen between 2/26/97 and 6/12/09 [322 shows]
Psycho Killer was not seen between 12/7/97 and 8/14/09 [279 shows]
Lengthwise was not seen between 7/28/98 and 6/9/09 [230 shows]
Oh! Sweet Nuthin’ was not seen between 10/31/98 and 8/5/09 [223 shows]
Paul and Silas was not seen between 11/29/98 and 8/11/09 [207 shows]
Icculus was not seen between 7/18/99 and 8/14/09 [191 shows]
Hello My Baby was not seen between 12/5/99 and 6/10/09 [136 shows]
The Man Who Stepped Into Yesterday Reprise was not seen between 7/4/00 and 6/21/09 [110 shows]
If I Could was not seen between 6/28/00 and 6/2/09 [102 shows]
Colonel Forbin’s Ascent was not seen between 9/30/00 and 8/14/09 [96 shows]
The Famous Mockingbird was not seen between 9/30/00 and 8/14/09 [96 shows]
Bike was not seen between 9/12/00 and 6/18/09 [95 shows]
Esther was not seen between 9/30/00 and 8/1/09 [89 shows]
Walk Away was not seen between 10/5/00 and 6/18/09 [80 shows]
Bold As Love was not seen between 10/6/00 and 6/9/09 [74 shows]

love the viz and stats.  Ideas:  How about stats on what tunes were only played once (i.e. sand) and stats on tunes with the longest span of time since last played?

@Harry, my pleasure. Glad you’re liking it.

@Doug, will certainly try to infuse your recommendations which sound awesome—but time consuming :)

One quick fix I’m going to roll out tonight is to split the graph in exhibit 1 into two chunks, one for each leg.

Would also love to add total show duration and avg song duration to exhibit 2 but I don’t have total show time data and it’ll take forever to add up the individual track times that are posted on LivePhish. So if anyone has total durations for the show (David????) please drop a line.

TL

Hey I love the post.  I’m a bit of a data junkie myself (big fan of Tufte) so it’s right up my alley.  It would be really cool if the sparklines could account for the length of the song as well, represented by the lenght of the bar.  Or the shape of the show if the length of the bars represented the length of each song in the second group too.

Three of my favorite things in the world combined on one webpage: Phish, analytics and visualization. Thank you kind sir, thank you.

Never would I have thought that Possum would be most played song this tour. I would have guessed BDTNL or Chalkdust.

Post your comment

Name
Email

Will not be made public.
Location
optional
URL
optional
Preferences

 Email me new comments

 Remember my information

Verification

Type the word below into the field:

Paragraphs and links are formatted automatically.

We kindly request no more than two links per comment.

Your comment may not appear immediately.

Commenting disabled. Sorry.

Like Jamtopia? Subscribe to Jamtopia by email or grab the Jamtopia RSS feed

All trademarks and copyrights are the property of their respective owners.