What’s in your iTunes?

As far as I can recall, I have been using iTunes as my primary media player/library. I still remember downloading iTunes to my old school Acer laptop and trying to sync it with my 1st gen iPod. Yes, I am referring to that white/silver little brick long before the day of touch screen. From then, the same library has followed me through different computers and devices.

Over the years, I have acquired a lot of music and I mean it. According my status bar, I have approximately 38.81GB of music, videos, and podcasts that is equivalent to 18.2 days of playtime. Before music streaming became the norm, I mostly acquired my music through iTunes Store purchase or import from CDs. So honestly, I did not track what I have collected. For all I know, you can probably still find me rocking to I Try from 1999 (Macy Gray).

I decided why not pull together a short python programs to do some descriptive analysis on my library. While I mostly just stream music now from Apple Music or Spotify, as a self-proclaimed music connoisseur, I thought I should know what music I have on my computer. In addition, this will be a good place to start when I am trying to free up my computer down the road.

So here’s a few things that found out:

  • I have a total 6,851 songs/podcasts, and movies on my computer


  • There are a total 1,492 artists in my collection, and here are the top 5:top 5 artists
    1. NPR 298
    2. Various Artists 153
    3. De Cosmo 98
    4. Barbra Streisand 90
    5. Nujabes 84

I had no idea where that 298 songs by NPR were from at first. So I did a little bit of investigative work, and I found out that I had subscribed to the Tiny Desk Concerts Podcast series a long time ago. It has been automatically downloading all the content to my computer. Also, I didn’t know I was such a Streisand fan.


  •  There are a total 100 genres in my collection, and here are the top 5:Top genres
    1. Soundtrack (21.02%)
    2. Classical (14.70%)
    3. Pop (12.17%)
    4. Jazz (10.99%)
    5. Podcast (4.35%)

Note that I combined everything from 5 onward into one others category for the pie chart. The genre results were pretty similar to my own prediction. I do listen to a lot of soundtracks and instrumental music.


  •  The average duration and median duration were 4:15 and 3:35, respectively. The duration ranged from as short as 5 secs to 103:17.

When I review the 20 longest track on my list, I realized that most of them were lectures or podcasts that I have downloaded, such as The Fall of the Berlin Wall 18 years later: Lessons from East Central Europe. Most of my songs are in the 3 – 5 minutes range, but I do have quite a few of symphonic/classical tracks that last about 30 minutes each movement. I ended focusing on media that were less than 20 minutes long which accounted for about 99% of the files. Here’s the distribution graph.

Track duration count <20

What have I learned so far? Couple things:

  1. I have acquired a lot of random stuff throughout the years, and I need to go back to determine if I still need them. For example, all the tracks from De Cosmo were from a text book of jazz improvisation. Without the actual text book, the tracks are pretty much useless – do I need to keep them?
  2. Be mindful of what your iTunes is automatically downloading for you. For example, my large collection of NPR Tiny Desk Concerts. Don’t get me wrong, I am listening to Son Little’s Concert on Dec, 18th, 2015, as I type this blog. They are great, but I can probably just stream them online.
  3. I do have a lot of world music in my collection, and that did not go so well with python. I will have to figure out how to deal with track titles, artists, composers in foreign languages.

Give it a try, you can find the code I use HERE (p.s. This is the first time I wrote a long code, so be nice!). Feel free to let me know if you have any suggestions or what other analysis or data you would like to see. Enjoy!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>