Episode 182

full
Published on:

21st Oct 2025

182: This Data Analyst Has Analyzed 1M+ Songs (here’s everything he knows)

Help us become the #1 Data Podcast by leaving a rating & review! We are 67 reviews away!

Data meets music 🎶 — Avery sits down with Chris Reba, a data analyst who’s studied over 1 million songs, to reveal what the numbers say about how hits are made. From uncovering Billboard chart fraud to exploring how TikTok reshaped music, this episode breaks down the art and science behind every beat.

💌 Join 10k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://www.datacareerjumpstart.com/newsletter

🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://www.datacareerjumpstart.com/training

👩‍💻 Want to land a data job in less than 90 days? 👉 https://www.datacareerjumpstart.com/daa

👔 Ace The Interview with Confidence 👉 https://www.datacareerjumpstart.com//interviewsimulator

⌚ TIMESTAMPS

00:00 - Intro: How Chris analyzed 1M+ songs using data

01:10 - What data reveals about hit songs and music trends

03:30 - Combining qualitative and quantitative analysis

07:00 - The 1970s Billboard chart fraud explained

10:45 - Why key changes disappeared from modern pop

13:30 - How hip-hop changed song structure and sound

14:10 - TikTok’s influence on the music industry

16:10 - Inside Chris’s open-source music dataset

22:10 - Best tools for music data analysis (SQL, Python, Datawrapper)

27:45 - Advice for aspiring music data analysts

🔗 CONNECT WITH CHRIS

📕 Order Chris's Book: https://www.bloomsbury.com/us/uncharted-territory-9798765149911

📊 Check out Chris's Music Dataset: https://docs.google.com/spreadsheets/d/1j1AUgtMnjpFTz54UdXgCKZ1i4bNxFjf01ImJ-BqBEt0/edit?gid=1974823090#gid=1974823090

💌 Subscribe to Chris's' Newsletter: https://www.cantgetmuchhigher.com

📲 Follow Chris on TikTok: https://www.tiktok.com/@cdallarivamusic

🔗 CONNECT WITH AVERY

🎥 YouTube Channel

🤝 LinkedIn

📸 Instagram

🎵 TikTok

💻 Website

Mentioned in this episode:

✨ Try Julius!

This episode is brought to you by Julius – your AI data analyst companion. Connect to your database and/or business tools, pull insights in minutes–no coding required. Thanks, Julius, for sponsoring this episode. Try Julius at https://landadatajob.com/Julius-DCP

https://landadatajob.com/Julius-DCP

Transcript
Speaker:

Have you ever wondered how to

do data analytics with music?

2

:

Well, today you'll learn how.

3

:

My guest today is Chris Riva, who is a

music analytics savant and genius, and

4

:

he'll share everything that he knows.

5

:

He's a data analyst for an audio

streamer, and he's literally

6

:

writing the book on music analytics.

7

:

And this episode he'll tell us

how to get into music analytics.

8

:

How to get music data, how to

analyze it, and what he's learned

9

:

with years of analyzing music.

10

:

Let's go ahead and get into it.

11

:

Chris, you have analyzed over a

hundred thousand, if not a million

12

:

different song data over your career.

13

:

Uh, and specifically for your new book,

which is coming out soon, uncharted

14

:

territory, I will link in the show

notes for everyone to check it out.

15

:

Down below, you analyzed, you know,

over a thousand number one hit songs.

16

:

Basically every number

one hit song since:

17

:

Which is crazy.

18

:

That's a lot of songs to listen to and

to look at the data for in your study,

19

:

like what do you feel like you learned

looking at those Number one hit songs.

20

:

I feel like the, this will be a

little bit of a cheesy takeaway,

21

:

but, and it's not even specifically

related to analytics, is that.

22

:

You really should keep an open mind when

you're listening to music because things

23

:

that seem strange to you on the surface.

24

:

Often when you dig a little bit deeper,

there are rich musical communities

25

:

that are making all different styles

of music, and if you don't like.

26

:

Any, so if you claim not to like one song

in a specific genre, you probably just

27

:

haven't listened to enough of it yet.

28

:

So I like to tell people, listening

to all those songs, analyzing all

29

:

this music has taught me to keep an

open mind about how music works and

30

:

what it means to be a good song.

31

:

But also, you know, you want to keep

an open mind when you're analyzing

32

:

music or analyzing data because if

you go in with preconceived notions.

33

:

You're liable to miss out on some

of the most interesting conclusions.

34

:

That's super interesting.

35

:

It's interesting also as well, because

for this study you are, uh, we should

36

:

mention that you were a data analyst.

37

:

You're now a senior, uh, product manager

of data and personalization at Audio Mac.

38

:

So you, you know, data, you know,

data analytics, and you did this

39

:

study, in my opinion, from like

kind of two different perspectives.

40

:

One, you have kind of like more

of a qualitative perspective

41

:

where you're actually like.

42

:

Listening to all 1000 number one

hit songs and you're like, thinking

43

:

about it and you're like, oh, that's

interesting how they did this here.

44

:

Or, you know, that's, I, I didn't

really expect that here, but you're also

45

:

doing it from a quantitative standpoint

where you actually have the data.

46

:

And I think I counted, it was

like, it was like over a hundred

47

:

columns, 105 columns, uh, worth of

data on each one of these songs.

48

:

Um, and you also have

that database online.

49

:

We'll have a show, uh, link

to the show notes down below.

50

:

So you're like doing a qualitative study.

51

:

While doing a quantitative study where

you're like looking for what your ears

52

:

are hearing and kind of like maybe

what your brain's thinking and what

53

:

your heart's feeling, but you're also

like, Hey, what numbers can I actually

54

:

like attribute back to those physical,

you know, sensations I'm having?

55

:

Can you talk about like.

56

:

How you did this study kind of

through those two different methods.

57

:

Ultimately, especially with, there's a

quote about music I, I think I put at the

58

:

beginning of the book, uh, writing about

music is like dancing about architecture,

59

:

which is sort of a funny quote is like,

music is experienced by listening to it.

60

:

You could write as many words about

it as you want, but ultimately, you

61

:

know, music is a very human subjective

thing that we all experience.

62

:

So data can make you dispassionate

about certain things and give you

63

:

a better view about what's actually

going on with certain things.

64

:

But ultimately we're talking about.

65

:

Popular songs here, so we need

to experience them with our ears.

66

:

And typically what would happen is

I would be going along listening

67

:

and I'd be like, oh, I feel like

I'm noticing some strange trend.

68

:

I feel like in the 1970s there's a

lot of songs that have the word dance

69

:

and shake and boogie in the title.

70

:

Now if you know about music of that area,

you'll be like, oh, that's not surprising.

71

:

That's when disco and

dance music became popular.

72

:

But the nice thing about data is I

can have this intuition and I can go

73

:

check, I could just scan all the song

titles, put together a chart and be

74

:

like, oh, there was actually a sharp

rise in a specific, in music using a

75

:

specific type of language in this era.

76

:

And that's what I sort of do throughout

the entire book is as I'm listening

77

:

to these songs, my gut is telling

me something, or there is some.

78

:

Something I've heard about before, and

I'm like, all right, let's, let's go

79

:

check if this is actually the case.

80

:

So it's the subjective and the

objective, or you know, the

81

:

personal and the quantitative.

82

:

The qualitative and the quantitative

are really wrapped together.

83

:

I think when you're talking

about data and art specifically.

84

:

But I think about data

and anything specifically.

85

:

You need to have a feel for something.

86

:

You can't just analyze your way directly

out of it without any feel for what's.

87

:

The data actually represents, I think

some people don't necessarily know that,

88

:

and they think that like everything's

data-driven, and I guess it's kind

89

:

of good if everything is data-driven.

90

:

Um, but from my experience, uh, working

in industry data is usually used as

91

:

a guide and like, as a suggestion

more than like absolute fact.

92

:

So for instance, when I,

when I worked for ExxonMobil.

93

:

Uh, one of the things that I did is

like predict gas demands, uh, every

94

:

gas station, ExxonMobil gas station

in America, or I would run a bunch

95

:

of simulations to try to figure

out what the best crude oil that we

96

:

should buy based on prices right now.

97

:

Is it, you know, is it Russian

or is it from Saudi Arabia?

98

:

So on and so forth.

99

:

And the decisions, the numbers that I had.

100

:

We're never the numbers that actually

happened in real life, even though if

101

:

they were optimal or that's like the best

prediction, it would always go to someone

102

:

that's a little bit closer to the business

than, than I am that like actually

103

:

understands like, I don't know, gas

demand more than me as a data scientist.

104

:

I, I understand I'm a chemical engineer,

so I understood gas and I understood

105

:

some of that, but like these people

have been in the industry for so many

106

:

years and those are the people who are

actually making the decisions and they're

107

:

using my numbers kind of as a guide, but

they're still going with a lot of like.

108

:

Their gained experience

and their, uh, heart.

109

:

So, uh, interesting to kind of hear that

from, from your perspective, uh, as well.

110

:

'cause you like grew up loving music, like

you've loved music for a long time, right?

111

:

Like played music as well.

112

:

Yeah.

113

:

That's my, my first musical love as I.

114

:

Of course, at first I was listening to

music as a kid and was always really

115

:

into bands and learning about, you know,

who played what instrument X, Y, and Z.

116

:

But since I was in middle

school, I've played instruments.

117

:

I've always enjoyed writing

and recording songs and playing

118

:

in a variety of bands, so.

119

:

My interest in music First comes

my interest in writing about music.

120

:

First comes from just being a fan of

making music and listening to music,

121

:

and it was years later that I discovered

that you could apply quantitative

122

:

skills to something like this.

123

:

But I totally agree with you.

124

:

You know, you need a.

125

:

You want to have a feel for what

your data is actually being used for.

126

:

Because ultimately, someone

told me once a, A model is a

127

:

map, it's like a map of reality.

128

:

You know, we're never gonna be

able to use data to perfectly

129

:

model everything in the world.

130

:

Well, maybe one day, I don't know, but

you know, the world's really complicated.

131

:

So it's always good to have a feel

for what's going on too, to trust your

132

:

gut a little bit, because sometimes.

133

:

The data might be telling you

something and you're just like,

134

:

that just cannot be right.

135

:

And lo and behold, often you go

and look a little bit more closely

136

:

and you made a mistake somewhere

or you misunderstood something.

137

:

Okay.

138

:

That's, that's really interesting.

139

:

I wanna, I want to chime in on

like one specific gut feeling that,

140

:

that you may have had, um, that

you kind of found out in the book.

141

:

So one of the things that, um, you did

in the book was you basically identified

142

:

a, a, a period where there was fraud

going on in the billboard top 100.

143

:

Now, um, I'm not familiar with, with the

music industry, uh, especially when this

144

:

occurred, which is like the 1970s, right?

145

:

Like I was not alive then.

146

:

Um, and I'm not a music buff.

147

:

So is it like well known that like

this fraud occurred in the:

148

:

Um, and I guess for, for those who

don't know about this fraud, can you

149

:

kind of explain what happened and then

how you were able to use data to, to

150

:

figure it all out and back it all up.

151

:

Yeah, I mean, historically the, the

music industry was known as a, a place

152

:

of shady characters and, you know, people

involved with organized crime ran labels.

153

:

Uh, in the late fifties.

154

:

All these radio DJs were hauled in

front of Congress for what became known

155

:

as the payola scandal, where basically

they were getting monetary kickbacks

156

:

to play certain songs on the radio.

157

:

Congress was like, you can't do

that if you're gonna be paid.

158

:

You have to announce that you are being

paid to play certain songs on the radio.

159

:

This stuff didn't go away.

160

:

It just morphed into different behaviors.

161

:

So there was always talk about how in

the seventies and the eighties, there

162

:

were still some shady behaviors going

on, especially in the world of radio.

163

:

There's a great book called

Hitman by Frederick Danon where

164

:

he outlined some of this stuff.

165

:

But again, I had, I had, I was aware

of some of this in the back of my head.

166

:

But I was looking at this billboard.

167

:

Number one hit data, and I noticed

thing strange that during the:

168

:

your average number one hit stays at

number one for about three weeks, and

169

:

then in the middle of the seventies,

it just takes a complete nose dive

170

:

and it gets very, very close to one

week, which it can't go lower than one.

171

:

You know, if some.

172

:

It can't be at number one for zero

weeks if it's a number one hit.

173

:

And then by the eighties it

sort of climbs back up to around

174

:

the two or three week mark.

175

:

And I was like, oh, that's

sort of a strange decline.

176

:

And again, that felt odd.

177

:

So I started looking into some other data

and I noticed that when songs would lose.

178

:

Their number one slot on the charts.

179

:

During that same era, they would

fall down, say five or six positions.

180

:

Whereas in all other eras, they would

only fall down one or two positions.

181

:

So say going from number one to

number three, now they're going from

182

:

like number one to number six when

they were losing their top slot.

183

:

So this is weird.

184

:

I mean, what we're seeing anomalous data.

185

:

Then I need a smoking gun

to, to write this story.

186

:

I can't just say something weird

is going on because sometimes there

187

:

is just oddities in your data.

188

:

A lot of things change and I find this

character named Bill Wardlow who was

189

:

involved with the billboard charts

at the time, he was actually the

190

:

director of the billboard charts, and

people over the years have said that.

191

:

If you were in his good graces, you

could just be like, yeah, I need this

192

:

record to be number one this week.

193

:

And for an exchange of, most people

say there was no exchange of money

194

:

for an exchange of something, uh,

unspecified, he would get your record

195

:

onto where it had to be on the charts.

196

:

And there are, during that

era in the seventies, there's

197

:

a lot of chart oddities.

198

:

You have songs, certain songs that

people claim should have gone to

199

:

number one but didn't because it.

200

:

It was not in this guy's best

interest to make it happen.

201

:

But again, find, figuring

this out was a combination of,

202

:

oh, I'm looking at the data.

203

:

I see something weird.

204

:

I'm aware of some sketchy behavior

in this department before.

205

:

And then you're like, all right, let's

see if there's actually any truth to this.

206

:

And it was able to find that there was

super interesting, uh, you're, you're

207

:

solving crimes or maybe not crimes,

but you're, you're solving shadiness.

208

:

Yeah.

209

:

Uh, with data in the music industry.

210

:

Uh, one of the other things that

you found, uh, in the book as well

211

:

was that the percentage of number

one hits, that had a key change has

212

:

like basically gone to, to zero in

the last like 10 years or, or so.

213

:

Tell us what that's all about.

214

:

First off, I guess maybe explain

what a key chain is for or not.

215

:

Key chain.

216

:

A key chain.

217

:

That's what goes where your car keys go.

218

:

Yes.

219

:

Uh, a key change in a song, uh, what that

is for us who maybe aren't as musical.

220

:

Uh, and then explain like why

is that maybe significant.

221

:

Very odd.

222

:

I've written about this before and it

always gets quite the reaction online

223

:

when you say there are fewer key

changes because I'm of the impression

224

:

that this is sort of a niche, uh,

you know, music theory topic, but

225

:

it seems to get people really going.

226

:

So they seem to have an

intuitive grasp on this.

227

:

When you think of a.

228

:

A key, a musical key is like a set of

notes that a song is based around, and

229

:

if you change the key, it means this

different other part of the song is now

230

:

based around a different set of notes.

231

:

That sounds sort of imprecise and

um, complicated, but when you hear

232

:

some examples of it, it's obvious.

233

:

In the seventies and eighties, it's

most notably at the end of a song

234

:

you'll hear, it seems like the song

goes up higher for the last chorus.

235

:

Something like Living

on a Prayer by Bon Jovi.

236

:

Or I wanna dance with somebody by

Whitney Houston, or if you're familiar

237

:

with the song Love On Top, by Beyonce,

which is from the new, new, the two

238

:

thousands, she ratchets the key up

like four or five different times at

239

:

the end of the song from the sixties

to the, I don't know, around:

240

:

It's something like 20% of

songs have a key change.

241

:

Not all of them are using that

specific key change, which is sometimes

242

:

called the gear shift key change.

243

:

'cause it feels like you're

shifting a car into a higher gear.

244

:

And then it plummets down to

zero, and it's still pretty

245

:

close to zero these days.

246

:

Key changes are not that

common in popular songs.

247

:

I think the reason people find this

observation interesting, and again, this

248

:

was an observation at first, that I just

felt like this was happening, and then I

249

:

went and measured it and it was the case.

250

:

Is that people associate key changes with

some sort of sort of musical complexity.

251

:

So sometimes people try to extrapolate

this to say, oh, popular music is becoming

252

:

less complicated, and people get upset

about this to some degree, thinking

253

:

that there's no expertise or there's no

craft, um, in our popular songs, as there

254

:

once were, you know, when the Beatles

were topping the charts or whatever.

255

:

I don't think this is,

is exactly the case.

256

:

When you see this decline in key changes,

it's mostly when hip hop becomes much

257

:

more popular and, and hip hop is a genre

that's much less based around, or I should

258

:

say it's more based around rhythm and

lyricism in general than it is melody

259

:

and harmony In hip hop songs, complexity

is not really built around harmonic

260

:

changes or key changes in the same way.

261

:

That it is in earlier forms

of pop and rock music.

262

:

So that's, that's how I interpret

it as it's really a change in what

263

:

genres are popular, more in how

skilled we are at crafting songs.

264

:

But I know people online have

interpreted this in other ways.

265

:

Super interesting.

266

:

Um, we should mention that you, you're

on TikTok, you make TikTok videos.

267

:

What effect do you feel

like platforms like TikTok.

268

:

Have had on, on the music

industry, um, like has it

269

:

changed how, how artists emerge?

270

:

Has it changed what the

top 100 charts look like?

271

:

A hundred percent.

272

:

I mean, in a certain sense, TikTok

is a continuation of a longer history

273

:

connected first to social media.

274

:

I mean.

275

:

In the mid two thousands, MySpace was huge

for making and breaking musical careers.

276

:

But even if you go back further to like

the eighties, you know, MTV was huge

277

:

in making and breaking musical careers.

278

:

TikTok is sort of part of that

trend, but it's different in terms

279

:

of how things go viral on TikTok.

280

:

Of course, I'm sure most

people listening to this have.

281

:

Use TikTok to some degree.

282

:

Short form video.

283

:

Music is heavily integrated

into the platform.

284

:

Often the way songs go viral

on TikTok is a song becomes

285

:

associated with a particular trend.

286

:

In many senses, that means people

are dancing to a song, but in other

287

:

senses, trends are associated with

songs in many, many different ways.

288

:

And finding a song in the two thousands

that got to the top of the charts and

289

:

was not popular on TikTok is basically

impossible in the same way that.

290

:

You were almost never gonna find

a number one hit in the:

291

:

did not have a popular music video.

292

:

So TikTok is where hits

really pop off these days.

293

:

It's not the only place and it's

fundamentally changed how artists

294

:

interact with their fans and how

people interact with music from.

295

:

One other interesting point about

this is, again, I like this comparison

296

:

to the 1980s because it was, the

music video is something else that

297

:

changed how things became popular

that wasn't completely musical.

298

:

When you went and watched Madonna's video

for like a Virgin, for example, you were

299

:

watching something that A, had Madonna

in it, and B Madonna was clearly involved

300

:

in making on TikTok it's different.

301

:

You could have, you could upload a

song and then suddenly some random

302

:

kid in Ohio dances to it in their

basement, and suddenly that song

303

:

becomes popular and it doesn't

really have anything to do with you.

304

:

And we see, we've seen this a bunch of

times throughout the:

305

:

That artists almost have less control over

their work, in a sense, because anyone

306

:

could upload up mu stuff on the internet.

307

:

Anyone could make posts, anyone could.

308

:

Fans are, it's much more interactive

than it was decades ago, which is

309

:

very distinct from earlier eras.

310

:

Very interesting.

311

:

Yeah.

312

:

Um, TikTok is, has played a role in like

what music I, I listen to, uh, as well.

313

:

I want to get a little

bit into the weeds here.

314

:

I'm actually gonna share my screen for.

315

:

Those of you guys who are watching

on, uh, YouTube, because I

316

:

wanna talk about this data set.

317

:

So this is the data set, uh, that you

kind of used, uh, to write your book

318

:

for the analysis in your book, and

you have it online for anyone to use.

319

:

So I encourage anyone who's interested.

320

:

Uh, in music data to take a look at it.

321

:

Um, because one of the things that

I think is a little bit difficult

322

:

is when you think of music, you

don't necessarily think of numbers.

323

:

So, um, you basically have this, this

list of the song, the artist, the date.

324

:

That all makes sense.

325

:

Uh, but then the other categories that

you have here, or the other columns,

326

:

I guess I, we should say, is ratings.

327

:

Weeks at number one.

328

:

How many weeks in a row?

329

:

It was at number one.

330

:

I don't know this word.

331

:

What's this word?

332

:

Di Divisiveness.

333

:

Divisiveness.

334

:

A scale.

335

:

Uh, so I mean, some of this was

calculated by me, but mm-hmm.

336

:

How this project started was a

friend and I would listen to every

337

:

song and we would just rate them out

of 10, which I've anonymized our,

338

:

our ratings there, but there were

three people who would rate songs.

339

:

The overall rating, I just take

the average of the three and the

340

:

divisiveness was basically me trying

to figure out a way to measure.

341

:

Mm, which songs had the

biggest spread in their rating?

342

:

So if I lay, if I said a song was one

was a one out of 10 and my buddy said

343

:

it was an eight out of 10, I would be

like, oh, this song is divisive because

344

:

we couldn't agree on how good it was.

345

:

That's how this actually started.

346

:

Funny enough, and then it expanded.

347

:

And so that's like very subjective data.

348

:

But the other, most of the other

stuff is objective measures or you

349

:

know, objective or factual pieces

of information that I tacked on.

350

:

Interesting.

351

:

So you have like what label

they're with, their parent label,

352

:

the different genres and styles.

353

:

Um, whether it features an, features

an artist, multiple artists, which

354

:

I think is really interesting.

355

:

Place of origin, age.

356

:

Male whites, you have race and age in

there, the songwriters, which I know

357

:

in your book you do some analysis

on whether, like what affects, uh,

358

:

having one songwriter versus like

four or five songwriters would have,

359

:

um, whether it's male or female.

360

:

So there's lots of really

good data, uh, that you have.

361

:

You, like I said, 105 different.

362

:

Columns, some of it, like what

we talked about earlier, uh, what

363

:

key, it's in like a simplified key,

some of the like energy and these

364

:

different, um, vibes around the song.

365

:

Um, I don't know.

366

:

Yeah, those, those come from spot.

367

:

Spotify.

368

:

Okay.

369

:

Spotify.

370

:

API, right?

371

:

That's, yep.

372

:

Um, yeah, Spotify keeps track of those.

373

:

So they're looking at some of the

actual like data from the like.

374

:

Sound wave of the song, basically.

375

:

Um, whether there's bongos or the

banjo in there, that's really cool.

376

:

So you could, you could like do some

pretty cool analysis, which you obviously

377

:

have in the book, but anyone, anyone

listening, I think there's almost like

378

:

unlimited analysis that you could do.

379

:

Like, hey, how many artists, how many

black female artists had a number one

380

:

hit with the flute slash the piccolo?

381

:

That's an interesting question

that you could answer.

382

:

You could answer that.

383

:

Uh.

384

:

I, and that was sort of my goal with,

I, I've always appreciated about the

385

:

data community that I've at least

interacted with online, is that

386

:

everyone's pretty open to sharing data

sources or making things open source.

387

:

So I knew when I, I had this huge

data set that I wanted to make it

388

:

available and hopefully someone else

could use it so that I wasn't, you

389

:

know, I didn't just waste years and

years of building this just for myself.

390

:

And I, you'll see in one of

the other tabs, I have a data

391

:

dictionary, so I try to describe.

392

:

What all the columns are.

393

:

Um, but yeah, I'm, I'm

hoping PE people use it.

394

:

Uh, that's sort of the goal there.

395

:

Yeah, I could see some really

interesting things like talent

396

:

show contestant, like there was a

time where American Idol mattered.

397

:

Um, and now at at least the latest, I

think like Pop Star, I don't know if

398

:

they fit your criteria of Pop Star that

was American Idol, like dropped out

399

:

extremely, uh, earlier Benson Boone.

400

:

I think the other thing that's really

interesting that isn't even necessarily

401

:

tabular that people could, if they

really wanted to is you have the lyrics.

402

:

And the lyrics is like

a whole nother data set.

403

:

'cause it's super unstructured.

404

:

Um, so you could do some really

cool NLP stuff, like where you're

405

:

actually analyzing the words in

each song at a really high level.

406

:

So this is an awesome data set.

407

:

Um, and thank you for, for making

it, uh, open source for everyone.

408

:

I'm curious because once again, you

do work, uh, for Audio Mac, which

409

:

is a music streaming platform.

410

:

You know, you're writing this book

about the history of music and, and

411

:

using data to kind of analyze it.

412

:

Uh.

413

:

Do you ever get sick of like your

job and your hobby being data

414

:

plus music or is it something that

you could do like all the time?

415

:

I haven't burnt out yet.

416

:

I, I know I've had a couple friends that

are worried that I will, I will flame

417

:

out with this because it's, I've been

so invested in it, um, for so long.

418

:

But I mean, still, I,

I turn 30 or earlier.

419

:

A couple months ago, and

I, I still still love it.

420

:

I still love all the musical stuff so far.

421

:

I try to, you know, I try to do some

other things occasionally to give myself

422

:

a little bit of a musical reprieve.

423

:

Uh, but right now I'm

still having fun with it.

424

:

So I, but I think it's important,

you know, you, it's, I think it's

425

:

easier to a degree if you're deep in

the weeds of a data set, if you're at

426

:

least interested in it to some degree.

427

:

We were talking before,

uh, we, we hit record that.

428

:

The process of writing a book is just

extremely difficult, um, which I have

429

:

heard, I have not done, but I've heard.

430

:

And so, uh, I think it makes sense that

you wanna write a book or do anything

431

:

hard about something you're passionate

about because when those hard moments

432

:

do come, you are like, at least it's

kind of fun to, to do all of this.

433

:

Um, so I'm glad, I'm glad

you're still enjoying it.

434

:

You know, we talked about

the data set, just barely.

435

:

I'm curious, like we mentioned

the Spotify, API or or Spotify

436

:

has some data available.

437

:

I'm curious, like if you need to go

find like a music data set, what are

438

:

some of the resources or some of the

methods you try to go find that data?

439

:

Yeah, I, I mean there are, I know certain

people who write stuff with data is,

440

:

they'll basically start with a data set.

441

:

And be like, oh, I found

this great data resource.

442

:

I'm gonna write something about it.

443

:

And I've done that before.

444

:

What I typically do is I usually have

a question and then I'm like, all

445

:

right, I have to go find data for this.

446

:

Which has its, I mean, there's

been some stuff I can't write about

447

:

'cause I just don't, I'm not able

to locate data, even if I think the

448

:

question is interesting, some of them.

449

:

Valuable musical resources that are out

there that are generally easy to use.

450

:

The Spotify API is great.

451

:

Uh, they've locked some

of it down recently.

452

:

I think they're trying to prevent

other people from using it to

453

:

train LLMs, but, uh, you can still

access a lot of great data there.

454

:

Um, Wikipedia is a great data source.

455

:

Uh, Wikipedia has an API that it's

a little bit janky, but there's

456

:

a lot of great stuff on there.

457

:

And you can access basically

any Wikipedia page and Wikipedia

458

:

has lots of great lists.

459

:

So a have like list of every rock artist

or list of every:

460

:

can pull all those down and access all the

pages and see certain things about those.

461

:

For example, something I did with that

was I try to do something about nepotism,

462

:

because if your parent is popular, usually

they'll be listed on your Wikipedia page.

463

:

Linking back to their own page.

464

:

So I was like, all right, let's look

at every pop star to find however you

465

:

want, and see how many of them have

parents who are also famous or famous

466

:

enough to have a Wikipedia page.

467

:

Uh, so there's some fun stuff

you could do with Wikipedia.

468

:

Music Brains is another big one.

469

:

This is a huge, huge open source

music project that has more data than

470

:

you could ever possibly dream of.

471

:

Billboard.

472

:

Is not open source, but people scrape

like the billboard charts and you

473

:

can, if you just search billboard

chart data, you could find the entire

474

:

history of the hot 100, uh, Kaggle,

they have a lot of data sets on there.

475

:

Occasionally there are some music

data sets that are useful, and

476

:

then there are some, all that

stuff I'm talking about is free.

477

:

There are some paid resources.

478

:

You could use Chart Metric as a big

one for the music industry, which is

479

:

pretty cheap, relatively speaking.

480

:

They scrape and they have data on base for

every song from basically every platform.

481

:

You know, you could see how many

radio stations in Kenya have

482

:

added a certain song or something.

483

:

It's pretty overwhelming.

484

:

Great resource.

485

:

You'd have to pay for it.

486

:

And then Lumin.

487

:

Which is much more expensive to the

point where if you're a single person,

488

:

you're probably not paying for it.

489

:

Luminate is the company that

powers the billboard charts.

490

:

So there is, there's lots

of music data out there.

491

:

And don't be a, if you're looking for

something and you think someone might

492

:

have it, don't be afraid to email 'em.

493

:

I've.

494

:

S come upon a couple data sets in

my life just because I sent a cold

495

:

email to somebody and they were like,

oh, yeah, I, I know how to get that.

496

:

I'll, I'll let you know.

497

:

Gimme a call.

498

:

So you gotta poke around.

499

:

But it's out there, the

power of networking.

500

:

That's super cool to to hear you.

501

:

Obviously, you know, you

work with Data at Work.

502

:

Uh, for this book.

503

:

You crunched a bunch of

numbers for your newsletter.

504

:

You crunched a bunch of

numbers, uh, for the book.

505

:

You created a, a decent amount

of charts and, and graphs.

506

:

I'm curious like.

507

:

What are your go-to data tools if you're

going to be analyzing data and what did

508

:

you use to make the charts in the book?

509

:

Uh, if I'm at work, I'm mostly in the

sql, Python, pandas, land and Ex Excel.

510

:

You, I don't know, you

can't really avoid Excel.

511

:

It does a lot of great stuff.

512

:

So, um, still rely on Excel, but.

513

:

During the day, I'm, I'm basically write

during my, my workday, I am writing a

514

:

lot of SQL queries, uh, and we use a

visual, an open source visualization

515

:

tool called Superset, which I think

was developed by people at Airbnb.

516

:

And they opened, they open, sourced

it for my personal projects,

517

:

the book and the newsletter.

518

:

I'm mostly using Pandas and

Excel for my newsletter.

519

:

I do all the visualization through

Data wrapper, which has great

520

:

free data visualization tools.

521

:

Occasionally I'll use Canva for stuff,

but not so much the book, though.

522

:

I worked with a graphic designer,

her name is Kaylee Nerney.

523

:

She's one of my good friends.

524

:

She, I basically gave her all the data.

525

:

We talked through it, and she

built all of those charts using.

526

:

I believe like Adobe, like Adobe

Illustrator, the ones in the

527

:

book are much more custom built.

528

:

And I think if you wanna do stuff

like that, you probably have to

529

:

work with someone who is not as

graphically challenged as I am.

530

:

But for simple stuff in the

newsletter, you know, I'm very

531

:

data wrappers a great tool.

532

:

Even the.

533

:

The visualization tools in

Excel are, are good enough.

534

:

If you're just sending out a newsletter,

people are posting something online.

535

:

Very cool.

536

:

I honestly have never heard of data

Wrapper before, but I really do like,

537

:

uh, the charts on your newsletter,

so I'll have to check that out.

538

:

And the other one you

mentioned was Superset, right?

539

:

Is that right?

540

:

Yeah, superset.

541

:

We use that at work.

542

:

So it plugs into like our data,

our data warehouse, and we could.

543

:

Set up visualizations and their charts

and graphs for people who aren't as

544

:

data savvy, um, in our organization,

though I think your development

545

:

skills might have to be a little more

sophisticated to get that all set up.

546

:

I, I didn't, I didn't do the setup

there, but it was a tool my, my

547

:

coworker was aware of and we've

had good success with it so far.

548

:

Very cool.

549

:

Uh, what advice would you give

someone that maybe is listening to

550

:

this and really enjoys music and

wants to, you know, become a data

551

:

analyst in the music industry?

552

:

Yeah.

553

:

I mean the, the cool thing about music and

entertainment generally in this day and

554

:

in the internet age is that every one of

these companies is hiring data analysts.

555

:

Whether you're working in live, live,

entertainment, music, streaming,

556

:

publishing, even the, the big

labels, I mean, they're hiring

557

:

people to crunch numbers because

there's so much data around there,

558

:

out there around music these days.

559

:

So it's a good industry to work in.

560

:

I think the most, I always say

the most employable skill that

561

:

I've learned in my day to days is.

562

:

How to write SQL queries, because

every role that I've come across

563

:

always involves SQL in some way.

564

:

I mean, there's tons of other,

uh, statistical, you know, there's

565

:

other tons of other data programs

that you can use, but a lot of

566

:

things seem to fall back onto sql.

567

:

But if you're interested

in music and music data.

568

:

Most of the things that I've been

able to do so far in my career have

569

:

just been because I am constantly

shouting into the void on the internet.

570

:

And occasionally someone is

listening and reaches out to me,

571

:

or I send a cold email to someone.

572

:

And that's how most opportunities

that I have had, especially with

573

:

writing this, this book would've

never happened if I didn't start

574

:

a newsletter or post on TikTok.

575

:

I don't think, you know, people

always say good enough is what is it?

576

:

Uh.

577

:

I'm gonna get it wrong.

578

:

Even if something's not perfect, you

should not be scared to put it out online.

579

:

Like you don't have to have

the perfect data visualization.

580

:

When I started the newsletter, it was

all just, it was just Excel graphics.

581

:

Um, people will, if, if what you're

saying is compelling, people will follow.

582

:

But it's nice to have pretty visuals too.

583

:

So pretty visuals help to go

make things go viral sometimes.

584

:

But good ideas and good

concepts can go a long way.

585

:

And I love what you said that like most of

the time, like 90% of the time I was just.

586

:

Talking into the void and no

one really cared what, what you

587

:

were saying until they don't.

588

:

And then all of a sudden

that makes the difference.

589

:

And it can lead to career opportunities,

it can lead to, you know, book

590

:

opportunities in, in your case.

591

:

And I agree, there's, there's so much

good that can come from posting on

592

:

social media and just talking about

what you're working on, talking

593

:

about what you are interested in.

594

:

That's awesome.

595

:

Okay, Chris, super excited

for your book to come out.

596

:

It's got, uh, uncharted territory.

597

:

Uh, when does it come out?

598

:

It comes out November 13th,

:

599

:

Um, you can find it basically

anywhere it's available.

600

:

It'll be available online through every

major bookseller, Amazon, Barnes and

601

:

Noble, Walmart, all that good stuff.

602

:

I think it's cheaper, cheapest

through the publisher.

603

:

So that's what I've been

linking to for people.

604

:

But you should be able to get it anywhere.

605

:

And if you can't, if you reach out

to me, I will make sure I get you

606

:

a copy of that book in your hands.

607

:

Sweet.

608

:

That's awesome.

609

:

So we'll have a link to it, uh,

in the show notes down below.

610

:

Uh, if you're listening to

this beforehand, then you can,

611

:

uh, pre-order it or if you're

listening to it after it comes out.

612

:

You'll be able to, uh, check it out.

613

:

I haven't read every single page,

but I've read a good chunk of it.

614

:

Um, and I found it pretty interesting.

615

:

Um, so Chris, thanks so much

for coming on the podcast and

616

:

talking about music and data.

617

:

Yeah, thanks for having me.

618

:

I'm always down to, to chop

it up about music and data.

Listen for free

Show artwork for Data Career Podcast: Helping You Land a Data Analyst Job FAST

About the Podcast

Data Career Podcast: Helping You Land a Data Analyst Job FAST
The Data Career Podcast: helping you break into data analytics, build your data career, and develop a personal brand

About your host

Profile picture for Avery Smith

Avery Smith

Avery Smith is the host of The Data Career Podcast & founder of Data Career Jumpstart, an online platform dedicated to helping individuals transition into and advance within the data analytics field. After studying chemical engineering in college, Avery pivoted his career into data, and later earned a Masters in Data Analytics from Georgia Tech. He’s worked as a data analyst, data engineer, and data scientist for companies like Vaporsens, ExxonMobil, Harley Davidson, MIT, and the Utah Jazz. Avery lives in the mountains of Utah where he enjoys running, skiing, & hiking with his wife, dog, and new born baby.