# [Question] What math do i need for data analysis?

I always had fairly good math­e­mat­i­cal think­ing (I think) and loved learn­ing about beau­tiful con­cepts in math—but i didn’t learn much at all in school (cause i had the choice). You can say i was “util­i­tar­ian” re­gard­ing learn­ing math, i didn’t do it if i didn’t see how it can en­rich my life.

so my knowl­edge of math is quite di­s­or­ga­nized, i know more about Bayes the­o­rem then many much sim­pler con­cepts (i know, it re­ally shouldn’t be that way).

Now i want to be able to an­a­lyze data, but i don’t want to learn math that i won’t use for it, if pos­si­ble.

So here’s my ques­tion—what ba­sic stuff do i need to learn in or­der to be able to calcu­late prob­a­bil­ities, statis­tics, do Bayesian math, and over­all do things within data anal­y­sis that I may yet be aware of?

If you also have sug­ges­tions for how to learn those things, af­ter i learn the ba­sics, it will be much ap­pre­ci­ated.

thank you :)

• I per­son­ally found the Udac­ity course helpful but I see that some­one has done a com­par­i­son of all the on­line data sci­ence courses they could find here. Hope­fully one of those might be what you’re look­ing for.

• Per­haps check out dataquest.io, which teaches the data sci­en­tist’s ba­sic skil­lset.

• This is re­ally cool! i might go for it. even though look­ing at the sub­jects there, i un­der­stand that i meant some­thing much more ba­sic in “data anal­y­sis” than what it ac­tu­ally means :)

• Check out John Hopcroft’s Foun­da­tions of Data Science

• thanks. but already in the in­tro­duc­tion (2.1) i got lost, it’s be­yond the math­e­mat­i­cal ba­sis I’m in now. what do i need to learn in or­der to learn that? or even just prob­a­bil­ity and statis­tics, as a start. it seems i didn’t know what i was ask­ing for when i said “data anal­y­sis”

• You want ba­sic un­der­grad­u­ate prob­a­bil­ity and lin­ear alge­bra and some calcu­lus on the side, but you should get along with those. Also some prac­tice with read­ing aca­demic texts so that you can try to ex­tract some use­ful mean­ing from it with­out un­der­stand­ing ev­ery part helps. Also you need some gen­eral fa­mil­iar­ity with how aca­demic math pa­pers are writ­ten, the con­cepts in 2.1 aren’t com­plex (high-di­men­sional space make ran­dom points stick to­gether in clumps less), but the way the book writes it is go­ing to be un­fa­mil­iar if you haven’t been ex­posed to aca­demic math writ­ing much be­fore.

Not sure what’s a good place to get that other than “go to uni­ver­sity, minor in math”. Khan Academy?

• I think I was in your shoes last year. I *thought* I wanted to learn “data anal­y­sis”, took an on­line course, and be­came way over my head and also re­al­ized that I prob­a­bly didn’t re­ally know what “data anal­y­sis” meant.

It sounds like, at the min­i­mum, an in­tro to statis­tics course might be use­ful. I don’t think there’s much math, but more ways of think­ing about things like what “prob­a­bil­ity” means, was re­ally helpful for me as a foun­da­tion for learn­ing other re­lated stuff.

• Yup. definitely the shoes I’m in, glad to find be­fore­hand that i might not want to take them for a walk ;)

Though i ac­tu­ally would like to learn the math (and math needed for it be­fore that), not just the thought pro­cess—do you have any sug­ges­tions? or even know just the pre­req­ui­sites?

• I would say that logic is ac­tu­ally more im­por­tant than math, though my knowl­edge of “data anal­y­sis” is very limited. Again, ba­sic statis­ti­cal knowl­edge and math is use­ful...things like what is/​how to calcu­late stan­dard de­vi­a­tions, cor­re­la­tion, re­gres­sion, etc.

I’ve taken this class, and while it’s spe­cific to Google Sheets, look­ing at the syl­labus might give you some clues about what to study: https://​​courses.ben­l­col­lins.com/​​p/​​data-anal­y­sis-with-google-sheets

Also, non-math-re­lated con­cepts like how to clean and or­ga­nize data is very im­por­tant, though I never even though about it un­til I started learn­ing about data anal­y­sis. After all, garbage in, garbage out.

• I be­gan go­ing through some ba­sics on khan academy, and plan to then learn statis­tics and prob­a­bil­ity there.

i think I’ll wait with learn­ing data anal­y­sis at least un­til af­ter that.

I would say that logic is ac­tu­ally more im­por­tant than math

can you elab­o­rate? :)

• I would say that logic is ac­tu­ally more im­por­tant than math
can you elab­o­rate? :)

I kind’a sort’a thought learn­ing data anal­y­sis would give me “mag­i­cal pow­ers” to glean in­sight from data....like I could just throw a bunch of data on a spread­sheet, run some for­mu­las and func­tions, and voila...en­light­en­ment. But there’s a LOT that goes into de­cid­ing things like what kind of data to use, what to ex­clude, *how* to pro­cess the data, how to *in­ter­pret* the data *and* the re­sults, etc. The for­mu­las and statis­tics is just a small part of the toolbox used in data anal­y­sis.

There’s a lot of plan­ning, pre-plan­ning, figur­ing out what you want to find out and how to get there from what you have...you have to use a lot of logic, crit­i­cal think­ing skills, things like that be­fore you even start do­ing the math and statis­tics, and cer­tainly *af­ter* you do the math. Does that make sense?

• Yup, thanks :)

ba­si­cally you say that pretty much all ra­tio­nal­ity skills (Logic, knogledge of bi­ases and heuris­tics, etc...) are needed or benefi­cial to know how not to make mis­takes while han­dling the data—right?

• Yep. I think the best les­sons I’ve learned re­volve around ac­tu­ally *try­ing* to sec­ond guess my­self. I’d crunch some num­bers, feel­ing con­fi­dent that I did ev­ery­thing right, only to re­al­ize that my as­sump­tions or logic or some­thing *other* than the me­chan­ics of num­ber crunch­ing was off or wrong.