Help, I can’t code and I’m doing a statistics degree. What do I do?
In Warwick, as part of your statistics degree you are going to experience at least 1 coding module. It’s core, and it’s in first year. The language that you’ll be taught in is called R, a free statistical programming language that’s been in use since 1993. It has a huge community support and there are thousands of packages that let you do basically anything you want in relation to data science, statistics, and making graphs. In fact, some of the R development team are actually professors here at Warwick.
The first thing to understand about programming is to understand how computers think. Computers will only do what you explicitly tell them to, especially when coding. Sometimes it will be painfully obvious that the “5” you just put in is a number, and you want to add it to a certain variable, but because the “5” is for whatever reason being treated as a character and not a number, the computer says no. This sort of problem is incredibly frustrating for new programmers. The feeling of “why doesn’t this work?” is so often said, that it can cause people to ragequit from programming altogether. However, if you can understand exactly what the computer is doing with every line of code, then you will be able to more easily debug your code.
A basic understanding of computer science can go a long way in this regard, and also in others. Understanding basic syntax and the basics of writing readable code can save you some trouble. If you need to ask for someone’s help, if the code is written in a completely incomprehensible manner then you’ll have absolutely no hope of getting the help you need. Negative space is important: indenting using tabs, putting spaces before and after operators, these are all very recommended. Brush up on coding standards and you’ll save your teachers a lot of time.
Get good at Googling stuff. In R, if you want to know what a function does you can write the function with a “?” before it, and R will tell you everything it knows about it. For example if I wrote ?count it would tell me exactly what can go into the count function, what i can tell count to do, and what will come out of it. Most importantly there is “what will come out of it”, like I mentioned above, sometimes R will return things that are not in the class that you want. You might end up getting a vector or a string when you really need a number, so in this case you should be aware of functions like “as.numeric” which converts anything inside it to a number. In some cases though, the information you get from R won’t be enough. If that is the case, almost every function you can think of will have a tutorial on youtube that you can search for, or a question having been asked about it on stackoverflow (a website for asking questions based on programming).
One thing to remember about R is that it’s not the be all and end all of statistical software. There are plenty of old statistical packages still used like SAS and SPSS, but they are generally falling out of use. The other free one that is often used is Python, if you feel like going into data science when you finish your degree then you’ll need to learn this one too. It’s not too hard, but it’s a bit harder than R. If you’re really looking to stay ahead of the curve then you can try learning Julia, a scientific programming language that is being picked up more for data science due to its speed.