Published on April 22nd, 2019 📆 | 3430 Views ⚑
0Programming languages infosec professionals should learn
Code is an essential skill of the infosec professional, but there are so many languages to choose from. What language should you learn? As a heavy coder, I thought Iâd answer that question, or at least give some perspective.
The tl;dr is JavaScript. Whatever other language you learn, youâll also need to learn JavaScript. Itâs the language of browsers, Word macros, JSON, NodeJS server side, scripting on the command-line, and Electron apps. Youâll also need to a bit of bash and/or PowerShell scripting skills, or SQL for queries. Other languages are important as well, Python is very popular for example. Actively avoid C++ and PHP as they are obsolete.
Also tl;dr: whatever language you decide to learn, also learn how to use an IDE with visual debugging, rather than just a text editor. That problems means Visual Code from Microsoft.
Letâs talk in general terms. Here are some types of languages.
- Unavoidable. As mentioned above, familiarity with JavaScript, bash/Powershell, and SQL are unavoidable. If you are avoiding them, you are doing something wrong.
- Small scripts. You need to learn at least one language for writing quick-and-dirty command-line scripts to automate tasks or process data. As a tool using animal, this is your basic tool. You are a monkey, this is the stick you use to knock down the banana. Good choices are JavaScript, Python, and Ruby. Some domain-specific languages can also work, like PHP and Lua. Those skilled in bash/PowerShell can do a surprising amount of âprogrammingâ tasks in those languages. Old timers use things like PERL or TCL. Sometimes the choice of which language to learn depends upon the vast libraries that come with the languages, especially Python and JavaScript libraries.
- Development languages. Those scripting languages have grown up into real programming languages, but for the most part, âsoftware developmentâ means languages designed for that task like C, C++, Java, C#, Rust, Go, or Swift.
- Domain-specific languages. The language Lua is built into nmap, snort, Wireshark, and many games. Ruby is the language of Metasploit. Further afield, you may end up learning languages like R or Matlab. PHP is incredibly important for web development. Mobile apps may need Java, C#, Kotlin, Swift, or Objective-C.
As an experienced developer, here are my comments on the various languages, sorted in alphabetic order.
bash (and other Unix shells)
You have to learn some bash for dealing with the command-line. But itâs also a fairly completely programming language. Perusing the scripts in an average Linux distribution, especially some of the older ones, and youâll find that bash makes up a substantial amount of what we think of as the Linux operating system. Actually, itâs called bash/Linux.
In the Unix world, there are lots of other related shells that arenât bash, which have slightly different syntax. A good example is BusyBox which has âashâ. I mention this because my bash skills are rather poor partly because I originally learned âcshâ and get my syntax variants confused.
As a hard-core developer, I end up just programming in JavaScript or even C rather than trying to create complex bash scripts. But you shouldnât look down on complex bash scripts, because they can do great things. In particular, if you are a pentester, the shell is often the only language youâll get when hacking into a system, sod good bash language skills are a must.
C
This is the development language I use the most, simply because Iâm an old-time âsystemsâ developer. What âsystems programmingâ means is simply that you have manual control over memory, which gives you about 4x performance and better âscalabilityâ (performance doesnât degrade as much as problems get bigger). Itâs the language of the operating system kernel, as well as many libraries within an operating system.
But if you donât want manual control over memory, then you donât want to use it. Itâs lack of memory protection leading to security problems makes it almost obsolete.
C++
None of the benefits of modern languages like Rust, Java, and C#, but all of the problems of C. Itâs an obsolete, legacy language to be avoided.
C#
This is Microsoftâs personal variant of Java designed to be better than Java. Itâs an excellent development language, for command-line utilities, back-end services, applications on the desktop (even Linux), and mobile apps. If you are working in a Windows environment at all, itâs an excellent choice. If you can at all use C# instead of C++, do so. Also, in the Microsoft world, there is still a lot of VisualBasic. OMG avoid that like the plague that it is, burn in a fire burn burn burn, and use C# instead.
Go
Once a corporation reaches a certain size, it develops its own programming language. For Google, their most important language is Go.
Go is a fine language in general, but itâs main purpose is scalable network programs using goroutines. This is does asynchronous user-mode programming in a way thatâs most convenient for the programmer. Since Google is all about scalable network services, Go is a perfect fit for them.
I do a lot of scalable network stuff in C, because Iâm an oldtimer. If thatâs something youâre interested in, you should probably choose Go over C.
Java
This gets a bad reputation because it was once designed for browsers, but has so many security flaws that it canât be used in browsers. You still find in-browser apps that use Java, even in infosec products (like consoles), but itâs horrible for that. If you do this, you are bad and should feel bad.
But browsers aside, itâs a great development language for command-line utilities, back-end services, apps on desktops, and apps on phones. If you want to write an app that runs on macOS, Windows, and on a Raspberry Pi running Linux, then this is an excellent choice.
JavaScript
As mentioned above, you donât have a choice but to learn this language. One of your basic skills is learning how to open Chrome developer tools and manipulate JavaScript on a web page.
So the question is whether you learn just enough familiarity with the language in order to hack around with it, or whether you spend the effort to really learn the language to do development or write scripts. I suggest that you should. For one thing, youâll often encounter weird usages of JavaScript that you are unfamiliar with unless you seriously learn the language, such as JQuery style constructions that look nothing like what you mightâve originally learned the language for.
JavaScript has actually become a serious app development language with NodeJS and frameworks like Electron. If there is one language in the world that can do everything, from writing back end services (NodeJS), desktop applications (Electron), mobile apps (numerous frameworks), quick-and-dirty scripts (NodeJS again), and browser apps â itâs JavaScript. Itâs the lingua franca of the world.
In addition, remember that your choice of scripting language will often be based on the underlying libraries available. For example, if writing TensorFlow machine-learning programs, you need those libraries available to the language. Thatâs why JavaScript is popular in the machine-learning field, because thereâs so many libraries available for it.
BTW, âJSONâ is also a language, or at least a data format, in its own right. So you have to learn that, too.
Lua
Lua is a language similar to JavaScript in many respects, with the big difference that arrays start with 1 instead of 0. The reason its exists is that itâs extremely easy to embed in other programs as their scripting language, is lightweight in terms of memory/CPU, and is ultra-portable almost everywhere.
Thus, you find it embedded in security tools like nmap, snort, and Wireshark. You also see it as the scripting language in popular games. Like Go, it has extremely efficient coroutines, so you see it in the nginx web server, âOpenRestyâ, for backend scripting of applications.
PHP
Surprisingly, PHP is a complete programming language. You can use it on the command-line to write scripts just like Python or JavaScript. You may have to learn it, because itâs still the most popular language for creating webapps, but learning it well means being able to write backend scripts in it as well.
However, for writing web apps, itâs obsolete. There are so many unavoidable security problems that you should avoid using it to create new apps. Also, scalability is still difficult. Use NodeJS, OpenResty/Lua, or Ruby instead.
PowerShell
The same comments above that apply to bash also apply to PowerShell, except that PowerShell is Windows.
Windows has two command-lines, the older CMD/BAT command-line, and the newer PowerShell. Anything complex uses PowerShell these days. For pentesting, there are lots of fairly complete tools for doing interesting things from the command-line written in the PowerShell programming language.
Thus, if Windows is in your field, and it almost certainly is, then PowerShell needs to be part of your toolkit.
Python
This has become one of the most popular languages, driven by universities which use it heavily as the teaching language for programming concepts. Anything academic, like machine learning, will have great libraries for Python.
A lot of hacker command-line tools are written in Python. Since such tools are often buggy and poorly documented, youâll end up having to reading the code a lot to figure out what is going wrong. Learning to program in Python means being able to contribute to those tools.
I personally hate the language because of the schism between v2/v3, and having to constantly struggle with that. Every language has a problem with evolution and backwards compatibility, but this v2 vs v3 issue with Python seems particularly troublesome.
Also, Python is slow. That shouldnât matter in this age of JITs everywhere and things like Webassembly, but somehow whenever you have an annoyingly slow tool, itâs Python thatâs at fault.
Note that whenever I read reviews of programming languages, I see praise for Pythonâs syntax. This is nonsense. After a short while, the syntax of all programming languages becomes quirky and weird. Most languages these days are multi-paradigm, a combination of imperative, object-oriented, and functional. Most all are JITted. âSyntaxâ is the least reason to choose a language. Instead, itâs the choice of support/libraries (which are great for Python), or specific features like tight âsystemsâ memory control (like Rust) or scalable coroutines (like Go). Seriously, stop praising the âelegantâ and âsimpleâ syntax of languages.
Ruby
Ruby is a great language for writing web apps that makes security easier than with PHP, though like all web apps it still has some issues.
In infosec, the major reason to learn Ruby is Metasploit.
Like Python and JavaScript, itâs also a great command-line scripting language with lots of libraries available. Youâll find it often used in this roll.
Rust
Rust is Mozillaâs replacement language for C and especially C++. Itâs supports tight control over memory structures for âsystemsâ programming, but is memory safe so doesnât have all those vulnerabilities. One of these days Iâll stop programming in C and use Rust instead.
The problem with Rust is that it doesnât have quite the support that other languages have, like Java or C# for apps, and isnât as tightly focused on network apps as Go. But as a language, itâs wonderful. In a perfect world, weâd all use JavaScript for scripting tasks and Rust for the backend work. But in the real world, other languages have better support.
SQL
SQL, âstructure query languageâ, isnât a programming language as such, but itâs still a language of some sort. Itâs something that you unavoidably have to learn.
One of the reasons to learn a programming language is to process data. You can do that within a programming language, but an alternative is to shove the data into a database then write queries off that database. I have a server at home just for that purpose, with large disks and multicore processors. Instead of storing things as files, and writing scripts to process those files, I stick it in tables, and write SQL queries off those tables.
Swift
Back in the day, when computers were new, before C++ become the âobject orientedâ language standard, there was a competing object-oriented version of C known as âObjective Câ. Because, as everyone knew, object-oriented was the future, NeXT adopted this as their application programming language. Apple bought NeXT, and thus it became Appleâs programming language.
But Objective C lost the object-oriented war to C++ and became an orphaned language. Also, it was really stupid, essentially two separate language syntaxes fighting for control of your code.
Therefore, a few years ago, Apple created a replacement called Swift, which is largely based on a variant of Rust. Like Rust, itâs an excellent âsystemsâ programming language that has more manual control over memory allocation, but without all the buffer-overflows and memory leaks you see in C.
Itâs an excellent language, and great when programming in an Apple environment. However, when choosing a âlanguageâ thatâs not particularly Apple focused, just choose Rust instead.
Conclusion
As I mentioned above, familiarity with JavaScript, bash/PowerShell, and SQL is unavoidable. So start with those. JavaScript in particular has become a lingua franca, able to do, and do well, almost anything you need a language to do these days, so itâs worth getting into the finder details JavaScript.
However, thereâs no One Language to Rule them all. Thereâs good reasons to learn most languages in this list. For some tasks, the support for a certain language is so good itâs just best to learn that language to solve that task. With the academic focus on Python, youâll find well-written libraries that solve important tasks for you. If you want to work with a language that other people know, that you can ask questions about, then Python is a great choice.
The exceptions to this are C++ and PHP. They are so obsolete that you should avoid learning them, unless you plan on dealing with legacy.
*** This is a Security Bloggers Network syndicated blog from Errata Security authored by Robert Graham. Read the original post at: https://blog.erratasec.com/2019/04/programming-languages-infosec.html
Gloss