Why You Should Learn Julia (Programming Language)

Why You Should Learn Julia (Programming Language)

Introduction

Julia is a General Purpose Programming Language, with a unique affinity for numerical computation, which makes it quite the useful language for data scienc and machine learning. The original motivation for creating the Julia language was to reduce the development cycle for data scientists. Often times, data scientists will do rapid development and exploratory analysis with higher level languages such as Python, and once their method is proven to work, they will send their method to be optimized in a lower level language, such as C. Julia is meant to be a language with the ease of use of Python, with speeds much closer to languages such as C. In their "Julia in a Nutshell" portion of their homepage, JuliaLang describes Julia as fast, dynamic, optionally typed, general, easy to use, and open source. I recently took an advanced matrix computations course which showed me many of the benefits of using Julia, so I will share with you my thoughts on it, and why I think it is a good language to learn, especially for those who work in data science and numerical analysis. To start off, you can download Julia here: julialang homepage.

Low Learning Curve

One of the big deterrents to learning a new language is the thought of learning loads of new syntax and quirks of a language. Anybody who is comfortable with a language such as Python would have no problem becoming familiar with Julia. Julia uses the keyword end paired with a scoping declaration keyword (such as function or for) to determine its scope rather than curly braces or colon and spacing. This makes for very readable code:

function closestWord(word::String, toCompare::Array{String}, minThresh::Int = 0)
    dists = map(x->LevenshteinDistance(word, x), toCompare)
    ind = argmax(dists)
    for i = 1:length(dists)
        if (dists[i] < dists[ind] && dists[i] >= minThresh)
            ind = i
        end
    end
    return dists[ind], toCompare[ind]
end

As you can see from the example above, Julia does not use a semi-colon for line termination, rather it uses newlines Γ  la Python. For those who are used to using Numpy in Python, there are some noticeable differences between Julia and Python. Julia is 1-indexed, whereas Numpy is 0-indexed. There are many other small differences that can bite you if you are not aware of them, so I suggest if you are interested in learning Julia to take a look at a nice resource provided by JuliaLang regarding major differences bewteen Julia and similar languages: Julia Language Differences. The main point I want to get accross is that these are fairly minor differences, and I think the best way to learn is to look at examples of common operations, such as this Julia by Example documentation. One additionally security blanket offered by Julia is its ability to "directly call and fully interoperate with Python from the Julia Language" using a package called PyCall (PyCall Github). So, if you have some important Python functions that you are not sure how to convert to Julia, you can rest assured that you can still use them even within your Julia code.

Open Source Community

As I mentioned above, Julia is an open-source language, with a blooming open-source community behind it (Julia Github). There are currently 25.2k stars on the official Julia Github repository, and the core developers are very active in their development process. Many of the packages I used the most during my matrix computations class are external libraries not maintained by the core development team, such as Julia Plots, Polynomial Roots, and Block Arrays. These libraries stay well-maintained and are constantly being added upon. This also provides great opportunities to contribute to an open-source effort, especially considering that Julia only went into its 1.0 version around August of 2018, so it is still a fairly young language with a need for open-source contributors.

Good REPL

One of the often overlooked parts of Julia that I poersonally find very nice for quality of life is its REPL. Julia's REPL, or Read-Evaluate-Print Loop, is a separate program that installs when you install Julia, and can be accessed during a normal terminal session if added to your PATH variable. The REPL is quite straight-forward, and great for when you are still learning Julia, as it has its documentation and package manager built-in. In order to proc the documentation, you simply type ? from the terminal, and then search for any topic. In order to proc the package manager, you type ], and then any command for the manager.

   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.1.1 (2019-05-16)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

help?> eof
search: eof EOFError typeof sizeof nameof fieldoffset versioninfo searchsortedfirst SegmentationFault

  eof(stream) -> Bool

  Test whether an I/O stream is at end-of-file. If the stream is not yet exhausted, this function will block to wait
  for more data if necessary, and then return false. Therefore it is always safe to read one byte after seeing eof
  return false. eof will return false as long as buffered data is still available, even if the remote end of a
  connection is closed.

julia> using SparseArrays

(v1.1) pkg> add "LinearAlgebra"
  Updating registry at `C:\Users\Andres\.julia\registries\General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
 Resolving package versions...
 Installed PlotThemes ─────────── v1.0.1
 Installed FFMPEG ─────────────── v0.2.4
 Installed IndirectArrays ─────── v0.5.1
 Installed ImageDistances ─────── v0.2.7
 Installed ImageTransformations ─ v0.8.1
 Installed ImageAxes ──────────── v0.6.2
 Installed ImageFiltering ─────── v0.6.9
 Installed GR ─────────────────── v0.44.0
 Installed Requires ───────────── v1.0.0
 Installed OffsetArrays ───────── v0.11.4
 Installed VersionParsing ─────── v1.2.0
 Installed Measures ───────────── v0.3.1
 Installed Arpack ─────────────── v0.3.2

Additionally, the REPL provides tab auto-complete and search, so if you are trying to remember a function name, you can begin by typing the first few letters, then type tab for the REPL to show you any functions that begin with those letters.

julia> a
a             acos           acscd          ans            argmin         asinh          atreplinit
abs           acosd          acsch          any            ascii          asyncmap       axes
abs2          acosh          adjoint        any!           asec           asyncmap!
abspath       acot           all            append!        asecd          atan
abstract type acotd          all!           applicable     asech          atand
accumulate    acoth          allunique      apropos        asin           atanh
accumulate!   acsc           angle          argmax         asind          atexit

Lastly, Julia supports unicode characters, which is quite nice when dealing with operations that are oftentimes adapted from mathematical literature. So if you referring to something as the "delta," you can actually use the greek symbol for delta. You just type a backslash then the character and then tab and it will convert to the unicode character, such as \delta, or \Sigma. Unicode is available outside of the REPL as well, so if you include unicode in your .jl file, it will work.

julia> \sigma β†’ Οƒ | \delta β†’ Ξ΄ | \sum β†’ βˆ‘
u = [1 / sqrt(5) -2 / sqrt(5) 0; 2 / sqrt(5) 1 / sqrt(5) 0; 0 0 1]
vt = [-1 / sqrt(2) 1 / sqrt(2) ; -1 / sqrt(2) -1 / sqrt(2)]
Ξ£ = [1 / ((1 / sqrt(5)) * (1 / sqrt(2))) 0 ; 0 0; 0 0]
aa = u * Ξ£ * vt

Performance

Now to get to the nitty gritty of what makes Julia such an alluring alternative to languages such as Python for numerical computation. I will not go into too much detail on my own as to benchmarking and the technicalities of what makes Julia so fast, but will rather point out some of the main achievements and refer you to outside resources where this performance has already been thoroughly tested. Some of the main points are that "Julia features optional typing, multiple dispatch, and good performance, achieved using type inference and just-in-time (JIT) compilation, implemented using LLVM" (Julia Introduction). JuliaLang describes multiple dispatch as "Using all of a function's arguments to choose which method should be invoked, rather than just the first." Julia is dynamically typed, but allows for optional typing, which when used consistently, can speed up your code dramatically. The main things to know about the performance of Julia, is that it is an objectively fast language, especially for a dynamically-typed language, and it is this way by design. The developers of Julia place high precedence on its speed, and will continue to do so.

If you are curious about how Julia performs compared to other common languages, you can check out the Julia micro-Benchmarks.

Conclusion

Although Julia is a general purpose programming language, it still mostly caters to numerical computing and data science. If you are mostly familiar with MATLAB or R, and are looking to get into general purpose programming, Julia could be the perfect bridge for you. Another great feauture of Julia is that it is integreated with Jupyter Notebooks as IJulia. If you are currently using Python for your numerical computing, I urge you to at least try out Julia. I personally really enjoy using it, and have come to prefer using it over Numpy for most matrix-heavy use cases. Julia is still a relatively young language, so expect some growing pains perhaps as well. If you're interested in learning Julia, I would suggest maybe adapting some existing Python code you have into Julia code, and seeing how they compare for you. I adapted some pseudocode from the Wikipedia entry for the Wagner-Fischer edit distance algorithm, as an example. The code for that can be found here: edit distance. Let me know if this post helped you in any way, and happy coding!

Comments