Getting started with Rust

the not-that-bumpy (!) first steps of my journey

I’ve found out about the Rust programming language a few years ago. I did check it out fast then, but I said it wouldn’t be for me: I didn’t need nor wanted to learn any new low level language, I thought.

Still, I did run through a short tutorial when I had some time just to see what’s it about. And I dismissed it hardly again because it looked I needed to restructure my thinking during programming if I’d ever need to go Rust, and why, of why, would I want to do that?

(Spoiler alert: I was wrong. There are very good reasons to learn Rust, even if you don’t want to start developing tools with it right away.)

I’ve revisited everything about the strange-to-me programming language later, however, after I’ve seen some Stack Overflow surveys last year. And then this year again. Plus, I remembered I’ve read somewhere that Linus Torvalds is also saying he could accept giving Rust a chance for certain Linux modules (drivers first, at least): oh my!

Stack Overflow Developer Survey 2021

The United States and India continue to provide the highest volume of survey responses, followed by Germany and UKI (UK…

insights.stackoverflow.com

Linus Torvalds on where Rust will fit into Linux

Linux is the poster-child for the C language. But times change. The Rust language has been slowly gathering support for…

www.zdnet.com

I knew it therefore, even before understanding things at any deeper level, that those people who declare they love this language so much must be aware of something that I don’t seem to have grasped yet.

So I tried to pursue the truth further: what is so lovable about Rust? Can learning more about it help me improve my programming at all, regardless of whether I’d eventually decide to use it in a project?

But before going on, let’s clarify a bit: low level programming — such as using C or assembly languages* — is difficult. But it’s very, very important that every developer to focus on such tasks, at least from time to time, even if only to remain able to understand how systems work under the hood in detail.

* Yes, there are more than one assembly languages: because they are not portable from one platform to another. C (and similarly Rust), on the other hand, try to resolve portability issues; and while C introduces many other issues instead (or just keeps them there), especially around memory management, the latter tries to add as little new problems as possible, even if this requires new concepts for its users to learn and.. accept.

Without low level programming knowledge we cannot optimize things more than our beloved high level platform allows, but moreover, we probably cannot even optimize things sufficiently under that platform of choice, be it based on Java, .NET, Objective C, or even JavaScript, whatever. Also because we all tend to forget how programs really go internally when they are started: how memory is really managed and how instructions are queued for execution too, to name just two important aspects on the table.

In short, too much abstraction — C++, Java, C#, Swift, or anything like those — proves great to solve many problems but is otherwise going to make us lose the detail track so fast that we won’t even see the derail until it have actually happened.

And of course, yada, yada, yada, low level programming is so useful for building high performance 3D apps, games, and even if we just need to add general algorithms like those in the standard libraries of any platform, after all. To get the desired power you must use a language like C or Rust, or otherwise you’re doomed.

See this nice chart from Wikipedia to see just how fast Rust is. And all this while remaining very “safe by default” at a level that C and C++ just cannot dream about themselves!

Comparison of programming languages - Wikipedia

Programming languages are used for controlling the behavior of a machine (often a computer). Like natural languages…

en.wikipedia.org

These being said, let me tell you how I did eventually start pursuing the “gold in Rust”, a process I’ve initiated myself just a few weeks ago, in fact…

You may have guessed my first step if you have read other articles of mine: I’ve ordered a Rust programming book to go through (and maybe even read 😀) as soon as possible. I know, I’m old, I learn faster from text books (and I’m talking about a printed version, this time) rather than just using YouTube.

Regardless, until the book arrives (yes, it takes more time than expected), I’ve decided to also create a small command like tool so that I see things that no tutorial would otherwise show: real life programming thingies regarding the language, be them good or bad.

To find something practical to do, I’ve looked through my notes and here it was: I always wanted a small tool to tell me which is the start point — a .csproj or .vbproj, for example — in a Visual Studio solution. (Disappointingly, this info is stored only on the local machines of developers — so it’s not shared among team members through source code control systems — , and whenever I get into a new Microsoft based codebase I need to ask my peers about it: what item from the solution should I set up as the Start project?)

Yes, I wanted to use Rust to build this tool for myself. And I’ve eventually published its code online as well, so others can join me using it, at least:

GitHub - SDolha/getstartproj

Gets start (root) projects in Visual Studio solutions…

github.com

To define the high level requirements: the end user would just point this tool to a path (of either a .sln file or a folder containing some) and he or she will be presented with root projects it contains — in fact they are leaf projects, i.e. projects that aren’t dependencies of any other projects in the same solution from Visual Studio’s point of view — and that would be all:

getstartproj path/to/solution
Solution.sln:
    RootProject1.csproj
    RootProject2.csproj
    (2 start projects)

Well: easy to say, haha. But let’s try implementing it!

I started by downloading and setting up Rust on Windows, because — obviously — while I could have developed everything on a Linux box (or even on the Mac), on my Windows computer (in fact, running as a virtual machine) I already have a lot of Visual Studio solutions to test with.

Once I’ve got access to rustc, the Rust compiler, the first thing I needed to know was how to write a “hello world”-ish program using the newly chosen language. Yes, I’m cheating here, I knew that already from the first tutorial I’ve read. But I needed to put it here as well — the initial structure of a .rs file should be something like this:

Short as it is, that’s all! fn introduces a function, and being named main it means it’s the entry point of the program. It has no arguments (thus those open-closed parentheses) and no statements for now (thus the open-closed accolades as well).

Next: how can we accept an argument from the environment? This won’t mean an argument to the main function, but in fact we’ll need to read it from the command line, using a function from the standard library: env::args():

I needed to skip the first argument because it actually indicates the name of the executable (getstartproj).

This is good, but I still didn’t like it. 🙃

I wanted to exit with an error code when the user hasn’t passed the correct number of arguments (1, besides the executable name) in the command line.

With a bit of further search in the Rust standard library documentation — which is very good, by the way — I found out I could use process::exit for that (instead of return):

Update: I moved to use exit status values starting with 79 (and theoretically up to maximum 113) as exit codes so that they don’t collide with any of the “standard” error codes of BSD, bash or C language.

And with this structure as a starting point, the last thing there was to determine whether the path that the user has passed (as argument) actually exists in the file system — otherwise we can simply exit with a different error code (and showing an error message as well for the end user to know what happened; sure we should send all error messages to stderr rather than stdout, but I’d leave this exercise for another time; hint: eprintln.)

We use a Path object that helps us out here — but we need to use it from std::path to be available in our code:

We’re done with the boilerplate, I guess. Next thing is to distinguish between the user having passed us a .sln file directly vs. a directory that may contain such files itself. And in the latter case, to browse it recursively and find solution files inside to process. Here is the entire program now — up to this point:

To note:

When we browse a directory (in process function above) we need to use if let matching to skip any errors (such as cannot read) that we might encounter. For now we’ll just assume that the path is readable or otherwise it’s find to not find projects there.
We display each solution file that we find with only its partial path, considering the original_path passed by the end user. In case the end user has specified a .sln file directly (partial path would be an empty string), we’ll skip that step completely.
We need to borrow some memory there as otherwise the compiler would complain— notice all those &Path references in code, but for now it’s all easy and everything looks clean;
We have to use OsStr as well here since that is the type returned by path.extension() function and we need to check file extensions appropriately;
We’ve let some placeholder code in — the get_start_project function — as a stub place where we would continue after the initial testing round we’re just about to go with now:

rustc getstartproj.rs -o getstartproj.exe

getstartproj C:\Code\MySolutionFolder
MySolution.sln:
    (root projects will be presented here)

getstartproj C:\Code\MySolutionFolder\MySolution.sln
    (root projects will be presented here)

Now it’s finally time to read the contents of the solution file and determine projects and their inner-dependencies. We need Microsoft’s documentation to understand the file formats, but it’s nothing special about this process from Rust’s point of view: we need to read contents of appropriate files, and parse them to get the information we want.

Note: we’ll not use regular expressions nor XML parsing for now, but just plain string functions to extract .csproj and .vbproj references in the .sln definition and ProjectReference values inside those project files too.

(Yes, I am aware that regular expressions would be more performant and, if done correctly, might be even more readable, that XML parsing would have helped somewhat, and also that other types of Visual Studio projects exist, but for now I’ll conscientiously go with this simpler approach to avoid crates and limit support to C# and Visual Basic project types too, that’s it.)

get_project_paths function above looks for all .csproj and .vbproj indicators inside the solution file, and extracts the strings, putting them into a project_paths collection that will be later processed. (Note how the mut keyword is used at its initialization time to allow us to change its content later.)

To read the actual lines from the .sln file we use fs::read_to_string (again, not optimal, but it’s OK) and string’s lines handy iterator, and we use some string functions and ranges afterwards, but everything is pretty straightforward eventually.

Regardless, for now, to test this up, we just show the count of projects found inside the solution:

getstartproj C:\Code\MySolutionFolder\MySolution.sln
    (temporary count of projects: 2)

Note that get_project_paths function returns a Vec<String> (a collection of strings) rather than a collection of Path objects as you may expect first. This is because Path itself is a “slice” (just like str) and doesn’t have a size at compilation time, and the compiler expects return types to have one.

We could have tried returning a Vec<&Path> instead, i.e. returning references to path slices, but that wouldn’t work either because we cannot return objects that are temporarily created inside a function to its caller. (Rust’s safety net is at work!)

What we can do, however, as returning strings still doesn’t look as the best choice, is to return Vec<PathBuf> instead:

We use PathBuf::from function to create path “buffers” out of strings there. And whenever we’d want a Path reference, our of a PathBuf, we could just call as_path() on it. Or simply use PathBuf directly because — according to documentation — “it also implements Deref to Path, meaning that all methods on Path slices are available on PathBuf values as well.”

(Rust seems to show its teeth a bit. Or at least a bit of ugliness. But we can manage things, all right.)

By this time I’ve realized that project paths in solution files and project reference paths in project files are going to be relative values most of the time, but each being relative to their own folder container. That meant I’ve got to use some canonic form for paths, thus I’ve decided to use canonicalize() calls everywhere — being specific to the platform is all right:

I’ve been therefore able to concatenate a relative path to any specified root path, and return it into a canonical path this way without much computation.

I’ve initially updated get_project_paths method to limit the projects it returns to whose inside the solution folder — we just don’t care about the rest. Note that because we use sln_path twice there, I’ve simply used a reference symbol to ensure we don’t move the variable into any of the calls.

Next, I’ve filled in some (straightforward) high level logic into the real “main” get_start_projects function:

… and continued with defining the remaining functions — get_start_project_paths and print_paths, accordingly.

Let’s present the latter now as it’s the simplest:

Here we use another helper function — get_canonical_dir which simply returns the canonical path of the parent path of the specified value. And if the path starts with the same prefix, we’ll display only the local difference.

For reference purposes only, get_canonical_dir goes would go like this:

While get_start_project_paths and its get_project_dependency_paths sibling definitions go as follows:

These functions represent the core of my tool, in fact. We start with the entire list of projects within the solution folder, but we check which of them are “root”, i.e. no other projects reference it and remove the others in our mut-able vector.

To get the dependencies of each project we look inside the .csproj or .vbproj file (which is formatted as XML inside), by searching for ProjectReference elements and their Include attributes.

We use retain function of the vector — with an interestingly written lambda filter)— to keep only those paths who respect the condition that they aren’t referenced as dependencies from any other project since the remove function would have needed an index instead.

And that’s all, the tool is now working like a charm! Here it is as a zip release.

Update (a few months later): I’ve updated code to use eprintln to display error messages to stderr, and I’ve also moved everything into a crate to have it uploaded on crates.io, too. You can now install its binaries, if you wish, directly using this command, woo-hoo:

cargo install getstartproj

Conclusions

It’s not that hard to start using Rust — at least for creating small tools — as it may sound when you read your first tutorial. Memory borrowing is not that painful as it may appear from reading theoretical rules. Compiler errors are really helpful, documentation is written for humans.

And it’s easier especially if you already have general programming knowledge, such as if you have also ever used C, C++, and/or a modern language like Swift.

It’s stunning how much Rust appear similar to both low level C in some ways, e.g. types and references (plus the damn semicolons, haha), and to highest level Swift in others, e.g. fully checked syntax and the functional programming style (plus no parenthesis ifs and range-base fors)!

Don’t get me wrong, however: in this first-steps journey I’ve only touched the tip of the rusting iceberg, having done no OOP, no real functional programming, and no large project size either. On the other hand, we, programmers, are used to always learn new things so everything is simply normal as it is, as well.

I therefore encourage you all to give Rust a try: you may be surprised to be a really nice way to go both towards low level performance and towards Swift-ish syntax security. I think I start to understand why people love it so much.

(Speaking of Swift do note, however, that Apple didn’t get listed as a member of the Rust foundation. Possibly because they are betting all on Swift. And, in a way, that’s understandable: that is a great language too, and although it isn’t forcing as much as Rust’s borrowing system does, it is still pretty safe to use it and it is still very, very fast, compared to other languages, by using reference counting instead of garbage collection; that is absolutely great, until you create reference cycles, of course.) 🙃