RUST was made for fast safe low-level concurrent programs. It has fast automatic garbage collection and gives error messages about dangling references and possible race conditions. So that's nice. Of course, making the RUST compiler happy requires more work than any other programming language. It's also a "hot" language which I'm 99% sure means people are suffering through it for no reason in places it provides no advantages.
The pain is mostly two intrusive rules. Its garbage collection (and dangling-ref checker) work because each heap object has one designated owner. Passing and not passing that ownership is a pain, esp. remembering to use "non-owner" references. The other painful rule is you can't have more than one pointer to an object; so no doubly-linked lists, parent pointers in trees, or pretty much any interesting data structures. Not even a temporarly short-cut pointer. The idea is that if you need a tree, either find a built-in tree or use unsafe
mode to write your own.
RUST seems like a standard post-2000 C-inspired language. It has easy (a,c,b) tuples, the declaration style is name : type
which allows for implicitly typed declarations. It has the usual overly complicated enums and a fixation with mutable/non-mutatable data.
var_name_here
) are preferred so much that varNameHere
is a warning! Semicolons are seperators -- you don't need them on the last line and sometimes can't have them (since they'd create an empty line) (but sometimes you do need them?)
// I'm a RUST comment let cat_count = 6; // "snake-case" identifier, implicit type if x<0 { x=0; y=0 } // no final semi-colon inside {}'s
As is common, a single underscore is a legal identifier, standing for "unused". RUST goes further and has any identifier starting with an underscore be the same (the compiler will give warnings -- "x is unused, consider changing to _x
").
Declares start with the keyword let
and end with : type. Ex: let w : String;
. As usual in this style, the type is optional if it can be deduced: let n = 6;
is legal and fine.
There isn't a general int or float type -- those families include a size: i16 i32 u16 u32 f32 f64
, and it's a low-level language, so I agree with that. It has a character type, char
, with the usual single-quotes: 'a'
. Then bool
with true/false. String
's use "abc"
(but see the funny casting rules). You can multi-declare, but only if you also assign through fake tuples and let RUST deduce the types: let (n1, n2) = (6, 25);
.
RUST strings come in two types: String
is the normal string class, while str
is a reference to some list of characters -- basically a char-array pointer. For examples, let w2 : String = "abc".to_string();
creates a new String as a copy of "abc", while let w : &str = &"abc";
causes w
to point to "abc" in the code. And then let w3 : &str = &w;
is an str
pointing to the char's in our String.
RUST has standard tuples: (2, 7.0) is one, with formal type (i32, f32)
. Accessing fields of a tuple uses a dot followed by a zero-based-index: t1.0
, t1.1
and so on. We get multi-assign with pretend temporary tuples as well.
// declare (and assign) a tuple: let t1 : (i32, f32) = (5, 14.0); // play with fields using indexes: if t1.0<10 || t1.1<10.0 ... // fake tuple assign to n1 and n2: (n1,n2) = (3,7);
RUST goes in for mutable / non-mutable types. So much so that a basic declare, such as let n=7;
is a constant. Keyword mut
, as in let mut x = 6;
creates a normal "mutable" variable.
let y : f32 = 5.6; y += 1.0; // error let mut y : f32 = 5.6; // this one can be changed
This is where we'd normally talk about casting mutable to non-mutable and such, but RUST has a more complicated system, later.
RUST arrays are barely meant to be used -- it wants us to use the nicer list class. And that's fine -- lots of langauges do that. Arrays are value types and the declaration includes the size! Crazy. Other facts: [] is used to make literals, formal array types are written [item_type:size] and dot-len() gives you that size. Ex:
// declare and assign size-3 array: let nums: [f32,3] = [4.0, 3.1, 7.0]; let n2 = [2,7,18,12,9]; // or declare w/implicit type [i32,5] let allFours : [i16:10] = [4:10] // shortcut for [4,4,4,4,4...] let sz = nums.len(); // size 3 let x = nums[0]+nums[2]; // standard [] indexing
My trick is remembering that the size is always last: [i16:6]
is a size-6 type and [2:6]
has 6 twos.
RUST's nicer lists use the older name "vector", written Vec
. The type Vec<f32>
is a list of floats. They use push, pop, len, remove and so on:
// declare Vec pointer and also create object: let mut nums : Vec<f32> = Vec::new(); S.push(4.0); S.push(7.0); S.push(1.0); // note: the point-zeroes are required S[0]+=2; // standard [] for indexing S.remove(1); // standard remove at index let n2 : f32 = S.pop().unwrap(); // pop() returns a value OR null -- see "Option" types below
Typeless Vec::new()
is an example of how aggressive RUST's type-guesser is. Not only is let v : Vec<i32>=Vec::new();
legal, but let mut v = Vec::new(); v.push(5);
is also allowed -- RUST looks down to that push to see that you wanted an i32. For fun, you can give supply a type with Vec::<f32>::new();
RUST uses the standard Reference/Value-type system. The docs make noises about being different, but it's not: stack-only types are Value types (ints, floats, bools, arrays), while heap types use auto-dereferenced pointers (structs and strings).
A confusing bit is that structs are (or can be?) created on the stack. But RUST pulls objects into the heap as needed. Essentially, sometimes-stack is a speed hack -- structs act as if they're on the heap.
Another confusing bit is that RUST redefines the term "reference" as part of the ownership system (later). Below, the first is a reference in the reference/value sense, while the second is what RUST calls a reference -- a non-owner pointer:
let mut c : Cat; // c is a "reference" to a Cat // a non-ownership pointer to c, // this is what RUST calls a real reference: let c2 : &mut Cat = &mut c;
You might think that this is like call-by-reference or alias-style C++ reference, because of the use of &. It's absolutely not. It's a new RUST non-owner pointer thing. And it's also often called a "borrow".
There are no implicit casts, not even from int to float which is super annoying: let n : f32 = 4;
is a freaking error (you need to write 4.0).
RUST castes look like n as i32
for example: let f : f32 = i as f32;
. There's no need for a C#-style dynamic downcast -- animal as cat
since there's no inheritance.
let w : String = "abc";
is an error because RUST won't implicitly copy the "abc" from the code. You need to do it yourself using either String::from("abc")
or "abc".to_string()
.
Explicit casting from strings (pulling a float out of "12.5") uses a magic rule: w.parse().unwrap()
handles all cases, and RUST checks the receiver to decide which type to convert into (but only in this one case -- RUST doesn't have overloading). Ex: w="34"; let n:i32 = w.parse().unwrap()
. We'll see more on unwrap in the Nullables section.
RUST has what other languages call nullable types -- for example, an int type which can also be null -- but does it using a generic Option class. The int?
in many languages becomes Option<i32>
in RUST.
unwrap()
either retrieves the value or throws an error. is_none()
and is_some()
check for null. Ex's:
// since the list may be empty, this is an Option<i32> let nn = cows.pop(); // did we get something?: if nn.is_none() { val=-1 } if nn.is_some() { val=nn.unwrap() } val=nn.unwrap(); // crashes if no value val=nn.expect("help! nn was empty"); // same, but crashes with our message val=nn.unwrap_or(-1); // no crash: gives -1 if null // crazy value-returning specialized enum: let n3 : i32 = match n2 { None => { -1 } Some(x) => { x } };
Literals for an Option-type are None
, which works for anything (let x : Option<String> = None;
), and Some() -- Some(6.0)
creates an Option<f32>
.
There's no formal null in RUST. You can't have let c : Cat = null;
. The equivalent is let c : Option<Cat> = None;
.
There are no ++ or -- operators. But we get +=, *= and so on. Modulo works on floats.
We don't get a nice w+"abc" for string-concat. We're stuck with w.push_str("add me");
.
Assignment statements have no return value: n1=n2=0;
isn't allowed. Yes, this is the new trend, but RUST needs to do this since assignment is also used to transfer "ownership" of heap data.
RUST if's have no parens in the condition, but the curly-braces are required. They can be value-returning (so they double as the ?: operator). Exs:
if a>b { max=a } else { max=b } max = if a>b {a} else {b}; // value-returning, {}'s are required if (a<0 || b<0) && a!=b { ... // bool operators
Compares are the usual ==, !=, >=, ... and logic uses the standard && and || with the usual precedence.
As with if's, the condition has no ()'s and the body requires {}'s. We get the basic while n<10 { }
. We also get a standard iterator-for: for item in List { }
. We do not get an old-style for(;;) loop, but we can use the Python-style iterate over a range: for n in 1..10 { }
(which creates range 1 to 9). We get break
and continue
.
There's an explicit infinite loop: loop { ... }
(which seems silly since the standard while(true) { ... }
works fine). Exs:
while S.len() !=0 { print!("{} ", S.pop().unwrap()) } // infinite loop with break: loop { if S.len()==0 { break } print!("{} ", S.pop().unwrap()) } // basic list loop: for n in S { print!("{} ", n) } // Python-style 0 to size-1 index loop: for i in 0..S.len() { print!("S[{}]={} ", i, S[i]) } // NOTE: S.len() is correct -- range excludes final value
RUST uses the "new" function style with a keyword and the return type at the end: fn funcName(x: i32) -> i32 { ... }
. Value-returning functions can omit the return
(and remember: no final semicolon).
There's no void in RUST. Instead, an empty tuple is used: fn proc(...)->() { ... }
. Or this can be implicit by leaving it out: fn proc() { ... }
. Technically these functions auto-return the empty tuple, ().
// standard function: fn max(a:f32, b:f32)->f32 { if a>b {a} else {b} } // implicit void function: fn print_dashes() { print!("----------") }
As mentioned, RUST has no function overloading. It also lacks operator overloading and default parameters. It doesn't have cool variadic paramters, but...
Functions ending in an explaination point, !, are actually macros. Those are generally anything taking a variable number of arguments. print!
and println!
are macros.
RUST has function pointers. They're declared how you'd expect. Here's one: let fp : fn(i32, i32)->i32;
. Assign and call the usual way: assign fp=maxInt;
and call fp(5,2)
. As input to a function it's also normal:
// assigning a function pointer to a function: fn add_one(n:i32) { n+1 } let fp : fn(i32)->i32; fp=add_one; let x = fp(8); // x is 9 // taking a function as input: fn count_if(S:Vec<i32>, is_good:fn(n:i32)->bool)->i32 { let mut count=0; for n in S { if is_good(n) { count+=1 } } count } fn is_two_digit(n:i32)->bool { n>=10 && n<=99 } println!("count is {}", count_if(S2,is_two_digit));
Anonymous functions, on the other hand, are crazy. They need a pair of |'s around the input (identifiers only -- the types are inferred) then a body without {}'s. Here's an anonymous max function: fp = |a,b| if a>b {a} else {b};
:
But once you get past that, using them is fine:
// sending an anonymous function (to count odd numbers): let c=count_if(S2, |n|n%2==1);
But then RUST is funny with closures -- a function which captures an external value -- for example |x|y+x
capturing y
. You need a special extra syntax to use closures. Here's a standard "curried" add (you give it N and it returns a function "+N" which adds N to the input):
// this returns a closure: fn adder(x:i32)->impl Fn(i32)->i32 { move |y|y+x }
That needed 3 special things: impl
leading the return type, capital F in Fn (normally it's lowercase fn), and move
before the anonymous function. The type of a closure can't be written down, which means we can declare let add8 = adder(8)
but can't write x : impl Fn(i32)->i32 = adder(8)
.
Closures start off being used normally: let add8 = adder(8); let num = add8(5);
(which is 13). But a function which take closures as input is once again weird. It needs another special syntax with template style and a where
:
// takes an array (for fun, instead of a Vec) and a closure: fn count_if<F>(nums : [i32;5], is_good:F) -> i32 where F: Fn(i32)->bool { let mut count=0; for n in nums { if is_good(n) { count+=1 } } count }
Anything which can take closures can also take normal functions. So I guess this is how all RUST function-taking functions should be written?
RUST uses mod
to create namespaces. It uses ::
for the scope resolution operator. It's got new-style visibility rules where everything is completely public within the same module, otherwise you decide (private unless prefaced with pub
):
// mod(ule) = a RUST namespace mod cats { // a public struct w/public vars: pub struct Meow { pub loudness:f32 } // same but w/a private var: pub struct Tail { color:String } // color in Tail is public in same scope: pub fn show_tail(tt:Tail) { print!({}, tt.color) } } // dig into the module using :: : let mut t1 : cats::Tail; t1.color = "red".to_string(); // ERROR -- color is private outside module
Dumping a module's contents into the current scope is done with use
. A wildcard grabs everything: use cats::*;
. Or do it 1-at-a-time like use cats::Meow;
, or chose multiples: use cats::{Meow, Tail};
.
RUST classes use the keyword struct
. For no good reason, fields are separated by commas. Instead of constructors they have a special syntax with ClassName{fieldName:value, ...}:
struct Point { // a class x : f32, // comma, ug y : f32 } let p1:Point = Point{x:1.0, y:5.2}; // RUST's constructor let xx = p1.x; // field accessed with standard dot
Recall public/private is module based. Defined as they are -- in our scope -- x and y are public to us. If Point was defined in a module they'd have needed pub
in front or else been private to us.
Getting weirder, member functions are written outside the class, in an impl
block, where the first parm is &self
:
// member functions in own area: impl Point { // non-modifying: fn quadrant(&self)->i16 { let x=self.x; let y=self.y; if x>=0.0 { if y>=0.0 {1} else {4}} else { if y>=0.0 {2} else {3}}} // self-modifying needs mut: fn set(&mut self, _x:f32, _y:f32) { self.x=_x; self.y=_y } // static (since no 'self'): fn new()->Point { Point{x:0.0, y:0.0} } }
If the member function modifies itself we need &mut self
. Leaving out 'self' is allowed and creates a static, in this case, Point::new()
. Using the name "new" for the explicit creation function is merely a convention -- new
isn't a keyword.
And note the self.x
in the member functions. Yeah ... we're doing that again.
RUST has no inheritance. Instead it goes with abstract interfaces, which it calls Trait
's. Writing them is about the same as usual:
// an interface with two functions: trait has_names { fn first(&self)->&str; // no inputs, the default name fn get_one(&self, index:i32)->&str; // returns one of several name parts }
These could have returned String, but returning a &str
works just as well here.
As with member functions, a struct implements the interface explicitly through another impl
block. The header is new language for "this implements interface X", but the rest is regular member functions:
struct Cat { name : String, age : i32 } impl has_name for Cat { fn first(&self)->&str { &self.name } fn get_one(&self, i:i32)->&str { if i==0 { return &"Ms." } else if i==1 { return &self.name } else if i==2 { return &"the cat" } else { return &"" } } }
Here the &str
return value could come from a literal (&"Ms."
) or a created String (&self.name
). If we needed to build a return string -- for example ("Ms. ".to_string()).push_str(self.name.clone())
we'd need to return a String
, otherwise we'd be referencing something going out of scope.
Polymorphic functions can work in two ways. A true one using dynamic dispatch -- one copy which can run anything -- uses &dyn
:
// takes anything implementing has_name: fn print_whole_name(val : &dyn has_name) { let mut i = 0; let mut w=val.get_one(i); while !w.is_empty() { print!("{} ",w); i+=1; w=val.get_one(i); } } // call in usual way: print_whole_name(&my_cat); // or just (my_cat) if you're already a borrow
Or you can use the old trick of having RUST create a copy of your function for each type, by using impl
:
fn print_whole_name(val : impl has_name) {
This is fun, showing RUST really is concerned with low-level stuff. There's no difference in how the program runs. It's a choice between speed vs. program size.
This one is just cutsie -- a tiny limited-use shortcut that has no effect on anything. You're allowed to redeclare variables in the same scope. The main use is to eliminate very temporary variables. Ex:
// converting W[0] into an integer: let num : String = W[0]; // "341", for example let num : i32 = num.parse().unwrap(); // redeclaring num
Without this trick we'd have had to make up a name like numAsString
for that temp. So this rule is nicer, but it seems like not nice enough to justify the added confusion.
Sometimes this seems as if it can fake changing a non-mutable, for example let x = 4; ... let x = 7;
. But it's really the same cutsie trick -- we're done with x
, never using it again, and are allowed to reuse the name for a new variable.
RUST's garbage collection is original and kind of slick. As you code, you're required to establish a single "owning" variable for each bit of heap data. When that variable leaves scope or otherwise stops referencing it, the object is garbage collected. And since that can be determined by the compiler there's no costly garbage collection step -- just a single invisibly inserted "collect this item" call. Dangling references are taken care of as well -- RUST gives you fat error messages if you leave any.
Obviously, the initial assignment of an object becomes the owner, but it can be transferred. In fact, it's always transferred unless you say otherwise. This is a big pain. Non-owner pointers are called borrows and use an &:
let w : String = "abc".to_string(); // owner let w2 = w; // now w2 is the owner let wa : &String = &w2; // w3 is a non-owner reference let wb : &String = wa; // another one
Notice how we're working hard. We need to explicitly say we're a borrow and explicitly convert into one, using &
. So wa and wb are &String
's and we use wa=&w2
on the third line (but assigning wa=wb
is fine since wb is already a borrow).
The same dance is required with parameters. They need to be declared as borrows and owners need to be turned into borrows when called:
fn show(a:&String) { print!("{}",a) } // this takes a borrow show(&w2); // need to convert w into a borrow show(wa); // but no need here since wa already a borrow
Strangely, extra &'s seem to be ignored, so an accidental show(&wa)
is fine, or even &&wa
.
The easy transfer of ownership is to our advantage in creation functions. Here, with no extra work, p
transfers ownership (to cat_location
) before going out of scope:
fn make_point(x:f32, y:f32)->Point { let p:Point = Point{x:x, y:y}; return p; } let cat_location=makePoint(4.0, 16.2);
An example of what you can't do, this function forgot to require a borrow, which means it takes ownership, which means it destroys its input:
// this broken function always destroys the string it gets: fn fff(a:String) { print!("{}",a) } fff(w); // w passes ownership print!("{}",w); // compiler error -- w is a dangling borrow
Once you pass ownership, you're actually un-usable since you're now technically a borrow but your type can't change to one. But you're only unusable until you find something else to own:
let mut p1 = Point{x:1.1, y:2.2}; let mut p2 = p1; // transfer ownership to p2 print!("{}",p1.x); // error, p1 gave it away and is dead p1=p2; // p1 gets ownership back print!("{}",p1.x); // p1 is usable again println!("{}",p2.y); // error -- now p2 gave away and is dead p2 = Point{x:8.0, y:-3.0}; // p2 is usable again!
And here's a swap also showing ownership changes (note how temp
needed to be an owner. As a borrow it would be a dangling reference to unowned "abc"):
// w and w2 swap (values and ownership): let mut w : String = "abc".to_string(); let mut w2 : String = "def".to_string(); let temp=w; // transfers ownership w=w2; w2=temp; print!("{} {}",w, w2); // def abc (swapped owners)
For mutable variables the owner/borrow rule is the same, except both the type and the conversion use &mut
:
// "borrow" of a mutable: let mut p1 = Point{x:1.3, y:6.0}; let p2 : &mut = &mut p1; // fuction taking (and being passed) a mutable "borrow": fn addP(p:&mut Point, xx:f32, yy:f32) { p.x+=xx; p.y+=yy } addP(&mut p1, 1.0, 5.0);
This borrow system allows RUST to agressively track object lifetimes and throw compile-time errors when there's a problem. For example, here I've worked to make w
refer to the name of a dead out-of-scope cat. RUST sees this and gives a nice compile-time error:
let mut w : &str = &""; { let c = Cat{name:"Fred".to_string()}; w = ff(&c); } // c falls out of scope println!("{}",w); // compile error -- w points to dead cat
Finally we come to the rule which makes RUST nearly useless for normal programming. Immutables can have any number of borrows. That's fine. But mutable variables can have only one live reference at a time. You're allowed one mutable borrow, which disables the owner until the borrow leaves scope. Or as many non-mutatable borrows, which turn the owner into a non-mutable until they go away.
let mut p1 = Point{x:0.0,y:0.0}; let p2 = &mut p1; // now only p2 can be used to write, not p1 p1.y+=0.1; // ERROR -- ref p2 obstructing p1 p2.x=0.0; // last use of ref p2 p1.y=3.4; // p1 is legal again
That last line, where p1 is magically legal again, shows a special rule: RUST aggressively destroys references after the last time they're used.
It turns out, and makes sense, RUST locks an entire container if you reference any part of it:
// a sample list: let mut P : Vec<Point> = = Vec::new(); P.push(Point{x:1.0,y:3.0}; P.push(Point{x:0.1,y:0.5}; let pp = &mut S[0]; // reference to 1 item in list // the entire list is locked: S[1].x=0.4 // ERROR -- ref to [0] locks whole thing S.push(p1); // ERROR -- ref to [0] locks whole thing pp.x+=0.1; // need this here to establish lifetime of pp
Likewise, if a reference could be to several objects, all count as used:
// p3 could be p1 or p2, both are locked: let p3 : &mut Point; if p1.x<p2.x { p3 = &mut p1; } else { p3=&mut p2; } p1.x=1.0; // ERROR -- p1 could be obstructed by p3 p2.x=6.5; // ERROR == p2 could also be obstructed by p3 p3.y=9.0; // establish p3 as active
As usual, p1 and p2 are unlocked after the last use of p3.
Attmepting to return strings is a nice example of how RUST worries about underlying storage. This next function only returns existing character strings, so can use &str
to save space:
// using &str to avoid allocating more space: fn cat_name(c:&Cat)->&str { if c.name.is_empty() { return "none" } else { return &c.name } }
Or we can have it return a real String. We're required to create it (ownership transfers to the caller automatically):
// using String instead to get a more usuable output: fn cat_nam2(c:&Cat)->String { if c.name.is_empty() { return "none".to_string() } else { return c.name.clone() } }
If we need to build a value, we can only return a String, since a &str will have nothing permanent to refer to:
// forced to use String because of created "Cat is"+c1.name: fn cat_name3(c:&Cat)->String { if c.name.is_empty() { "none".to_string() } else { let mut w = "Cat is ".to_string(); w.push_str(&c.name); w } }
Surprisingly, this next function gives errors about needing a "storage" option (and it won't it the input is a &str
):
fn ff(n:i32)->&str { if n==0 { return "none" } else { return "not empty" } }
I think you're always supposed to specify a "storage class" in the return type, but RUST can guess it in most cases?
RUST is supposed to be good for concurrency safety. It doesn't have any new rules -- use either plain-old message-passing, or mutexes. But they're a little different -- they're not just locks that go around critical sections -- RUST mutexes go over one variable and serve as access protection for just that. So you can't share a variable between threads -- you share RUST-mutexes which encapsulate that variable. To use the variable you're forced to get a get a handle which also waits for a lock. So RUST absolutely enforces mutual exclusion on shared variables. And that only works since RUST aggressively ensures there are no other pointers, which could get around that mutex.
Every language does slices a little differently, so let's have fun and see how RUST does theirs.
Slices are gotten with mostly the usual notation &aa[2..5]
grabs 3 items from aa
(at 2 through 4, exclusive of ending value). Shortcut &aa[2..]
goes until the end and &aa[..3]
goes from the start to just before 3.
Slices are always references/borrows, which is why they're taken with &
. The type of a slice is something like just &[i32]
-- an array type with no size. And of course, slices also count as borrows.
Slices can be used to assign (if they're created as mutables):
let mut aa=[1,2,3,4,5,6]; let as1 : &mut[i32] = &mut aa[2..]; as1[0]=99; // aa is [1,2,99,4,5,6]
They work for vectors the same as arrays:
let mut vv : Vec<i32> = Vec::new(); for i in 1..6 { vv.push(i) } let v_slice = &mut vv[2..]; // vector, array, same thing