Scala for Programmers

home

Scala runs using Java bytecode, but it's not terribly Java-like, except for using Value/Reference types. It's very into implicit types, and silly shortcuts. It even has Lisp-style singly-linked lists. This isn't in order of weirdness. If you see something relatively normal, that doesn't mean we're almost done.

Fluff, misc

Ending semi-colons are optional. They're meant to separate same line statements:
n=1; x=2 // no semicolon, but could have one.
It's actually more complex. A newline, with no semi-colon, can end a statement, or the statement could continue onto the next - the compiler will try to guess:
```
n= // no semi-colon continues to next line
1  // but no semi-colon here understands the statement is done
x=2
```
Has // and /* */ comments. Allow nested /* /* */ */ comments (which is nice, but the bug it prevents can't happen with modern syntax highlighting).
If you make an array or list by putting 1 item on each line, the last can end with a comma, like "cats",.
Allows triple double-quote """cats""" multiline strings.
0-input member functions can be declared without the ()'s, allowing them to be called with or without: W.length or W.length(). You can think of these are C#-style "properties", but they're written like normal functions.
1-input member functions may be called without the symbols: W.contains("cow") is also W contains "cow". In other words, every 1-input function also works like a binary operator. The intent is to make "x contains y" feel like a built-in part of the language.
Function calls may use round or curly-braces: func1(3) or func1{3}. Neither is ever required.
The () operator can be overloaded, but it's written as a function named "Apply" (this is semi-common in other languages).

Reference/Value types

Like Java, classes are only as reference types. But it goes further. Value-types are also references, to constant data. n=6 finds or creates an object with 6 and points n to it (the same way Python does it).

Operators for pointer-compare and deep-compare are flipped. eq and ne compare references. == and != compare contents (they run your equals function).

Math

C-style Char, Int, Float (2.4f) and Double. Automatic up casting: 'a'+4+6.8 is 122+10.8
5/3 is int division. Also has float modulo: 7.5/3 is 1.5
No ++ or --, but has n+=3 and so on.
Standard bit operations ~, |, &, <<, >> and ^.
w*3 repeats a string 3 times.
Cast using dot-toType: n.toInt + 2.3f.toDouble + 99.toChar. This works since everything is an object.

Declarations

Syntax is val or var, the name, then an optional colon with the type. val is a constant (or a constant pointer to non-const data). It prefers to guess the types whenever it can:

var n : Int = 0  // variable. Initial value is required
var n=7 // using inferred type shortcut
val N : Array[Int] = Array(2,4,6,8) // val is constant (to non-const data, if a pointer)
    N = Array(1,2,3,4,5) // error - attempting the change const pointer
    N(2)=7 // fine. N is const, the array object isn't

var ch : Char = 'e' // C-style char syntax
var w : String = "catfish" // nothing special. ": String" was optional

Declared vars must be giving a starting value. Underscore is the standard for "the default value of this type": var n:Int = _.

Template types exist, and use square brackets around the types:

var L1 : List[String] = null // standard list of strings
var D1 : Tuple3[Int, Int, String] =  (3,5,"cow") // D1 is a 3-part tuple
// the type wasn't needed, since it would have been guessed from (3,5,"cow")

// template function. This takes a list of any type:
def findIn[T] (L : List[T]) : Int =  ...

String construction shortcuts

Simple +, plus some older shortcuts:

w = "We saw " + num + " cats"  // implicit cast to string
  
// script style interpolation if you add "s" before the string. Use $var or ${expression}:
w = s"We saw $num cats with ${num*4} total paws"
w = s"Cats cost $$$catCost" //  Cats cost $6 - $$ is an "escaped" $

A leading f is like an s, but adds Fortran-style string formatting: f"cat temp is $t1%3.1f" displays t1 with 1 decimal place and at least 3 digits. Can ignore escapes by using raw"asci art: \/\/\". Can use the placeholder and values-come-later method: "Have %i cats and %i dogs".format(c1, d1).

Tuples

Naked parens are tuples. They're immutable. Use underscore-index to look inside. Oddly, the first item is 1, not 0:

var t2 = ("cats",7) // infers type is Tuple2[String,Int]
var t2b : Tuple2[String,Int] = t2 // using the formal type
var animal = t2._1 // _1 is the first item
t2._1 = "dogs" // ERROR - tuples are immutable (as usual)

Scala has the multiple assign shortcut, with underscore as an ignore:

var (aType, aAmt) = (5,9) // declare both and assign from tuple
var (x1, _, x2) = (4,8,9) // x1=4, x2=9

If's

Standard C/Java style except they're value-returning. In other words, A scala if/else is also a ?:. Scala also adds extra compare operators:

var w = if(cats>0) "mieow" else "no cats" is legal and considered fine.
Besides standard && || <= ! and so on, has b1^b2 for an exclusive or.
Single & and | are non-short-circuit versions of && and ||. if(checkA() | checkB()) will run checkB even if checkA is true (which wouldn't happen with ||).

You're encouraged to use match, which is like a switch that returns a vaue. It has many special rules for how to match. Not having a return value is a run-time error:

// running a match on "numOfCats". n1 is a string
 val n1 = numOfCats match { 
       case 3 => "herd" // technically 3 is a pattern, matching number 3
       case 5 | 8 => "frog" // regular-expression style
       case _ => "" // underscore is default.
       // Actually, any identifier is a default, but _ is a good "don't care"
    }

The pattern can't do range-checking, but you can add if's afterwards:

 val n2 = testScore match { 
       case n if n<60 => "F" // works like a function: n is placeholder for the input
       case n if n<70 => "D" // can use same n, or different
       case n => "" // default (using _ here is more common)
 }

Loops

Scala has standard C/Java while and do-while loops. But the for loops are only python-style, over a range you create, or a list. The <- is part of the syntax:

for (i <- 1 to 10) { do stuff with i } // i counts 1,2,3 ... 10
for (i <- 1 to 10 by 2) // 1,3,5,9
for( n <- 0 until n) // "until" is a shortcut for n-1

for(w <- listOfAnimalNames) // standard foreach list loop

1 to 10 is using the calling shortcut. It's actually 1.to(10), which is a built-in int function returning a range. Likewise the second is actually 1.to(10).by(2), which could also be 1.to(10,2). Fun fact, ranges aren't arrays, but act like them: var rng1 : Range = 8.to(12); println(rng1.length + " "+ rng1(1)) // 5 9

Underscore can be used as a dummy loop variable:for(_ <- 1 to 5) n*=2.

Scala has a very silly nested loop shortcut. You write both loops in a single for. The second one is the inner loop:

for(x <- 1 to 10, y <- 1 to 8) //  Runs 80 times, x=1, y is 1 to 8 ...
for(a <- Animals, n <- 1 to 3) // handles each animal with 1,2,3

There's also a shortcut to put an if inside the loop, which acts as a continue command. This examines every animal except dogs:

for(x <- Animals if(x!="dog")) { // every item in Animals, except dog's
  val cuteness = match x {
    case "cat" => 10
     ...
  }
}

We can combine these. The inner if's are allow to use the outer variables. This examines some odd diagonal slice of a grid:

for( x <- 0 until 10 if(x!=3), // skipping 3's just to show we can
       y <- 0 until 10 if(x+y>3 && x<y<12))

Blocks

Scala allows, and encourages, value-returning blocks. The last processed item is the value. Of course, anything declared inside of these blocks are local:

val someMath = { val n=z1*z2; n*2 - z3n }
// uses n as a local aux var, then returns the final equation

w = { val petSum=cats+dogs; val w2=if(petSum>2) "lots" else "some"; s"I have $w2" }
// petSum and w2 are local. Returns "I have lots/some"

Collection types

Scala encourages immutable types. The single-linked immutable Lisp-style List works pretty well for this. Scala prefers not to use doubly-linked lists. Most types have a creation shortcut: TypeName(v1,v2,v3). All use A.length (with optional ()'s). Indexing uses normal parens, instead of square brackets.

Arrays are technically not resizable. The operations which change the size are a bit slow:

A1 : Array[Int] = Array(2,4,6,8)
A1 = A1 :+ 5 // add to end
A3 = A1 ++ A2 // ++ concatenates two arrays

As usual, you're suppose to create an ArrayBuffer (or ListBuffer and StringBuilder) and convert the final result into an Array.

List is a preferred type. An immutable, singly linked list. You make new lists by playing with the front. Here L1, L2 and L3 safely share most of their elements:

L1 = List(2,4,6,8,10)
L2 = L1.tail // all but 1st
L3 = 0 :: L1 // adding 0 to front (best place to add)
     L2
       \
L1 -> 2 4 6 8 10
     /
L3->0

Notice the new syntax. :: is an item add. ++ or triple ::: concat two Lists.

Other fun fact: Scala's Queue class isn't implemented as a doubly-linked list, as you'd think. It uses 2 singly-linked lists. It adds to the front of the "end" list, and flips it when the "front" list runs out of elements.

Scala has a pile of list-traversing operations:

map transforms each element, possibly into a new type:
L2 = L1.map(_*2+"cat") // convert (1,2,3) into (2cat, 4cat, 6cat)
The underscore is an anonymous function trick "the single input goes here".
flatMap transforms each item into several, possibly of a different type. "flat" means that while you're technically replacing each item with a list, it's understood we want it as 4 items, not 1 list. This turns each item into 1.5 and 2.5 times itself:
L2 = L1.flatMap(n => List(n*1.5, n*2.5) ) // convert (1,2,3) into (1.5, 2.5, 3, 5, 4.5, 7.5)
reduce and fold combine every element into a single value. Reduce is the same type, while fold may change it. Because of this, fold needs a starting value (of the correct type). Adding right or left forces it to combine going that way, which only matters for things like a-b. Adding neither allows it to work in parallel:
```
L1.reduce(_+_) // adds everything together, in arbitrary order
L1.foldLeft("")(_+", "+_) // convert int list to a single string with commas
```
_+_ is the anon function trick again: reduce and fold expect a 2-input function, telling them how to combine items. The underscores represent the 1st and 2nd input (clearly, this trick only works if you use each once, in order).

Functions

A basic declare looks like this. Note the return type, then the = just before the body:

def addTwo(x : Int, y : Int) : Int = {
  return x+y
}

The void type is replaced by Unit. Unit counts as a real type, and even as a value. Parameters are constant, unless you add "var" in front of them (in other words, they're implicitly declared as "val").

There are a pile of Shortcuts:

Omit the {} if the body is one statement:

  def add(x:Int, y:Int) : Int = x+y // omitting the return and {}'s
  def wordInParens(w:String) : String = println("("+w+")")

No need for a return at end. Not having one is preferred. Like blocks, the final seen value is auto-returned:

def addOne(x:Int):Int = { x+1 } // shortcut for "return x+1"
def max(x:Int, y:Int) : Int = { if(x>y) x else y }
def max(x:Int, y:Int) : Int = if(x>y) x else y // using both shortcuts

Zero parm functions can omit the ()'s in the function declare and in the call:

def sayA = println("a") // shortcut for: def sayA(): Unit = { println("a") }

sayA // shortcut for say() function call
sayA() // also legal

Can infer the return type. This only works if you don't use the "return" keyword:

def addOne(x:Int) = x+1 // shortcut for: def AddOne(x:Int):Int = { x+1 }
def addOne(x:Int) = return x+1 // error - either add the return type, or lose the "return"

Can call using parens or curly braces: w.substring{3} is fine.

You're allowed to declare extra parms in their own set of parens, which the caller must echo. This isn't quite a curried function, since these are almost always called with all at once:

// Needlessly complicated function to concat a string n times
// NOTE: w*n does the same thing. This function isn't needed:
def repeatTimes(w:String)(times:Int) = {
  var w2 : StringBuilder = new StringBuilder // this is way too fancy
  for(_ <- 1 to times) w2 ++= w // += is for Char, ++= adds a list
  w2.toString // return value
}

repeatTimes("cow")(4) // cowcowcowcow

This is mostly a style thing. You have to use all parms -- it's not currying. But, oddly, it can be made to be:

var cowYeller = repeatTimes("cow") // surprisingly, error

// both of these correctly compute Int=>String functions:
var bearYeller = repeatTimes("bear")(_) // Use (_) for missing parm...
var catYeller : Int=>String = repeatTimes("cat") // ...or declare the type
bearYeller2(3) + catYeller(2) // bearbearbearcatcat

Partial functions

Instead of having pre-conditions, or returning -1 for bad inputs, Scala has a formal way where functions tell you where they can't run. They call it a partial function. You can write an extra bool function which lets you know whether it's runnable on that input. Below, ff won't run for cat, and gives double the length for any other string:

val ff = new PartialFunction[String,Int] {
  def isDefinedAt(w:String) = w!="cat" // works for anything but cat
  def apply(w:String) = w.length*2  // the real function
}

if(ff.isDefinedAt("cow")) n+=ff("cow");

The advantage of having a formal isDefinedAt is we can use it to chain them. orElse knows to call isDefinedAt:

(ff2 orElse ff3)(7) // runs ff2 on 7, unless ff2.isDefinedAt(7) is false

A shortcut method for writing partial functions has the system guess from your cases (scala people love cases):

val ff2 : PartialFunction[Int,Int] = { // why is this declared differently? Beats me
  case n if(n>5) => n*2
}

isDefinedAt is built from if(n>5) (really, from combining every case). Apply is built from the case results. Here, ff2.isDefinedAt(3) is false, and ff2(7) is 14.

First class functions

var f : Int => Int defines f as a function variable. We can also use C# style Function1[Int,String]. The number is how many parms, with the last being the return type.

Anonymous functions use Python-style syntax:

f = (x:Int) => if(x>3) 10; else 5 // using value-returning if and no-return shorcut

If we can infer the input types, underscore stands for each, in order: (_ * _) is short for (x:Int, y:Int) => x+y.

Duck-type functions (Structural types)

We're allowed to replace the type of an input class with the operations it supports - formal support for Duck typing. You can also specify member variables (since they are really paren-less functions):

This defines an "I have an Int x member var" interface, then uses it in duck-type function, duck1:

type HasXint = { var x : Int } // defines a Duck-type "anything with: var x : Int"

// this takes any class with a dot-x:
def duck1(cc : HasXint) cc.x+5 // using a Duck type to specify input type

 duck1(point1) // adds 5 to point1.x
 duck1(dog1) // possibly crashes, or adds 5 to dog1.x

This one specifies any class with a yell function, taking an Int and returning nothing:

// canYell is any class with yell(int)
type canYell = { def yell(times:Int):Unit } // defines a Duck type

def yell7(yy : canYell) = yy.yell(7) // runs their yell function with a 7

You're also allowed to write an anonymous duck-type inside the template. Here, local type HasX is specified as any superset of an anonymous "Int x"-having class:

def duck1[HasX <: { var x : Int }] (cc : HasX) = { cc.x += 5 }

Imports, namespaces

As an example, import scala.collection grabs everything from there.

Classes

Class definitions are mostly Java-like. Except the default accessibility is public. It also uses the trick where assignments to member vars are moved into all constructors. Member vars must have a starting value:

class A extends B { // "extends" to inherit
  var x : Int = 0 // assign-in-constructor shortcut
  var y = 7
}

There's a shortcut for creating a constructor and member vars - known as the Primary Constructor. You're encouraged to always use it. Add assignments to variables in parens after the class name. "Naked" code lines in your class are part of this constructor. Here, the heading gives Point fields x and y, and gives Point(2,4) as a constructor. Then down below it gets a z-field and Point(6) to set z (NOTE: this is a terrible class; it's for demo purposes only):

class Point(var x:Int=0, var y:Int=0) { // <-declares x and y, AND is a constructor
  println("constr") // <- also part of primary constructor
  
  var z : Int = 0 // point has x and y (above) and also z
  
  def this(zz:Int) {  // <- a normal constructor
    this(6,10); // <- required to call primary one first, if it exists
    z=zz
   }
}

All variables must be initialized, either in the primary constructor, or the field declaration. Underscore is a shortcut for the default value:

  class Point { // <- no primary constructor, which is fine
    var x:Int=0
    var y:Int=_ // same as 0
    var z = 0 // inferred type Int
  }

Style-wise, underscore means "will initialize in a line below".

Fun fact: a var declared with no initial value is actually declaring an abstract interface, making you an abstract class. Sub-classes must either declare it, or create it using a get/set pair.

get/sets use the fake variable for the get, and the name followed by an underscore for the set:

class OneNum {
  private var xx:Int=0 // backing var
  
  // getter/setter pair for x:
  
  def x = xx // the getter, using several shortcuts
  
  def x_=(newx : Int) { // the setter
      var xTemp=newx; // NOTE: copying since inputs to setters are constant
      if(xTemp>10) xTemp=10;
      xx=xTemp
   }
}

Scala actually makes every field into a getter/setter if you don't make one. That way class interfaces are clean, with only functions. It's how, I think, interfaces can specify "must-have" member variables.

Interfaces

Scala's name for a virtual base class or a Java-style interface is a trait. As with Java, you can inherit from 1 real class and as many "trait"'s as you want.

A new, crazy thing in scala is having virtual variables. Declaring a variable with no starting value magically makes it virtual. A concrete subclass must either declare it, or declare a getter/setter:

trait Mood {
  var howMuch : Int // abstract field, must be created in subclass
  def isGood() : Boolean // abstract virtual function
}

class DogMood extends Mood (var name="Bowser") {
  var howMuch : Int = 0 // a normal declaration, fulfilling our obligation
  def isGood() : Boolean = howMuch>10
}

Scala also allows "traits" to write bodies for the functions - you're allowed to use the virtual variables:

trait Mood {
  var howMuch : Int // still virtual, but will be an int
  def isGood() : Boolean = howMuch>=10 // implemented
}

class DogMood extends Mood { var howMuch : Int = 0 } // all we need for a concrete subclass

Misc weirdness

Type projection is nothing special. But it points to other weirdness. Cat_t#Claw_t is the boring way to find the Claw_t nested class of Cat_t. Nothing special, and in many languages the pound-sign would be silly -- just use a dot.

But in scala, like Java, instances of objects of a nested class are associated with the parent that made them. If we have two cats, then c1.Claw_t and c2.Claw_t are different classes. They have different concrete parents. If a member function of c1 takes a Claw_t as input, it expects a claw which it made - implicitly a c1.Claw_t. In scala (but not Java), giving it a c2.Claw_t is a type mismatch. That system is called Path Dependent Types.

Type projection, Cat_t#Claw_t, is for the rarer cases where you don't want to specify any particular parent instance. It matches any CatInstance.Claw_t. That's an appropriately awkward syntax, for something you wouldn't want to do that often.

Covariance / Contravariance: These are rules about when a template class can accept subtypes of its template argument, and in which direction. For example, a function taking a List[Duck] can probably also take a List[Mallard], but not a List[Animal]. Many languages have automatic rules allowing this -- but in Scala you have to explicitly define the behavior. [+T] means normal subclasses are allowed, just [T] means it has to be that exact class, and [-T] means only superclasses are allowed. For example, scala lists are defined using List[+T], which means they work the way we expect when we write list-using functions.

This can't make things legal that weren't. It just gives nicer error messages. If scala had List[-T] then function doStuff(D : List[Duck]) would accept a List[WaterFowl] as input. But then would fail when you tried to use Duck-specific member functions.

Within a template type with multiple types, often a nested one, [S <: T] says the second type, which will be named S, can be T or any subtype. If a function takes List[T] and List[S <: T], the second list can be the same type as the first, or any subtype, like f(DuckList, MallardList). [S >: T] is the opposite -- T or any supertype.

This all makes sense, and has reasons you'd use any of those three. But you can also either ignore it, or use the [+T] option. The standard classes have everything set the correct way.

Mixins: these are nothing. It's scala's word (and in a few other languages as well) for when we inherit from a class and also from some interfaces (which, recall, scala calls traits). I think the inherited interface is the mix-in. Scala also allows interfaces to inherit from abstract classes, which is fun, but also nothing special.

Higher-order functions: also nothing. It's scala's term for passing or returning a function, which every other language can do without needing a special name. But it's cool how scala uses the real math term for it.