Software, Technology, Investing: October 2013

Sunday, October 20, 2013

Building the scala-redis client for scala 2.10.3

scala-redis a client for the popular key-value database redis. The client is avaible on github here
and comes with a jar file 'rediscient-1.0.1.jar for scala version 2.8.1

With a few steps I'll show how to build this client for scala version 2.10.3 using sbt. If you havent done so already clone the repository on github

1. Change two lines in the build.properties:
build.scala.versions=2.8.1 -> build.scala.versions=2.10.3
sbt.version=0.7.4 -> sbt.version=0.13.0

The build is for scala 2.10.3 and and the sbt version should match what you have installed. If SBT is not installed you can get it here

2 Open a command line and change directory to the root of your the scala-redis project

3. Type 'sbt'+Enter and wait for dependencies to be resolved
C:path> sbt

4. Optional: At the SBT prompt type 'compile'+Enter to compile. Wait for success
> compile

There should be a folder called scala-2.10 in the target folder with compiled java classes.

5. To package the java classes to a jar-file, type 'package'+Enter at the prompt
> package

There should be a newly created jar file in scala-2.10. My file is named scala-redis_2.10-0.1-SNAPSHOT.jar

Now this jar can be consumed by other projects!

Saturday, October 12, 2013

Skalahhhh (The Scalable Language)

As my experiments in F# is going quite well and I really start to enjoy the terse syntax, lack of curly braces, type inference, and experimental programming in the F# interactive console I decided to take up another challenge by getting some basic familiarity with Scala. For those who don't know Scala is pure object oriented language that runs on the Java Virtual Machine. Its completely separate to F# and .Net so the only association between the two will be in this first paragraph of this post.

Two of the first things I read about Scala is that "Every value is an object" and "Every function is a value". With this knowledge one can easily conclude that all functions are objects. I also read that Scala is a so called pure object oriented language where everything is an object. While this is easy to remember its harder to understand the implications of such a simple statement. As an application developer I don't think its crucial to understand the true meaning of it, just that it is a important fact in the design of the language.

At the point of writing the current version of the language is 2.10. The first version available version appeared in 2003 and its creator is Martin Odersky. He is still engaged in Scala and teaches an online course in Scala on Coursera (link later in this post). Scala is under development and from time to time new releases are made available. The Scala Improvment Process (SIP) provide insight into pending and completed changes to the Scala language.

One of the challenges in learning something new is finding good material to learn from. When learning to program the resources are often plentiful and this makes it only harder. Just by doing a few searches on Scala I was able to find tutorials, books, videos and who else knows what. For me its a thin balance between collecting too much and too little reference material and its important not to be on the far edge of either side. If I have to much material the task of digesting and extracting what is useful is overwhelming and off-putting. If I have too little references there's a risk that I will shortly again be searching for more information and risk being overloaded that time. To get started I've filtered the resources available online and ended up with this list.

Official Scala Language Specification (PDF) - Useful to look in from time to time. I wont read it back to back.
A cheat sheet with small nuggets of Scala code
The well know site Stack Overflow have a compiled list of questions tagged with Scala
A selection of video recordings of Scala topics are on Nescala.org. Taught by Martin Odersky
Scala School tutorial with 14 sections
Course online course

When learning Scala a compiler and runtime is needed. This and any text editor is enough to get started with typing your own code. Using an IDE may be an option if the text editor doesn't suit your style. From what I know there are three well known IDEs used for Scala (and Java) development. They are all free to download.

Ecplise
IntelliJ IDEA
Netbeans

A third option that I will take is initially is to use a web based editor and compiler. By writing, compiling and running Scala code from a web page I don't need to worry about installing and configuring Scala on my computer. I've found three sites that allow me to experiment with Scala code without installing anything on my computer (except a browser of course)

Up until now I've used simplyscala because it lets me type code in a console, without defining classes, in an experimental manner. This works well to try out a few lines of code and see the immediate result, but I will most likely try the other two later. At some point I plan to start using an IDE, probably when start doing more than just simple experiments and want to save the code I've written. I would like to get comfortable in writing Scala and have a few ideas of projects that I could do. However, whether I learn to master it or not, and how fast is not crucial. I'm happy just enjoying learning something new and work with languages that I know much better for my serious work. Up until now I've spent some time reading and experimenting with the Scala School tutorial. The tutorial contains a lot of examples and is good to work through.

Some of the central concepts in Scala that a beginner should have no problems understanding I've tried to explain below which will be the end of my first blog post on Scala.

Higher order functions are the opposites to a first order functions and come in three forms:

A function with one or more functions as parameters, and that returns a value
A function that takes one or more values as parameters, and that returns a function
Both of the above, i.e. a function that returns another function and have one or more functions as input parameters

In calculus, good examples are the limit function, which given a function returns a value, and the derivative function, which given a function returns a different function.
An example in Scala looks like this:

def apply(f: Int => String, v: Int) = f(v)

Currying is a process that transforms a function that has more than one parameter into a series of embedded functions, each of which has a single parameter. In other words, when a function is called with fewer number of arguments as prescribed by the function signature a new function will be returned that expects the missing arguments as parameters.

def route (m:Message) = {

(e: Endpoint) => e.send(m)

}

Case classes are class definitions with immutable members that depends on their constructors arguments. A class defined as a case class support implicit equality comparison and therefore support pattern matching. (In F# they are called Discriminated Unions)

case class Demo( title : String, author : String )

Sequence comprehension, in Scala also called for-comprehension (generally called list comprehension in other constructs), is the process of creating a sequence based on an existing sequence. If its sounds like a for-loop it is because it its very similar. Sequence comprehension has it roots in mathematics in generally composed of an output variable(in the example the output variable is i), input domain or generator (in the example the List.range), the guard(in the example the if-statement), and the output function (in this case also the i variable)

for (i <- List.range(from, to) if i % 2 == 0) yield i

An in-depth explanation of comprehensions can be found here.

Closures are functions, whose return value depends on the value of one or more variables declared outside the function. A closure function is a simple function featuring special characteristics. In the example the function on row three is the closure. If you look at that row in isolation you'll notice that the value first is not declared as an argument to the function, it is a so called free variable. What actually happens at runtime is that the compiler extracts a new function that binds the free variable, which is a characteristic of a closure.

val largerThanFirst = ( xs : List[Int] ) => {

val first = xs(0)

val isLarger = ( y : Int ) => y > first

for ( x <- xs; for ( isLarger (x) ) yield x

}

Sunday, October 6, 2013

My guide to securing digital currencies

This is my work-in-progress guide and self-adopted steps to backing up and keeping my digital currencies secure. The guide applies to bitcoins, litecoins, and ripples (or XRP). It’s widely accepted that due to the popularity of Windows that it’s more vulnerable to attacks and malicious software. Therefore I use Linux as much as possible when handling digital currencies. I use numerous wallet providers, both web-based and installed applications. A key factor in keeping your assets secure is that you are the only one with access to your wallet files, seed phrases and passwords. Therefore I prefer to use an installed wallet application, at least for accounts with larger balances.

The services I use are Electrum, Litecoin-Qt and the Ripple online client, and I will therefore focus on preventing unwanted access, backing up files and making sure that I can restore and gain access to my assets held by these services should I need to.

Electrum

The electrum wallet can be recovered from a secret seed so it’s imperative to keep backups of this seed and prevent anyone from seeing it. I keep a copy of the seed in three locations:

Paper copy kept at a safe location
In my Wuala account. Before uploading I self-encrypted the file
File saved on a USB-stick, also self-encrypted

Litecoin

My Litecoin-Qt wallet is encrypted and I keep the passphrase on paper at a safe location. A self-encrypted file with the passphrase is saved in my Wuala account and on a USB-stick. The wallet.dat is backed up when I see it as necessary. By default the wallet file contains 100 pre-generated unused addresses, so after a period with many new transactions I make a new backup and replace the wallet.dat file in my Wuala account and USB-stick. Also this file is self-encrypted.

Ripple

For my ripple wallet I backup two files. One file with my secret key is kept in an encrypted file on Wuala and on a USB-stick. This backup only needs to be performed once. My wallet file is also self-encrypted and stored in my Wuala account. The wallet file contains stored contacts so I replace it when I see it necessary.

The Final Touch

There are some password that I keep in my head and written down for easy access

Electrum password
Litecoin-qt passphrase
Ripple client wallet name and passphrase
Wuala account name and password
7-zip password for my self-encrypted files

Take note that for assets of larger value, savings for example, this is not a recommended approach as the wallets are in contact with the internet and therefore may be compromised or stolen if someone unwanted can gain access to the seed or password. I make transactions, even though not very often, with these wallets and therefore I require them to be online. Cold-storage, i.e. creating a wallet which is never in contact with the internet, is another topic and requires different set of actions.

Saturday, October 5, 2013

First Script in Scala

This is a quick summary on how to build and run a very simple application in Scala. The application is a typical "Hello World" example so there no Scala to learn here, instead I hope it will make it frictionless to download, install and configure the essentials for creating the very first Scala application. In this example the OS is Fedora 19 and I will use IntelliJ IDEA as IDE (Integrated Development Environment).

Begin with downloading the JAVA JDK, you'll need this to start the IDE and to make the application. Get the latest RPM-package here (x64 or x86 depending on the OS)

http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html

Install from a terminal (replace path to the download and the JDK version)

rpm -Uvh /path/jdk-7u40-linux-x64.rpm

Download IntelliJ IDEA for Linux here:

http://www.jetbrains.com/idea/download/

Extract the file, i.e: tar -xzf ideaIC-xx.xx.xx.tar.gz

Before the IDE can be started an environment varible to the JDK need to be set. Either one of IDEA_JDK, JDK_HOME or JAVA_HOME will work. This is one way of doing that:

export IDEA_JDK=/usr/java/jdk1.7.0_40 (make sure it's the correct path to the root of the JDK installation)

For the environment variable to persist it needs to added to a config file. There are plenty of tutorials with instructions for this!

Start the IDE from the terminal window: ./idea.sh

In the start window navigate to Configure -> Plugins, then click Install JetBrains Plugin and search for Scala

Right click and Download and Install!

Go back to the Start page and create a New Project. Choose Scala Module. Enable 'Set Scala Home' and click 'Download Scala'. Choose a version to Download. After the download set the path to the download folder. Make sure Compiler library and Standard library is filled out. In case there is a warning triangle as in the picture below, ignore it for now. Click Finish!

Navigate to File -> Project Structure -> Libraries, and fix the errors if any (There is a small icon to click for fixing the error). Dependencies to scala-compiler and scala-library must be added as in the picture below.

Add a new Scala script to the project, name it FirstScript.scala

The script file must be added to the src folder for the project to run!

Type some program code in a main method, something like this:

object FirstScript {

def main(args: Array[String]) {

println("Hello, world!")

}

Build! (Alt+F9)

Run! (Shift+F10)

Good Luck!!

Friday, October 4, 2013

F# Collections - Part 1

F# has three different collection types to hold values of the same type

Array - A fixed-size, zero-based, mutable collection
List - An ordered, immutable linked list
Sequence - A logical series of elements

They have similarities but also many differences. Very few can probably remember all the functions that can be applied for a given type (Intellisense will help!) but a solid grasp of many of the functions are necessary to have in memory. For my reference I wrote this sheet to list the functions, first and second column are the important ones. In the available column ‘a’ is array, ‘l’ list, and ‘s’ is sequence. Some descriptions are missing but I will add that later as I experiment (for full details go here).

Skip this list and goto my experiments below.

Function	Description	Available
append	Add elements and return new collection	als
average	Calculate average	als
averageBy	Calculate average with a function applied to each element	als
blit	Copy section	a
cache	Compute and store elemnts	s
cast	Convert to type	s
choose	Apply function and return Some	als
collect	Apply function, concatenate and return	als
compareWith	Compare with a function	s
concat	Combines	als
countBy	Appy key-generating function to each element and return	s
copy	Copy collection	as
create	Create array	a
delay		s
distinct	Return collection with no duplicates	s
distinctBy	Return collection with no dupliates according to equality function	s
empty	Create empty collection	als
exists	Tests if any elements satisfies condition	als
exists2	Test is pair of elements satisfies condition	as
fill	Set a range of elements to a given value	a
filter	Filter and return new collection	als
find	Return first found element	als
findIndex	Return index of first found element	als
fold	Apply function to each element, threading an accumulator argument	als
fold2	Apply function to each element in two collection, threading an accumulator argument	al
foldBack		al
foldBack2		al
forall	Test if all elements satisfies predicate	als
forall2	Test if all elements satisfy predicate pairwise	als
get/nth	Get element	als
head	Get first element	ls
init	Create collection given dimension and function	als
initInfinite	Generate sequence	s
isEmpty	Test is empty	als
iter	Apply function to each element	als
iteri		als
iteri2	Apply function to pair of elements	al
iter2		al
length	Return length	als
map	Build a collection by applying a function	als
map2	Build a collection by applying a function to two collections	l
map3		l
mapi	Build array	als
mapi2	Build collection	als
max	Return greatest element	als
maxBy	Return greatest element compared with function	als
min	Return smallest element	als
minBy	Return smallest element compared with function	als
ofArray, ofList, ofSeq	Create a new collection of a different type
pairwise		s
partition	Split collection into two collections	al
permute	Return array with all elements permuted	al
pick	Apply function to successive elements	als
readonly		s
reduce		als
reduceBack		l
replicate	Create list of specified length with elements set to given value	l
rev	Reverse collection	al
scan		als
scanBack		al
singleton		s
set	Set element to specified value	a
skip	Skip elements and return new collection	s
skipWhile		s
sort	Sort using compare	als
sortBy		als
sortInPlace		a
sortInPlaceBy		a
sortInPlaceWith		a
sortWith	Sort using comparison function	al
sub	Create array from subrange	a
sum	Return sum of elements	als
sumBy	Return sum of element with function applied	als
tail	Return list without first element	l
take	Return elements up to specified count	s
takeWhile		s
toArray, toList, toSeq	Create a new collection of a different type
truncate	Return a sequence with no more than N elements	s
tryFind		als
tryFindIndex		als
tryPick		als
unfold		s
unzip	Split list of pairs into two lists	al
unzip3	Split list of triples into three lists	als
windowed		s
zip	Combines the two collections into a list of pairs	als
zip3	Combines the three collections into a list of triples	als

There are several ways to create collections but the most basic is:

let a = [| 0 .. 10 .. 100 |]

let l = [ 0 .. 10 .. 100 ]

let s = { 0 .. 10 .. 100 }

The collections above have small footprints so let’s create larger collections and make a discovery. I add a stopwatch to determine the time for each binding.

let N = int(10e6)

open System.Diagnostics

let stopWatch = Stopwatch.StartNew()

let aXL = Array.init N (fun i -> i)

let aXLt = stopWatch.Elapsed.TotalMilliseconds

let lXL = List.init N (fun i -> i)

let lXLt= stopWatch.Elapsed.TotalMilliseconds - aXLt

let sXL = Array.init N (fun i -> i)

let sXLt = stopWatch.Elapsed.TotalMilliseconds - lXLt

stopWatch.Stop()

printf "%f %f %f" aXLt lXLt sXLt

The output I get is: 35.113200 1038.648800 72.062900

Initializing the list takes considerable longer and can be explained that lists are in fact linked lists.

Now let’s try to update the element at in the middle (index 5). For the array you can do this:

a.SetValue(0, 5)

But for both the list and sequence you will discover that there is not straightforward way in doing the same. Only array elements are mutable as I wrote in the first paragraph.

To get a single element and bind to a new value:

let a5 = Array.get a 5

let l5 = List.nth l 5

let s5 = Seq.nth 5 s

For arrays and list it can also be written:

let a5 = a.GetValue(5)

let l5 = l.Item(5)

The same collections can also be created as

let a = Array.init 11 (fun n -> n * 10)

let l = List.init 11 (fun n -> n * 10)

let s = Seq.init 11 (fun n -> n * 10)

By now you have probably seen that ‘s’ has a different signature

val s : seq<int>

It is because the sequence is not actually evaluated when created. The sequence is represented as System.IEnumerable which means that the sequence is lazily evaluated. With lazy evaluation it is possible to create infinite collections, which otherwise would consume an infinite amount of memory

let sInfinite = Seq.initInfinite (fun n -> n * 10)

Seq.nth 10 sInfinite

Seq.nth 2147483647 sInfinite

The third row will take a while to evaluate, as it’s the largest possible value for a 32 bit int, and might not evaluate to the number you expect.

The average of a integer collection can easily be calculated

Array.averageBy (fun i -> float i) a

List.averageBy (fun i -> float i) l

Seq.averageBy (fun i -> float i) s

Let’s try this on a larger collection and measure the performance

open System.Diagnostics

let stopWatch = Stopwatch.StartNew()

let aa = Array.averageBy (fun i -> float i) a

let aat = stopWatch.Elapsed.TotalMilliseconds

let la = List.averageBy (fun i -> float i) l

let lat= stopWatch.Elapsed.TotalMilliseconds - aat

let sa = Seq.averageBy (fun i -> float i) s

let sat = stopWatch.Elapsed.TotalMilliseconds - lat

stopWatch.Stop()

printf "%f %f %f " aat lat sat

You’ll find that the sequence is the slowest.

Since these collections are similar but have differences there will be situations where one collection needs to be casted to another type. There are two functions to use for casting, e.g. for an array there are both toList and ofList. The former will cast the array to a list and the latter will cast a list to an array. This means that there are two ways of casting an array to a list , i.e. Array.ToList and List.OfArray. The obvious question is what’s the difference? To find out I decompiled the Array.Module which holds the functions.

public static FSharpList<T> ToList<T>(T[] array)

{

if ((object) array == null)

throw new ArgumentNullException("array");

else

return List.ofArray<T>(array);

}

public static T[] OfList<T>(FSharpList<T> list)

{

if ((object) array == null)

throw new ArgumentNullException("list");

else

return List.toArray<T>(list);

}

The function makes a call to a function in the List module.

public static FSharpList<T> OfArray<T>(T[] array)

{

return List.ofArray<T>(array);

}

public static T[] ToArray<T>(FSharpList<T> list)

{

return List.toArray<T>(list);

}

Examining this we find that using ToList() in the Array Module and using ofArray() in the List Module is in fact the same thing except for the if condition equals null. Note the difference of the static method with the first capital capital and calls to internal methods with first character lowercase. All conversion is handled by internal methods of List.

internal static T[] toArray<T>(FSharpList<T> l)

{

T[] res = new T[l.Length];

List.loop<T>(res, 0, l);

return res;

}

internal static FSharpList<T> ofArray<T>(T[] arr)

{

int length = arr.Length;

FSharpList<T> tail = FSharpList<T>.get_Empty();

int index = length - 1;

int num = 0;

if (num <= index)

{

tail = FSharpList<T>.Cons(arr[index], tail);

--index;

}

while (index != num - 1);

}

return tail;

}

Its worth looking at this for a minute and understand how it works. Both conversion to and from Array uses an iteration, when converting to an array its straightforward but the other conversion is a bit complex, with a downward counting while loop and list concatenation.

The same symmetry can be expected for Array.ToSeq and Seq.OfArray and I will show that it is the case here.

Decompiled Array module:

public static IEnumerable<T> ToSeq<T>(T[] array)

{

if ((object) array == null)

throw new ArgumentNullException("array");

else

return SeqModule.OfArray<T>(array);

}

Decompiled Seq Module:

public static IEnumerable<T> OfArray<T>(T[] source)

{

if ((object) source == null)

throw new ArgumentNullException("source");

else

return new SeqModule.OfArray <T>(source));

}

To finish this post I’ll sum up some important traits for each collection and when and how one would choose one over another.

Seq and Array are better than List for parallelism
Expose Seq to public API, use List and Array only internally
Use Array if data is rarely added (or added in larger groups), or if size is known at initialization
Since a (Linked) List are easy to add to/remove from use List if many add/inserts will be made
Use List in recursive processing with the head::tail pattern
Use Seq or Array for large collections since List consumes much more memory
Use List if you are holding immutable data and any of the other conditions suggest List
Use Seq for large collections but don’t expect to use all of the element, or don’t want all elements in memory at the same time
Seq is an abstract type and List and Array are automatically sequences. Seq support lazy evaluation. Therefore Seq can be used by default as the concrete type doesn't matter.