Sunday, October 20, 2013

Building the scala-redis client for scala 2.10.3

scala-redis a client for the popular key-value database redis. The client is avaible on github here 
and comes with a jar file 'rediscient-1.0.1.jar for scala version 2.8.1

With a few steps I'll show how to build this client for scala version 2.10.3 using sbt. If you havent done so already clone the repository on github

1. Change two lines in the build.properties:
    build.scala.versions=2.8.1  -> build.scala.versions=2.10.3
    sbt.version=0.7.4 -> sbt.version=0.13.0

The build is for scala 2.10.3 and and the sbt version should match what you have installed. If SBT is not installed you can get it here

2 Open a command line and change directory to the root of your the scala-redis project

3. Type 'sbt'+Enter and wait for dependencies to be resolved
    C:path> sbt

4. Optional: At the SBT prompt type 'compile'+Enter to compile. Wait for success
    > compile

There should be a folder called scala-2.10 in the target folder with compiled java classes.

5. To package the java classes to a jar-file, type 'package'+Enter at the prompt
    > package

There should be a newly created jar file in scala-2.10. My file is named scala-redis_2.10-0.1-SNAPSHOT.jar

Now this jar can be consumed by other projects!





Saturday, October 12, 2013

Skalahhhh (The Scalable Language)

As my experiments in F# is going quite well and I really start to enjoy the terse syntax, lack of curly braces, type inference, and experimental programming in the F# interactive console I decided to take up another challenge by getting some basic familiarity with Scala. For those who don't know Scala is pure object oriented language that runs on the Java Virtual Machine. Its completely separate to F# and .Net so the only association between the two will be in this first paragraph of this post.

Two of the first things I read about Scala is that "Every value is an object" and "Every function is a value". With this knowledge one can easily conclude that all functions are objects. I also read that Scala is a so called pure object oriented language where everything is an object. While this is easy to remember its harder to understand the implications of such a simple statement. As an application developer I don't think its crucial to understand the true meaning of it, just that it is a important fact in the design of the language.


At the point of writing the current version of the language is 2.10. The first version available version appeared in 2003 and its creator is Martin Odersky. He is still engaged in Scala and teaches an online course in Scala on Coursera (link later in this post). Scala is under development and from time to time new releases are made available. The Scala Improvment Process (SIP) provide insight into pending and completed changes to the Scala language.


One of the challenges in learning something new is finding good material to learn from. When learning to program the resources are often plentiful and this makes it only harder. Just by doing a few searches on Scala I was able to find tutorials, books, videos and who else knows what. For me its a thin balance between collecting too much and too little reference material and its important not to be on the far edge of either side. If I have to much material the task of digesting and extracting what is useful is overwhelming and off-putting. If I have too little references there's a risk that I will shortly again be searching for more information and risk being overloaded that time. To get started I've filtered the resources available online and ended up with this list.

When learning Scala a compiler and runtime is needed. This and any text editor is enough to get started with typing your own code. Using an IDE may be an option if the text editor doesn't suit your style. From what I know there are three well known IDEs used for Scala (and Java) development. They are all free to download.
  • Ecplise
  • IntelliJ IDEA
  • Netbeans
A third option that I will take is initially is to use a web based editor and compiler. By writing, compiling and running Scala code from a web page I don't need to worry about installing and configuring Scala on my computer. I've found three sites that allow me to experiment with Scala code without installing anything on my computer (except a browser of course)
Up until now I've used simplyscala because it lets me type code in a console, without defining classes, in an experimental manner. This works well to try out a few lines of code and see the immediate result, but I will most likely try the other two later.  At some point I plan to start using an IDE, probably when start doing more than just simple experiments and want to save the code I've written. I would like to get comfortable in writing Scala and have a few ideas of projects that I could do. However, whether I learn to master it or not, and how fast is not crucial. I'm happy just enjoying learning something new and work with languages that I know much better for my serious work. Up until now I've spent some time reading and experimenting with the Scala School tutorial. The tutorial contains a lot of examples and is good to work through.

Some of the central concepts in Scala that a beginner should have no problems understanding I've tried to explain below which will be the end of my first blog post on Scala.

Higher order functions are the opposites to a first order functions and come in three forms:

  1. A function with one or more functions as parameters, and that returns a value
  2. A function that takes one or more values as parameters, and that returns a function
  3. Both of the above, i.e. a function that returns another function and have one or more functions as input parameters 

In calculus, good examples are the limit function, which given a function returns a value,  and the derivative function, which given a function returns a different function.
An example in Scala looks like this:

def apply(f: Int => String, v: Int) = f(v)

Currying is a process that transforms a function that has more than one parameter into a series of embedded functions, each of which has a single parameter. In other words, when a function is called with fewer number of arguments as prescribed by the function signature a new function will be returned that expects the missing arguments as parameters.

           def route (m:Message) = {
                  (e: Endpoint) => e.send(m)
           }

Case classes are class definitions with immutable members that depends on their constructors arguments. A class defined as a case class support implicit equality comparison and therefore support pattern matching. (In F# they are called Discriminated Unions)

                 case class Demo( title : String, author : String )

Sequence comprehension, in Scala also called for-comprehension (generally called list comprehension in other constructs), is the process of creating a sequence based on an existing sequence. If its sounds like a for-loop it is because it its very similar. Sequence comprehension  has it roots in mathematics in generally composed of an output variable(in the example the output variable is i), input domain or generator (in the example the List.range), the guard(in the example the if-statement), and the output function (in this case also the i variable)

          for (i <- List.range(from, to) if i % 2 == 0) yield i

An in-depth explanation of comprehensions can be found here.

Closures are functions, whose return value depends on the value of one or more variables declared outside the function. A closure function is a simple function featuring special characteristics. In the example the function on row three is the closure. If you look at that row in isolation you'll notice that the value first is not declared as an argument to the function, it is a so called free variable. What actually happens at runtime is that the compiler extracts a new function that binds the free variable, which is a characteristic of a closure.

          val largerThanFirst =  ( xs : List[Int] )  =>  {
              val first = xs(0)
              val isLarger =  ( y : Int )  =>   y > first
              for ( x <- xs;  for ( isLarger (x) )  yield x

          }

Sunday, October 6, 2013

My guide to securing digital currencies

This is my work-in-progress guide and self-adopted steps to backing up and keeping my digital currencies secure. The guide applies to bitcoins, litecoins, and ripples (or XRP). It’s widely accepted that due to the popularity of Windows that it’s more vulnerable to attacks and malicious software. Therefore I use Linux as much as possible when handling digital currencies. I use numerous wallet providers, both web-based and installed applications. A key factor in keeping your assets secure is that you are the only one with access to your wallet files, seed phrases and passwords. Therefore I prefer to use an installed wallet application, at least for accounts with larger balances.

The services I use are Electrum, Litecoin-Qt and the Ripple online client, and I will therefore focus on preventing unwanted access, backing up files and making sure that I can restore and gain access to my assets held by these services should I need to.

Electrum

The electrum wallet can be recovered from a secret seed so it’s imperative to keep backups of this seed and prevent anyone from seeing it. I keep a copy of the seed in three locations:
  1. Paper copy kept at a safe location
  2. In my Wuala account. Before uploading I self-encrypted the file
  3. File saved on a USB-stick, also self-encrypted

Litecoin

My Litecoin-Qt wallet is encrypted and I keep the passphrase on paper at a safe location.  A self-encrypted file with the passphrase is saved in my Wuala account and on a USB-stick. The wallet.dat is backed up when I see it as necessary. By default the wallet file contains 100 pre-generated unused addresses, so after a period with many new transactions I make a new backup and replace the wallet.dat file in my Wuala account and USB-stick. Also this file is self-encrypted.

Ripple

For my ripple wallet I backup two files. One file with my secret key is kept in an encrypted file on Wuala and on a USB-stick. This backup only needs to be performed once. My wallet file is also self-encrypted and stored in my Wuala account. The wallet file contains stored contacts so I replace it when I see it necessary.

The Final Touch

There are some password that I keep in my head and written down for easy access
  •  Electrum password
  • Litecoin-qt passphrase
  • Ripple client wallet name and passphrase
  • Wuala account name and password
  • 7-zip password for my self-encrypted files

Take note that for assets of larger value, savings for example, this is not a recommended approach as the wallets are in contact with the internet and therefore may be compromised or stolen if someone unwanted can gain access to the seed or password. I make transactions, even though not very often, with these wallets and therefore I require them to be online. Cold-storage, i.e. creating a wallet which is never in contact with the internet, is another topic and requires different set of actions.

Saturday, October 5, 2013

First Script in Scala

This is a quick summary on how to build and run a very simple application in Scala. The application is a typical "Hello World" example so there no Scala to learn here, instead I hope it will make it frictionless to download, install and configure the essentials for creating the very first Scala application. In this example the OS is Fedora 19 and I will use IntelliJ IDEA as IDE (Integrated Development Environment).

Begin with downloading the JAVA JDK, you'll need this to start the IDE and to make the application. Get the latest RPM-package here (x64 or x86 depending on the OS)


Install from a terminal (replace path to the download and the JDK version)

rpm -Uvh /path/jdk-7u40-linux-x64.rpm

Download IntelliJ IDEA for Linux here:


Extract the file, i.e:  tar -xzf ideaIC-xx.xx.xx.tar.gz

Before the IDE can be started an environment varible to the JDK need to be set. Either one of IDEA_JDK, JDK_HOME or JAVA_HOME will work. This is one way of doing that:


export IDEA_JDK=/usr/java/jdk1.7.0_40 (make sure it's the correct path to the root of the JDK installation)

For the environment variable to persist it needs to added to a config file. There are plenty of tutorials with instructions for this!

Start the IDE from the terminal window:  ./idea.sh

In the start window navigate to Configure -> Plugins, then click Install JetBrains Plugin and search for Scala



Right click and Download and Install!

Go back to the Start page and create a New Project. Choose Scala Module. Enable 'Set Scala Home' and click 'Download Scala'. Choose a version to Download. After the download set the path to the download folder. Make sure Compiler library and Standard library is filled out. In case there is a warning triangle as in the picture below, ignore it for now. Click Finish!


 
Navigate to File -> Project Structure -> Libraries, and fix the errors if any (There is a small icon to click for fixing the error). Dependencies to scala-compiler and scala-library must be added as in the picture below.



Add a new Scala script to the project, name it FirstScript.scala
The script file must be added to the src folder for the project to run!

Type some program code in a main method, something like this:

          object FirstScript { 
               def main(args: Array[String]) { 
                  println("Hello, world!") 
               }
          }
 

Build! (Alt+F9)
Run! (Shift+F10)

Good Luck!!





Friday, October 4, 2013

F# Collections - Part 1

F# has three different collection types to hold values of the same type

  1. Array - A fixed-size, zero-based, mutable collection
  2. List - An ordered, immutable linked list
  3. Sequence -  A logical series of elements

They have similarities but also many differences. Very few can probably remember all the functions that can be applied for a given type (Intellisense will help!) but a solid grasp of many of the functions are necessary to have in memory. For my reference I wrote this sheet to list the functions, first and second column are the important ones. In the available column ‘a’ is array, ‘l’ list, and ‘s’ is sequence. Some descriptions are missing but I will add that later as I experiment (for full details go here).

Skip this list and goto my experiments below.

Function
Description
Available
append
Add elements and return new collection
als
average
Calculate average
als
averageBy
Calculate average with a function applied to each element
als
blit
Copy section
a
cache
Compute and store elemnts
s
cast
Convert to type
s
choose
Apply function and return Some
als
collect
Apply function, concatenate and return
als
compareWith
Compare with a function
s
concat
Combines
als
countBy
Appy key-generating function to each element and return
s
copy
Copy collection
as
create
Create array
a
delay

s
distinct
Return collection with no duplicates
s
distinctBy
Return collection with no dupliates according to equality function
s
empty
Create empty collection
als
exists
Tests if any elements satisfies condition
als
exists2
Test is pair of elements satisfies condition
as
fill
Set a range of elements to a given value
a
filter
Filter and return new collection
als
find
Return first found element
als
findIndex
Return index of first found element
als
fold
Apply function to each element, threading an accumulator argument
als
fold2
Apply function to each element in two collection, threading an accumulator argument
al
foldBack

al
foldBack2

al
forall
Test if all elements satisfies predicate
als
forall2
Test if all elements satisfy predicate pairwise
als
get/nth
Get element
als
head
Get first element
ls
init
Create collection given dimension and function
als
initInfinite
Generate sequence
s
isEmpty
Test is empty
als
iter
Apply function to each element
als
iteri

als
iteri2
Apply function to pair of elements
al
iter2

al
length
Return length
als
map
Build a collection by applying a function
als
map2
Build a collection by applying a function to two collections
l
map3

l
mapi
Build array
als
mapi2
Build collection
als
max
Return greatest element
als
maxBy
Return greatest element compared with function
als
min
Return smallest element
als
minBy
Return smallest element compared with function
als
ofArray,
ofList,
ofSeq
Create a new collection of a different type

pairwise

s
partition
Split collection into two collections
al
permute
Return array with all elements permuted
al
pick
Apply function to successive elements
als
readonly

s
reduce

als
reduceBack

l
replicate
Create list of specified length with elements set to given value
l
rev
Reverse collection
al
scan

als
scanBack

al
singleton

s
set
Set element to specified value
a
skip
Skip elements and return new collection
s
skipWhile

s
sort
Sort using compare
als
sortBy

als
sortInPlace

a
sortInPlaceBy

a
sortInPlaceWith

a
sortWith
Sort using comparison function
al
sub
Create array from subrange
a
sum
Return sum of elements
als
sumBy
Return sum of element with function applied
als
tail
Return list without first element
l
take
Return elements up to specified count
s
takeWhile

s
toArray,
toList,
toSeq
Create a new collection of a different type

truncate
Return a sequence with no more than N elements
s
tryFind

als
tryFindIndex

als
tryPick

als
unfold

s
unzip
Split list of pairs into two lists
al
unzip3
Split list of triples into three lists
als
windowed

s
zip
Combines the two collections into a list of pairs
als
zip3
Combines the three collections into a list of triples
als


There are several ways to create collections but the most basic is:

let a = [| 0 .. 10 .. 100 |]
let l = [ 0 .. 10 .. 100 ]
let s = { 0 .. 10 .. 100 }

The collections above have small footprints so let’s create larger collections and make a discovery. I add a stopwatch to determine the time for each binding.

let N = int(10e6)
open System.Diagnostics
let stopWatch = Stopwatch.StartNew()
let aXL = Array.init N (fun i -> i)
let aXLt = stopWatch.Elapsed.TotalMilliseconds
let lXL = List.init N (fun i -> i)
let lXLt= stopWatch.Elapsed.TotalMilliseconds - aXLt
let sXL = Array.init N (fun i -> i)
let sXLt = stopWatch.Elapsed.TotalMilliseconds - lXLt
stopWatch.Stop()


printf "%f %f %f" aXLt lXLt sXLt

The output I get is: 35.113200 1038.648800 72.062900
Initializing the list takes considerable longer and can be explained that lists are in fact linked lists.

Now let’s try to update the element at in the middle (index 5). For the array you can do this:

a.SetValue(0, 5)

But for both the list and sequence you will discover that there is not straightforward way in doing the same. Only array elements are mutable as I wrote in the first paragraph.
To get a single element and bind to a new value:

            let a5 = Array.get a 5
       let l5 = List.nth l 5
       let s5 = Seq.nth 5 s

For arrays and list it can also be written:

            let a5 = a.GetValue(5)
       let l5 = l.Item(5)

The same collections can also be created as

let a = Array.init 11 (fun n -> n * 10)
let l = List.init 11 (fun n -> n * 10)

let s = Seq.init 11 (fun n -> n * 10)

By now you have probably seen that ‘s’ has a different signature

                val s : seq<int>

It is because the sequence is not actually evaluated when created. The sequence is represented as System.IEnumerable which means that the sequence is lazily evaluated. With lazy evaluation it is possible to create infinite collections, which otherwise would consume an infinite amount of memory 

            let sInfinite = Seq.initInfinite (fun n -> n * 10)
       Seq.nth 10 sInfinite
       Seq.nth 2147483647 sInfinite

The third row will take a while to evaluate, as it’s the largest possible value for a 32 bit int, and might not evaluate to the number you expect.

The average of a integer collection can easily be calculated

           Array.averageBy (fun i -> float i) a
      List.averageBy (fun i -> float i) l
      Seq.averageBy (fun i -> float i) s

Let’s try this on a larger collection and measure the performance

open System.Diagnostics
let stopWatch = Stopwatch.StartNew()
let aa = Array.averageBy (fun i -> float i) a
let aat = stopWatch.Elapsed.TotalMilliseconds
let la = List.averageBy (fun i -> float i) l
let lat= stopWatch.Elapsed.TotalMilliseconds - aat
let sa = Seq.averageBy (fun i -> float i) s
let sat = stopWatch.Elapsed.TotalMilliseconds - lat
stopWatch.Stop()


printf "%f %f %f " aat lat sat

You’ll find that the sequence is the slowest.

Since these collections are similar but have differences there will be situations where one collection needs to be casted to another type. There are two functions to use for casting, e.g. for an array there are both toList and ofList. The former will cast the array to a list and the latter will cast a list to an array. This means that there are two ways of casting an array to a list , i.e. Array.ToList and List.OfArray. The obvious question is what’s the difference?  To find out I decompiled the Array.Module which holds the functions.

    public static FSharpList<T> ToList<T>(T[] array)
    {
      if ((object) array == null)
        throw new ArgumentNullException("array");
      else
        return List.ofArray<T>(array);
    }

    public static T[] OfList<T>(FSharpList<T> list)
    {
      if ((object) array == null)
        throw new ArgumentNullException("list");
      else
        return List.toArray<T>(list);

    }

The function makes a call to a function in the List module.

    public static FSharpList<T> OfArray<T>(T[] array)
    {
      return List.ofArray<T>(array);
    }

    public static T[] ToArray<T>(FSharpList<T> list)
    {
        return List.toArray<T>(list);

    }

Examining this we find that using ToList() in the Array Module and using ofArray() in the List Module is in fact the same thing except for the if condition equals null. Note the difference of the static method with the first capital capital and calls to internal methods with first character lowercase. All conversion is handled by internal methods of List.

        internal static T[] toArray<T>(FSharpList<T> l)
    {
      T[] res = new T[l.Length];
      List.loop<T>(res, 0, l);
      return res;
    }

    internal static FSharpList<T> ofArray<T>(T[] arr)
    {
      int length = arr.Length;
      FSharpList<T> tail = FSharpList<T>.get_Empty();
      int index = length - 1;
      int num = 0;
      if (num <= index)
      {
        do
        {
          tail = FSharpList<T>.Cons(arr[index], tail);
          --index;
        }
        while (index != num - 1);
      }
      return tail;
    }

Its worth looking at this for a minute and understand how it works. Both conversion to and from Array uses an iteration, when converting to an array its straightforward but the other conversion is a bit complex, with a downward counting while loop and list concatenation.

The same symmetry can be expected for Array.ToSeq and Seq.OfArray and I will show that it is the case here.

Decompiled Array module:

      public static IEnumerable<T> ToSeq<T>(T[] array)
    {
            if ((object) array == null)
                throw new ArgumentNullException("array");
            else
                return SeqModule.OfArray<T>(array);
     }

Decompiled Seq Module:

    public static IEnumerable<T> OfArray<T>(T[] source)
    {
            if ((object) source == null)
                throw new ArgumentNullException("source");
            else
return new SeqModule.OfArray <T>(source));

     }

To finish this post I’ll sum up some important traits for each collection and when and how one would choose one over another.

  •  Seq and Array are better than List for parallelism
  •  Expose Seq to public API, use List and Array only internally
  • Use Array if data is rarely added (or added in larger groups), or if size is known at initialization
  • Since a (Linked) List are easy to add to/remove from use List if many add/inserts will be made
  • Use List in recursive processing with the head::tail pattern
  • Use Seq or Array for large collections since List consumes much more memory
  • Use List if you are holding immutable data and any of the other conditions suggest List
  • Use Seq for large collections but don’t expect to use all of the element, or don’t want all elements in memory at the same time
  • Seq is an abstract type and List and Array are automatically sequences. Seq support lazy evaluation. Therefore Seq can be used by default as the concrete type doesn't matter.