Friday, August 19, 2011

Scala stock charts, part 2 - Getting the quotes

I am Swedish so I am for the moment only interested in the Swedish stock market. I have found a site where to get the quotes, it is in text format and I intend to read this information and save it in in other files, one file for each instrument (I call on single stock instrument since it is a form of financial instrument).

The format of the daily file is: (There is an example file to work with at : http://sites.google.com/site/ironicprogrammer/files/20110817k.txt)

Slutkurser från OMX 2011-08-17 Slutkurser från Burgundy 2011-08-17
Ticker Aktie +/- +/-% Köp Sälj Senast Högst Lägst Oms (antal) Oms (SEK) Senast Högst Lägst Oms (antal) Oms (SEK)
OMX Stockholm Large Cap
ABB ABB Ltd -2,30 -1,66 136,40 136,50 136,40 139,40 136,40 3299500 453921900 136,6 139,4 136,6 183000 25240000

It is hard to see in that little box, however. first row is title information of the file.
Slutkurser från OMX 2011-08-17[6 tabs]Slutkurser från Burgundy 2011-08-17

That is not really necessary and can be skipped as the next line, but the next line is interesting to see the structure of the quotes:

Ticker Aktie +/- +/-% Köp Sälj Senast Högst Lägst Oms (antal) Oms (SEK)[6 tabs]Senast Högst Lägst Oms (antal) Oms (SEK)

 When reading the file I will only be interested in the quotes from OMX, the Burgundy I do not want so when saving the quotes to different files I will not consider these, so it is the blue marked fileds that would be used.

The third row

OMX Stockholm Large Cap

is the name of the first list of quotes. It continues like this in the file;
Listname
Ticker Aktie ...
...
Listname ...

Ticker Aktie ...
...

I am only interested in the first three lists, the forth is named "Externa listan", and I will use this name to break the read.

So lets start write som code (finally!!)
I use the REPL for a little bit trial before I put in in classes/objects.

To test to read the file:
Ok, everything seams to work so this need to be taken car of in some other way.

Lets put it in a bit more organized way:

I put a file in src/main/scala/ironic that is named QuoteLoad.scala with the following content:
QuoteLoad.scala

Ok there may be some features to explain here. 
First of all, I make use of an singleton object here so I will be able to call from a terminal window. So it has a main method, but I also extract the functionality so the future client will be able to use the same methods to load quotes.

There is a trait so I can avoid getting Iterator[Any] and instead getting Iterator[QuoteItem] to be able to handle the case class objects only, when I later will save the quotes to different files.

QuoteList: is for holding the name of a list (there is different list of quotes according to its category, that is more or less the amount of trading that is done and the size of the companies)
Quote: is for holding the quote for the day for a specific share
InvalidQuote: is for holding (and logging for error checking) an invalid row from the file.

There is two regexp's that is handled later in pattern matching to extract the lists and quotes from the file. They look for the structure in the file for valid quotes and the simpler one is for extracting the list name that is only one text.

def loadQuotes(url: String, date: String): handles the load from the URL and calls a helper method to extract the valid quotes and lists.

This will have to do for a first version, later I will save these quotes to different files.

2 comments:

  1. preface: I'm a c++ guy with past Java experience & taking a scala class currently (tough wrapping headaround)

    question: Are you sure the regex is the way to go for parsing? Might a CSV parse be faster? (parse on tabs to load up a 2d grid of strings -- then convert to proper field format in memory?)

    ReplyDelete
  2. Hi, John,
    sorry for late reply.
    Maybe you are right. The purpose for doing this was not for performance but to use (and learn) pattern matching.
    I did this Project to learn Scala.

    ReplyDelete