<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6748907194223514435</id><updated>2011-11-27T19:18:58.313-05:00</updated><category term='Data Interchange'/><category term='Quant'/><category term='DSL'/><category term='Groovy'/><category term='Serialization'/><category term='Object Notation'/><title type='text'>Jonathan's Technology &amp; Finance Blog</title><subtitle type='html'>Blog on technology and quantitative finance focusing on Dynamic and Functional Languages such as Groovy, Scala, F# and R for building financial and quantitative domain specific languages (DSL) across data and computational grids using tools like GridGain and Terracotta.  This description is now search engine and tag word compliant.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://nycfintech.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6748907194223514435/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://nycfintech.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Jonathan</name><uri>http://www.blogger.com/profile/13596731520846039248</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://4.bp.blogspot.com/_J9I66wTIjAE/SYEpkhd-mtI/AAAAAAAAAAM/8m2xMBPpmbY/S220/jonathan2.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>5</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6748907194223514435.post-8181797934982305917</id><published>2010-12-04T17:43:00.002-05:00</published><updated>2010-12-09T10:14:04.034-05:00</updated><title type='text'>Giant Groovy Scripts and Enterprise Software Forensics: Tricks and Treats</title><content type='html'>&lt;span style="font-size: x-large;"&gt;&lt;b&gt;The Vision&lt;/b&gt;&lt;/span&gt; &lt;br /&gt;&lt;br /&gt;How often would it be helpful to record everything a complex system did at a critical moment, be able to log the information intelligently, be able to play it back at some point in the future, and have a record to fall back on that could re-create the state of the system at a given point in time?&lt;br /&gt;&lt;br /&gt;I find myself investing a lot of time studying the log files -- sometimes poorly constructed or lacking critical information -- of real-time systems, build systems to simulate typical and exceptional moments in the financial markets, and instrumenting existing systems to better combine the insights of forensic analysis with simulation.&lt;br /&gt;&lt;br /&gt;The questions I started with are, admittedly, a set up for a discussion of proxies that write their operations to a script while executing production code, thus creating a human-readable and machine-executable record of exactly want took place during operations.&amp;nbsp; In Java, this is easily implemented by combining multiple implementations of a given interface (one that record and one that executes) with a delegation class that receives the original instructions and provides handlers for exceptions and so on.&lt;br /&gt;&lt;br /&gt;In Groovy, it is easy to accomplish the same thing using Groovy's Meta-Programming features (for which there are already many good blogs and presentations).&lt;br /&gt;&lt;br /&gt;However you choose to implement the code-generation part of a component that records a given execution path and its inputs, you are likely to hit certain snags.&amp;nbsp; That said, if you started down this path there is probably a reason and my conclusion is that it is worth the pain.&amp;nbsp; The forensic value in having a human readable / editable "log file" that can also be evaluated to reproduce the exact state of a system at a given point in time cannot be undervalued.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;Practical considerations for very sizable playback scripts&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-size: small;"&gt;There are a number of problems with this approach as it related to context specific business logic, external state management, exceptions and errors.&amp;nbsp; In my situation we have a point-in-time database so I can send any query to the database with an additional parameter that specifies an effective date and the query will return the information that was available to the system at that point in time.&amp;nbsp; This solves most of the state management issues not already serialized to my scripts.&amp;nbsp; For the purposes of this post, I will focus on mechanics on dealing with the gigantic scripts you have produced.&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-size: small;"&gt;In my case, the generated script have the feel of database operations.&amp;nbsp; For example,&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre class="brush:groovy"&gt;api.beginTrx()&lt;br /&gt;  inst.defineInstrument(&lt;br /&gt;    [ name : 1.23, type : 'sample', avg : 3.1415927], &lt;br /&gt;    [ 5 : '', 7 : '', 9 : ''])&lt;br /&gt;&lt;br /&gt;  inst.defineInstrument(&lt;br /&gt;    [ name : 1.23, type : 'sample', avg : 3.1415927], &lt;br /&gt;    [ 5 : '', 7 : '', 9 : ''])&lt;br /&gt;&lt;br /&gt;  inst.defineInstrument(&lt;br /&gt;    [ name : 1.23, type : 'sample', avg : 3.1415927], &lt;br /&gt;    [ 5 : '', 7 : '', 9 : ''])&lt;br /&gt;&lt;br /&gt;api.commitTrx()&lt;br /&gt;&lt;/pre&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-size: small;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-size: small;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The first round of difficulties came from compilation errors with either the code or with data serialization.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-size: small;"&gt;When considering these, it is important to realize a few things about how Groovy works with the JVM. &amp;nbsp; First, although many interpreted and dynamic languages evaluate their code one block at a time, Groovy's runtime compiles the entire script at once into a class derived from the abstract class groovy.lang.Script.&amp;nbsp; At this point the script can be compiled in a variety of ways ranging from the GroovyShell or &lt;/span&gt;&lt;/span&gt;GroovyClassLoader.&amp;nbsp; Additionally &lt;span style="font-size: large;"&gt;&lt;span style="font-size: small;"&gt;the&amp;nbsp; GroovyScriptEngine can both compile and evaluate a scrip with a given binding for external variables&lt;/span&gt;&lt;/span&gt; and keep class files in synch when the source code changes.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;This difference between Groovy and a language like Perl&amp;nbsp; which does not compile the entire script, is that your code generation component must produce syntactically valid Groovy for every possible combination of execution paths and state migrations.&amp;nbsp; Obviously, this was already a requirement because the goal of this approach was to be able to replay exactly what happened in a large and complex system.&amp;nbsp; My point here is that a small compile errors in the generated code will have much more unforgiving consequences in Groovy than in some other dynamic languages because the entire script is compiled at once.&amp;nbsp; That said, if your goal is to have defect free code in this feature point, then consider the compilation process a blessing.&lt;br /&gt;&lt;br /&gt;If you are using Groovy's meta programming features, this requirement is fairly easy with respect to method invocation and such.&amp;nbsp; Typically the bigger program in my experience involves how to best serialize the data such that it will replicate state accurately, be human readable, and be machine evaluable.&amp;nbsp; In most cases their are ways to construct a string that, when evaluated, will reflexively re-create an object with the same state.&amp;nbsp; A slightly trickier problem is when to evaluate and when to serialize GStrings.&amp;nbsp; Here thorough unit testing and integration testing will be key to your success. &amp;nbsp; You can see some of my early thoughts on these problems in the first two posts I made on Groovy Object Interchange data files.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Although using the dynamic groovy support classes provide powerful features, once compiled the run method defined in the new .class file call be called on any instance directly from Java.&amp;nbsp; So the point to keep in mind here is that all the code from your script will get folded into a method that will be invoked through&amp;nbsp; Groovy's run methods.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;The JVM 64k Problem For Large Groovy Scripts&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Additionally, the JVM class file specification has a 64KB limitation of the size of a single method.&amp;nbsp; Typically when a script is compiled, you should assume that the totality of the code generated by your script will reside in a single method.&amp;nbsp; Therefore, it won't be long before you find yourself on friendly terms with&amp;nbsp;&lt;span class="postbody"&gt; &lt;/span&gt;&lt;span class="postbody"&gt;&lt;a class="api" href="http://download.oracle.com/javase/6/docs/api/java/lang/ClassFormatError.html" target="_new" title="Java API"&gt;java.lang.ClassFormatError&lt;/a&gt;.&lt;/span&gt;&amp;nbsp; The database-like example above will quickly result in this.&amp;nbsp; In that example, each line of code contributed about 10 bytes of Java bytecode to the primary method and, predictably, &lt;br /&gt;&lt;br /&gt;If you read the various threads on this error, the suggestions that seems to be among the most popular revolve around refactoring your code into a series of smaller method invocations and occasionally around how to automatically check the implied method size prior to compilation and safely divide the script into a number of methods that could be evaluated individually.&lt;br /&gt;&lt;br /&gt;To this I have two responses.&amp;nbsp; The stated goal here is to accurately re-create and document an execution path, so it would be unexciting to add additional structure that did not exist during real-time evaluation.&amp;nbsp; Secondly automatically decomposition of the primary method is not as trivial as it sounds (and if it is, should then be part of the parser / compiler).&lt;br /&gt;&lt;br /&gt;That said, Groovy does not constrain you to a world of small, reasonably sized scripts.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;Closures and Builders are Your Friends&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Consider the simple alternative to the above code:&lt;br /&gt;&lt;br /&gt;&lt;pre class="brush:groovy"&gt;api.transaction {&lt;br /&gt;  inst.defineInstrument(&lt;br /&gt;     [ name : 1.23, type : 'sample', avg : 3.1415927], &lt;br /&gt;     [ 5 : '', 7 : '', 9 : ''])&lt;br /&gt;&lt;br /&gt;  inst.defineInstrument(&lt;br /&gt;     [ name : 1.23, type : 'sample', avg : 3.1415927], &lt;br /&gt;     [ 5 : '', 7 : '', 9 : ''])&lt;br /&gt;&lt;br /&gt;  inst.defineInstrument(&lt;br /&gt;     [ name : 1.23, type : 'sample', avg : 3.1415927], &lt;br /&gt;     [ 5 : '', 7 : '', 9 : ''])&lt;br /&gt;&lt;br /&gt;  inst.defineInstrument(&lt;br /&gt;     [ name : 1.23, type : 'sample', avg : 3.1415927], &lt;br /&gt;     [ 5 : '', 7 : '', 9 : ''])&lt;br /&gt;&lt;br /&gt;  inst.defineInstrument(&lt;br /&gt;     [ name : 1.23, type : 'sample', avg : 3.1415927], &lt;br /&gt;     [ 5 : '', 7 : '', 9 : ''])&lt;br /&gt;&lt;br /&gt;  inst.defineInstrument(&lt;br /&gt;     [ name : 1.23, type : 'sample', avg : 3.1415927], &lt;br /&gt;     [ 5 : '', 7 : '', 9 : ''])&lt;br /&gt;}&amp;nbsp;&lt;/pre&gt;&lt;br /&gt;Here the body of the transaction is contained in an anonymous closure which could be passed to a method with the signiture:&lt;br /&gt;&lt;br /&gt;&lt;pre class="brush:groovy"&gt;def transaction ( Closure body ) {&lt;br /&gt;    beginTrx()&lt;br /&gt;    try {&lt;br /&gt;      body.run()&lt;br /&gt;      commitTrx()&lt;br /&gt;    } catch (Exception e) {&lt;br /&gt;      rollbackTrx()&lt;br /&gt;    }&lt;br /&gt;}&amp;nbsp;&lt;/pre&gt;&lt;br /&gt;Here the body of the short transactions are wrapped, not in begin / commit method invocations, but in an anonymous closure that generating into its own class changes the practical limitations considerably.&amp;nbsp; Still, each top level method invocation adds additional bytecode to the principal method and this example above now scales to 2,000 / transactions or 25,000 lines of code before failing.&amp;nbsp; Larger transactions would have let the file grow still larger and nested transaction allowed yet more.&lt;br /&gt;&lt;br /&gt;Moreover, packaging up units of work into larger baskets and submitting the baskets to a smart transaction manager has some appeal on its own terms.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;The Most Powerful Tools Come At A Price&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-size: small;"&gt;The last thought I would like to share is that for extract-transform-load data flow scenarios, this approach proved incredibly powerful.&amp;nbsp; In the end, I had a small code base that would allow be to examine without binary structures or obscure references all of the data flowing into and out of a complex system, to be able to re-examine the system at a given snapshot, and to be able to re-generate the system from its first execution to the present day using the replay tool.&amp;nbsp; But these features came at a price.&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-size: small;"&gt;I spent a lot of time tweaking the script generation tools and script execution runtime.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-size: small;"&gt;I spent a lot of time making sure that everything beyond simple collections and primitives serialized correctly&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-size: small;"&gt;I spent more time than I should studying the internals of the VM and groovy bytecode generation&lt;/span&gt;&lt;/span&gt;, but admittedly that is partly because I like that stuff.&lt;/li&gt;&lt;li&gt;I had to redesign the Core APIs of the 'live' system to be able to properly support simulation and playback, although I believe I ended up with a better API for the 'live' system as well&amp;nbsp;&amp;nbsp;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-size: large;"&gt;&lt;b&gt;Steps Required To Fully Implement This Vision&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: small;"&gt;Scripts would have to know which version of the codebase they ran against.&amp;nbsp; I would probably narrow the specificity here for major.minor version releases, allowing a patch release to re-write history by fixing a bug in a minor release.&amp;nbsp; The script runner would build a classpath appropriate to the system executing at that time&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: small;"&gt;Database versioning would have to be explicitly synchronized with codebase versioning and simulations run against the "old" schema would have to execute "correctly"&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: small;"&gt;State would have to be explicit with few or no side effects such that the serialization mechanism for the code producing agent would be able to exactly capture both the initial state of the system and all flows of data that changed that state.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;/ul&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;Inspect() / Eval.me()&lt;/span&gt;&lt;/b&gt; &lt;br /&gt;&lt;br /&gt;I have made this point previously, so I won't invest too much in this, but much of what I built is remarkably similar to Groovy's inspect() / eval syntax, but for that to work properly support for common Java datatypes -- Date classes in particular -- and a more generalized contract for POJO / POGO objects to Inspect() themselves (which people could override when advanced state management or systems integration logic was required).&amp;nbsp; This functionality is clearly both within the syntactical aspirations of the language and within its approach to dynamicism.&amp;nbsp; It is also well within the abilities of the current team&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;span style="font-size: large;"&gt;Groovy and The Beauty of The Half Baked Idea&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;As I have said before, one of my favorite aspects of the Groovy eco-system is the speed and ease with which it allows someone to build out a half baked idea and begin to get serious about it.&amp;nbsp; Most of the fully baked ideas in the world have already found their way into great software.&amp;nbsp; Most of the great ideas of the future are half-baked ideas today.&amp;nbsp; Not only does Groovy let you get the half baked ideas into some kind of working structure that allows for refinement and rigorous clarification, but it also allows an idea to fail quickly.&amp;nbsp; This is important also because most of the half baked ideas are bad ideas and the quicker you can move onto the next one the better.&lt;br /&gt;&lt;br /&gt;As strong an advertisement as this is for the language, the eco-system, and, really, the Groovy community, it is also evident in the eco-system itself.&amp;nbsp; Inspect() and Eval() feel, to me, like half baked ideas that got half finished before someone got pulled into something bigger and more important.&amp;nbsp; There are a fair number of points in the codebase where there are diamonds in the rough waiting to be attended to.&amp;nbsp; There are also a fair number of ideas that should have died young but are now stuck in the codebase.&amp;nbsp; And there are others (like the very attractive Grails-UI Rich Internet plugin) that were great at one time but now need work but have now been abandoned by their authors.&amp;nbsp; This is common enough is strong and active open source communities -- think Apache, JBoss and so on -- but Groovy is hitting a critical mass right now in terms of its support all around the world and is also being challenged as the Java language adopts more and more of its most popular features.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6748907194223514435-8181797934982305917?l=nycfintech.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nycfintech.blogspot.com/feeds/8181797934982305917/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://nycfintech.blogspot.com/2010/12/giant-groovy-scripts-and-enterprise.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6748907194223514435/posts/default/8181797934982305917'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6748907194223514435/posts/default/8181797934982305917'/><link rel='alternate' type='text/html' href='http://nycfintech.blogspot.com/2010/12/giant-groovy-scripts-and-enterprise.html' title='Giant Groovy Scripts and Enterprise Software Forensics: Tricks and Treats'/><author><name>Jonathan</name><uri>http://www.blogger.com/profile/13596731520846039248</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://4.bp.blogspot.com/_J9I66wTIjAE/SYEpkhd-mtI/AAAAAAAAAAM/8m2xMBPpmbY/S220/jonathan2.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6748907194223514435.post-684594777250842131</id><published>2010-09-17T17:19:00.000-04:00</published><updated>2010-09-18T15:40:23.419-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Serialization'/><category scheme='http://www.blogger.com/atom/ns#' term='DSL'/><category scheme='http://www.blogger.com/atom/ns#' term='Data Interchange'/><category scheme='http://www.blogger.com/atom/ns#' term='Object Notation'/><category scheme='http://www.blogger.com/atom/ns#' term='Groovy'/><title type='text'>Groovy Data Interchange Format</title><content type='html'>&lt;span style="font-size:180%;"&gt;Groovy Object Notation ? GrON?&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;I Guess it had to happen...&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Currently I spend a lot of time building DSLs in Groovy that will get used by other tools, so writing code that generates syntactically correct Groovy -- and in particular being able to produce machine-executable / human-readable text files that serialize data acquired somewhere else is really important...&lt;br /&gt;&lt;br /&gt;Therein we find one of the problems with Groovy (as opposed to some other functional and dynamic languages).  There is no simple and consistent way to write data to a file that will eventually get passed to XXX.parse on &lt;span style="font-family: courier new;"&gt;GroovyShell&lt;/span&gt; or &lt;span style="font-family: courier new;"&gt;GroovyClassLoader&lt;/span&gt; or passed to &lt;span style="font-family: courier new;"&gt;GroovyScriptEngine&lt;/span&gt; or passed to &lt;span style="font-family: courier new;"&gt;Eval.me&lt;/span&gt;(...) and return an equivalent structure:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Simple map to string semantics use double-quotes for Strings without escaping dollar sign characters ($) that will get caught up in GStrng evaluation.&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: courier new;"&gt;ConfigObject&lt;/span&gt; and &lt;span style="font-family: courier new;"&gt;ConfigSlurper&lt;/span&gt; come very close but&lt;br /&gt;&lt;br /&gt;a) have different syntax&lt;br /&gt;b) do not handle &lt;span style="font-family: courier new;"&gt;Date&lt;/span&gt; and some other objects correctly&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;In some cases you can fix problems with serialization of complex types with overrides on &lt;span style="font-family: courier new;"&gt;toString&lt;/span&gt; in the &lt;span style="font-family: courier new;"&gt;metaClass&lt;/span&gt;, but some classes do not allow overrides on &lt;span style="font-family: courier new;"&gt;toString&lt;/span&gt;  (e.g.  overrides where &lt;span style="font-family: courier new;"&gt;Date.metaClass.toString={-&gt;return new Date(...)&lt;/span&gt; } doesn't really work either...&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;So, even though there is probably some class deep in the API that does this, I dug up some old code that wrote out data structures as Groovy code.  Looking at this as a first class service for DSL language development, I would propose a set of requirements rather than my sample code.   The requirement would be thus:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Be able to handle primitives, basic collections, strings, booleans, and Dates&lt;/li&gt;&lt;li&gt;Get the time-zone handling correct on the Date implementation&lt;/li&gt;&lt;li&gt;The serialization tool should generate human-readable / machine-executable code (Groovy syntax correctness is a given)&lt;/li&gt;&lt;li&gt;Eval.me(script) should, for all practical purposes, be the inverse function for &lt;span style="font-family: courier new;"&gt;writeObject(obj) -&gt; String&lt;/span&gt;.  String java equality won't apply for collections, but for the sake of these discussions we will allow two collections of the same type with functionally equal values to be equal collections&lt;/li&gt;&lt;li&gt;Complex types should be able to handle their own serialization / de-serialization if they choose to.&lt;/li&gt;&lt;/ul&gt;So, this is what I coded up for a first cut.  If anyone knows a where this functionality already exists in the GDK, please let me know.  Any enhancements of functional improvements would be welcome also.&lt;br /&gt;&lt;br /&gt;&lt;pre class="brush:groovy"&gt; &lt;br /&gt;class GroovyObjectNotation {&lt;br /&gt;    private String dateFormatText = "yyyyMMdd-HH:mm:ss.SSSSS Z"&lt;br /&gt;    private SimpleDateFormat dtFrmt = new java.text.SimpleDateFormat(dateFormatText)&lt;br /&gt;&lt;br /&gt;    public writeList(list) { &lt;br /&gt;      return "[ " + list.collect{ t -&gt; writeObj(t) }.join(", ") + " ]" &lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public writeSet(Set set) {  &lt;br /&gt;      return ("new LinkedHashSet( ${writeList(set)} )").toString(); &lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public writeMap( Map map) { &lt;br /&gt;      return ('new LinkedHashMap( [' + &lt;br /&gt;         map.collect { k, v -&gt; &lt;br /&gt;            "${writeObj(k)} : ${writeObj(v)}" &lt;br /&gt;         }.join(", ") + '])').toString() &lt;br /&gt;     }&lt;br /&gt;&lt;br /&gt;    public writeNumber(Number obj) {&lt;br /&gt;        if (obj instanceof Double) return "${obj}d";&lt;br /&gt;        if (obj instanceof Float) return "${obj}f";&lt;br /&gt;        if (obj instanceof Integer) return "${obj}i";&lt;br /&gt;        if (obj instanceof Long) return "${obj}L";&lt;br /&gt;        if (obj instanceof BigInteger) return "${obj}g";&lt;br /&gt;        if (obj instanceof BigDecimal) return "${obj}g";&lt;br /&gt;&lt;br /&gt;        if (obj instanceof Short) return "${obj} as short";&lt;br /&gt;        if (obj instanceof Byte) return "${obj} as byte";&lt;br /&gt;        return obj;&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public String writeDate( obj ) {&lt;br /&gt;        if (obj instanceof java.sql.Timestamp) &lt;br /&gt;          return 'new java.sql.Timestamp(Date.parse(\'' + dateFormatText +&lt;br /&gt;              '\',\''+dtFrmt.format(obj)+'\').getTime())'&lt;br /&gt;        else if (obj instanceof java.sql.Time) &lt;br /&gt;          return 'new java.sql.Time(Date.parse(\'' + dateFormatText + &lt;br /&gt;              '\',\''+dtFrmt.format(obj)+'\').getTime())'&lt;br /&gt;        else if (obj instanceof java.sql.Date) &lt;br /&gt;          return 'new java.sql.Date(Date.parse(\'' + dateFormatText + &lt;br /&gt;              '\',\''+dtFrmt.format(obj)+'\').getTime())'&lt;br /&gt;        else if (obj instanceof java.util.Date) &lt;br /&gt;          return 'Date.parse(\'' + dateFormatText + &lt;br /&gt;               '\',\''+dtFrmt.format(obj)+'\')'&lt;br /&gt;        else if (obj instanceof java.util.Calendar) &lt;br /&gt;          return 'new GregorianCalendar(Date.parse(\'' + dateFormatText +&lt;br /&gt;               '\',\''+dtFrmt.format(obj)+'\').getTime())'&lt;br /&gt;&lt;br /&gt;        else return obj.toString()&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public writeObj(obj) {&lt;br /&gt;        Object.metaClass.toGroovyCode={-&gt; delegate.toString() }&lt;br /&gt;        if (obj instanceof Date || obj instanceof Calendar) return writeDate(obj)&lt;br /&gt;        else if (obj instanceof Number) return writeNumber(obj)&lt;br /&gt;        else if (obj instanceof Set) return writeSet(obj)&lt;br /&gt;        else if (obj instanceof List) return writeList(obj)&lt;br /&gt;        else if (obj instanceof Map) return writeMap(obj)&lt;br /&gt;        else if (obj instanceof Boolean) return obj ? true : false&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;        def retVal = obj.toGroovyCode() == obj.toString() ?&lt;br /&gt;            "'" + obj.toString().replaceAll('([^\\\\])\'','$1\\\\\'') + "'" :&lt;br /&gt;            obj.toGroovyCode()&lt;br /&gt;&lt;br /&gt;        return retVal&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6748907194223514435-684594777250842131?l=nycfintech.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nycfintech.blogspot.com/feeds/684594777250842131/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://nycfintech.blogspot.com/2010/09/groovy-data-interchange-format.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6748907194223514435/posts/default/684594777250842131'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6748907194223514435/posts/default/684594777250842131'/><link rel='alternate' type='text/html' href='http://nycfintech.blogspot.com/2010/09/groovy-data-interchange-format.html' title='Groovy Data Interchange Format'/><author><name>Jonathan</name><uri>http://www.blogger.com/profile/13596731520846039248</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://4.bp.blogspot.com/_J9I66wTIjAE/SYEpkhd-mtI/AAAAAAAAAAM/8m2xMBPpmbY/S220/jonathan2.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6748907194223514435.post-3276476720095220252</id><published>2009-05-31T17:58:00.000-04:00</published><updated>2009-06-01T15:08:39.974-04:00</updated><title type='text'>Groovy Math II: The Monte Carlo Engine</title><content type='html'>Code update can be downloaded &lt;a href="http://jonathanfelch.github.com/Groovy-Numerics/"&gt;here&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;In the last post, I introduced a Groovy numeric collections class that supported basic algebraic operators and would execute those operations in parallel.  It is just one class (plus a Java helper class that maps the reduced data into tasks).  Why am I excited ?&lt;br /&gt;&lt;br /&gt;It could be, I envisioned, a power component for a Monte Carlo engine.   This would a have certain a certain simplicity for researchers and strategists who wanted a platform around which they could articulate financial math.  It would allow for arbitrary path generation and for stochastic processes that might or might not conform to the log normal distribution that dominates much of financial mathematics.  It would be able to price not just one option but an entire options chain for a given expiry.  It would also be able to provide an arbitrary convergence measure.&lt;br /&gt;&lt;br /&gt;So today I uploaded to GitHub a MonteCarlo pricing engine, a number of financial business logic classes for option payouts, path generation, yield curves, and so on.  The pricing engine takes the stochastic variable generators, the asset path generator, and the convergence test as closures allowing for the financial professional to use simple closures to do the job.  The contract:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Stochastic Process:&lt;/span&gt;  The implicit contract is that the random number generator provides a NumericGrid of random numbers large enough to price the options correctly.  Monte Carlo simulations are good for high-dimension problems and it is easy to envision a requirement that the random number generator might produce several grids (Asset Price Movement, Stochastic Volatility Values, Stochastic Interest Rates, Poisson Processes, etc...)  In this model, one of two contracts must apply.  Either the random number generator must generate all of the variables required for the algorithm or the algorithm must produce the additional stochastic terms.  Because both processes are closures, the author can pretty much do what they wish.  The only requirement is that the closure for the path generator be able to accept the output from the random number generator.  Finally because the grid is by definition a complete distribution there is no requirement that is conform to a named mathematic distribution, so the author could add additional kurtosis or extreme event values to their heart's content.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Path Generation:&lt;/span&gt; This must accept this output of the random number generator and return a NumericGrid of paths.  This will feed the options chain.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Options Collection:&lt;/span&gt; Must implement a method called payout that accepts path input.  Accepts one or more paths as inputs and returns a object of the same class and the path input as a result.&lt;/li&gt;&lt;/ol&gt;One of the virtues of of setting up the core components as closures is that they can be overridden to accommodate special circumstances.  The default implementation:&lt;br /&gt;&lt;pre class="brush:groovy"&gt;  &lt;br /&gt;def convergenceTest = { size, standardDeviation -&gt;&lt;br /&gt;  standardDeviation / Math.sqrt(size)&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;Applied the Central Limit Theorem to estimate the price accuracy of the model as a function of the population size and the standard deviation estimate for the sample.  This has the unfortunate but inescapable effect of requiring the engine to increase the path size by a factor of 100 to realize improvement in pricing accuracy by a factor of ten.  This is an appropriate assumption for pseudo-random normally distributed paths.&lt;br /&gt;&lt;br /&gt;However, the NumericGrid class has a handy method that produces a low discrepancy sequence&lt;br /&gt;appropriate for Quasi-Random variables and in this case guaranteed to be complete  discretization of the distribution over a space of that size.  This will allow the engine to converge more quickly such that our initialization code can do the following:&lt;br /&gt;&lt;pre class="brush:groovy"&gt;&lt;br /&gt;MonteCarloEngine engine = new MonteCarloEngine()&lt;br /&gt;engine.convergenceTest = { size, standardDeviation -&gt;&lt;br /&gt;  standardDeviation / size&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;I will explore some more of the class library later on, but a simple view into the Monte Carlo engine would be the following script:&lt;br /&gt;&lt;pre name="code" class="brush:groovy"&gt;&lt;br /&gt; MonteCarloEngine engine = new MonteCarloEngine()&lt;br /&gt; engine.convergenceTest = { size, standardDeviation -&gt;&lt;br /&gt;    standardDeviation / size&lt;br /&gt; }&lt;br /&gt; engine.timeHorizen = 1.0&lt;br /&gt;&lt;br /&gt; def strikes = [80, 90, 100, 110, 120]&lt;br /&gt; strikes.each { strike -&gt;&lt;br /&gt;   engine.options &lt;&lt; new SimpleOption( strike : strike, &lt;br /&gt;      payoutType : PayoutType.Put, name : "${strike} Strike 1 Yr Put" )&lt;br /&gt;   engine.options &lt;&lt; new SimpleOption( strike : strike, &lt;br /&gt;      payoutType : PayoutType.Call, name : "${strike} Strike 1 Yr Call" )&lt;br /&gt; }&lt;br /&gt;&lt;br /&gt; def brownianMotion = new LogNormalPathHelper(spot : 100, &lt;br /&gt;      vol : 0.15, rate : 0.05, time : 1.0)&lt;br /&gt; def closure = { random -&gt;&lt;br /&gt;   brownianMotion.generatePaths(random)&lt;br /&gt; }&lt;br /&gt; engine.pathGenerator = closure&lt;br /&gt;&lt;br /&gt; def options = engine.runSimulation()&lt;br /&gt; options.each { option, price -&gt;&lt;br /&gt;   println "Contract: ${option.name} was priced @ ${price} over ${engine.size} iterations"&lt;br /&gt; }&lt;br /&gt;&lt;br /&gt; ParallelMathHelper.shutdownPool()&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6748907194223514435-3276476720095220252?l=nycfintech.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nycfintech.blogspot.com/feeds/3276476720095220252/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://nycfintech.blogspot.com/2009/05/groovy-math-ii-monte-carlo-engine.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6748907194223514435/posts/default/3276476720095220252'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6748907194223514435/posts/default/3276476720095220252'/><link rel='alternate' type='text/html' href='http://nycfintech.blogspot.com/2009/05/groovy-math-ii-monte-carlo-engine.html' title='Groovy Math II: The Monte Carlo Engine'/><author><name>Jonathan</name><uri>http://www.blogger.com/profile/13596731520846039248</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://4.bp.blogspot.com/_J9I66wTIjAE/SYEpkhd-mtI/AAAAAAAAAAM/8m2xMBPpmbY/S220/jonathan2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6748907194223514435.post-3083368915619385411</id><published>2009-05-29T10:22:00.000-04:00</published><updated>2010-09-18T16:08:21.543-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DSL'/><category scheme='http://www.blogger.com/atom/ns#' term='Groovy'/><category scheme='http://www.blogger.com/atom/ns#' term='Quant'/><title type='text'>Groovy Math I:  Parallel Calculation</title><content type='html'>&lt;span style="font-size: 180%;"&gt;Functional Programing and Numeric Computing&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;(Project page at GitHib is &lt;a href="http://jonathanfelch.github.com/Groovy-Numerics/" style="color: black;"&gt;Groovy Numerics Repository&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;What if we would write Monte-Carlo Simulations like this ?&lt;/span&gt;&lt;br /&gt;&lt;table style="height: 114px; width: 682px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td width="20"&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;pre class="brush:groovy"&gt;def normalRandom = ... // get stochastic terms from somewhere&lt;br /&gt;def operation = { random -&amp;gt;&lt;br /&gt;return 100.0 * Math.exp((0.05 - 0.15 * 0.15 / 2.0)+ 0.15 * random)&lt;br /&gt;}&lt;br /&gt;def paths = operation(normalRandom)&lt;br /&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;This entry will be the first in a series of articles following up on my presentation at the GR8 Conference on Groovy, Grails, and Griffon.  In the first set, I hope to discuss high-performance mathematical operations beginning with Numeric Collections and subsequently with a linear algebra library / DSL and finally with root solvers.  In the second set, I will discuss distributed calculation in a grid computing environment and the the final set I will explore issues related to distributed caching.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;The ultimate destination for the journey will be two assets that work in partnership.  Together they will be called DASEL a DSL for financial engineering.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;The first asset will be a number of small building blocks for our DSL.  The main elements of this language will be submitted in the next few entries.&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="color: black;"&gt;The second  asset will be a distributed cache that could accommodate a comprehensive model for the global financial eco-system.  It would  will allow dynamic enhancement of the 1st class objects and data model.  It would allow dynamic enhancement of the mathematical models.  It would allow the developer / analyst /  trader to find first class objects through simple queries and traverse the data with simple tree like logic.  It would be able to populate itself and recover lost reference and time series data.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;In this entry, I will explore the simplest use case:  A simple mathematical expression that gets evaluated many times with some (or all) of the inputs changing with each evaluation.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;This kind of use case might not strike most people as a likely candidate for Groovy -- or any other scripting language for that matter.  If performance is key, shouldn't this sort of thing be coded in C++ or at least Java?  On this question, I will concede the main point for the most performance hungry applications in the financial universe:  automated market makers and high-frequency arbitrage.  These application often require sub-millisecond responses, co-location at the exchanges, and their profitability is driven as much by the fill ratio as it is by identifying the correct trading signals.   However, if we relax the requirements just slightly, we find this dynamic language (running on top of a a second interpreter, the Java VM) out performs naive Java and C++ implementations even on a single dual-core CPU.   Even looking for a more rigorous benchmark, I hope to show that Java and indeed a Groovy-based DSL for dynamic expression evaluation will both come close to the high-performance benchmarks set in those languages but also to show that high-performance dynamic programming is possible and -- with a little help -- straightforward.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;If this aspiration is achievable, they the benefits of using a dynamic language for idea generation, analysis, and research becomes a no-brainer.  Researchers can explore an idea using purely mathematical notation, the expression will enter the eco-system as a closure, and the back-end components will worry about the performance side of this -- presumably through a combination of optimizations, intelligent dependency management, and parallelism.&lt;/span&gt;  Complex multi-variant time series analysis and problems of dimensionality reduction for multi-factor models become a problem domain that does not require weeks of C++ coding.  Rather the scripting platform becomes the white board of half-baked ideas that are quickly sent to the trash can for qualified for more rigorous study.&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;The first artifact presented will be a numeric collections class.  It is similar in metaphor and structure to a matrix class.  Internally it is a two dimensional array of values.  The key difference will be that it will follow the rules used for conventional algebra rather than linear algebra, but will do these operations in bulk rather than serially.  For example, consider the log normal random walk model commonly used for the stock market:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="color: black; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_J9I66wTIjAE/SiAol9eeeEI/AAAAAAAAABo/lMz7aD5w4F8/s1600-h/bs-stock-price-model.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5341313790684067906" src="http://4.bp.blogspot.com/_J9I66wTIjAE/SiAol9eeeEI/AAAAAAAAABo/lMz7aD5w4F8/s320/bs-stock-price-model.png" style="cursor: pointer; display: block; height: 18px; margin: 0px auto 10px; text-align: center; width: 194px;" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;span style="color: black;"&gt;We can use the Euler method to create an analytical solution for S(t) as follows:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="color: black; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_J9I66wTIjAE/SiAtbCSSAtI/AAAAAAAAABw/Q4R3qWWfbaU/s1600-h/euler.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img alt="" border="0" id="BLOGGER_PHOTO_ID_5341319100554674898" src="http://1.bp.blogspot.com/_J9I66wTIjAE/SiAtbCSSAtI/AAAAAAAAABw/Q4R3qWWfbaU/s320/euler.png" style="cursor: pointer; display: block; height: 51px; margin: 0px auto 10px; text-align: center; width: 290px;" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;span style="color: black;"&gt;Now the question revolves around where we get our random numbers, how do we iterate through them, and how many iterations to we need to converge on a reasonably good price?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;This is where Groovy kicks in.    Traditionally this is done by looping through a simple expression where by each iteration is seeded with a different random number.  Our groovy class will treat the entire collection as a number, evaluate each term in the expression following the normal rules of operator precedence, and return a new collection (of the same size and shape as the original) that has the final result of the expression using the initial value as the seed.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;For those used to thinking about these kinds of expressions using linear algebra (which will be the subject of the next post), this will seem both familiar and counter intuitive. The operations that combine a scalar and a collection will be the same as the matrix arithmatic they are familiar with.  The collections multiplication --I use the term grid to referrence a 2 dimensional collections class that support conventional algebra and a matrix for a collection that implements linear algebra -- is the different.  In the grid class, the resulting collection is the same size as the original grid.  The resulting component (i,j) is that product of the i,j component in grid A and the i,j component in grid B.  This is accomplished using Groovy's operator overloading capabilities.  Under the hood, since each row vector can be processed independently, a thread pool will execute the operation over each row vector as a discrete task and a helper class written in Java will create the Callable for each row vector.  This will have several advantages over the traditional 'for loop.'&lt;/span&gt;&lt;br /&gt;&lt;ol style="color: black;"&gt;&lt;li&gt;If the expression combines scalar values and collections, the scalar portions will only be calculated once and the result will be applied to the collection.  Normal operator precentance and a little bit of smart grouping will replace the task of decomposing the expression programmatically or building out a dependancy graph.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;The JIT and CPU Cache are good at optimizing similar operations executed in sequence.  Asking Mr. CPU to calculate 2 ^ x over 1,000,000 iteration might result in loading 2 into register A and x into register B and then executing the power function.  If we can keep 2 in A and load 1,000,000 values for X into the cache, the result will come faster.  If we are talking about 2 ^ x + y, we are still better off calculating 2^x over 1,000,000 inputs, and then adding 1,000,000 y-s to 1,000,000 x-s than doing 1,000,000 separate loops for 2^x + y.&lt;/li&gt;&lt;li&gt;The expression can be entered as a closure and the author does not need to know what the strongly typed characteristics of the variables will be so long as they support the familiar operations.&lt;/li&gt;&lt;li&gt;The operations are done in parallel by default without additional code&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="color: black;"&gt;So the first step of our Math DSL has some obvious requirements:&lt;/span&gt;&lt;br /&gt;&lt;ol style="color: black;"&gt;&lt;li&gt;It should store a collection of values&lt;/li&gt;&lt;li&gt;It will need to implement plus, minus, multiply, divide, power, etc for both use cases involving scalar values and other (same size, same shape) collections&lt;/li&gt;&lt;li&gt;It will need to enhance the Number.metaClass to allow scalar values to support operations with Numeric Collections as operands&lt;/li&gt;&lt;li&gt;It would be nice to provide collection supporting overrides for max, min, exp, power, and so on from static Math.* methods&lt;/li&gt;&lt;li&gt;It will need a helper class that manages a shared thread pool for computational tasks&lt;/li&gt;&lt;li&gt;The helper class will provide simple routines to reduce the two dimensional array of values to a collection of Callable tasks that can be submitted to the thread pool.&lt;/li&gt;&lt;li&gt;The collections class will support additional helper methods (statistical methods, factory methods, etc...)  to the extent they support a clear API.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;Now, let's return the the Monte Carlo simulation.  Using the Euler method described above, we should be able to code something like.&lt;/span&gt;&lt;br /&gt;&lt;table style="height: 240px; width: 696px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td width="20"&gt;&lt;br /&gt;&lt;/td&gt;&lt;td style="font-family: courier new;"&gt;&lt;pre class="brush:groovy"&gt;import com.dasel.math.NumericGrid&lt;br /&gt;&lt;br /&gt;def gaussian  = NumericGrid.createQuasiGaussian(1000,1000);&lt;br /&gt;&lt;br /&gt;def eulerMethod = { price, time, rate, vol, rand -&amp;gt;&lt;br /&gt;def rets =  (rate-0.5*vol*vol)*time + Math.sqrt(time)*vol*rand&lt;br /&gt;price * Math.exp(rets)&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;def model = eulerMethod.curry(100.0,1.0,0.05,0.15)&lt;br /&gt;def paths = model(gaussian)&lt;br /&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;span style="color: black;"&gt;Here, with a few lines of code.  We have evaluated the the "model" expression 1,000,000 times.  &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black; font-size: 180%;"&gt;Benchmarks:  Naive Java versus Smart Groovy&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black; font-size: 100%;"&gt;It is always unfair to present a naive implementation of a straw man design and a sophisticated implementation of your 'favored' solution.  The reason I will defend this slight of hand is that performance is not my end goal.  Performance is a necessary condition, but flexibility and speed of development are the key drivers.  A sophisticated C++ or Java developer would be able to beat my benchmark, maybe by a factor or 2 in Java&lt;/span&gt;&lt;span style="color: black;"&gt;, maybe a factor of 3 or more in c++ (I think that this would take an exceptional developer in c++ and they would need to work much harder than I have to accomplish it, but take this as a challenge and prove me wrong).  Gauntlet thrown, let's see how me did on my Vista laptop with 1 dual core CPU.  &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: black;"&gt;On the Java side, I went for simplicity.  No generalized support to evaluate an arbitrary expression.  Single threaded.  Constants over variables.  The Java solution did not store the results anywhere, the values where simply discarded.  The only output was the time required to execute the loop.   Also the drift component for the Euler model is explicitly calculated before the loop begins, reducing the number of operations roughly in half.  The benchmark code went as follows:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;table style="width: 710px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td width="20"&gt;&lt;br /&gt;&lt;/td&gt;&lt;td style="font-family: courier new;"&gt;&lt;pre class="brush:groovy"&gt;public static Double test(List values) {&lt;br /&gt;long start = System.currentTimeMillis();&lt;br /&gt;double drift = 0.05 - 0.5 * 0.15 * 0.15;&lt;br /&gt;for (Double value : values) {&lt;br /&gt;double diffusion = 0.15 * value;&lt;br /&gt;double result = 100.0 * Math.exp(drift + diffusion);&lt;br /&gt;}&lt;br /&gt;double time = (System.currentTimeMillis() - start) / 1000.0;&lt;br /&gt;System.out.println("Time = " + time + " seconds");&lt;br /&gt;return time;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;public static void main(String[] args) {&lt;br /&gt;List values = createrDistribution(1000000);&lt;br /&gt;long start = System.currentTimeMillis();&lt;br /&gt;List benchmarks = new ArrayList();&lt;br /&gt;for (int i = 0; i &amp;lt; 100; i++) {  &lt;br /&gt;benchmarks.add(test(values)); &lt;br /&gt;} &lt;br /&gt;double avg = (System.currentTimeMillis() - start) / 1000.0 / 100.0; &lt;br /&gt;double min = Collections.min(benchmarks); &lt;br /&gt;double max = Collections.max(benchmarks); &lt;br /&gt;System.out.println("Min: " + min + " Max: " + max + " Avg: " + avg); &lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;span style="color: black;"&gt;So, in the end the Groovy version executed the path of a arbitrary stock market issue over 1,000,000 different one year paths with constant volatility and a constant interest rate 100 times.  The first execution was dramatically slow then the subsequent realization s, taking on average just over 1.1 - 1.3 seconds to complete.  The average execution time, however, quickly converged to roughly 110-130 milliseconds with a best performance in the 100-120 milliseconds range for a sample size of 1,000,000 draws. &lt;/span&gt;  &lt;span style="color: black;"&gt;The Java sample had a best performance of 150-160 milliseconds and an average that converged very quickly to roughly 180-190 milliseconds.  It, too, showed additional sluggishness in the first iteration although it was far less significant than the early Groovy iterations and faded more quickly.  &lt;/span&gt;  &lt;span style="color: black;"&gt;I doubt very much that a dual core CPU would ever realize a performance improvement of order two (linear scaling) simply by adding an additional thread, and I cheated a little but in Java's favor by discarding the results and pre-calculating the drift component and thus reducing the total number of operations.  All in all, I think that this shows two things:&lt;/span&gt;  &lt;br /&gt;&lt;ol style="color: black;"&gt;&lt;li&gt;Performance in numerical computing between Java and Groovy can be a wash&lt;/li&gt;&lt;li&gt;There are some aspects of the operator overloading capabilities that when combined with a collections class lend themselves to serious performance gains from the JIT optimizers and multi-core CPUs.  This obviously could be done with an API (perhaps an internal DSL) for a collections library in Java, but the appeal of this solution was that the researcher could simply write a closure and the DSL would take care of the rest.  Since C++ also supports operator overloading, the CPU and L2 Cache optimizations could also be exploited easily in that language, altough most of the published work in that space still involves iterating through the contents of a collection and evaluating the entire expression repeatedly, rather than allowing an expression to combine scalar values with numeric collections.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="color: black;"&gt;The code is available at GIT Hub under the &lt;/span&gt;&lt;a href="http://jonathanfelch.github.com/Groovy-Numerics/" style="color: black;"&gt;Groovy Numerics Repository&lt;/a&gt;  &lt;span style="color: black;"&gt;    &lt;/span&gt;&lt;a href="http://github.com/JonathanFelch/Groovy-Numerics/zipball/master" style="color: black;"&gt;ZIP&lt;/a&gt;&lt;span style="color: black;"&gt;: &lt;/span&gt;&lt;a href="http://github.com/JonathanFelch/Groovy-Numerics/tarball/master" style="color: black;"&gt;TAR&lt;/a&gt;&lt;span style="color: black;"&gt;:&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6748907194223514435-3083368915619385411?l=nycfintech.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nycfintech.blogspot.com/feeds/3083368915619385411/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://nycfintech.blogspot.com/2009/05/groovy-math-i-parallel-calculation.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6748907194223514435/posts/default/3083368915619385411'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6748907194223514435/posts/default/3083368915619385411'/><link rel='alternate' type='text/html' href='http://nycfintech.blogspot.com/2009/05/groovy-math-i-parallel-calculation.html' title='Groovy Math I:  Parallel Calculation'/><author><name>Jonathan</name><uri>http://www.blogger.com/profile/13596731520846039248</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://4.bp.blogspot.com/_J9I66wTIjAE/SYEpkhd-mtI/AAAAAAAAAAM/8m2xMBPpmbY/S220/jonathan2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_J9I66wTIjAE/SiAol9eeeEI/AAAAAAAAABo/lMz7aD5w4F8/s72-c/bs-stock-price-model.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6748907194223514435.post-788993538157127979</id><published>2009-05-27T16:01:00.000-04:00</published><updated>2009-06-01T07:22:27.123-04:00</updated><title type='text'>Groovy Finance</title><content type='html'>Presentation from GR8 Conference in Copenhagen, Denmark:&lt;br /&gt;&lt;br /&gt;Overview of Performance and Scalability Issues for Groovy-based DSL technology for Quantitive Finance Applications:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://docs.google.com/gb?export=download&amp;amp;id=F.2a24df09-6c97-4b9c-b35c-66cf8fc68ee4"&gt;Presentation at Goggle Docs&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.slideshare.net/jonathan.felch/groovy-finance?type=powerpoint"&gt;&lt;br /&gt;Groovy Finance at Slide Share&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Flattering summary of the talk here &lt;a href="http://gettinggroovy.wordpress.com/2009/05/20/the-gr8-conference-grid-computing-and-computational-finance/"&gt;Getting Groovy Review&lt;/a&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;&lt;br /&gt;var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");&lt;br /&gt;document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));&lt;br /&gt;&lt;/script&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;&lt;br /&gt;try {&lt;br /&gt;var pageTracker = _gat._getTracker("UA-9109747-1");&lt;br /&gt;pageTracker._trackPageview();&lt;br /&gt;} catch(err) {}&lt;/script&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6748907194223514435-788993538157127979?l=nycfintech.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://nycfintech.blogspot.com/feeds/788993538157127979/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://nycfintech.blogspot.com/2009/05/groovy-finance.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6748907194223514435/posts/default/788993538157127979'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6748907194223514435/posts/default/788993538157127979'/><link rel='alternate' type='text/html' href='http://nycfintech.blogspot.com/2009/05/groovy-finance.html' title='Groovy Finance'/><author><name>Jonathan</name><uri>http://www.blogger.com/profile/13596731520846039248</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://4.bp.blogspot.com/_J9I66wTIjAE/SYEpkhd-mtI/AAAAAAAAAAM/8m2xMBPpmbY/S220/jonathan2.jpg'/></author><thr:total>2</thr:total></entry></feed>
