Mike Schaeffer's Blog

Articles with tag: lisp
January 3, 2006

Despite outward appearances, there is (really!) another release of vCalc in the works. I'm not going to be so silly as to speak to a timeline (probably 2006Q2), but here's a brief list of planned features for the next release or two:

  • Constant Library - A library of a few hundred constants.
  • Interface improvements - The current UI is functional but rather plain both in appearance and the interactivity it supports. The next version of vCalc will dress up the UI bit and start the process of making it more interactive.
  • Macro Recorder - To aid programming, there's a macro recorder that records sequences of commands as programs written in the language described above.
  • New Data Types - There are more first-class data types, including complex numbers, lists, tagged numbers, and programs.
  • User Programability - There's a user programming language including conditional branches, loops, and higher order functions. This language looks a lot like a lexically scoped variant of RPL, the language used by HP in its more modern calculators.
  • Better interoperability with other data sources - This means import/export of CSV, through both the clipboard and by file.
  • Financial Math - This is mainly planned to be Time Value of Money. There's actually interface in vCalc 1.0 to support this functionality, but I released vCalc before getting it to work reliably and disabled the code that implements it. This is going to be an ongoing area for devevlopment.
  • Infix notation - There needs to be a way to enter an expression like 'sin(x)'. This is both a programmability feature and the core of things like symbolic algebra and calculus.
  • Graphics - Function plotting.

In a more general sense, there are a few other issues that are important, but have a slightly lower priority level. These are general issues that are too big to be 'fixed' in one release, but nonetheless are important areas for work. The first of these is performance and the second is openness.

Performance is the easier of the two issues to describe: I want vCalc to be usable to interactively perform simple (mean, max, min, linear regrssion, historgram, etc.) analysis of datasets with 100K-1,000K observations of 10-20 variables each. The worst case scenario means that vCalc needs to be able to manage a in-memory image around 500-600MB in size and be able to compute 20-30M floating point operations within 5-10 seconds. That's a stretch for vCalc, but I think it's doable within a year or two. Right now, development copies of vCalc can reasonably manage 100K observations of 50 variables each. The biggest weakness is the CSV file importer, which is glacially slow: it reads CSV files at around 30K/second. I'll speak to these issuses in more detail later on, but the fix for this will be a staged rewrite of the Lisp engine and garbage collector at the core of vCalc.

The other issue that will have to be fixed over time is the issue of openness. One of the things I'd like this blog to be is a way to communicate with the audience of vCalc users. That means example code, demonstrations of how to use vCalc to solve specific problems, and descriptions of the guts of vCalc, at the very least. For that to be useful, there needs to be an audience, and for there to be an audience the vCalc bits need to be availble for people to use and good enough for them to care about using it. There's a lot to be done between here and there, but I've come to believe that open sourcing parts of vCalc and releasing more frequent development builds of the closed source parts will end up being key. We'll see.

November 4, 2005

Michael Sperver has written an SRFI that documents "Octet-Addressed Binary Blocks". Basically these things are like BLOBs in SQL: blocks of memory, opaque to the data model of the language, that can be used to store arbitrary binary data. I can think of a bunch of applications for this:

  • An internal representation for compiled byte code functions.
  • A way to interoperate with C code that expects binary data formats. (Like the Win32 API, for example. )
  • A way to represent binary data longer than a byte that's written to and read from binary ports.
September 7, 2005

I've found a couple interesting websites related to computer history. The first is Dusty Decks, a blog related to some efforts to reconstruct Lisp and FORTRAN history. A highlight of this is a discussion on the Birth of the FORTRAN subroutine. Also via Dusty Decks is a website on the early history of the Lisp Programming Language.

That leads me to a couple books I've been reading lately. The first is Lisp in Small Pieces, by Christian Queinnec. I'm only a couple chapters in (stuck on continuations right now), but it's already been pretty profound. So far, the aspect of the book that's been the most useful is that it has gone through several core design choices Lisp implementors have to make ( Lisp-1 vs. Lisp-2, Lexical Scope vs. Dynamic Scope, types of continuations to support), and goes into depth regarding the implications and history of the choices involved. I think I'm finally starting to understand more of the significance of funcall and function in Common Lisp, not to mention throw/catch and block/return-from.

Book two is The First Computers–History and Architectures, edited by Raul Rojas. This book is a collection of papers discussing the architecture of significant early computers from the late 30's and 40's. The thing that's so unique about the book is that it focuses on the architectural issues surrounding these machines: the kinds of hardware they were built with, how they processed information, and how they were programmed. Just as an example, it has a detailed description of many of ENIAC's functional units, even going into descriptions of how problems were set up on the machine. Another highlight of the book for me (so far) has been a description of Konrad Zuse's relay-based Z3, down to the level of a system architectural diagram, schematics of a few key circuits, and coverage of its microprogramming (!).

August 24, 2005

I literally dreamed about this last night. It would be wonderful if Excel supported formulas like this:

=LET(value=MATCH(item,range,0), IF(ISERROR(value), 0, value))

If you're into Lisp-y languages, it'd look like this:

(let ((value (match item range 0)))
  (if (is-error? value) 0 value))

The function call =LET(name=binding, expression) would create a local range name named name, bound (equal) to the value returned by binding, to be used during the evaluation of expression. In the example above, during the evaluation of IF(ISERROR(value), 0, value))<, value would be bound to the value returned by MATCH(item, range, 0).

It's worth pointing out that this is slightly different from how normal Excel range names work. Range names in Excel work through textual substitution. With textual substitution, the initial expression would be logically equivalent to this:

=IF(ISERROR(MATCH(item, range, 0)), 0, MATCH(item, range, 0)))

In other words, Excel would treat every instance of value as if MATCH(item, range, 0) was explictly spelled out. This means there are two calls to MATCH and two potential searches through the range. While it's possible that Excel optimizes the second search away, I'm not sure that anybody outside of Microsoft can know for sure how this is handled.

Microsoft's current reccomendation for handling the specific ISERROR scenario in the first expression is this VBA function:

Function IfError(formula As Variant, show As String)

    On Error GoTo ErrorHandler

    If IsError(formula) Then
        IfError = show
        IfError = formula
    End If

    Exit Function

    Resume Next

End Function

This isn't bad, but it requires that spreadsheet authors and readers understand VBA. It also imposes significant performance costs: calling into VBA from a worksheet takes time.

Older Articles...