Mike Schaeffer's Blog

September 18, 2019

Git Stuff

Amazingly enough, git is now 14 years old. What started out as Linus Torvald's 'three day' replacement for BitKeeper is now dominant enough in its domain that even the Windows Kernel is hosted on git. (If you really are amazed by the age of git, that last bit might be even more amazing.) In any event, I also use git and have done so for close to ten years. Along with a compiler and an editor, I'd consider it one of the three essential development tools. That experience has left me with a set of preconceived notions about how git should be used and some tips and tricks on how to use it better. I've been meaning to get it all into a single place for a while, and this is the attempt.

This isn't really the place to start learning git (that would be a tutorial). This is for people that have used git for a while, understand the basic mechanics, and want to look for ways to elevate their game and streamline their workflow.

The Underlying Data Model

git is built on a distinct data structure, and the implications of this structure permeate the user experience.

Understanding the underlying data model is important, and not that complicated from a computer science perspective.

  • Every revision of a source tree managed by git can be considered a complete snapshot of every source file. This is called a commit.
  • Every commit has a name (or address), which is a hash of the entire contents of the commit. These names are not user friendly (They look like d674bf514fc5e8301740534efa42a28ca4466afd), but they're essentially guaranteed to be unique.
  • If two commits have different contents, they also have different hashes. A hash is enough to completely identify a state of a source tree.
  • Because hashes are a pain to work with, git also has refs. Refs are user friendly symbolic names (master, fix-bug-branch) that can each point to a commit by hash.
  • Commits can't be mutated, because any change to their contents would change their name/hash. Refs are where git allows mutations to occur.
  • If you think of a ref as a variable that contains a hash and points to a commit, you're not far off.
  • Commits can themselves refer to other commits - Each commit can contain references to zero or more predecessors. These backlinks what allow git to construct a history of commits (and therefore a history of a source code tree).
  • The 'first commit' has zero predecessors, a merge commit has two or more.

The result of all this is that the core data structure is a directed acyclic graph, covered nicely in this post by Tommi Virtanen.

Tags:gittech
January 24, 2019

Friend Authorization Checks and Compojure Routing

Despite several good online resources, it's not necessarily obvious how friend's wrap-authorize interacts with Compojure routing.

This set of routes handles /4 incorrectly:

(defroutes app-routes
  (GET "/1" [] (site-page 1))
  (GET "/2" [] (site-page 2))
  (friend/wrap-authorize (GET "/3" [] (site-page 3)) #{::user})
  (GET "/4" [] (site-page 4)))

Any attempt to route to /4 for a user that doesn't have the ::user role will fail with the same error you would expect to (and do) get from an unauthorized attempt to route to /3. The reason this happens is that Compojure considers the four routes in the sequence in which they are listed and wrap-authorize works by throw-ing out if there is an authorization error (and aborting the routing entirely).

So, even though the code looks like the authorization check is associated with /3, it's really associated with the point in evaluation after /2 is considered, but before /3 or /4. So for an unauthorized user of /3, Compojure never considers either the the /3 or /4 routes. /4 (and anything that might follow it) is hidden behind the same security as /3.

This is what's meant when the documentation says to do the authorization check after the routing and not before. Let the route decide if the authorization check gets run and then your other routes won't be impacted by authorization checks that don't apply.

What that looks like in code is this (with the friend/authorize check inside the body of the route):

(defroutes app-routes
  (GET "/1" [] (site-page 1))
  (GET "/2" [] (site-page 2))
  (GET "/3" [] (friend/authorize #{::user} (site-page 3)))
  (GET "/4" [] (site-page 4)))

The documentation does mention the use of context to help solve this problem. Where that plays a role is when a set of routes need to be hidden behind the same authorization check. But the essential point is to check and enforce authorization only after you know you need to do it.

December 21, 2018

Small Computing History Resources

I've lately run across several interesting small computer history sites. If you have any interest in small computing's emergence from 1980 to 1990 or so, these are worth a look.

In no particular order:

  • OS/2 Museum - Covers OS/2, but also gets into detail around PC architecture. Among other interesting bits, this is just one of several articles on A20 gate handling, and here's something on the IBM 8514/A.
  • DTACK Grounded - A newsletter written to promote Hal Hardbergh's side business of attached Motorola 68000 processor boards. Mostly interesting for his commentary on then-crurent events leading up to the emergence and use of 32-bit microprocessors. Notably, this was written at the time of Intel's pivot from the iAPX 432 to the 80386. The commentary on the relative unreliability of DRAM is amusing too.
  • CRPG Addict - Not sure how he has the time, but the author of this blog has set himself the challenge of playing through and documenting every early CRPG game from the late 70's and well into the 90's.
  • The Digital Antiquarian - Critical commentary on early small computer gaming. Lots of details about how games came to be made and their content.
  • Retrocomputing Stack Exchange site - This is currently more like Netflix than anything else. Coverage is spotty, but that doesn't mean you can't find something interesting to read.
August 3, 2018

Rhinowiki

It's been a long time coming, but I've finally replaced blosxom with a custom CMS I've been writing called Rhinowiki. More than a serious attempt at a CMS, this is mainly a fun little side project to write some Clojure, experiment a bit with JGit, and hopefully make it easier to implement a few of my longer term plans that might have been tricky to do in straight Perl.

Full source in the link above, a high level summary here:

  • Everything is in Clojure.
  • Backend format is Markdown as interpreted by markdown-clj.
  • Source code is highlighted using highlight.js.
  • Markdown rendering is done entirely on the server, with syntax highlighting on the client. (I'm looking into Nashorn to run highlight.js server side too, but don't know if that's possible within my time constraints.)
  • Back end storage is managed using and retrieved via JGit.
  • All requests are served out of memory.
  • There's a hand rolled (and conformant) Atom feed.
  • Also RSS 2.0.
Older Articles...