Mike Schaeffer's Blog

Articles with tag: programming

September 7, 2005

Programming Well: Write (and Read) your Data ASAP

One of the first functions I like to write when creating a new data structure is a human-readable dumper. This is a simple function that takes the data you're working with and dumps it to an output stream in a readable way. I've found that these things can save huge amounts of debugging time: rather than paging through debugger watch windows, you can assess your program's state by calling a function and reading it out by eye.

A few tips for dump functions:

The more use this kind of scaffolding code gets, it gets progressively more cost effective to write. Time spent before dumpers are in place reduces the amount of use they can get and makes them progressively less cost effective. Implement them early, if you can.
Look for cheap alternatives already in your toolkit: Lisp can already print most of its structures, and .Net includes object serialization to XML. The standard solution might not be perfect, but it is 'free'.
Make sure your dumpers are correct from the outset. The whole point of this is to save debugging time later on, if you can't trust your view into your data structures during debugging, it will cost you time.
Dump into standard formats. If you can, dump into something like CSV, XML, S-expressions, or Dotty. If you have a considerable amount of data to analyze, this'll make it easier to use other tools to do some of the work.
Maintain your dumpers. Your software isn't going to go away, and neither are your data structures. If it's useful during initial development, it's likely to be useful during maintenance.
For structures that might be shared, or exist on the system heap, printing object addresses and reference counts can be very useful.
For big structures, it can be useful to elide specific content. For example: a list of 1000 items can be printed as (item0, item1, item2, ..., item999 ).
This stuff works for disk files too. For binary save formats, specific tooling to examine files can save time compared to an on-disk hex-editor/viewer. (Since you have code to read your disk format into a data structure in memory, if you also have code to dump your in-memory structure, this does not have to be much more work. Sharing code between the dump utility and the actual application also makes it more likely the dumper will show you the same view your application will see.)
Reading dumped structures back in can also be useful.

Tags:programmingtech

April 12, 2005

Visual BASIC

There's been some 'controversy' in the blog world about a petition that's circulating to ask Microsoft to continue supporting "Classic" Visual BASIC in addition to the replacement VB.Net. A month ago, I had a pretty long post dedicated to the topic, but due to technical problems I wasn't able to get it online. Therefore, I'll keep this sweet and to the point.

The core problem VB6 developers are facing is that they sank lots of development money into a closed, one-vendor language. Choosing VB6 basically amounted to a gamble that Microsoft would continue to support and develop the language for the duration of a project's active life. That gamble hasn't paid off for some developers, and companies with sizable investments in VB6 code now need to figure out how to make the most of that investment while still evolving their software.

With standardized languages like C, languages with multiple tool vendors, the risk is significantly lower. If one vendor drops their version of a language, switching to another implementation is going to be a lot easier than porting to an entirely different platform (particularly if you've avoiced or isolated vendor-specific features).

So... what's the moral of this story? Before you base your business on a particular language or tool, make sure you know what happens if that platform ever loses support. Pick something standardized, with multiple viable vendors. Or alternatively pick something open source, where you can take over platform development yourself (if you absolutely need to). Whatever you do, don't pick a one vendor tool and complain when the vendor decides to drop it. Commercial vendors, particularly, have no legal obligation to their customers.

Tags:programmingtechvb

April 12, 2005

Programming Well: Global Variables

Global variables tend get a bad rap, kind of like goto and pointers. Personally, I think they can be pretty useful if you are careful. Here are the guidelines I use to determine if a global variable is an appropriate solution:

Will there ever be a need for more than one instance of the variable?
How much complexity does passing the variable to all its accessors entail?
Does the variable represent global state? (A heap free list, configuration information, a pool of threads, a global mutex, etc.)
Can the data be more effectively modeled as a static variable in a function or private member variable in a singleton object? (Both of these are other forms of global storage, but they wrap the variable accesses in accessor functions.)
Can you support the lifecycle you need for the variable any other way? Global variables exist for the duration of your program's run-time. Local variables exist for the duration of a function. If you don't have heap allocated variables, or if your heap allocator sucks, then a global variable might be the best way to get to storage that lasts longer than any one function invocation.
Do you need to use environment features that are specific to globals? In MSVC++, this can mean things like specifying the segment in which a global is stored or declaring a variable as thread-local.

If all that leads you to the decision that a global variable is the best choice, you can then take steps to mitigate some of the risks involved. The first thing i'd do is prefix global variable names with a unique qualifier, maybe something like g_. This lowers the risk of namespace collisions as well as clearely denotes what variables are global, when you have to read or alter your code. If you have multiple global variables, I'd also be tempted to wrap them all up in a structure, for some of the same reasons.

Tags:programmingtech

March 4, 2005

Outsourcing vs. Offshoring

I'm writing some content for a future post on the offshoring of jobs overseas, but I want to clear something up before it gets posted: Outsourcing and offshoring are two different and orthogonal concepts. This seems to be something that gets misunderstood a great deal, but simply put, outsourcing is the movement of jobs to a different company and offshoring is the movement of jobs to a different country. Either one can be done without the other.

The scenarios that people tend to get upset about (at least in the United States) are the scenarios involving offshoring, the movement of work overseas. Outsourcing, however, does not necessarily imply that the work gets moved to a different country: it's very common for work to be outsourced to another American business employing American workers. An example of this is hiring a Madison Avenue firm to put together an ad campaign. Sure, it'd be possible to develop the talent in house to do this yourself, but there are many advantages in outsourcing the work to a more specialized vendor.

Tags:businessprogrammingtech