I’m releasing dropt 1.1.0 today.
dropt is a C library for parsing command-line options. Yes, there are a lot of existing ones already, but I wasn’t satisfied with those that I had come across:
A showcase of my craptacular writing and art.
I’m releasing dropt 1.1.0 today.
dropt is a C library for parsing command-line options. Yes, there are a lot of existing ones already, but I wasn’t satisfied with those that I had come across:
A few weeks ago, one of my coworkers complained about doing maintenance on a project that he had moved away from. I told him that authoring code is like having a child: you can’t say you’re tired of it and abandon it. If you brought it into this world, you should take some responsibility for it. If you’re not prepared to do that, don’t have that baby.
I was joking, of course, but perhaps it’s not a completely ridiculous comparison (although I suspect that my friends who are actual parents might disagree).
Today marks my eight year anniversary at VMware. For those past 8 years, I’ve spent 40 hours per week (well, probably more) developing VMware Workstation, watching it grow and trying to imbue it with whatever knowledge I have. A number of people tell me that I’ve been at VMware for too long and should move on, but I’m not ready to let go yet.
Cygwin, a port of various Unix utilities to provide a Unix-like environment on Windows, has been around for a long while. It’s well-known; sites such as Lifehacker gives tips about using it. My tip is: avoid Cygwin unless absolutely necessary.
Cygwin-based tools depend on cygwin1.dll, and cygwin1.dll is obnoxious because:
So what are people to do?
I should note that Cygwin is still a necessary evil for stable versions of bash and sshd. I don’t know of any good alternative implementations of those.
I saw some code:
std::string s = foo(); bar(s.c_str());
I tried changing it to:
bar(foo().c_str());
and things broke. It turns out that bar()
was a macro that expanded to:
#define bar(s) do { struct st; st.someString = (s); baz(&st); } while (0)
(If you’re wondering about the do ... while (0)
, consult the comp.lang.c FAQ.)
This is fine for C code, but in the C++ world, this is dangerous. In this case, foo()
returns an anonymous std::string
object by value. That anonymous object then is destroyed after its internals are assigned to st.someString
but before baz()
gets to use it, causing baz()
to be called with garbage.
Moral #1: Macros that don’t have perfect function-like semantics shouldn’t look like functions. For example, macros should be clearly indicated by naming them in all uppercase.
Moral #2: Use inline functions when possible. (In this case, however, the macro was provided by a C library.)
A couple of weeks ago I read about a scam anti-virus program sold by some no-name software company. The software reported false positives to induce hapless people into thinking that they were infected with something and to buy their useless product. A few days ago, Mark Russinovich of Sysinternals wrote about bogus spyware removers.
I’m so disgusted that I wonder if there should be a programming ethics board that allows programmers to become certified or licensed voluntarily. Shouldn’t people writing so-called anti-virus software take some form of Hippocratic Oath? Such a system wouldn’t be too different from the driver signing that Microsoft does, except it’d be a general system for individual developers, not for particular binaries. Hobbyists still would be able to create, distribute, and sell unlicensed programs, but anyone wanting to establish trust could advertise that they’re licensed. A signing authority could verify that licenses are active and authentic. Obtaining a license could require verification of developers’ personal information, allowing them to be identified and accountable if they break the code (pun intended). Qualification exams even could test for recognition of buffer overflows and other unsafe practices.
On the other hand, what would the punishment be? If the licensing fee is too low, it might be worthwhile for dishonest developers to obtain licenses just to break them. If the licensing fee is too high, no one would participate. And, of course, it’s unclear how to distinguish between intentionally malicious code and simply negligent code.
Recently in a programming forum I frequent, someone posted some sample code he had written and asked for a critique. He would be providing this code to prospective employers.
If you’re trying to get a job programming, providing sample code is good. However, it’s a little surprising what some people consider to be good sample code.
Your goal should be not only to demonstrate that you can write code, but that you can write maintainable code. Write-once, read-never code is worse than useless; it’s fragile and wastes the time of anyone else who ever tries to modify it.
Quick and easy things you can do to improve your code samples:
(Of course, only do the above if you also intend to continue doing them in practice. Misrepresenting yourself is dishonest, okay?)
I bought a Treo 650 this week, and it’s awesome. It’s even inspiring me to do some programming for Palm OS again. Unfortunately, getting back into that groove is really hard.
I wrote a lot of great code while I was at Sony, but of course all that code is Sony-owned and outside of my grasp. To do any Palm OS development work again, I’d need to rewrite everything from scratch, which is demotivating because I’d be redoing work that I had done already and—since I’m now rusty at this—work that I had done better. It makes me feel like my life is progressing backwards.
I am amazed that programming languages (well, the typical ones, at least) don’t make it easier to manipulate files.
A common way files are read in C is to create a struct that matches the file format and to call fread to read the file into it. Isn’t that easy enough?
Not really. This approach is fine in isolation, but it’s non-portable:
The typical way to solve these problems is to read a file a byte at a time, copying each byte into the appropriate location within the struct. This is tedious.
Programming languages should provide a mechanism for programmers to declare a struct that must conform to some external format requirement. Programmers should be able to attribute the struct, prohibiting implicit padding bytes and specifying what the size and endian requirements are for each field. For example:
file_struct myFileFormat { uint8 version; uint8[3]; // Reserved. uint32BE numElements; uint32BE dataOffset; };
When retrieving fields from such a struct, the compiler should generate code that automatically performs the necessary byte swaps and internal type promotions.
Everyone knows that the gets
function in the C standard library is dangerous to use because it offers no protection against buffer overflows.
What should people use instead?
The typical answer is to use fgets
. Unfortunately, although safe, fgets
in non-trivial cases is much harder to use properly:
How do you determine what a good maximum buffer size is? The reason why using gets is dangerous in the first place is because you don’t know how big the input can be.
Unlike gets
, fgets
includes the trailing newline character, but only if the entire line fits within the buffer. This can be remedied easily by replacing the newline character with NUL, but it’s a nuisance.
If the input line exceeded the buffer size, fgets
leaves the excess in the input stream. Now your input stream is a bad state, and you either need to discard the excess (and possibly throw away the incomplete line you just read), or you need to grow your buffer and try again.
Discarding the excess usually involves calling fscanf
, and I don’t know anyone who uses fscanf
without disdain, because fscanf
is hard to use properly too. Furthermore, discarding the line by itself won’t necessarily make you any better off, because you’ve accepted incomplete input and still have to walk the road to recovery.
Growing the buffer is also a hassle. You either need to grow the buffer exponentially as you fill the buffer to capacity, or you need to read ahead, find out how many more bytes you need, grow the buffer, backtrack, and read the excess bytes again. (The latter isn’t even possible with stdin
.) And, of course, this means you also need to handle allocation failure.
In all, this means to use fgets
properly, you need to make several more library calls than you want. For example, here’s an implementation that discards line excess:
/** Returns the number of characters read or (size_t) -1 if the * line exceeded the buffer size. */ size_t fgetline(char* line, size_t max, FILE* fp) { if (fgets(line, max, fp) == NULL) { return 0; } else { /* Remove a trailing '\n' if necessary. */ size_t length = strlen(line); if (line[length - 1] == '\n') { line[--length] = '\0'; return length; } else { /* Swallow any unread characters in the line. */ fscanf(fp, "%*[^\n]"); fgetc(fp); /* Swallow the trailing '\n'. */ return -1; } } }
Ugh!
I personally prefer using C.B. Falconer’s public-domain ggets
/fggets
functions, which have gets
-like semantics and dynamically allocate buffers large enough to accomodate the entire line.
Additional reading: Getting Interactive Input in C
What’s wrong with this C++ code?
#include <iostream> #include <string>
class Foo { public: Foo() { }
operator bool() { return true; }
std::string operator[](const std::string& s) { return s; } };
int main(void) { Foo foo; std::cout << foo["hello world!"] << std::endl; return 0; }
Answer: