Smokes your problems, coughs fresh air.

Tag: C

Finding the right UUID generation algorithm for

Early during the development of the PostgreSQL backend for—that is our amazing managed MQTT hosting service!—, I decided on using UUIDs rather than auto-incrementing integers wherever I needed or wanted surrogate keys. I am one of those people who prefers the use of natural keys where their use is … natural, but I certainly have nothing _against_ surrogate keys, only to their overuse, which usually results from an over-reliance on ORMs (which I do have something against). There are a couple of advantages to using UUIDs over auto-incrementing integers (available in PostgreSQL via sequences):

Continue reading

Rapidly firing myself in the foot with C pointers

Now that I am dedicated to becoming a somewhat decent C programmer, I need to master pointers. Last week, I leveled up in my pointer usage by debugging a particularly nasty segfault. It took the help of gdb (the GNU Project Debugger) for me to notice that my segfault was accompanied by very weird values for some increment counters, while the pointer involved was a char* pointer, not a pointer to an int.


First, some notes on the GNU Project Debugger: it’s excellent! And … it’s easy to use. I have no idea why looong ago, when as a budding programmer I was trying to use it, I had so much trouble using it that it stuck into my memory as a very tough tool to use. Well, this is not the same brain anymore, so time to get rid of all these printf() statements everywhere (that I wouldn’t dare use in a programming language that I do have some fluency in, mind you!) [lines of shame: L45, L100, L101, L119 ].

With the help of gdb xjot-to-xml (and then, from within GDB, run < my-test-file.xjot), I noticed that some of the ints I used to track byte, line and column positiion had ridiculously high values for the input line, especially since I was pretty sure that my program crashed already on the first character.

In GDB, such things are easy to find out: you can very simply set a breakpoint anywhere:

break 109
run < tests/element-with-inline-content.xjot
Starting program: /home/bigsmoke/git/xjot/xjot-to-xml < tests/element-with-inline-content.xml

Breakpoint 1, _xjot_to_xml_with_buf (in_fd=537542260, out_fd=1852140901, buf=0x6c652d746f6f723c, buf_size=1024)
    at xjot_to_xml.c:109
109                 byte = ((char*)buf)[0];

From there, after hitting the breakpoint, I can check the content of the variable n that holds the number of bytes read by read() into buf.

print n
$1 = 130

So, the read() function had read 130 bytes into the buffer. Which makes sense, because element-with-inline-content.xjot was 128 characters, and the buffer, at 1024 bytes, is more than sufficient to hold it all.

But, then line_no and col_no variables:

(gdb) print line_no
$2 = 1702129263
(gdb) print col_no
$4 = 1092645999

It took me a while to realize that this must have been due to a buffer overrun. Finally, I noticed that I was feeding the address of the buf pointer to read() instead of the value of the pointer.

(I only just managed to fix it before Wiebe, out of curiosity to my C learning project, glanced at my code and immediately spotted the bug.)

The value of pointers

C is a typed language, but that doesn't mean that you cannot still very easily shoot yourself in the foot with types, and, this being C, it means that it's easiest to shoot yourself in the foot with the difference between pointers and non-pointers.

I initialized my buffer as a plain char array of size XJOT_TO_XML_BUFFER_SIZE_IN_BYTES. Then, the address of that array is passed to the _xjot_to_xml_with_buf() function. This function expects a buf parameter of type void*. (void* pointers can point to addresses of any type; I picked this “type”, because read() wants its buffer argument to be of that type.)

What went wrong is that I then took the address of void* buf, which is already a pointer. That is to say: the value of buf is the address of buffer which I passed to _xjot_to_xml_with_buf() from xjot_to_xml().

When I then took the address of the void* buf variable itself, and passed it to read(), read() started overwriting the memory in the stack starting at that address, thus garbling the values of line_no and col_no in the process.

The take-home message is: pointers are extremely useful, once you develop an intuition of what they're pointing at. Until that time, you must keep shooting yourself in the foot, because errors are, as Huberman says, the thing that triggers neuroplasticity.

WW challenge 1: learning better C by working on XJot

Since the beginning of this month (October 2021), I become officially jobless, after 6 years at YTEC. That’s not so much of a problem for a software developer in 2021—especially one in the Dutch IT industry, where there has been an enormous shortage of skilled developers for years. However… I don’t (yet) want a new job as a software developer, because: in the programming food pyramid, I’m a mere scripter. That is, the language in which I’m most proficient are all very high-level languages: Python, PHP, XSLT, Bash, JavaScript, Ruby, C# (in order of decreasing (recency of) experience. I have never mastered a so-called systems language: C, C++, Rust.

Now, because of the circumstances in which YTEC and I decided to terminate the contract, I am entitled to government support until I find a new job. During my last job, I’ve learned a lot, but almost all I learned made me even more high-level and business-leaning than I already was. I’ve turned into some sort of automation & integration consultant / business analyst / project manager. And, yes, I’ve sharpened some of my code organization, automated testing skills as well. Plus I now know how to do proper monitoring. All very nice and dandy. But, what I’m still missing are ① hardcore, more low-level technical skills, as well as ② front-end, more UX-oriented skills. Only with some of those skills under my belt will I be able to do the jobs I really want to do—jobs that involve ① code that has to actually work efficient on the hardware level—i.e., code that doesn’t eat up all the hardware resources and suck your battery (or your “green” power net) empty; and I’m interested in making (website) application actually pleasant (and accessible!) to use.

If I start applying for something resembling my previous job, I will surely find something soon enough. However, first I am to capture these new skills, if I also wish to continue to grow my skill-level, my self-respect, as well as my earning (and thus tax paying) potential. Hence, the WW challenge. WW is the abbreviation for Wet Werkeloosheid, a type of welfare that Dutchies such as myself get when they temporarily are without job, provided that you were either ⓐ fired or ⓑ went away in “mutual understanding” (as I was). If ⓒ you simply quit, you don’t have the right to WW (but there are other fallbacks in the welfare state of the NL).

Anyways, every month, I have to provide the UWV—the governmental organization overseeing people in the WW—with evidence of my job-seeking activities. Since I decided that I want to deepen my knowledge instead of jumping right back into the pool of IT minions, I will set myself challenges that require the new skills I desire.

My first goal is to become more comfortable with the C programming language. I have some experience with C, but my skill level is rudimentary at best. My most recent attempt to become more fluent in C was that I participated in the 2020 Advent of Code challenge. I didn’t finish that attempt, because, really, I’m a bread programmer. Meaning: before or after a 8+-hour day at the office, I have very little desire to spend my free time doing even more programming. To stay sane (and steer clear of burnout), that time is much-needed for non-digital activities in social space, in nature, and by inhabiting my physical body.

So now I do have time and energy to learn some new skills that really require sustained focus over longer periods of time to acquire. On the systems programming language level, besides C, I’m also interested in familiarizing myself with C++ and Rust. And it is a well-known facts that C knowledge transfers very well to these two languages.

Okay, instead of doing silly Christmas puzzles, this time I’ve resumed my practice by writing an actual program in C that is useful to me: xjot-to-xml. It will be a utility to convert XJot documents to regular XML. XJot is an abbreviated notation for XML, to make it more pleasant to author XML documents—especially articles and books.

Using ROT13 to circumvent Gmail’s .exe filter

Wiebe reminded me by mail to send him a Windows executable. I use Gmail. Gmail doesn’t allow me to send archives containing .exe files. Something to do with viruses on Windows. Wiebe doesn’t use Windows. He uses Wine. I just want to send the damn executable.

Last time I wanted to do this, I emerged net-mail/email. But that just wasn’t very cool. A while ago someone also suggested using a password-protected zip file, but I find that even less cool because I hate zip files. (I usually prefer bzipped tarballs.)

ROT13 is a variation of ROT3, the Ceasar Cipher, which is as old as it is insecure. For computer use ROT13 is even cooler, because you can decrypt just as you encrypt. Decrypt and encrypt are the same. Let’s grab the simplestshortest C implementation we can find, rot13-shortest.c:


Compile it and use it as a filter to encode a file:

rot13-shortest < suspicious.tar.bz2 > suspicious.tar.bz2.r13

To decode (to “decrypt” is too grand a verb), just swap the input and output files:

rot13-shortest < suspicious.tar.bz2.r13 > suspicious.tar.bz2

Now, it’s time for me to hand in my Geek Card. Firstly, I confused ROT13 with ROT3 (even spending some time trying to find a ROT3 implementation). Secondly, I had never used ROT13 before, not even to by-pass a forum filter or to make a joke which is funnier if you wear a pen-protector. EBG13 wbxrf whfg nera’g shaal.

© 2024 BigSmoke

Theme by Anders NorenUp ↑