frn.sh

TIL - Today I Learned

Some small things I’ve learned:

xxd(1) is a neat tool

Found out about a cool tool today: xxd(1). Basically, it can help you convert a hexdump file into binary or vice versa. For example:

➜   head -c 100 /dev/urandom > out.bin
➜   cat out.bin
5O0!3<[N
                R4DT'{SC95#Rd5b62i^5u(OcD"/MrJBUZ%

Now let’s hexdump it:

➜  hexdump -C out.bin > out.hex
➜  cat out.hex
00000000  35 e2 4f 30 21 b4 33 e2  c8 af b9 3c 5b ba 4e 0b  |5.O0!.3....<[.N.|
00000010  09 f6 52 16 34 11 44 54  85 96 b6 00 27 a2 7b 04  |..R.4.DT....'.{.|
...

And then use xxd(1) to convert it back:

➜   xxd -r out.hex out2.bin
➜   cat out2.bin
5O0!3   R4DTSd5b65u(Մ/MZ%

When we inquiry the OS to know what are those files, we get:

➜   file out.bin out.hex out2.bin
out.bin:  data
out.hex:   ASCII text
out2.bin: data

postgres vacuum and netbsd dir size

Running vacuum full in Postgres requires as much free disk space as your database currently occupies, since it rebuilds the entire table by copying all the non-deleted rows to a new file. vacuum on the other reclaims dead tuples space. Something similar occurs with directories in netBSD. If you create lots of files of 255 chars in a directory, the size of the directory will increase. If you delete all files, the size won’t decrease.

Suppose you run touch $( yes a | head -255 | tr -d '\n' ) three times, only replacing the letter. You would have a directory with something like this:

aaa… bbb… ccc…

Then, you delete “b” and create another file with the “d” letter. Since they both share the same size, the OS would understand that the new file can be placed between “a” and “c” (there’s a padding there now, since “b” was deleted):

aaa… ddd… ccc…


writing to disk with O_SYNC

write(2) doesn’t actually write to disk imediatelly. Instead, it writes to a page cache and the OS periodically handles writes to disk. Using O_SYNC, though, write(2) returns only when it fully wrote the data to a data block.

Linux exposes the actual timeframe for periodic writings:

➜  ~ cat /proc/sys/vm/dirty_writeback_centisecs 
500

source and export diff

Non-interactive shells don’t load initialization files, so bash -c 'declare -f' doesn’t output anything. But we can source it: bash -c 'source ~/.bashrc; hello'. Or even: bash -c 'hello() { echo "hi"; }; declare -f'.

It’s all about memory share in shell modes:

  • source changes only affect current shell memory.

  • export marks variables to be passed to child processes.

Subtile difference that can save us lots of debugging time.


cool trick: the kernel stack of a process

I found a cool trick to see what’s happening to a blocked (sleeping) process: cat /proc/pid/stack. Yep, you can peek at the trace of kernel functions related to a process!

➜  pexpl git:(main) ✗ ps aux | grep p.py
frns       23703  0.0  0.0  13888  7948 pts/3    Sl+  02:48   0:00 nvim p.py

➜  pexpl git:(main) ✗ sudo cat /proc/23703/stack
[<0>] do_epoll_wait+0x698/0x7d0
[<0>] do_compat_epoll_pwait.part.0+0xb/0x70
[<0>] __x64_sys_epoll_pwait+0x91/0x140
[<0>] do_syscall_64+0x55/0xb0
[<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

The information I get from this trace is that epoll was called, so the kernel is doing IO multiplexing, probably waiting for a event, and then called a syscall that blocked the process. Combined with other tools, like strace, <pid>/stack can give an specific perspective of what’s wrong with a process. How cool is that?

about procfs

proc (procfs) is a pseudo-filesystem; it dynamically generate directories for processes. The files within /proc doesn’t like on disk, similarly to the /dev directory. Wikipedia lists the history of procfs implementation, which goes back to 1984.