Sazid

November 03, 2020

Recently, I wrote a python script that allow you to collect and extract data from deeply nested structures without the need for writing boilerplate loops and list/dictionary access code. The idea is similar to regular expressions where you specify a pattern to match, except in this case, you specify a pattern to collect nested data.

Let’s go through a simple example to understand how this works. Given the following data (a random json data):

[
    {
        "error1": {
            "type": "Runtime Error",
            "occurrence": [
                {"line": 10, "message": "fail"},
                {"line": 20, "message": "block"},
            ],
        },
        "error2": {
            "type": "Compiler Error",
            "occurrence": [
                {"line": 50, "message": "fail"},
                {"line": 64, "message": "xyz"},
                {"line": 70, "message": "pqr"},
            ],
        },
        "1error": {
            "type": "Runtime Error",
            "occurrence": [
                {"line": 100, "message": "fail"},
                {"line": 200, "message": "block"},
            ],
        },
        "2error": {
            "type": "Compiler Error",
            "occurrence": [
                {"line": 500, "message": "fail"},
                {"line": 640, "message": "xyz"},
                {"line": 700, "message": "pqr"},
            ],
        },
    },
    {
        "error": {
            "type": "Brain malfunctioned",
            "occurrence": [
                {"line": 150, "message": "abort!"},
                {"line": 23, "message": "shutdown"},
            ],
        },
        "error": {
            "type": "Computer crashed",
            "occurrence": [
                {"line": 341, "message": "blocked"},
                {"line": 4, "message": "blocked"},
                {"line": 74, "message": "math error"},
            ],
        },
    }
]

Let’s say we only want to collect line numbers. You have to specify the following pattern and it’ll give you all the line numbers in a list.

Pattern:

# Pattern
pattern = "_all_, _all_, occurrence, _all_, line"

# Output
[10, 20, 50, 64, 70, 500, 640, 700, 100, 200, 341, 4, 74]

The first thing you notice is that the pattern is a string with some comma-separated items. And it contains some mysterious _all_ in it.

What’s _all_? By putting the _all_ property you’re essentially asking the collector to loop through any items (be it a list or a dictionary). The first _all_ loops through the first level of items (two dictionaries in this case) inside the root list. The second _all_ loops through the values of the dictionaries (ignoring the dictionary keys). Next, we’re asking for the occurrence items which are present inside each of the dictionaries at the third level. Notice that, occurrence itself is yet another list. So, we loop through _all_ the items inside it. And lastly, we collect line. Done!

Let’s see a few more patterns and their outputs:

pattern_1 = "_all_, _all_, type"
# Output of pattern 1
['Runtime Error', 'Compiler Error', 'Compiler Error', 'Runtime Error', 'Computer crashed']

pattern_2 = "_all_, *error, type"
# Output of pattern 2
['Compiler Error', 'Runtime Error', 'Computer crashed']

pattern_3 = "_all_, error*, occurrence, _all_, line"
# Output of pattern_3
[10, 20, 50, 64, 70, 341, 4, 74]

Notice that you can specify partial names for dictionary keys using * before or after a name. In the case of pattern 2, we wanted all errors ending with the word “error”. So the pattern is *error. In case of pattern 3 we wanted all errors ending with the word “error”, so the pattern is error*.

September 02, 2020

Rust env_logger Variable

Self note post.

The env_logger crate in Rust which provides the underlying logging facilities for the log crate requires an environment variable to be set which configures the log levels (and subsequently which logs to show in the console) - RUST_LOG.

Log levels

error
warn
info
debug
trace

Commands for various shells

Powershell
```
$Env:RUST_LOG="debug"
```
CMD
```
set RUST_LOG=debug
```
Bash/ZSH - Linux/Mac
```
export RUST_LOG="debug"
```

More information.

PS. I’m liking how simple the logging facility is with the env_logger and log crates!

August 30, 2020

Load Balancing with Nginx

Today, I carried out a simple experiment to see how load balancing works and how I can apply it in an existing server setup.

The stack

A wsgi server running on port 8000. Started by a cron job. We use bjoern at the company I’m working on right now. Its probably one of the fastest wsgi servers for python.
A reverse proxy server interfacing the web, listens for requests on port 80 (HTTP) and 443 (HTTPS) requests. We use nginx for this. nginx is also responsible for serving static assets separately, that is, static assets should not hit the wsgi server.

Although I’m saying “we”, but it’s just me who’s managing all the server setups and I came up with this stack instead of the old one.

The old stack

The old stack basically ran on the python based waitress server which used to listen on port 80, served all static assets through the django server, ran as the root user and did not support HTTPS. This was a very bad choice for a few obvious reasons:

No HTTPS support.
Ran as the root user. THIS MUST NEVER BE DONE UNLESS YOU KNOW WHAT YOU”RE GETTING INTO.
Served static assets by the Django application itself, which is without doubt slower than having the assets serve via some dedicated reverse proxy.

Enough of the old stuffs! In our case, nginx just routed all the requests to the wsgi server running on port 8000. We do not use sockets for this.

Side note: I tried a configuration with systemd once, to start the wsgi server to create a socket, but its just too much of a hassle and I didn’t see any additional benefits in our case (pardon me, I’m not really that knowledgable in these areas… specially with systemd services and how each service depend on each other, plus socket stuffs). Besides, starting a server on a port just seemed more portable to me, for example, if we have the wsgi server running on another machine.

It was very simple to modify the nginx configuration file we have, to enable load balancing. First, I created an upstream server_pool branch which would serve as the pool of wsgi servers. Next, I changed the proxy_pass parameter to route requests to the server_pool instead of the localhost:8000, which was essentially the wsgi server running on the same machine. Here’s a simplified view of the config.

upstream server_pool {
    server localhost:8000;
    server localhost:8001;
    server localhost:8002;
    server localhost:8003;
}

server {
    server_name server.com;

    # ...

    location / {
        proxy_pass http://server_pool;
    }

    # ...
}

Then I made sure the changes were okay, by running,

$ sudo nginx -t

After this, all I needed to do was increase the number of servers that was spun up after each reboot on port 8000, 8001, 8002 and 8003. This was just a simple cron job running a python script:

@reboot ... python bjoern_run.py 8000 ...
@reboot ... python bjoern_run.py 8001 ...
@reboot ... python bjoern_run.py 8002 ...
@reboot ... python bjoern_run.py 8003 ...

Well, all of this can be simplified to a single bash script, but I didn’t bother. I was just eagerly waiting to see if the whole setup worked or not. So, sudo reboot and wait for a few minuntes for the machine to come up online. And… it worked! The server was now running a load balancing system. Forgot to mention, but this was being tested on a server with 4 cores (hence the choice to start four wsgi servers) and 16 gigs of RAM. For our use case, it was more than enough.

This configuration can be easily extended so that the wsgi servers run on different machines, the database on another machine and the static files on yet another machine. In that case, changing the localhost:8000 to an IP address should do, I have yet to try this setup though. Maybe, I’ll try it when I get some time next.

End.

August 29, 2020

Upgrading Django from 1.8.2 to 3.1

At Automation Solutionz, we’ve been using Django 1.8.2 for almost the past five years! Without upgrading. While deploying to a new server instance, I found out that, due to a built-in Django bug, file uploads through forms do not work when using Python >= 3.7. So, I quickly updated the Django version in that server to 1.8.19 and everything was fine again. That’s when I thought, why don’t we just upgrade to Django 3.x? Surprisingly, I was able to upgrade the whole server to Django 3.1 within a single day, obviously working almost non-stop past the usual working hours, except for prayer and lunch break.

On a side note, our server does not rely on Django specific code that much (except for routing and views). Models are very rarely used, almost all the queries are either hand-written or generated by our own defined functions. Needless to say, since we’re not utilizing models, we also do not use Django forms. Even our user management system is custom built. Do not ask why… that’s the state I found the project in, when I joined. Well… even the whole stack was in a really bad shape, which I had to improve. I’ll go over that in another post maybe.

Next morning, I started working on upgrading the whole system. A few points I took before upgrading:

Do not update existing libraries unless absolutely required (does not work with newer django version). There was an exception though, which you’ll see.
Do not change behavior of existing code in the middle of the upgrade process.
Start with the next lowest django version and work my way upto the latest version one-by-one, unless there are no changes that’ll break our code in between. Do not consider patch releases in between, only the final patched version. For example, if there are three versions: 2.0.1, 2.0.2, and 2.0.3, try upgrading to 2.0.3.
Reorganize files that are independent of the server code (scripts for database operations, image compression, etc.)

At each stage of the Django upgrade (from one version to another) process, I ran a series of tests to make sure all the core components are working correctly. Specially, the API end points, file upload/downloads, browser caching and a few others.

The first thing I did was, reorganize the independent scripts that were scattered around the root of the server directory. Then, I went ahead and replaced simplejson with Python 3’s built-in json module. This is the only exception that I mentioned in the first point. However, I quickly found out that json was used as a variable name in many many places. I used Visual Studio Code’s Find and Replace to replace all those instances with json_result. This was a quick change.

Next, I read through the changes in between 2.0 to 2.2 to see if I can jump straight to 2.2. And yes, I could jump directly since the changes that are required by 2.0 is the same as required if I upgraded to 2.2 (I’m obviously talking about our own codebase only. Make no mistake, there were quite a lot). So, I went ahead and upgraded the version to 2.2. The few changes I had to keep an eye on were:

Change import of reverse.

from django.core.urlresolvers import reverse

from django.urls import reverse

Upgrade Djagno Rest Framework to latest version (3.11.0).
MIDDLEWARE_CLASSES are deprecated in favor of MIDDLEWARE in settings.py file.

It seems that Django 2.x deprecated the django.shortcuts.render_to_response function in favor of the django.shortcuts.render function. So, I went ahead and started replacing all instances of render_to_response in our codebase. This was the most time consuming task in the whole processes. Since, hundreds of view functions we created used it. Thankfully, I knew a bit of regex and used the excellent regex support for Find and Replace in VS code. Using capture groups, I was able to relocate the parameters correctly for the render function. Some of the regex I had to use are:

render_to_response\(['"](.*)['"],\s(.*)\{\}\)

render_to_response\(['"](.*)['"],\s(.*).*\)

render_to_response\([\s\n]*['"](.*)['"],[\s\n]*(.*),[\s\n]*.*\)

render_to_response\(['"](.*\..*)['"]\)

render_to_response\(['"](.*)['"],\s(.*)\)

Each of the capture groups (...) could be used in the Replace box like: render(request, $1, $2) where $1 and $2 are the first and second capture groups respectively, and they’ll be substituted for the values captured.

This concluded the upgrade from 1.8.2 to 2.2.15. Then I went ahead and upgraded to 3.0, only to realize that there’s not much change in between 3.0 and 3.1 …and I upgraded the Django version to 3.1. I had to make two changes.

Change the database engine to django.db.backends.postgresql from django.db.backends.postgresql_psycopg2.
Since we were on 1.8.2, we used to use the regular expression based django.conf.urls.url which was removed completely in 3.x. But, the same functionality was available through django.urls.re_path. This warranted a replace of all the url(...) function calls with re_path(...) function calls.

And, that’s it! The whole server was upgraded from 1.8.2 to 3.1 in a single day! After this, the changes were pushed to our QA server to test for any other issues. I’ll probably go over the stack later on.

August 28, 2020

Favorite Font - JetBrains Mono

I’ve finally found a font that I really like. It’s the JetBrains Mono font family. Very easy to discern characters even when the font size is small. Previously I used to swap between Consolas, Monaco and Roboto Mono. In fact, if the fonts have loaded properly, you’re most probably reading this paragraph with JetBrains Mono enabled.

Thank you so much to the JetBrains team for this wonderful font and for making it free and open source!

My usual font size is set to 14 in VS Code. Comfortable to view for me on both my desktop and my Thinkpad x280 (ubuntu can’t seem to figure out a perfect screen scaling parameter, thus making everything look so tiny at 100% scaling, 125% is a bit bigger for my taste, specially on such a small screen).

July 01, 2020

Switch to Vim and VS Code

Well… I tried to use emacs, but I found myself ending up wasting time to configure everything from scratch and then retraining my muscle memory to match the same speed as in vim (which is really slow compared to other Vim experts :|). There were other emacs distributions like doom emacs and spacemacs, but I didn’t feel like using them. Its better to just use plain Vim with a very minimal config and Visual Studio code which now has Settings Sync support. The only thing that kept me from using VS Code previously was the large file support which is obviously needed for a file with more than 10k lines. Here’s the config I use for vim:

" Place in ~/.vimrc

set nocompatible

" Syntax highlighting
if has("syntax")
  syntax on
endif

set showcmd  " Show partial command in status line
set ignorecase " Ignore case while searching
set showmatch  " Show matching bracket
set smartcase  " Do smartcase matching
set incsearch  " Incremental search
set autowrite  " Automatically save before executing make, etc.
set mouse=a  " Enable mouse interaction
set smartindent  " Smart indenting
set autoindent  " Automatically indent
set number  " Show line numbers
set relativenumber  " Show relative numbers
set encoding=UTF-8  " UTF-8 encoding for international characters
set scrolloff=5  " Minimum number of lines to show above and below the cursor
set wildmenu  " Auto-complete menu
set laststatus=2  " Always show status line
" set cursorline

" Tabs
set smarttab
set tabstop=4
set shiftwidth=4
set expandtab  " Expand tabs to spaces

" To use `ALT+{h,j,k,l}` to navigate windows from any mode:
:tnoremap <A-h> <C-\><C-N><C-w>h
:tnoremap <A-j> <C-\><C-N><C-w>j
:tnoremap <A-k> <C-\><C-N><C-w>k
:tnoremap <A-l> <C-\><C-N><C-w>l
:inoremap <A-h> <C-\><C-N><C-w>h
:inoremap <A-j> <C-\><C-N><C-w>j
:inoremap <A-k> <C-\><C-N><C-w>k
:inoremap <A-l> <C-\><C-N><C-w>l
:nnoremap <A-h> <C-w>h
:nnoremap <A-j> <C-w>j
:nnoremap <A-k> <C-w>k
:nnoremap <A-l> <C-w>l

" Specify a directory for plugins
" - For Neovim: stdpath('data') . '/plugged'
" - Avoid using standard Vim directory names like 'plugin'
call plug#begin('~/.vim/plugged')

" On-demand loading
Plug 'scrooloose/nerdtree', { 'on':  'NERDTreeToggle' }

" rust.vim
Plug 'rust-lang/rust.vim'

" ctrlp.vim - fuzzy file search
Plug 'ctrlpvim/ctrlp.vim'

" Tagbar
Plug 'majutsushi/tagbar'

" Initialize plugin system
call plug#end()

" Use Ctrl-P to start fuzzy file search
let g:ctrlp_map = '<c-p>'
let g:ctrlp_cmd = 'CtrlP'

May 12, 2020

Using Emacs as Default Editor

I’ve learnt the basics of vim a few years ago and I’ve been able to just use it whenever necessary. I felt quite comfortable in editing small config files with vim. However, I’ve tried to use it for large codebases and everytime I tried it, for some reason, I just couldn’t work with it properly.

Recently, I’ve started using linux as my daily driver again (took a break when I started playing Apex Legends on Windows :p). This time, linux seems so much more usable to me. Gnome 3 has become quite efficient and I think it just stands out on its own as a separate kind of desktop environment similar to OSX and Windows. I used to dislike how slow and inefficient Gnome 3 was, that’s not the case anymore!. Well… I still don’t like the really big title bars in gui applications (take note from Windows & OSX!). However, I love the overview screen when you press the Super key (Windows key).

So, as part of the ongoing endeavour I decided to give emacs a go again (learnt the basics once but forgot everything…). This time I’ve decided to dedicate more time to know about the application properly. Anyway, now that I’ve decided to dedicate my time to emacs, I’ll need to stop using vim for basic editing for sometime, so I needed to configure the terminal default editor. In order to do that, just put the following in the ~/.profile.

# Set emacs as the default EDITOR
VISUAL="/usr/bin/emacs -nw"
export VISUAL
EDITOR="/usr/bin/emacs -nw"
export EDITOR

That should be it! Oh btw, if you need to use the Menu bar available in emacs terminal at the top of the window, press F10.

May 12, 2020

Run Levels in Linux

Self-documenting… going to try out dwm tiling window manager. Seems pretty cool.

Run levels are states or modes that are defined by the services listed in the /etc/rc.d/rc<x>.d, where <x> is the number of the runlevel. More information: Fedora guides: Runlevels.

The following runlevels exist:

0 - Halt
1 - Single-user mode
2 - Not used (user-definable)
3 - Full multi-user mode
4 - Not used (user-definable)
5 - Full multiuser mode (with an X-based login screen)
6 - Reboot

When logging in, if you need to login via a text-only screen then it means you’re running in runlevel 3. If a graphical login screen is present, you’re running in runlevel 5.

The default runlevel can be changed by modifying the /etc/inittab file, which should contain a line similar to the following

id:5:initdefault:

Changing the number in this line to the desired runlevel. Need a reboot to take effect.

UPDATE: I just found out that the linux system I’m using right now (Fedora 32 Workstation) does not use runlevels like this anymore. It uses systemd for that purpose. Quoting the /etc/inittab file in my system:

# inittab is no longer used.
#
# ADDING CONFIGURATION HERE WILL HAVE NO EFFECT ON YOUR SYSTEM.
#
# Ctrl-Alt-Delete is handled by /usr/lib/systemd/system/ctrl-alt-del.target
#
# systemd uses 'targets' instead of runlevels. By default, there are two main targets:
#
# multi-user.target: analogous to runlevel 3
# graphical.target: analogous to runlevel 5
#
# To view current default target, run:
# systemctl get-default
#
# To set a default target, run:
# systemctl set-default TARGET.target

May 11, 2020

Cat

So peacefully sleeping… :D

![]($site.asset(‘cat.jpg’).alt(‘My cat!’))

May 10, 2020

Linux Filesystem

Linux file names are case-sensitive.

Folders are referred to as Directories (if you’re coming from Windows). There are no Local Disks in Linux, everything is stored in the root directory. However, you can mount different directories to different partitions or storage devices if you want.

The linux file system layout is defined in the FHS (Filesystem Hierarchy Standard). However, some linux distributions don’t really follow it exactly. Several directory structuring styles have also changed over the years.

/ - Root directory contains everything that is needed to run the system. If the linux kernel is the brain, the root directory is akin to the heart of the system.
/bin - Short for binary. Contains the most basic binaries necessary for the system, are present here such as ls, cat, etc.
/sbin - System binaries. These are binaries that a system administrator would use. A standard user won’t have access to this directory without proper permissions.

Both the /bin and /sbin directories contain the binaries that are necessary for running the system in single user mode.

/boot - Contains everything an os needs to boot. Better not touch anything here.
/cdrom - legacy mounting point for CD ROMS. Might not be present in all distros.
/dev - Devices live here. Hardware devices such as keyboard, mouse, hard disks, etc are present. Disks are referred to by the files sdx.
/etc - etcetera. This directory contains system wide configuration files.
/lib, /lib32, /lib64 - libraries are stored here. Required for different binaries.
/media, /mnt - Contains mounted drives such as hard disks, usb drives, hard drives, etc. The /media directory is used in recent distros to manage disk drives by the system. When manually mounting something, use the /mnt and let the /media directory to be managed by the os.
/opt - optional directory. Usually contains vendor provided packages. You can place the applications that you created here.
/proc - contains information about system processes and resources. You can also find information about cpu such as cat /proc/cpuinfo and find out the uptmie of the system cat /proc/uptime.
/root - Home directory for the root user. Unlike a typical user, it does not contain all the directories found inside a user’s home directory. Note that, this resides outside the /home directory. You need root permission to access files in this directory. Why is the root directory present in /root and not in /home/root? The answer is, so that the root user can access his/her home directory even if the /home directory is mounted on another partition or disk which may not be available.
/run - fairly new, found in recent distros. Its a tempfs (temporary file system), which means everything present here resides in the RAM. So everything is gone, when you shut down/reboot the system.
/snap - [not standard] contains the Snap package management related files and directories. This is found in Ubuntu and any system that utilizes the snap packaging system.
/srv - Service directory. Service data is stored here for example when you run a web server or a ftp server, you would want to place your files or directories that you want to serve, here for the users to access.
/sys - System directory. Present around for a long time. Its a way to interact with the kernel. Similar to the /run directory in that, this directory does not actually physically exist, it is created evertime the system boots up.
/tmp - Temporary directory - Used by applications to store temporary files here such as files in word processors that you are currently editing, etc. Should be safe to delete stuffs from this directory.
/usr - User application space. Applications used by the users are stored here as opposed to /bin and /sbin which contains applications for the system. Also known as “Unix System Resource”. Applications present here are considered non-essential for the os to function properly. Most program installed from source code will usually end up in /usr/local directory. Larger programs might install into the /usr/share directory. Installed source code will go into /usr/src.
/var - Variable directory. Contains files and directories that are expected to grow in size. For example /var/log contains the logs for different applications, /var/crash contains crash logs, /var/cache to store different cache items.
/home - Home directory for users. Each user has his/her own home directory in the format /home/username. Contains all user related data such as user specific configuration files, cache, etc. /home/username/.themes and /home/username/.icons contain themes and icons that are available to the current user. If you want to save all your settings, this is the directory that you want to backup. After upgrading or updating to a new system, restoring the /home directory means that all your settings will stay as is even if you reinstall applications.