Skip to content
April 3, 2014 / Karthikeyan Natarajan

Find scripts folder in bash

we execute the bash scripts directly using ./ or using source

sometime we need to create folder in project root DIR or only on scripts folder.  Here is a simple single line command for finding scripts path

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}")" && pwd )"


March 31, 2014 / Karthikeyan Natarajan

Programming Atmega8 using Arduino Uno (Rev3)

I started playing with Arduino Uno recently. I had bought one Atmega8L to make USBASP. But earlier I could not get burn it with Arduino as ISP.

When uploading Bootloader I got the following error.

avrdude: stk500_paged_write(): (a) protocol error, expect=0x14, resp=0x11
avrdude: Send: V [56] @ [40] . [00] . [00]   [20]   [20] 
avrdude: Recv: . [12] 
avrdude: stk500_cmd(): programmer is out of sync

In order to avoid such errors, don’t use the examples>Arduino ISP. Use this file’s content

This is because the Arduino ISP for Arduino Uno will work for 9600 baud rate only.

Today, I started with connecting atmega8 to Arduino Uno with the following diagram


After connection, I chose

Board > Arduino UNO
Programmer > USBasp

Open the downloaded ArduinoISP.ino and Click Upload.

After this, I created a file  /home/karthikeyan/sketchbook/hardware/breadboard/board.txt Optiboot8



Close and open the Arduino IDE. Now choose

Board > Arduino Optiboot8
Programmer > Arduino as ISP

Now you can choose ‘Burn Bootloader’ or open any example and Upload.
This has been tested with Ubuntu. It will work for windows too.

If you want to test with example programs (eg. Blink), open it and choose ‘Upload using Programmer‘.

Advanced understanding:

Note that, the upload protocol is stk500v1 and speed is  9600.
‘arduino:optiboot’ means folder optiboot in /usr/share/arduino/hardware/arduino folder (arduino installed folder).
‘arduino:standard’ means variant standard in arduino installed folder.
If you want to have locally installed variant, bootloaders, cores, remove ‘arduino:’ in board.txt. Create directory structure similar to arduino installed folder at /home/karthikeyan/sketchbook/hardware/breadboard

If you get errors, in Preferences, enable verbose output for upload and check the error.
If you want to burn your own hex file (like usbasp hex file), copy the command used for burning in verbose mode and just change the file name/location.


August 2, 2013 / Karthikeyan Natarajan

Passing arguments (options) to linker in nvcc compiler

Sometimes we need to pass some parameters to linker. In gcc, we use -Wl. This does not work with nvcc and it gives following error.

$  nvcc -L/usr/local/cuda/lib -lcurand -Wl,-rpath,/usr/local/cuda/lib

nvcc fatal   : Unknown option ‘Wl,-rpath,/usr/local/cuda/lib’

There is alternatives to solve this.  Instead of -Wl, use –linker-options=   or -Xlinker=

Both will work well.

Eg: $  nvcc -lcurand –linker-options=-rpath,/usr/local/cuda/lib

      $  nvcc -lcurand -Xlinker=-rpath,/usr/local/cuda/lib


April 4, 2013 / Karthikeyan Natarajan

tmux rules

I was using screen to run scripts using ssh and detach it so that I can close ssh after running scripts.  I wanted to work with multiple source files in vi editor and also i don’t want to quit vi and then open all files in vi editor again after compilation and execution. This is where tmux comes in handy with powers of screen and multiple windows and panes.

tmux is extremely convenient if you could play around ~/.tmux.conf file.

First of all, default prefix for tmux is Ctrl+b. Key ‘B’ is bit far from both Control keys. So, I wanted to switch to Ctrl+a. so add following line to ~/.tmux.conf

# remap prefix to Control + a
set -g prefix C-a
unbind C-b
bind C-a send-prefix

Next how to split panes. Instead of %, I used
Ctrl+a \ New vertical pane
Ctrl+a – New horizontal pane
which most of vi editor lovers would like to do. I wanted more like browser interface.
Ctrl+n new vertical pane.
Ctrl+h new horizontal pane.
Ctrl+w Closing current pane

Paste this code in tmux config file (~/.tmux.conf)

unbind %
bind \ split-window -h
bind – split-window -v
bind-key -n C-n  split-window -h
bind-key -n C-h  split-window -v
bind-key -n C-w  kill-pane
set -g mouse-select-pane on

Also to switch between panes, I wanted to use Ctrl+Up, and Ctrl+Down.

bind-key -n C-Up    up-pane
bind-key -n C-Down  down-pane

There are some of customizations using -n option. -n option allows to bind keys without prefix.
In order to load modified tmux file, type once

tmux source ./.tmux.conf

Create tmux session,
Each session will be numbered as 0, 1, 2 …
Named sessions can started as
tmux new -s name
To detach current tmux session, Press Ctrl+a d
you can configure any new key combination for detaching using -n option.

bind-key -n C-q detach

To resume/attach current session,
tmux attach -t name

July 30, 2012 / Karthikeyan Natarajan

Calling CUDA program from C/C++ project.

In order to speed up execution in  a big project, one might use CUDA functions.  CUDA supports only C & a little bit C++. So, it is not possible to write whole functions again in CUDA. so, you can write a host wrapper function in CUDA which calls the kernel device function. Here is the procedure to do it.

1) include this line in the file where you need to call CUDA function

extern void wrapperfunction(void);

Instead of void, you can pass any values  and get any values as you wish because it is simple C function.You can call this function from any part of your C/C++ code.

2) create another file named and include these lines.

#include <stdio.h>
 __global__ void kernel(int *a, int *b,int *c){
 int tid=threadIdx.x;
 //your code for kernel
 void wrapperfunction(void){
 // your code for initialization, copying data to device memory,
 kernel<<<32,32>>>(a,b,c); //kernel call
 //your code for copying back the result to host memory & return

3) Now, compile this file using following command.

nvcc -c

4) Now, link the object file created while linking in C/C++ project

g++ -o program -L/usr/local/cuda/lib64 -lcuda -lcudart main.cpp  wrapper.o

NOTE: In order to compile this program, you don’t need a PC with GPU. if you have installed CUDA environment, you can compile. In order to execute, you need a PC with GPU. Don’t forget that if you are using a 64 bit machine to link to the 64 bit library!


If you want your executable to run both in PC without GPU and PC with GPU, follow this procedure. Create another host C function configureCudaDevice() in the file (with prototype in the header) that queries and configures the GPU devices present and returns true if it is successful, false if not. If you use many functions, you can create a header file which contains extern declarations of all wrapper functions.

Include this in “mycudaImplementations.h”,

extern void runCudaImplementation(void); //wrapper unction
extern bool configureCudaDevice(); //own function to find CUDA enabled device is present

Include this code in main C/C++ implementation,

// at app initialization
// store this variable somewhere you can access it later
bool deviceConfigured = configureCudaDevice();         
// then later, at run time
    runCudaImplementation(); //run wrapper function for CUDA code
    runCpuImplementation();  // run the original code

Here is my make file:

all: program
program: cudacode.o
    g++ -o program -L/usr/local/cuda/lib64 -lcuda -lcudart main.cpp  wrapper.o
    nvcc -c
clean: rm -rf *o program

July 30, 2012 / Karthikeyan Natarajan

Simple CUDA program

I have started learning CUDA for a while. Here i will describe how to create a simple CUDA program & compile and run.

I assume that you have setup your system with CUDA environment & GPU , drivers installed. (I will describe those steps in another post).

Here is the program> save as

__global__ void kernel( void )
int a=threadIdx.x;
int b=blockIdx.x;
int main( void ){
        printf(“Hello, World!\n”);
        return 0;
This is a very simple program which creates 2×5=10 threads in GPU. Each Thread will run the same code but with different Idx (threadIdx.x and blockIdx.x).

There are some basic rules in CUDA kernel.

1) CUDA kernel function does not return any value. It should be void.
2) Memory of GPU and CPU are physically different. Values can be transfered between them using Memory copy functions provided by CUDA.
3) Basically kernels are organized in two dimensions in blocks and grids. A Grid consists of number of blocks (Grid Dimension). A block consists of number of threads (Block Dimension). So, each thread in the kernel has unique threadIdx and blockIdx. Grid dimension can be 2D again (x and y). Block dimension can be 3D again( x,y,z).

This kernel function merely shows existence of unique variable threadIdx.x and blockIdx.x (since i used 1D for both Grid and Block dimensions).
Here the Grid dimension is 2. i.e, no of blocks = 2; Block dimension is 2. i.e, no of threads per block = 5.
Total 2 x 5 = 10 threads.

__global__ is to tell the compiler that kernel() function can be called anywhere from the whole CUDA file.

Compile: nvcc

Execute: ./a.out

This will invoke simple kernel function in your GPU.

To learn CUDA, you can start with “CUDA by Example” book. For more detailed study, you can look into “Programming Massively Parallel processors” book.

July 22, 2012 / Karthikeyan Natarajan

Problem Solved: Windows title bar does not appear in ubuntu

I was using xfce in ubuntu 12.04 because it gave me better battery backup in my laptop. I used to get this problem sometimes “The Windows title bar will not be visible for all windows”. How many times i logged in and out, restarted, the problem remains same.

There is quick fix for this problem. Open Terminal and type


and press enter. All windows title bar will appear again and will be normal then.

Share the story if it helped you.