Cut down video size while maintaining quality

I believe the best way to do this is through ffmpeg (Windows build –, others –

Copy your input video (ex: in.mp4 to the ffmpeg bin folder). Open up the command line and navigate to the same bin folder and enter this command. Here out.mp4 is the output video.

ffmpeg -i in.mp4 -crf 20 out.mp4

It works for most videos extremely well. Saved me when my videos were over the size limit for research submissions.

Using the cross product to determine the orientation of edges and points in 2D

The cross product is an extremely valuable tool when doing geometric calculations. One of the many uses of it is it determine whether a point is to the left of an edge or to the right of an edge (also referred to as counter clockwise or clockwise).

cross product


Let’s consider the edge OA, and the point B as the edge and the point we would like to determine the orientation of.

To start, let’s connect B to O, to form OB, and then take the cross product OA with respect to OB, i.e. OA x OB. This will always be positive if B is on the left (or CCW) with respect to OA.


Now, if we use the righthand corkscrew rule (which can be used to determine the direction of the vector in Z dimension) and curve our fingers in the direction of OA to OB, i.e. the direction if OA was rotated around O on to OB, we see that our thumb point upwards. Based on the convention that upwards is positive, you can easily see why the cross product works. If we, do the same from OA to OC, you will see that we have to twist our hand and the thumb point downwards. Thus, if the point is on to the right of OA (ex: C), the cross product will always be negative.

One of the uses of this is when your constructing a spatial graph (such as a road network), and you want to determine the order of edges in CW or CCW order around a vertex (or point). Then if you take the cross products between a chosen edge against all the other edges, and sort them in a descending manner according to the cross product value, the edges will be sorted in a CCW order. This happens because the value of the cross product will be highest, i.e. largest positive value at the edge that will be closest to 180 degrees, relative to the chosen edge, and lowest at the edge just adjacent to the largest positive edge, but in the CW order. This property aligns itself nicely, when a sorting of the edges are needed. And since the cross product is a very fast to compute, it will perform fast as well for interactive applications.

Debugging CUDA – Tips and Tricks

CUDA is fast but painful to debug. It’s similar to working with openGL, which gives brilliant results when it works, but you have no idea what’s going on when it doesn’t. I’m listing down a number of ways that you can use to track down issues in your CUDA algorithms. Hopefully, it will ease the pain that I had to go through.

  1. Install Nsight and use CUDA Debugging

This step seems rather obvious, and Nsight gets installed when you install CUDA. But, surprisingly its not obvious to a beginner how to use and why you should use it. If you are using Visual Studio, and are having problems with your CUDA algorithm, follow these steps to start debugging. Make sure the project is built in “Debug” mode. After building it (don’t run it), open the Nsight menu and click CUDA Debugging. And now, you should be able to conveniently place breakpoints within your CUDA kernels, that get hit. Also, look at the Nsight output in your output information, and watch out for error codes.

  1. CUDA Memory checking

Always, make sure for memory access violations. Click on the Nsight menu and make sure “Enable CUDA Memory checker” is checked and follow the steps under point 1 to debug your application. If there are memory access violations stop right there! This is the first thing you should correct. Even if your algorithm runs and you are getting some results, there can be plenty of subtle bugs lying around when memory access violations happen. A common error that happens is because some threads access your arrays outside their index. So you need to block proceeding if a thread index is outside by including a return statement after an index range check like below:

int x_index = blockDim.x * blockIdx.x + threadIdx.x;
int y_index = blockDim.y * blockIdx.y + threadIdx.y;
if ((x_index >= cols) 
	|| (y_index >= rows)) {

  1. Understand Nsight debugging output

Make yourself familiar with the CUDA runtime error codes. Nsight will sometimes give output with an error such as “Program hit error 9 on execution”. Now, what you have to do is look up this error code with the documentation that you are using. Let’s look it up here – Aha! now we know what error 9 means. It says “This indicates that a kernel launch is requesting resources that can never be satisfied by the current device. Requesting more shared memory per block than the device supports will trigger this error, as will requesting too many threads or blocks. See cudaDeviceProp for more device limitations.” We probably asked the kernel to use 100000 threads per block or something to that effect, which is out of the limit of threads that the device can use per block. Now, we know we need to check what the values we are passing and adjust that.

  1. Time your functions

This is something that I found extremely helpful. Here’s a simple C++ snippet I use:

Clock::time_point t0 = Clock::now();
CUDA_segment(pre_segmentation_img, post_segmentation_img, vis_img);
Clock::time_point t1 = Clock::now();
milliseconds ms = std::chrono::duration_cast<milliseconds>(t1 - t0);
std::cout << "Time taken for segmentation: " << ms.count() << "ms\n";

In addition to telling your execution time, which probably matters to you since you are trying to use CUDA, it also tells you if your CUDA execution failed. If you are getting a run time like 1ms for something that would usually take about 500ms, you need to hold your enthusiasm. Your algorithm didn’t suddenly become super fast. Your CUDA code probably ran into an error, and exited.

  1. Use a single thread and a single block and check sequential execution logic

If there is a problem with your algorithm and you need to understand why it’s failing, try simplifying your kernel execution to a single thread. This allow you to forget the complexity of parallel execution and debug it like a single threaded application. Just use block size = 1, and threads per block = 1. Also, do any additional modifications to your kernel code so that it goes on the same path every time you debug, i.e. if your processing an image, make sure it operates on the same sequences of pixels, by hard coding the x and y indices (x_index = 200, y_index = 200).

convert_2_closest_color <<<1, 1>>> (cuda_img, valid_colors_);
  1. Fast debugging – Use printf

After following step 3, I prefer to use a lot of printfs for debugging. This allows me to execute the code in “Release” mode, and see what exactly is going wrong at a fast execution speed.

NOTE: Make sure you disable all printfs through a macro when you want to use this code in production

  1. Write back your output to files and check your output

Even with debugging, the data structures you use are hard to check because of the massive parallelism that’s inherent with CUDA. Try to write out the effects of the intermediate steps of your algorithm by doing a cudaMemCpy from device to host. I usually write out the data into CSV files or image files and check the output for any issues that I can see. If you can visualize the data, you will notice a lot of issues that can result due to errors in your code.

I hope this helped to ease some of the pain that you are suffering due to programming CUDA. Don’t get me wrong I love CUDA, and I truly love the end execution times it gives for my algorithms. But debugging is quite a process and needs to get used to 🙂

Don’t use memset for initializing floats or doubles

I’ve been dabbling at a bit of C for some extremely optimized code, and it has served me well. I’ve also learnt some lessons as well. Here is one lesson I learnt regarding memset.

I had a large array of floats.

int great_array_size = 100000;

float * the_great_array = (float*) malloc (sizeof(float) * great_array_size));

Now, I wanted a fast way of initializing the array to a specific value. So, without thinking too much I used memset. I did not have any knowledge when I was using this that it was mainly used to initialize strings.

float great_initial_value = 0.005f;
memset(the_great_array, great_initial_value, great_array_size);

Instead of fast initialization what I got was a world of hurt. Memset will convert the passed value to a char value, and in this case use the least significant byte value of the float and use this value to initialize your array.

The correct way to initialize a float array is the obvious way.

for (size_t i = 0; i < great_array_size;++i) {
    the_great_array[i] = great_initial_value;

Sigh. It seems easy now that I know what happens. Oh, well.

The difference between malloc and new

The difference between malloc and new is subtle, but important if you are mixing C and C++. malloc will allocate the memory needed for your object. new will allocate your memory and call your constructor as well, executing any code in it.

The same difference applies to free and delete.

Here’s a code example.

#include <cstdlib>
#include <iostream>

struct MyClass {
int property = 0;
MyClass() {
    property = 10;

~MyClass() {
    std::cout << "Object destructor called" << std::endl;
int main(int argc, char** argv) {
   MyClass *my_class_malloc = (MyClass*) malloc(sizeof(MyClass)); // just allocated memory
   std::cout << "Property : " << my_class_malloc->property << std::endl;
   MyClass *my_class_new = new MyClass(); // calls constructor and sets to 10
   std::cout << "Property : " << my_class_new->property << std::endl;
   std::cout << "Calling free..." << std::endl;
   std::cout << "Calling delete..." << std::endl;

List of must have plugins for coding C++ in Visual Studio

On this post I maintain a list of plugins I use with Visual Studio.

  1. Power Tools

If you are used to control+click through Eclipse or Idea usage. You will love this set of tools. It has tons of useful tweaks that makes Visual Studio much easier to use.

Available at:

  1. VsVim

I’m a big vim fan for ease of navigation, fast editing and movement. If you are this will be a life saver for you, as you will never have to use arrow keys or the mouse for navigation.

Available at:

  1. Refactoring (Only rename)

This feature is almost expected of IDEs and is primarily one of the reasons why we use them for complex projects. Unfortunately, this plugin only has renaming as the supported refactoring option. I really miss the extract method that was available in other IDEs after I switched to Visual C++ for my work. Anyway, it’s better than nothing.

Available at:

4. Image Watch (as mentioned by Chris May)

This is a really cool plugin if you are working with openCV. It helps you get rid of the std::cout statements you need to include to see the contents of matrices, and makes working with images a pleasure rather than a pain.

Available at:

Let me know if there are any other plugins that you find useful for coding C++ in Visual Studio.

List of Issues that can arise with 3D Reconstruction using Stereo Cameras


3D reconstruction is one of those tasks that depends on getting many things right. If you usually have a bad reconstruction or a very weird looking one, chances are that you might have an issue with one of the issues in this list. I strongly advice using openCV when doing 3D reconstruction, as it provides a robust set of functions for each step of 3D reconstruction. These issues in the list are still valid even if you don’t use openCV for reconstruction. Without further ado:

1. Calibration Errors

The calibration is the MOST important step. This can not be said enough. It will serve you well double or triple check it. If everything goes right, you will usually end with an error less 0.5. Anything larger than 1.0 is usually a sign of things going bad. Make sure to use findCheckerboardCorners, then optimize it further with cornerSubPix. Any slight calibration problem will blow up and make your reconstruction fall flat on its face.

2. Calibration Pad Orientation


This is one problem that I faced and it has no solution except to take another image. If you use a square checkerboard pattern you may get this issue. I was informed that using a rectangle pad avoids this issue (as it checks for a set number of corners to detect per side). Whenever you calibrate double check that by visualizing the detected checkerboard corners that your orientation happens in the same order. Usually it’s from top left to bottom right. As seen in the image above, sometimes, one checkerboard patterns starts from top right to bottom left. This is a weird situation and I found no solution to this problem, except taking new images.

3. Ordering of calibration points

Look in to the 2D points and the 3D points and make sure they correspond in position the way you want them to correspond. I prefer them increasing and decreasing in the same order, i.e. when x of 2D points increase, x of 3D points also increase, and when y of 2D points increase then y of 3D points increase. If you also prefer it this way you usually have to flip the Y coordinate of the 2D points. There is no hard and fast rule about this, but make sure it 3D-2D pairs correspond the way you think they correspond.

4. Reprojection Error

After you get your camera matrices make sure to check for the re-projection error. Like the calibration error this is a good indicator of something going wrong. This is usually less than 2.0 to get a decent reconstruction.

5. Automatic Feature Detection


As cool as the Flann detection scheme is, there’s a reason that active 3D reconstruction methods are still in business. These automatic detectors rarely give you the points you want  with the accuracy you want. If you closely check them, you will notice that there are errors of a pixel range, so tread carefully.

5. Triangulation Error

After you triangulate points with your correspondence point, again check the re-projection error. They have to correspond to your correspondence points. This not happening again is clear indication that something is amiss. Mapping the texture from these re-projected points is also a great way to find out whether something is wrong. The texture usually gets warped and small errors can easily be detected.

6. Image resolution

Another problem that I faced when reconstructing objects was with small image resolution. When using an image of 640 x 480 for performance reasons, I noticed that my 3D reconstruction had noticeable errors. After double-checking everything I noticed that my resolution wasn’t accurate enough to wipe out perturbations of 3D points. Switching to a high resolution such as 1600 x 1200, solved this problem. This problem can be detected because other than small but noticeable errors in the reconstruction, the shape and the textures will be preserved well. This is a good indication (given that the correspondence is done well) that the image resolution used is not high enough.

Some more things that can go wrong was pointed out to me by Google+ users Kabir Mohammed and Oliver Hamilton. Here they are:

7. The need for a robust rig

I didn’t use a rig. I just used a hand held camera as this was a temporary exercise. But, if you are going to calibrate once and use the rig throughout, you need to make sure to re-calibrate again. As pointed out by Kabir, the tiniest bit of warping or bending of the rig will screw up reconstruction.

8. Synchronizing frames between cameras

If you are using multiple images (or video), then you need to make sure that the cameras are synchronized. Otherwise, you may end up using different image pairs! As Oliver pointed out if you’re moving the cameras around and the frames aren’t captured at the same time it gives the effect of them not being rigidly linked. Usually USB cameras require an external clock/trigger signal to sync start of image integration. Firewire cameras usually sync across the bus themselves.

I hope these pointers helped some of you to iron out any issues you had with 3D reconstruction using a stereo rig. Please let me know in the comments section of any other issues you faced when doing 3D reconstruction.