AoC: Day 03

The Problem

We’re still in the relatively easy phase. In this problem, we’re given a map with trees marked in, and on a straight-line path, we have to count how many trees we encounter. The only thing to account for is that the map repeats horizontally, we have to replicate it as many times as needed till we reach the bottom. We do the same in part 2, for different lines.

My Solution

I thought this was pretty straightforward; it took me about ten minutes for each part. However, when I looked at other solutions…

Algorithm

Not really an algorithm, but I was kicking myself about this. I physically (is that the right term?) replicated the map as many times as needed (even did a calculation to figure this out upfront). Turns out all I needed to do is use the mod operator!

Implementation

  • The last input in part 2 required going down two rows at a time. I wrapped myself up in knots doing this, but all that was needed was a [::2]
  • Lots of opportunities for (list) comprehensions, I need to start recognizing these.
  • Check this out: trees = sum(forest[i * sr][(i * sc) % cols] == '#' for i in range(rows // sr))
Follow by Email
Facebook
Twitter
LinkedIn

AoC: Day 02

The Problem

This was more of a string matching exercise followed by a bit of arithmetic on the extracted data. Part 2 involves a bit of logic.

My Solution

The first thing that came to mind was regular expressions with named fields. Its been a while, so I had to go back to the (excellent) docs, but this was pretty straightforward.

Turns out this was overkill (see below)! Also, I found out not one but two improvements in my if statements!

Other Approaches

There isn’t much to explore in terms of algorithm, since the task isn’t very complicated. So mostly looking at what I could learn from doing differently the

Implementation

  • My traditional C based thinking translated into python: if(num >= min_inst and num <= max_inst): A more pythonic way of doing the same: if(min_inst <= num <= max_inst):
  • I blindly translated the requirements of part 2 into two if statements, since we needed to satisfy either of two conditions, but not both …but that’s just what an xor does!
  • Another approach that I liked for part 2 was along the lines of if(condition_1 != condition_2):
Follow by Email
Facebook
Twitter
LinkedIn

AoC: Day 01

The Problem

An extremely simple problem to kick things off, and one that shows up on many beginners programming tests: find two numbers from a list that add up to a specified number. Part 2 asks for three numbers that add up to the specified number. The list is not sorted.

My Solution

My solution was pretty obvious: sort the list, and do a binary search. However, turns out that NumPy has a pretty nice function called searchsorted. We give it the sorted list and a list of keys to search for, and it returns a list of indices, each corresponding to where the key would be inserted while maintaining the order. So pretty cool, I learnt about a new function. And my overall runtime is $O(n\lg{n})$

For Part 2, the natural approach is to repeat part 1 for every element of the list, while suitably modifying the desired sum.

Other Approaches

Now the fun part! Let’s order these into three groups

Algorithm

  • The straightforward – a doubly nested loop,  $O(n^2)$
  • Many binary searches; see the next section for variations in the implementation
  • The one I liked best, and one that I’m kicking myself for not thinking about. Set a lo and hi index and add the corresponding values. If the sum is greater than the desired value, decrement hi, if less, increment lo, if equal, we’re done. The proof of why this is correct is left to the reader. Note that this only speeds up the search time to linear from  $O(n\lg{n})$, the sorting time is not changed. The asymptotic time for both approaches is the same, but this is more elegant!
  • Generating functions (see below)

Implementation

  • The straightforward approach can be expressed as a list comprehension
  • Use recursion to make it generic: if we want  $n$ numbers that add up to the desired value, express this in terms of $n-1$. And of course, don’t forget to terminate the recursion
  • Read the list of numbers into a set. Offload the search for the second number to the set. This can be done using a list as well, but the search time will be linear instead of logarithmic. Another approach used two dicts, for each input number and its complement. I think this was overkill, though I liked the approach
  • itertools.combinations I need to get a better handle on the itertools library. Turns out we can simply create all 2-combinations and a loop over these is all that is required. I haven’t looked up the time complexity, but this is likely  $O(n^2)$, so there is that…
  • itertools.product also can be used for generating the cartesian products of the input

Yowza!!

Express the input list as a polynomial. The values of the list are coefficients and the powers of the variable. Then in the square of this polynomial, the coefficient of the target numbered term is the product that we’re looking for since powers are added and coefficients are multiplied. Knuth would approve!! And of course, we can do the polynomial multiplication using FFTs.

Follow by Email
Facebook
Twitter
LinkedIn

Advent of Code

I recommend Advent of Code to everyone I meet. It is a wonderful way to develop your programming skills, since you’re completely on your own (other sites babysit you quite a bit, methinks). However, while doing the implementation is a great exercise, it is much more useful (especially if learning a new language) to look at what others have done (after solving it ourselves!).

I did about half of AoC in 2019, and figured I’d work on the 2020 edition when I had the time. Well, here we go!

Follow by Email
Facebook
Twitter
LinkedIn

Fractals

It is surprisingly easy to generate a fractal image. A schematic of the Mandelbrot Set is as follows. Start with the equation $z_{n+1} = z_n^2 + c$, where $c$ is a complex number. Your “image” is a 2d array whose rows and columns correspond to the real and imaginary parts of $c$. Set $z_0 = 0$ and repeatedly apply the equation. If the value converges in a set number of iterations, set the corresponding entry in your array to 1, else 0.

Input

Array/figure size in pixels

Output

Bitmapped image, running time

Variants

Color the pixel by the number of iterations it took to converge. Try variants of the equation (see Julia sets). And note that the computation of each pixel is independent of every other: we can parallelize this! We visit this in another post. There are many ways to optimize the sequential implementation — try these out!

Report

Show us what you learnt!

Follow by Email
Facebook
Twitter
LinkedIn

Searching

Input

Array size, data type, search algorithm, key. The key can specified as an index or as a non-existent value

Output

index, if key is in the array, -1 otherwise. Running time

Program

  1. Generate the array of the given data type and size. Populate with random numbers in sorted order (duh!)
  2. Implement the usual suspects:
    1. Linear search
    2. Binary search
    3. Exponential search
  3. Implement different optimizations for each of these
  4. Test that the search is correct

Run 

your implementation. There can be many variations in here, based on where the key is. Compare the performance of different searches for keys at the beginning, middle and end of the array. What do you think is the best-, average- and worst-case situation for each search? Do the performance results match your intuition? There are different ways of analyzing the performance, such as time v. array size for different key locations, or time v. key location for different array sizes, or …. What do you choose?

Report

your findings! Can you correlate these to the theoretical analysis and the architecture of your platform?

Follow by Email
Facebook
Twitter
LinkedIn

Sorting

This is so fundamental that you’re probably wondering why I even bother to put it here. Read on…

Input

Array size, data type, algorithm, sort order

Output

Running time of different parts of the code

Program

  1. Create an array of the given data type and size, and populate it with random values
  2. Implement multiple sorting routines:
    1. InsertionSort
    2. MergeSort
    3. QuickSort
    4. ShellSort
    5. CountingSort
  3. Implement different optimizations for these
  4. Test that the final array is sorted

Run

your implementation with different array sizes. Your program should report the time to generate the array, sort it and test that it has been sorted correctly.

Write a wrapper script that captures the above running times and generates a plot that shows the comparative performance of different sorting routines: the x-axis is the array size and the y-axis is the time. You should have multiple versions of the above sorts with different implementations.

What do you expect the graph to look like? Can you explain the shape of what you actually get? 

Should you combine different sorting approaches? What would you expect, and how would you do this?

What factors impact the performance that you observe?

Report

Compile your experiments and describe your findings. What conclusions do you draw from this? What do you hypothesize, and how did you test your hypotheses?

 

Follow by Email
Facebook
Twitter
LinkedIn

Command-Line and Options

Your programs should be parameterized and should take inputs at the command line (not prompt the user for inputs!)

The command line can be used to control the behavior of the program:

  • Specify the sort order
  • Specify the algorithm
  • Provide debug/progress information
  • Specify output formats and destination

To specify inputs:

  • From a file v. generate internally
  • Provide the search key if searching

This helps when you want to benchmark or generate performance data – easy to put into a script and push the button.

Figure out once how to process command-line options, and you can just use this for all your programs. Most languages have existing libraries or packages that do pretty sophisticated parsing and fancy stuff like how you specify flags and options. Experiment with these!

Follow by Email
Facebook
Twitter
LinkedIn

Focus on Performance

Colleges don’t talk about program performance; I find this strange since they focus so much on student performance!

Develop the habit of instrumenting your code by default — figure out the correct timer to use for your platform, and measure, measure, measure. Understand the interplay between the code you write, what the compiler does to it, and how it runs on the CPU. 

Investigate the tools available on your platform, and experiment with these. You will learn a lot from this than you can ever imagine!

Follow by Email
Facebook
Twitter
LinkedIn

Debug & Test

Learn how to debug your code. This is not just about the tools (research and get familiar with the appropriate tools for your language and platform) but also about writing code that is easy to debug and your mental model for how you approach debugging. 

Think about how you can test the code that you’ve written — both at the micro-level and for the entire application. Get into the habit of writing your own test cases, or rather, generating your test cases. Can you be comprehensive? (why not?) What are the corner cases? Each time you make a change / add a feature / fix a bug, ensure that your code passes all the tests you have defined. 

Learn how to talk about your implementation: the choices you made for the data structures and algorithms and why you chose this instead of that. And remember:

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

Brian Kernighan

Follow by Email
Facebook
Twitter
LinkedIn

Version Control

git is the flavor of the day; so learn how to git. Force yourself to use the command line, this will help you understand the philosophy of version control in general, and git in particular.

Create a github account and start using it. Use it to showcase your improving maturity as a programmer and developer, and use it to showcase your projects. 

Pro Git, Chacon and Straub is a fantastic tutorial and reference, and is available online

Follow by Email
Facebook
Twitter
LinkedIn

Editors

Learn how to use a good editor.

I’m not going to get into a religious war here, so will not make any recommendations. However, once you’re touch-typing, you should be making changes to your code in terms of your thoughts, not in terms of your editor’s commands. Choose an editor that makes sense, and choose one that is built for editing code. You will see your efficiency sky-rocket. 

Follow by Email
Facebook
Twitter
LinkedIn

Touch Typing

The one, simple thing you can do to increase your productivity more than 10-fold is to learn how to touch type. 

This isn’t just about improving your performance at the keyboard. It has to do with removing distractions and obstacles when coding. It’s about becoming one with the machine, translating the thoughts in your head into code, without having to hunt around for the right key and pressing it one at a time. Your brain works much faster than you can express your thoughts, and anything you can do to bridge that gap improves your performance. 

And this has a multiplier effect as you learn how to use a good editor.

 

Follow by Email
Facebook
Twitter
LinkedIn

The Developers Bookshelf

Algorithms & Data Structures

I’ve found multiple perspectives deepens my understanding of topics. The following books cover roughly the same ground, but have very different approaches, and I’ve gotten different insights from each of them

  • Introduction to Algorithms, Cormen, Leiserson, Rivest and Stein. The basics, written clearly and yet with rigor, this book should be the one to start with on most topics. Read it for self-study or as a reference.
  • Algorithm Design, Kleinberg and Tardos. A much more detailed look than CLRS, and the systematic exposition is a joy to read. There are a crazy number of exercises as well!
  • Algorithms, Dasgupta, Papadimitriou and Vazirani. Much less detailed than the first two, but a very different approach to many topics. 
  • Algorithms, 4th Ed., Sedgewick and Wayne. Many fewer topics covered, but in a lot more depth. Especially see how to analyze and test implementations. Great exercises and see the associated booksite as well.

And of course, no list of Algorithms books can be complete without

  • The Art of Computer Programming, vols. 1-n, Knuth. This is advanced, and not for the faint of heart. Also, the best categorization of sorting algorithms that I’ve seen. And demonstrates the limits of analysis (by which I mean it demonstrates how much analysis you can do!). 

Programming Praxis

More than syntax, programming is about the mindset. Here are a few books that help in this area:

  • Programming Pearls, Bentley. Yes, it’s old. Yes, it’s worth every bit.
  • Beautiful Code, Oram & Wilson. Bringing art into engineering
  • The Pragmatic Programmer: From Journeyman to Master, Hunt & Thomas

Others

  • Hackers and Painters: Big Ideas from the Computer Age, Paul Graham
  • Turing’s Cathedral: The Origins of the Digital Universe, George Dyson. A bit of history. 
  • Anything and everything by Simon Singh
  • “Surely You’re Joking, Mr. Feynman”, Richard Feynman

Language-Specific Books

C++

  • The C++ Programming Language, Stroustrup
  • Accelerated C++, Koenig & Moo
  • Modern C++ Design, Alexandrescu

Python

There are too many (very) good options, but I’m going to stick with just one:

  • Python Data Science Handbook, Jake VanderPlas

Design

  •  Design Patterns: Elements of Reusable Object-Oriented Software, Gamma, Helm, Johnson, Vlissides
  • Head First Design Patterns, Freeman & Robson

Fiction

  • The Hitchhiker’s Guide to the Galaxy, Douglas Adams. Everyone should know the answer to the ultimate question
  • The Discworld novels, Terry Pratchett
Follow by Email
Facebook
Twitter
LinkedIn

Programming for the Programmer

Motivation

Students aspiring to program come in all varieties. Some are good, some are not so good and this categorization is strictly within the same pool.

The frustrating part (for me) is that the good programmers have a long way to go to become competent developers. And the not-so-good programmers are not so good for no fault of their own. I find that the assignments that they do, either in college or elsewhere are unambitious and are too focused on specific concepts.

Many students seem to think sites such as leetcode are all that is required. I beg to differ – these are fine for preparing for interviews, but not for  becoming good developers.

Here’s a series of assignments that will take you beyond your current capabilities will teach you more about algorithms and data structures than you currently know, help you learn how to use the right tools, techniques, and skills, and get you started on building your portfolio of work.

The first group doesn’t even ask you to write any code:

The next group is focused on the basics:

  • Sort
  • Search
  • Fractals
  • Command line / options

And a few complicated ones:

  • Shell
  • Bellman-Ford
  • Matrix multiplication
  • Random walks

And the skills needed for tomorrow today:

  • Shared-memory multiprocessing
  • Distributed-memory multiprocessing
  • Map-reduce
Follow by Email
Facebook
Twitter
LinkedIn

Quantization

What you will learn:

  • Software: practice with loops and operations on arrays
  • Domain: approximations and error measurements, sampling

Input: Construct a (continuous) signal of your choice – say a simple sine
wave with a particular frequency and amplitude. Plot this. Note that since you
are representing the continous signal on a computer, so you are already
sampling it.

Discretization in space:

  • For the amplitude, choose a $n$ and divide the range into $2^n$ steps. For each index of your signal, (i) round up (ii) round down and (iii) round to the nearest step value.
  • Calculate the mean squared error due to this process.
  • Repeat for different values of $n$
  • Plot the error for different values of $n$. Does this match what you think it should look like?

Discretization in time:

  • What happens if you take fewer / more samples in time for representing your signal? How would you now determine the quality of your representation?
  • What is the lowest you can go? How does this relate to the frequency of the signal? (*cough* nyquist *cough*)

Discretization in both dimensions

  • Try it out!
Follow by Email
Facebook
Twitter
LinkedIn

Convolution

What you will learn

  • Software: relatively simple constructs, but a great exercise
    in constructing iterations. Visualization.
  • Domain: develop an intuition for how convolution works.
    And its time complexity

Input: construct the following signals of roughly the same height and width. Consider if each of these are odd or even signals.

  • Single rectangular pulse centered at zero
  • Single triangle centered at zero
  • One period of a square wave
  • Single sawtooth
  • Make your own!

Convolve each pair of inputs

  • Decide what the range of the summation should be
    ($-\infty$ to $\infty$?). i.e., how many steps do you need?
  • Start with a single step and plot the result. Repeat for each step. Try animating it!
  • How does the type of signal affect the output?
Follow by Email
Facebook
Twitter
LinkedIn

Modulation

What you will learn

  • Software: fairly simple constructs
  • Domain: Modulation, FSK, tradeoffs

Amplitude Modulation

  • Construct a carrier (high frequency) and a signal, represented as a list of values
  • Simply add the two
  • Visualize all the three signals
  • Take the modulated signal and subtract out the carrier to
    recover the original signal

Frequency Modulation – 1

  • Let’s start with a binary data signal.
  • Based on the frequency of your carrier, determine
    an appropriate $ \Delta_f $
  • Modify the carrier frequency for a range of cycles based on
    the input signal
  • Plot all three signals
  • Recover the input signal

Frequency Modulation – 2

  • Rather than a $f_c \pm \Delta_f$ for a $1$ and $0$, use a return-to-zero scheme. So the frequency changes only if there is a change
    from zero to one or vice-versa.
  • Plot all three signals
  • Recover the input signal

Frequency Modulation – 3

  • You could group your input data by two bits (so four levels) and use correspondingly more $\Delta_f$
  • Plot all three signals
  • Recover the input signal
Follow by Email
Facebook
Twitter
LinkedIn

The Rich Man’s Square Wave Generator

What you will learn

  • Software: python, matplotlib
  • Domain: a different way of looking at Fourier series

Do this in python, mainly for the visualization capabilities

Rich man’s square wave generator

  • Create a sine wave of a given frequency and amplitude. Plot it for one period
  • Create a second sine wave of twice the frequency and half the amplitude as the first. Add the two cycles to the one from the previous step.
  • Continue with a third, fourth, … nth sine wave of successively higher frequencies and smaller amplitudes.
  • What waveform do we get as we add higher frequency components?

Analysis

  • Plot the error between your square wave and a true square wave at each step. Can you bring this down to zero? what is the limit? Can you prove this limit?

Other combinations

  • Take odd multiples of frequency and divide the amplitudes by the multiple squared
  • Take odd multiples of frequency and divide the amplitudes by the multiple squared
Follow by Email
Facebook
Twitter
LinkedIn

The Poor Man’s Square Wave Generator

What you will learn

  • Software: encapsulating functionality, parameterizing functions
  • Domain: frequency and period of a square wave

The poor man’s square wave generator

  • Decide a frequency and amplitude of your desired square wave
  • Create a list with low and high values (these correspond to the amplitude) and specific indices (these correspond to the frequency)
  • Plot this!
  • Create additional signals of multiples of this frequency, and different amplitudes
  • Repeat using numpy arrays. How might you efficiently create these signals, using features of the numpy library?
Follow by Email
Facebook
Twitter
LinkedIn
Search

  • Search

  • Categories

  • Post Date