The [deceptive] power of visual explanation
July 22, 2019
Quite recently, I came across Jay Alammar’s, rather beautiful blog post, “A Visual Intro to NumPy & Data Representation”.
Before reading this, whenever I had to think about an array:
In : import numpy as np In : data = np.array([1, 2, 3]) In : data Out: array([1, 2, 3])
I used to create a mental picture somewhat like this:
┌────┬────┬────┐ data = │ 1 │ 2 │ 3 │ └────┴────┴────┘
But Jay, on the other hand, uses a vertical stack for representing the same array.
At the first glance, and owing to the beautiful graphics Jay has created, it makes perfect sense.
Now, if you had only seen this image, and I ask you the dimensions of
data, what would your answer be?
The mathematician inside you barks
But, to my surprise, this wasn’t the answer:
In : data.shape Out: (3,)
(3, )eh? wondering, what would a
(3, 1)array look like?
In : data.reshape((3, 1)) Out: array([, , ])
Hmm, This begs the question: what is the difference between an array of shape
(R, 1). A little bit of research landed me at this answer on StackOverflow. Let’s see:
The best way to think about NumPy arrays is that they consist of two parts, a data buffer which is just a block of raw elements, and a view which describes how to interpret the data buffer.
For example, if we create an array of 12 integers:
>>> a = numpy.arange(12) >>> a array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
aconsists of a data buffer, arranged something like this:
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐ │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ 10 │ 11 │ └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
and a view which describes how to interpret the data:
>>> a.flags C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> a.dtype dtype('int64') >>> a.itemsize 8 >>> a.strides (8,) >>> a.shape (12,)
Here the shape
(12,)means the array is indexed by a single index which runs from 0 to 11. Conceptually, if we label this single index
i, the array
alooks like this:
i= 0 1 2 3 4 5 6 7 8 9 10 11 ┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐ │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ 10 │ 11 │ └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
If we reshape an array, this doesn’t change the data buffer. Instead, it creates a new view that describes a different way to interpret the data. So after:
>>> b = a.reshape((3, 4))
bhas the same data buffer as
a, but now it is indexed by two indices which run from 0 to 2 and 0 to 3 respectively. If we label the two indices
j, the array
blooks like this:
i= 0 0 0 0 1 1 1 1 2 2 2 2 j= 0 1 2 3 0 1 2 3 0 1 2 3 ┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐ │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ 10 │ 11 │ └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
So, if were to actually have a
(3, 1)matrix, we would have the exact same stack representation as a
(3, )matrix, thus creating the confusion.
So, what about the horizontal representation?
An argument can be made that the horizontal representation can be misinterpreted as a
(1, 3)matrix, our brains are so accustomed to seeing it as 1-D array, that it is almost never the case (at least with folks who have worked with Python before).
Of course, it all makes perfect sense now, but it did take me a while to figure out what exactly was going under the hood here.
Visual Explanation of Fourier Series - Decomposition of a square wave into a sum of infinite sinusoids. From this answer on math.stackexchange.com
I also realized that while it is hugely helpful to visualize something when learning about it, but one should always take the visual representation with a grain of salt. As we can see, they are not entirely accurate.
For now, I’m sticking to my prior way of picturing a 1-D array as a horizontal list to avoid the confusion. I shall update the blog if I find anything otherwise.
My point is not that Jay’s drawings are flawed, but how susceptible we are to visual deceptions. In this case, it was relatively easier to figure out, because it was code, which forces one to pay attention to each and every detail, however minor it may be.
After all, human brain, prone to so many biases, taking shortcuts for nearly every decision we make (thus leaving room for sanity) isn’t anywhere near as perfect as it thinks it is.
My Experience with OBM
July 19, 2019
If you want an overview about OBM, please read my post on the same .
I’ve participated in three sprints until now, in which I’ve completely failed myself, but I’ve already experiencing a drastic changes in my habits, which is good.
Here is what I’ve learned from this short, but significant experience:
Crafting my future with OBM
July 12, 2019
It’s been a couple of years maybe, when I read ‘i want 2 do project tell me wat 2 do’ - which by the way, you should too! since I first came across Operation Blue Moon (OBM), a project aimed towards time management and getting things done. It’s run single-handedly by Shakthi Kannan (~mbuf) (who is also the author of ‘i want 2 do project, tell me wat 2 do’ ).
Not only does it borrows it’s name, but also the kind of disciple practiced, from our miliary counterparts. The practices here, build upon the years of experience Shakthi has dealing with people trying, failing, and trying ~harder~ again in their conquest with these utterly useful traits.
Using Weechat with Glowing Bear for IRC
July 2, 2019
Last month, I had a new addition to my toolbox - Glowing Bear, which has been a really nice improvement, allowing me to access Weechat (hosted on a server) through my browser. Here’s how I set it up.
Announcing new blog series on Deep Learning
May 27, 2019
Silk Road, Revolutions and Systems
May 26, 2019
Today, I read the story of Silk Road: how the young idealist Ross Ulbricht, tired of chasing success the old school way, found his way around the darkweb to create an online As a part of the darkweb, it was operated as a Tor hidden service which protected the personal privacy of users by concealing their details from anyone - from the Government to their ISP - conducting network surveillance. Additionally, all payments were made using Bitcoin , a cryptocurrency which provides a certain degree of anonymity. bazaar for the trading of illicit materials, mainly drugs, which he named Silk Road.
The aim behind writing this blog post is to think out loud and try to gain insight into the oversights made by some of the most prominent revolutionaries in history.
Freedom of Speech, Authoritarianism, Freedom of Press and Faiz
May 22, 2019
Right to Free Speech is essential for a democracy. This blog post aims to shed some light on the recent authoritarian attempts made by hindutva-right-wing to curb free speech and how can we fight back.
A glimpse into the darkness: the 'Brutish' rule in India
May 18, 2019
A second-generation freeborn attempts to understand the impact and aftermath of colonization of India by British. It turns out that even an educated Indian of today is still not aware of the atrocities and turmoil it caused the country.
Do we really need to cover coverage with Vulture?
August 18, 2018
The team behind Vulture (a tool used for detecting unused Python code) decided not to integrate it with coverage (a tool for measuring code coverage of Python programs). Read why!
Dynamic code analysis with Vulture
June 27, 2018
This is a follow up post of Why use coverage to find which parts of a python code were executed? - there we discussed how we stumbled on this plan of dynamic code analysis with vulture. Here, we talk about the development process we (the Vulture team) underwent to integrate Vulture with coverage.py in order to automatically generate a whitelist of functions which Vulture reports as unused but are actually being used.
Google Summer of Code 2018 - Phase 1
June 14, 2018
Here’s my work progress with the first phase of Google Summer of Code 2018.
The story of Dead Code, Vulture and scavenging
May 30, 2018
It isn’t uncommon for software developers to encounter some code that they had written in the past and reflecting on it - the most common reaction would probably be “It must be the most horrible thing I wrote”. But sometimes, there’s that aha moment where you find something and you are instantly gratified and proud of yourself, “Oh, this is so beautiful, no wonder it took so many sleepless nights”. However glamorous it may sound, but it is indeed a difficult task to write and maintain such code, and this is where automatic tools come in to the picture. Let’s discuss about one such tool - Vulture, which helps discover unused stuff in Python code.
So, today we present to you the voodoo which throws out unused code.
Why use coverage to find which parts of a python code were executed?
May 19, 2018
In this post, I’ll walk you through the decision making process the team behind Vulture underwent to come up with a way to deal with false positives in it’s results.
A meeting with my GSoC'18 mentors
May 13, 2018
Tell me and I forget, teach me and I may remember, involve me and I learn. This blog post is a public memoir of an online meeting I had with my GSoC mentors. Kudos to me for having such awesome mentors! :P
May 10, 2018
“Good luck is a residue of preparation.” ― Jack Youngblood
Getting selected as a Google Summer of Code student with coala was a breakthrough for me. The coala community touched me on every aspect of open source software development, especially how to get along with peers (and troll them :-p). And it has happened again - I am a student with coala one more time, and I look forward to learn yet more from my dear mentors and the beloved coala community.
Statement of Chaos
March 30, 2018
Should I go for a job or an MS?
Organising the Mozilla visit
March 3, 2018
This blog post is about my experience with organising and attending a Mozilla session at my college.
How to get started with self driving cars
January 11, 2018
TMP Day 1: Introducing three months long backbreaking goals
August 24, 2017
Challenging my limits - Completing 4 ridiculously difficult programs in a year.
July 24, 2017
Phase 2 is coming to an end today (24’th of July, 11:30 PM IST). It had been an intensive and healthy work-period with a high steep-learning curve. Let me reflect on my journey throughout the month.
June 24, 2017
Phase 1 of the coding period ended on 26’th June 23:30 GMT+5:30. With this post, I would like to reflect upon the development progress so far and share some of the challenges I faced.
June 20, 2017
Trying to change my habits in a way it feels fun!
June 10, 2017
A meeting with my mentor, tweaking the VultureBear and my new laptop. Ahh, perfect!
coala - COde AnaLysis Application
June 1, 2017
How working with coala changed my life? :-)
GSoC Project Timeline
May 28, 2017
Here is a description of how I plan to manage my schedule during GSoC period.
May 20, 2017
The project I will be working this (G) summer (oC)
Getting into GSoC
May 3, 2017
Hello, this post is a brief description of what is GSoC and how I wrote my project proposal for GSoC