Aparajithan Venkateswaran

Year 2024 in Review

2025-01-02T00:00:00+00:00

My annual review for 2024 is here (after skipping the 2023 version). This contains less stats and more personal reflections. And you will see why.

Looking back at 2024

Big milestones

I defended my dissertation and successfully graduated
I started my new job as a data scientist

Travel

Traveled to Florida for a workshop
Traveled to San Francisco for fun
Day trip to Bainbridge Island, Washington
Traveled to Portland, Oregon for a conference
Traveled to Yellowstone National Park
Traveled to Vancouver, Canada for a conference

By the stats

Books read: 18
Favorite fiction: Stormlight Archive: Wind and Truth (Brandon Sanderson), The Name of the Wind (Patrick Rothfuss)
Favorite non-fiction: A Promised Land
Favorite songs: Oh Raaya (AR Rahman), Different Lives (Fly By Midnight), Shotgun Rider (Patrick Droney), Love Somebody (Morgan Wallen), Moves (Suki Waterhouse)
Favorite movies: Dune Part 2, Meiyazhagan, Furiosa, Maharaja
Favorite video games: Hades, Uncharted: The Lost Legacy
Favorite hike: Lake Serene, WA
States visited: Wyoming, Colorado, Oregon, California, Washington
Countries visited: Canada

Looking ahead at 2025

Some goals

Run a half marathon (I’ve been setting a running distance goal for the past few years but I’m feeling positive about it this time)
Visit Europe or New Zealand
Learn 12 new songs on guitar

Some reflections

There was a period of time right after graduation when I started my new job during which I felt a little lost. For the better part of the past 27 years, my goal in life was well-defined. In school, it was to graduate from high school and get into a good college. In college, it was to graduate with a degree and either find a job or get into grad school. Obviously, I went to grad school. In a Ph.D. program, the goal was to perform “novel” research and defend your findings (and find a new job). Very loosely defined path, but the destination was defined. In August 2024, I had done all of that with a job lined up to start in a month.

Now that I started the new job, I was left with two holes. First, there was no goal. At least not in my professional life. Yes, there were projects I owned and had to make sure they were progressing in the right direction. But they did not have the “levelling up” aspect of formal education. Once a project was done, you either moved on to the next project or you continued to maintain the product. Sure, you could level up in your job i.e., get promoted. Again, that was not the same as the “levelling up” you get in formal education, where you start a grand new chapter. These were more “journey” than “destination.”

Add on to the fact that my social life was completely upended. Throughout college, I was surrounded by other students. There was a sense of camaraderie as you work through the same assignments, study for the same exams, and share large portions of our lives in the same pursuit. At work, you do not get that bonding with your colleagues through shared experiences.

I guess both problems were of separating “professional” from “personal.” In college and grad school, the boundaries are blurred. The research bleeds so much into your personal space (working late hours and on weekends) that it becomes a part of your identity. As a result, my social life was inherited from the professional life. The social inheritance can be a good thing when you are going on the same journey. But I think I took it too far in that the professional identity took over the personal identity. So in the process of achieving my professional goal, I lost my sense of personal identity. And that crisis manifested in those two ways.

Admittedly, I had (am still having?) a difficult time with the transition. I have some strategies that I have been trying out to fix the social life. For instance, I proactively stay in touch with my friends. For people in town, this means organizing dinners, board game nights, and other similar activities where I get to see familiar faces. For people far away, this means reaching out for phone calls, sending memes, and maybe even planning trips. And I force myself to say “yes” to social events where I get the opportunity to expand my circle. I’ve only been good at this for the past few weeks but I am feeling positive about it already. Another thing I want to try in 2025 is attending meetups. This is a bit nerve wracking but, in the worst case, I likely won’t meet anyone there again, so I have nothing to lose!

As for the first problem about “goals in life,” I am sad to say I don’t have definite answers. I have been trying to fill out my calendar trying new activities. Last month, I signed up for dance lessons, which is an activity that old me was least likely to engage in. It was a little outside my comfort zone and I enjoyed it. I think it would be more fun if I got to practice more. So I am tabling further investing time into it for the future. Next, I am looking at signing up for karate lessons. You get the idea of what I’m trying to do here.

However, these feel like temporary answers and do not address the underlying question. What is my next destination? Journey before destination, yes, but is there a point to the journey when there is no destination? Asked differently, how do you make the journey without knowing the destination?

As I continue the search for my answers, I wish you a happy new year!

Ph.D. Graduation Speech

2024-06-07T00:00:00+00:00

The Department of Statistics at the University of Washington held its commencement ceremony for the 2023-2024 academic year on June 7, 2024. I was nominated by the Ph.D. students to give a speech at the ceremony. You can find the speech here. Here is the transcript:

Thank you for the introduction, Abel. It’s truly an honor being here today. I want to take this opportunity to thank my advisors, Tyler and Ema; my fellow PhD students, and, of course; my family.

Before I go on, I should come clean and tell you that I haven’t yet defended my research. I think the only other place where “graduation” comes before “defense” is the dictionary, if you read it backwards. I promise that’s the last joke. Besides, our family and friends, almost surely, won’t get the statistics puns. And I don’t want to marginalize them.

As I was reflecting on my journey, I was reminded of an old couplet in Tamil, my native language:^[1]

உடையார்முன் இல்லார்போல் ஏக்கற்றுங் கற்றார்
கடையரே கல்லா தவர்

uṭaiyārmuṉ illārpōl ēkkaṟṟuṅ kaṟṟār
kaṭaiyarē kallā tavar

Humility is the only path to knowledge. Pride and vanity only lead to ignorance.

When I joined UW, I was starry-eyed. Against the better advice of the students, staff, and faculty, I decided to enroll in 3 classes on top of being a teaching assistant and trying to do research. Three months in, I realized that this was a mistake. You see, I had underestimated the difficulty of the classes and overestimated my abilities as a statistician and a researcher.

Besides the pandemic, this was the biggest shock during those months. My peers would agree with me that the classes and exams were the hardest milestones in our journey to get here today. If you’re in the PhD program, you were probably one of the smartest people in your undergraduate classes. A few weeks into a statistics graduate class, you realize that’s no longer the case and you question whether you’re in the right place.

Fortunately, you don’t have to do this alone. You learn to ask for help from your professors and your friends. Seriously, I couldn’t have gotten through the 570s and 580s without the countless office hours and study groups.

At the end of the day, we learned how to be ambitious without being competitive. We became comfortable with not knowing and embraced the possibility that we might be wrong.

We’ve all hit a wall at some point. An elusive proof. Simulations that don’t converge. Finding out that we got scooped by someone 50 years ago. At times like that, the fleeting feeling of disappointment is inevitable. But we learn to pick ourselves up and persevere in spite of the setbacks. And that is what matters in the end.

Because when you climb that mountain, you experience something that cannot be put in words. There is a sense of relief but there is also something magical when you finally figure it out. Like you’ve discovered a secret of the universe. And you’re dying to spill the beans. You teach others to see what you see. Very few appreciate it like you do.

For instance, Ema and I proved a fact that our distinguished alumni speaker, Prof. Ali, conjectured 15 years ago. To me, that is beautiful for reasons more than just coincidence that we’re both here today. It is okay that you don’t understand my reasons, because, in the end, beyond the objective truth of the facts, these secrets are a little personal.

Today, we get to lay them out and celebrate those moments. Looking back at how far we’ve come, I feel grateful for the opportunity to have gone, to be going, on this journey.

Looking to the future, and this is mostly advice to myself, remember, “Journey before destination.” Don’t measure out life in coffee spoons and live in quiet desperation. Appreciate the beauty. Seize the day. Dare to disturb the universe.^[2]

I will leave you with a quote from the author Brandon Sanderson:^[3]
The question is not whether you will love, hurt, dream, and die. It is what you will love, why you will hurt, when you will dream, and how you will die. This is your choice. You cannot pick the destination, only the path.

Notes

[1] These couplets come from a classical ancient text called Thirukkural (திருக்குறள்) written in the Sangam period. It consists of 1330 couplets each containing seven words. This couplet is taken from the chapter on Education (கல்வி). Scholars interpret this couplet in subtly different ways, which are lost in translation. I provided my own summary based on their explanations.

[2] Many of these metaphors are inspired by the author Brandon Sanderson (The Stormlight Archive), and the poets T.S. Elliot (The Love Song of J. Alfred Prufrock) and Henry David Thoreau (Civil Disobedience and Other Essays).

[3] This appears in Oathbringer as a quote from an in-world fictional book The Way of Kings (not to be confused with the book The Way of Kings written by Brandon Sanderson).

Looking behind at 2022 and looking ahead at 2023

2023-01-01T00:00:00+00:00

Since starting in 2016 over the winter break of my first year as an undergrad, this annual retrospective seems to be the onlt thing keeping this blog alive. I do ocassionally write some guides, but nothing for the blog itself. I hope to add more to the blog but, right now, I don’t have a timeline for that. In the meantime, I intend to keep the tradition of these annual round-ups of the past year.

Looking behind at 2022

While a lot of things have considerably changed compared to 2021 and 2020, the pandemic still loomed in the background. But other global events such as inflation, recession, and war took the foreground. Here are some highlights from my personal life:

I was a teaching assistant for just one class, STAT 221 (Statistical Concepts and Methods for the Social Sciences).
I passed my Ph.D. preliminary examination. (I also learned that this is not enough to become a Ph.D. candidate.)
I spent the summer interning as a Data Scientist in the Xbox Player Services team at Microsoft. I got to apply some of my research in the project.
I upgraded my PS4 to a PS5.
I attended my first conference and presented my research at JSM.
Two international travels to Canada and India.
I became the Graduate Student Representative (GSR) for the Statitics Department at UW.
I took the Statistics Consulting class. It was a lot of work but also fun to learn new methods and actually see statistics applied in other fields.
I finished my first project (that will be a part of my dissertation) and am currently wrapping it up.
I signed an offer to return as a Data Scientist intern at Microsoft in the Xbox Player Services team. (The huge tech layoff in September did not present a promising outlook for internship search.)

Looking at the numbers

Number of goals that I set out to achieve: 5
Number of goals that I completed: 2 (playing more video games and reading more fantasy)
Emails sent: 416 (+42 from 2021)
Emails received: > 3998 (+203 from 2021)
Papers read: 116 (+58 (100%) from 2021)
Books read: 12 (+6 from 2022)
Books in progress: 1
Favorite fiction: Stormlight Archive: The Way of Kings
Favorite non-fiction: The Language of Food
Favorite music album: State of the Heart by Patrick Droney
Favorite song: Fall in love by Bailey Zimmerman
Favorite movies: Batman, Avatar: Way of the Water, Ponniyin Selvan - Part 1
Favorite TV show: House of the Dragon
Video games finished: 0 (-3 from 2021)
Video games in progress (that I intend to finish): 4
Favorite video game: God of War: Ragnarok
Concerts attended: 2 (+2 from 2021)
Theatrical performances attended: 0 (no change from 2021)
Miles run: 65.36 (-57.36 from 2021)
Miles biked: 7.32 (+7.32 from 2021)
Hikes: 7
Favorite hike: Mt. Fremont Lookout
14ers completed: 0
Total 14ers completed: 1
States visited: WA, CO, NM, DC (+2 new)
Total states visited: 16
Countries visited: Canada, India
Total countries visited: 4

Looking ahead at 2023

A few things I am looking forward to in 2023:

Taking my general exam. After this, I will be a Ph.D. candidate. Hopefully this happens in 2023!
Summer internship at Microsoft
Doing some traveling. Not sure where, but I intend to visit new places!

Last year, I set out to do more of less. I think I was more or less successful in that endeavor. I decided to do, at most, 5 things distributed between research, classes, teaching, and other obligations. I definitely brought it down to 4 towards the end of the year. I want to continue doing that in 2023. I expect it will fluctuate between 4 and 5 for until the end of Spring quarter.

I have some exciting ideas for side projects that I want to pursue in 2023. Some of these have been simmering in my head for a long time and others are inspired by recent events. Some are technical and others are creative. I won’t fully describe them here but I will post updates when they are nearing completion.

Some other fun things I want to do this year:

Work on those side projects
Read more fantasy
Learn more about geology and archaeology
Play more music (and write music)
Play more video games
Run a half marathon (I’ve been wanting to do this 3 years now)

Looking Behind and Looking Ahead: 2021 and 2022

2022-03-05T00:00:00+00:00

Here it is: the annual retrospection for 2021.

I started writing this on January 6, 2022. And this was sitting in my drafts for almost 2 months. Don’t ask why.

Looking behind at 2021

2021 was Year 2 into the pandemic and Year 0.5 to 1.5 of my Ph.D. Here are some highlights:

I was a teaching assistant for 3 classes: STAT 311 (Intro Stats), STAT 435 (Undegrad Machine Learning), and STAT 512 (Grad Stats Inference). All of those classes were a blast to teach!
I started working on two different projects with two different professors. Both of them are super exciting!
I started taking singing lessons.
I spent the summer working remotely for Microsoft in the Mixed Reality Object Understanding team. I developed an application to annotate object-poses.
I moved into an apartment of my own.
I attended my first Western wedding.
I bought a PS4 and picked up gaming again.
I signed an offer to return as a Data Scientist intern at Microsoft in the Xbox Product Services team.

Looking at the numbers

Number of goals that I set out to achieve: 5
Number of goals that I completed: 1
Emails sent: 374
Emails received: > 3795
Papers read: 58 (+26 from 2020)
Books read: 6 (-4 from 2020)
Books in progress: 1
Favorite fiction: The Eye of the World
Favorite non-fiction: How to avoid a climate disaster
Favorite music album: Equals
Favorite song: Where You Are by Patrick Droney
Favorite movies: No Time to Die, Spiderman: No Way Home
Favorite TV show: Lucifer
Video games finished: 3 (-1 from 2020)
Video games in progress: 3
Favorite video game: Uncharted: A Thief’s End
Concerts attended: 0 (-2 from 2020)
Theatrical performances attended: 0 (-5 from 2020)
Miles run: 122.72
Miles biked: 0
Hikes: 7
Favorite hike: Snow lake
14ers completed: 0
Total 14ers completed: 1
States visited: WA, CO, IL, IN, KY, TN (+4 new)
Total states visited: 14
Countries visited: 0
Total countries visited: 3

Looking ahead at 2022

There are some exciting things happening in 2022 for me:

Prelims in June. Once I pass this, I will become a Ph.D. candidate.
Data science internship at Microsoft. I just learned that this will be in person.

Going forward, I intend to do more of less. In the past, I’ve generally been very distracted in terms of what I want to do. I had hoped that by doing more things, I would be able to figure out what I liked. Turns out I find a lot of things interesting and it is hard for me to say “No” when a new opportunity pops up. I end up having a lot of things on my plate and become stretched too thin.

To that end, I am going to impose a limit on how much I will do and stop testing the limits of how much I can do. The upper bound on the number of things I will be involved in at any point in time is 5. This 5 will be distributed between research projects, classes, teaching, and other obligations. Currently (March 2022) it is at 5. I anticipate it will stay at 5 until the end of this academic year. Ideally I would like to bring it down to 4.

Some other fun things I want to do this year:

Work on more (side) projects
Read more fantasy
Play more music (or even try writing music)
Play more video games
Run a half marathon (I’ve been wanting to do this 2 years now)

Looking back on 2020

2021-03-21T00:00:00+00:00

2020 was an interesting year to say the least. You’ve probably heard it phrased a dozen different ways from a gross number of people. So I won’t try to describe it more, except that I’ll use it as an excuse for being 3 months late. It’s also an excuse for why I won’t do the usual by the numbers format I did the last two years.

This time, I’ll just list the highlights (ups and downs) of the past year, and the things I’m looking forward to this year.

A timeline

Jan 2020

Tried nordic skiing.
Went on the final EHP retreat to Glen Eyrie.
Had interviews with various Ph.D. programs including Stanford, NYU, and UW Statistics.

Feb 2020

Face a lot of Ph.D. program rejections including Stanford and NYU. I talk a little bit more about dealing with these external events here.
UW Statistics came through!
Pandemic becomes a thing.

Mar 2020

The final open mic night gets cancelled.
School becomes all remote.

Apr 2020

Accepted offer from UW.
Apparently, I did well in my undergrad that they gave me a few awards.
Successfully defended my thesis.

May 2020

Started biking for longer distances and more often.
Officially graduated and received my degrees.
Started internship at Microsoft with the Mixed Reality team.

June 2020

Ran a 10k!

July 2020

Went on some hikes in Boulder.
Signed an apartment lease in Seattle.

Aug 2020

Finished internship at Microsoft.

Sep 2020

Had a small birthday gathering.
Drove to Seattle from Superior.
Lots of furniture hunting and assembling.
Ph.D. program starts.

Oct 2020

Got my license at Washington.
Started to get stressed about classes and teaching. The “new guy in a new city” feeling starts to really hit.

Nov 2020

Got new license plates.
The Election happened.

Dec 2020

Successfully finished my first quarter.
Bought a new plant and welcomed Ciri to my household.
Course evaluations for my TA job come in. The numbers are skewed (sample size of 8), but the comments were great!
Started reading the Wheel of Time.
Started a YouTube channel with the goal of uploading one new guitar video every month.

Things I discovered and learned

Teaching can be frustrating at times, but always rewarding.
Study groups can actually be fun, especially when there are few other ways to socialize.
AirPods Pro, Macbook, and the complete Apple ecosystem is just seamless.
The Joy of $x$ Podcast by Steve Strogatz
The Wheel of Time by Robert Jordan (and, later, Branden Sanderson)
The Witcher by Andrzej Sapkowski
Caste by Isabel Wilkerson

For a complete list of books I read in 2020, visit my Goodreads.

Some specific goals for 2021

Run a half marathon.
Upload 12 guitar videos on YouTube.
Write 10 short stories (at least 1000 words each).
Read 20 books.
Start actively working on a research project.

Some broad goals that may or may not ever happen

I re-discovered a lot of things I like. Here are some goals along that vein, but not important to lead a healthy, productive, happy life.

Write a fantasy novel.
Busk.
Play music at some sort of event like a wedding. (The difference between this and busking is that, here, someone considers me a good enough guitarist to hire me)
Put on an art show with paintings.
Publish an album of photographs.

Despite being unable to do a lot of things, a lot of things did happen for me, to me, and around me this year. Whether or not we want it to, time continues moving forward. We can either choose to live in the moment, or pine for what could have been. We can either change our ways to adapt to the times, or let our old patterns drive us into extinction.

This is my biggest takeway from the last 12 months. Hopefully someone out there is inspired by what they read here.

Identifying the Why

2021-02-24T00:00:00+00:00

The why is one of the most difficult introspective questions to answer. There are different why’s in life. Why is my current life this way? Why is my ideal life that way? Why is there a difference between the two? These are difficult questions to answer because they are broad, abstract, and often have no one right answer. After spending a good amount of time thinking, in this article, I try to develop a framework for approaching these questions.

The why questions I am referring to here pertain to those within our control. There are a plethora of why questions along the lines of “Why did this happen to me?” The answers to those questions are outside our control, and not the object of discussion here. In my opinion, dwelling on such questions is rarely useful.

Finally, take these with two grains of salt:

With new experiences and learnings, my thoughts and opinions are bound to change.
What makes sense to me need not make sense to you.

That said, I welcome criticisms that help me understand things that I currently don’t understand or misunderstand.

A framework

At the heart of these why questions lie values. Everything we do stems from some value or moral system that we consciously or subconsciously believe in. There are two different layers to our value system – the primary and the secondary.

The primary values are the atomic values. These include broad things like “honesty,” “strive to be good,” “do no harm” and so on. These are the building blocks of our moral framework. The secondary values are the ones that we build using the atomic values. These are our subjective interpretations of the atomic values. For instance, “should I incite harm on someone who perpetuates harm to reduce the total harm in the world?” I think at the core, many of us share the same atomic values to some degree. It is in our interpretations, prioritization, and execution of these primary values, in other words the secondary values, where differences arise. And for most intents and purposes, when two people have different values, they refer to their secondary values.

This distinction is important because our secondary values are shaped by various factors. So, in effect, our actions and desires are shaped by the same multitude of factors. Now, bear with me as I use a mathematical metaphor for this problem.

A vector space

Think of the value system as the basis for the multidimensional vector space. Each person’s position in this space is determined by several factors. These include things such as expectations of other people, expectations of oneself, fear of death, love for other people, and passion for hobbies or interests.

Each of these factors represent a force field that pulls us in some direction within the value space. If life were simple, we would be pulled only in one direction and we would not have to actively decide anything – just go with the flow. However, life is not that simple. These force fields attempt to pull us in different directions.

When we find ourselves in a position that’s opposite to the force fields (that goes against our values), we do not necessarily enjoy the experience. Therefore, as good physicists, our goal is find the equilibrium location – the state where the forces best align with our values. This is the state that energy due to these various forces is also minimized. This equilibrium corresponds to a position that maximizes our contentedness and fulfillment.

A note on fulfillment: In my mind, fulfillment involves the right combination of satisfaction and dissatisfaction. It involves a healthy amount of pain and discomfort that allows (and motivatues) us to grow without bringing us down. Everything is just enough and we are not yearning for more of any one single thing. This fulfillment does mean that we are happy, but not solely seeking happiness all the time. Life consists of all experiences, happy and sad, and seeking just happiness often backfires. After all, emotions are relative – we can’t appreciate the highs without the lows.

Back to why land

Our goal in answering the why questions is to find this sweet equilibrium. This is hard because it is difficult to even identify these forces. It is easier to just go with the flow of one or two factors and let our overall happiness slide. This is what I meant by “making choices of distractions,” or living a life that’s not your own.

Now, the steps to achieve equilibrium are “straightforward.”

Identify what the different factors are that influence our actions, decisions, and values.
Balance the tradeoff between these facets of life.

Of course, this is easier said than done. If identifying the different factors is difficult by itself, balancing them is probably impossible, even for a physicist. Perhaps it makes sense to go with the flow initially, then re-evaluate once in a while to understand how things are going and how we can alter the trajectory towards the equilibrium. This is akin to a statistician’s optimization procedure – start with a (likely to be bad) guess, and repeatedly make corrections to get closer to the optimum.

The key idea here is the re-evaluation. If we do not continuously re-evaluate, we will get stuck in a rut. Perhaps, I will write more on what it means to perform this re-evaluation in the future. And maybe even come up with concrete ideas instead of abstract metaphors and analogies.

Why more time?

2021-02-12T00:00:00+00:00

Almost everyone at some point in their life has the thought, “I wish I had more time” or something along those lines. Often, we miss the larger question lurking behind this thought – “Why do I want more time?” This question aims to isolate the real life from the ideal life. The real life is the present – the one we are in right now. The ideal life is the life that we wish we were living.

Answering the question, “Why do I want more time?” forces us to think about what the ideal life would look like. The thought of wanting for more time arises out of a dissonance between thoughts and actions. What would I be doing right now rather than writing more lines of code? Where would I be eating my lunch rather than at my desk trying to attend a meeting? What would my ideal (future) life look like, and why is my present different? Follow the answers down the rabbit hole until you finally arrive at the true answer to the question, “Why do I want more time?” This is the why behind your ideal life. Knowing the real why is important. Otherwise we will likely end up making “choices of distractions” or, worse, living a life that’s not our own.

Imagine a life where you did not have to worry about time. What would you be doing then? There is the Stoic aphorism, memento mori, translating to “remember death.” There are also a plethora of insipid quotes asking you to live as if today was your last day on Earth. Instead of these (useful virtues), ask yourself, “If I had an infinite amount of time on Earth, what would I be doing?” Approach life with an abundance mindset instead of a scarcity mindset.

Now we know the true reason why you want more time. We also know what the ideal life would look like. Clearly, we also know what reality looks like. Work backwards. Where does reality diverge from ideality. Where is the dissonance between thoughts and actions?

At this point, we should have enough answers and thoughts to find what the important things are in life and come up with a plan to “make more time” for them. This may involve giving up certain aspects and luxuries of life that we currently enjoy to make more room for the more important things. We may have to hit a hard reset. Sometimes, we may realize that the way out is to continue doing what we are doing because there is no apparently obvious way to move towards the ideal life, or because we are already on that path.

I like to think that the goal of this meditation is not to come up with a life plan, but to evaluate the state of life itself. Understand what you currently don’t find convenient with your time schedule and envision an alternate life where you wouldn’t feel the same way. Compare and contrast to identify how to move towards the latter. And this is definitely not a one-time thing. Whenever the temporal scarcity mindset kicks in, it serves as a cue to re-meditate on the dissonance.

On Stoicism

2020-11-27T00:00:00+00:00

Stoicism refers to a philosophy of life, and a Stoic is someone who lives by this philosophy. This is not to be confused with stoicism (with a lowercase ‘s’), which means “suppressing happiness, pain, and emotions in general” in the modern sense. I like to think of myself as an aspiring Stoic. I try to embody this philosophy, but currently do so imperfectly. In this post, I share my understanding of Stoicism and thoughts about about how it has been helpful to me.

The Dichotomy of Control

The core tenet of Stoicism is that there are things that we can control and things we cannot control, and we suffer when we confuse the two. It is only harmful when we worry about and attempt to control things that are not under our control. By paying attention to the difference, we can focus on the things that we have at least some control of. Here is a passage from The Enchirdion (emphasis added):

The things in our control are by nature free, unrestrained, unhindered; but those not in our control are weak, slavish, restrained, belonging to others. Remember, then, that if you suppose that things which are slavish by nature are also free, and that what belongs to others is your own, then you will be hindered. You will lament, you will be disturbed, and you will find fault both with gods and men. But if you suppose that only to be your own which is your own, and what belongs to others such as it really is, then no one will ever compel you or restrain you. Further, you will find fault with no one or accuse no one. You will do nothing against your will. No one will hurt you, you will have no enemies, and you will not be harmed.

So what is in our control? Our own actions. Anything that is not our own actions including our own body, wealth, knowledge, and reputation are not under our control. This perspective is hard to buy the first time. I was opposed to some of these ideas initially.

How is my knowledge not under my control? The difference is that I can take actions to improve and increase my knowledge by reading books, listening to others, and debating questions. And these actions are under my control. Some of these actions may enable me to learn more effectively than others. Some of these actions have a higher probability of effecting change. But there is always an underlying uncertainty. My knowledge may or may not increase through these actions. The consequences are not under my control. Besides, there are various external factors. Age will diminish my brain power and fade away memories. That is not under my control either. It is easy to confuse our ability to take actions that improve our knowledge with our knowledge itself.

The same is true for many things in life. We often confuse our ability to act with our ability to effect. This confusion sets us up for disappointment when things don’t go the way we want them to.

The Stoic’s Decision Tree

I like to remind myself constantly about the Stoic’s Decision Tree.
Disclaimer: This is just my rendition. If you google “stoicism flowchart” you will come up a whole slew of other interpretations.

A second idea I introduce here in this flow chart is “virtue.” Stoicism is not about throwing your hands in the air. Your actions need to be guided by your morality. This is what sets apart Nihilism from Stoicism. A Nihilist believes that the world has no meaning, rejects all morality, and acts accordingly. A Stoic assigns morality and virtue to things under their control i.e., their actions. As someone who believes that there are things such as “right” and “wrong,” I find some solace in the Stoic’s perspective.

Examples

Often, I find abstract philosophical ideas to be intangible. So here are some examples of where Stoicism has been applicable to me in recent times.

Internet failure: This quarter (Autumn 2020), I teach two recitations (quiz sections) on Tuesdays and Thursdays. On my fourth day of teaching, my internet went out an hour before I had to log in to teach. Typically, I would have lost my mind blaming others for poor internet (we all love to blame our internet provider don’t we!). And for a minute, I did start to panic. But going back to the Stoic’s Decision Tree, internet connectivity is not under my control. So why should I stress myself worrying about it? Instead I should be thinking about what I could do. I asked my roommate who was responsible for setting up the router to fix it and used my phone as a hot spot to teach in the meantime.
Applying to grad schools: This was the most stressful part of my undergraduate career. Especially when admission season rolls around and the only emails I received were rejections. I had to remind myself that the outcome of the admission process is not under my control. I could write the perfect essay, craft the perfect CV, and my letter writers could nominate me for a Nobel prize. Nevertheless the outcome is outside of my control.

Receiving rejections can also be very demoralizing. Especially when the same applications make you hyper-conscious of external factors. In addition to the Stoic’s Decision Tree, I had to remind myself that external validation (a.k.a. the decision of the admissions committee) does not define who I am. This is definitely no easy ask, and it is still a work-in-progress.
Bombing an exam: This is a textbook example. I prepare really hard for an exam. But I walk out knowing that things didn’t go the way I hoped they would. Once again, my preparation strategy is under my control. My performance on the exam is not fully under my control, however. It could just be that the exam was inherently very difficult. Or maybe the exam tested me on a different set of skills than I was anticipating. This is not to diminish the feeling of betrayal or despair. It is to simply know there there are factors outside my control that affect the final outcome.

Some Common Misconceptions

Here are some questions and misunderstandings I had over the years that have since been clarified.

Don’t emotions define human beings? Why would you choose to avoid experiencing emotions?
Stoicism is not the same as not feeling. Your emotions are not under your control. Happiness, sorrow, pain, love, loss are all natural. It is easy to confuse the passages in The Enchiridion with not feeling. How you react to these emotions are, however, under your control. Feeling anger and frustration are alright. Your reactions to these emotions are what you need to be aware of.
Clearly I have very little control over issues like climate change. Are you suggesting that I should not worry about them?
When you are actively concerned about something, you come up with a plan and tasks that you can act upon. Herein lies the distinction. You can act in a way that reflects your concern and is aligned with your values. The outcome of the actions, however, are not always under your control. For instance, take climate change. You cannot, singlehandedly reverse it. But you can take actions that reflect your beliefs and values. You can educate others. You can participate in demonstrations. You can perform scientific research and studies. These are within your control. Your friends may not listen to you. The demonstrations may not be successful. Your research’s funding may get pulled. These outcomes are hard to sit with, but remember they are not always under your control.
If the outcomes of my actions are not under my control, why should I care about them in the first place?
Remember virtue. A Stoic’s actions are guided by their morality. A Stoic cares about issues and acts on them because it is the right thing to do.

A Final Note

What is under my control? The definitions are blurry if we go beyond “our own actions.” And sometimes thinking in terms of our actions alone is not enough to inform our decisions. We need to be able to characterize the strength and nature of the power our actions have. This is somewhat difficult to do. I will defer the reader to an essay reflecting on power written by Ellen Considine.

There are other tenets of Stoicism that I do not discuss here. These include amor fati (love fate) and memento mori (remember death). These are ideas that I am still grappling with. So I will reserve those too for a future time when I can thoughtfully reflect upon them.

Readings

For those intrigued by my musings, here are some books that may be of interest

The Enchiridion by Epictetus
The Daily Stoic by Ryan Holiday and Stephen Hanselman
Meditations by Marcus Aurilieus

Reflecting on Fouriers at CU

2020-08-11T00:00:00+00:00

For the uninitiated, the title is a pun on Fourier. Jean-Baptiste Joseph Fourier was a French mathematician who, through Fourier Series, laid the foundation for Fourier Analysis, an important topic in physics, image compression, signal processing and so many other fields, and the title of a required course in the Applied Math department at CU.

I remember moving to Colorado from India in July 2016. It was all so fast that I did not get the chance to think and reflect on my first 18 years in India. And I was filled with excitement and nervousness about what was ahead. Now, my four years in Colorado, and specifically at the University of Colorado at Boulder, has come to an end. It was an unanticipated ending given the pandemic. I did not get a chance to see the classrooms one last time. I did not get a chance to thank my professors in person. I did not get a chance to meet my friends and mark the milestone. And going our own ways, I probably will not get to do all of that for a long time again. As I continue to look forward to starting my Ph.D. at the University of Washington, I found myself reflecting on the last four years. I was flipping through photos and conjuring up memories instead of reminiscing over conversations.

Some Overarching Memories

Here are some themes that I keep coming back to when reflecting upon my time at CU.

Bus Rides: During my freshman and senior years, I spent quite an amount of time commuting to and from campus on the RTD. I cannot remember the number of times I needed to run to catch my bus, the times they wouldn’t let me on the bus because there was no space for my bike, and the countless hours spent waiting for the bus. It was a grueling, and unforgettable experience. The silver lining was that I could take naps on days I had to wake up early and when the days were long.

Campus: Despite the buildings which seem out-of-place, the campus is really beautiful with the mountains in the backyard. I will miss waking up to the Flatirons shrouded in clouds of mist. I will miss the Flatirons burning upon touching the first rays of sun. I will miss evening colors the mountains bring out as I stare out of the balcony from the study room on First East.

HackCU: HackCU was a big part of my undergraduate experience. I found a closely-knit community that I could be a part of. I met people from around the world. And through HackCU, I was able to become a GitHub Campus Expert.

Andrews Hall and EHP: I lived in Andrews my sophomore and junior year. I will dearly cherish my time here, the friends I made, and the many opportunities this place has given. They taught me to think critically about various aspects of my life. In many ways, they have made college a deeply treasured experience. I remember the day I moved in. I was in the music room playing ping-pong. I saw two people there that day who are now two of my closest friends. Incidentally, their conversation inspired me to reach out to professors for research. And here I am.

MCM: The Mathematical Contest in Modeling is definitely an important part of my college experience. Working on a single problem for 100 hours locked in with two other people definitely sounds fun (and stressful)!

Conversations at C4C: C4C, unfortunately, gets a bad rep. You just need to get creative with the food so the unlimited options are no longer repetitive. It is one of my most memorable places with all the conversations I have had with so many people here.

Concerts: I love music. I got the chance to go to many concerts, musicals, and plays over the last three years. Makes sense when one of your friends is also an actor.

Open Mic Nights: These were my favorite, yet anxious-ridden, nights living in Andrews. I played guitar – mostly solo ballads. I also played classical duets with the clarinet. I am still bummed what would have been my last Open Mic Night got cancelled.

Outdoor activities: I was fortunate enough to have friends who partake in outdoor activities. I am told this is a very Boulder thing, by the way. I started rock climbing, hiked a 14er, started biking, and learnt to ski.

Research: A reflection on my undergrad is incomplete with mentioning Prof. Dan Larremore and First Team All Science. I found joy in solving problems using math and this played a huge role in my decision to get a Ph.D.

…and that’s a wrap!

There are definitely a lot more of these memories. Perhaps I could fill an entire book with the experiences and learnings.

Here are some pictures that I took during my time at CU. Clockwise from top right: the hike to La Plata summit, a wintry morning from the Business field, view from the 8th floor of ECOT, bike ride to the top of NCAR, a gloomy fall morning behind Norlin, the first rays of the sun kiss the Flatirons, and HackCU shenanigans (center).

Looking forward, here is a photo of the Pacific ocean from Olympic Sculpture Park in Seattle, where I will spend the next few years of my life.

HackCU Over the Years

2020-05-19T00:00:00+00:00

HackCU is a student organization at the University of Colorado at Boulder. HackCU was started by two students who wanted to give other students the space to work on their side projects and ideas that they would otherwise would not have the opportunity to. This space was a hackathon. A hackathon can best be described as an invention marathon where people work, either alone or in teams, to create something over the course of 24 hours.

I joined HackCU as a freshman in September 2016. It was the third year for HackCU as an organization. And now as I leave HackCU in its sixth year, HackCU has grown to be more than just a group that organizes hackathons. HackCU now organizes more events such as workshops, startup career fairs (although that was still a thing back in 2016), and has overall become more involved in the community. The team now try to experiment with other ideas. For instance, there was a hardware hackathon last year.

There has also been a large shift in the team culture and ethos. When I first joined HackCU, it was a pretty small team that was closely-knit. While there was structure to the team, it was rarely ever explicitly mentioned. It was simply something that everyone understood. But over the last four years, the scale of the events HackCU puts on has become increasingly larger, with the exception being this year. And that meant the team either had to put in more hours of work, or it had to increase in size. As all of us are students, the natural course of action was to increase the size of the team. And with that increased size of the team and scale of the events, also came a necessary shift in team culture and ethos.

The team size increased. And there emerged sub-teams each with their own focus. Now, there is a dedicated finance team that handled the budgets. There is a team that manages the servers and websites. And so on. Of course, these teams existed earlier, but were much less structured. A person on the finance team would also push commits to modify the website. Now, the structures are more rigid that one often gets the feeling of being pigeon-holed.

Such structures also had to be enforced across the teams. With such fragmentation, there needed to be some coordination between teams in order to ensure the overarching goal of organizing hackathons are still met. And these structures were formally established within the team. For the first time, a formal set of bylaws for the organization were written down. This is the biggest change I have seen at HackCU. It made me realize how the team has grown over the years that there necessitated the enforcement of “rules” in order to keep the wheels spinning.

And as the scale of the events became larger, the team and the events organized often came under scrutiny from the participants and the university administration. HackCU has always had a somewhat rocky relationship with the college. I was told stories of how the first ever hackathon was almost shut down by the college. Over the years, the college has come to recognize HackCU as an event that benefits students. Yet, it has always been difficult for us to secure a venue that can comfortably host 600 people. A single mishap last year and we were blacklisted from hosting our events at various buildings! As a result, this year we were forced to scale down to 400 people to the dismay of everyone including the team and the participants. If scaling up is difficult, this year showed me that scaling down comes with its own set of problems. For instance, it was hard to gauge the number of expected participants which unfortunately led us to turn away many people.

Another problem that has plagued HackCU is the sleeping situation. A campus policy states that it is illegal to sleep in campus buildings (unless you are in a dorm, obviously). And HackCU is a 24-hour event with people traveling from out-of-state and many driving from Fort Collins. How do you ask these people to not sleep, or wake them up if they are falling asleep? Last year, we booked a separate venue in Downtown Boulder and bussed people who wanted to take a break for a few hours. It kind of worked, but made the logistics harder to handle. Dealing with this problem is slightly more difficult. With infinite budget, it is possible to book Airbnb’s and hotels to host out-of-state participants. Another idea that has always been floating around, but never actually explored in any capacity, is matching out-of-state participants with students at CU or the HackCU team. Maybe some of these ideas will surface again to fruition in the future.

Despite all of these hurdles, I am surprised how much HackCU has grown in the last four years. One thing that struck me was how smooth the hackathon was from the organizers’ perspective. For the last three years, the night before the hackathon was always very stressful. This year, things seemed much more relaxed than before. That may also be due to the fact that I was slowly limiting my involvement, and I failed to notice the stress. The food this year was also of better quality. Usually we default to YellowBelly for one of the meals. This year the team decided to try something different, and it turned out to be really good. The judging process was far better planned and executed than any of the previous years. The workshops were also varied in content and skill levels to cater most people’s needs. And the overall quality itself has increased.

Reflecting on this reminded me of the Life Cycle of a Student Community talk that Joe Nash gives. The ethos of HackCU has evolved over the last four years. You could think of the “first generation” as anchors trying to make sure the team is strong enough before straying too far from the harbor. Now, the team is under a whole new leadership who have different visions for what HackCU could be. As an example, during one of my last meetings the team was deciding whether to let go of the startup career fairs to focus more on workshops and hackathons. And HackCU still has a lot of changes coming along the way, especially with the pandemic threatening everything to become remote. At times, these changes in thinking pattern irked me. Nevertheless it was necessary in order for HackCU to continue being a successful student organization. It needs to be independent of the team members, in some sense, in order to evolve and adapt to the changing needs. And I am glad I was able to witness this metamorphosis.

This is the renewal part of the cycle. Everyone who were the “first generation” have graduated and the last of us just graduated in May 2020. And I am excited to see where the “next generation” carries the torches.

Thoughts on research opportunities

2020-03-01T00:00:00+00:00

Recently, a friend asked me a few questions on finding research positions. My answers to her questions summarized my newfound philosophy on finding the “right” question to work on.

Disclaimer: These are just like my opinions man. Not anyone else’s. As with all things, you should take it with a grain of salt.

Me: Hi, X said that you were looking for research positions and wanted help.

Her: Hi! Yeah, I had a few questions about working in a lab. I’m trying to figure out if I can work at a lab this semester starting in March, take the summer off, and continue in the fall?

I’m also looking for a research opportunity that isn’t extremely coding heavy so if anything comes to mind let me know.

Me: It depends on the people you work with. There are organizations and research groups that employ you based on contracts. And the contract defines your length of stay, how many hours you will work, pay, etc. Generally, these contracts are “set in stone”, but the employers tend to understand you are a student and allow things to be less rigid.

Most of the time, ff you reach out to a professor whose research you think is cool, your work schedule might be more flexible. For instance, I worked for free with Dan Larremore for the first 3 months. Then, I asked him for a summer job and was on his payroll for the next year. Then, I took off to to an internship at Microsoft the next summer but still was able to work with him and get paid. Finally, I am doing a thesis with him right now.

I am happy to talk more about my research journey.

What kind of research things are you interested in?

Her: Yeah I’ve been thinking about reaching out to professors but I’ve been worried it’s too late since it’s already March. I won’t be able to continue the research in the summer so I would need something flexible.

And I think I’m more interested in the labs housed under TAM as opposed to the ones housed under the engineering college.

What kind of work did you do with Dan Larremore? What did a day at “work” look like for you at the lab?

Me: I don’t think it’s too late. Research generally doesn’t follow semester schedules. Worst case, they tell you to start in Fall and that’s still better than nothing.

The work I do with Dan is more mathematical. When I started, I was working on automatically extracting data from faculty CVs to study the scientific ecosystem. Now, my thesis is focused on inferring hierarchy in networks based on interactions between different nodes and their characteristics (think ranking chess players).

When I worked full time, I would usually go to the lab space and spend most of my time reading, coding, and working out math on a whiteboard. Now, since it’s part-time, I generally work whenever/wherever I feel like. And he doesn’t mind if I work fewer hours I work as long as I have results to show every week.

But, this is a more CS/Math research group and the “typical day” is very different for different research groups even within the same field. For instance, in biology, you would spend more time doing wet lab stuff.

If you’re interested in TAM things, one thing I’d suggest is narrowing down exactly what you want to do and why you want to do that. Do you have a sense of that?

This question is hard to answer and generally trying multiple things can give a sense of what you like. In my case, I worked in an aerospace lab for a semester before I realized what I was doing was not interesting at all to me. And I met Dan after that.

Once you know that, you can move away from the TAM umbrella and find the same kind of things at various other places, giving you more options. As an example, someone I know (studying TAM) works as the scientific communicator/marketing and media person for a research group that analyzes data from Earth systems to better understand the environment.

Her: It’s good to know that it maybe isn’t too late, I guess after talking to you a bit more I’ll get back to emailing professors about their labs.

I think I’m having difficulty narrowing down exactly what it is I want to do. I’ve read a lot of summaries about different labs but I have a hard time getting a sense of what I would be doing in a lab, so I guess I would have to just reach out and talk to professors about it, there’s no other solution to that.

Right now it’s kind of hard for me to see where to even start given I don’t know what I want to do. So here’s a question I’m gonna throw at you: There are a lot of interesting labs here at CU, but not every lab will be suitable for me to work at. Is there any way to narrow down what it is I wanna do besides “picking what seems the coolest”?

And what made you not like the aerospace lab?

Me: The first question you asked is something almost everyone struggles with at the beginning, and I didn’t even know I was facing this. I think an easier question to answer is, “What do I care about? Why?” Write down the answer. Literally, write it down. This could be research ideas/topics, problems in the world you care about, the kind of people you want to work with. This thinking helps focus your goals and declutter your mind from the coolness aspect (you may still have cool things, but you remove the ones you don’t care about). And this will change over time as you experience new things.

Then, it will be easier to find projects and people whose goals align with what you care about. And when you reach out to professors, it will be much easier to just tell them what you want and ask if they can offer it, but also be open and humble to their perspectives when you talk. An added benefit is that they will appreciate you for having put thought into what you want.

Another thing to realize is that trial and error is not a bad thing. Volunteer to work for someone (and sincerely do it) for a few weeks and see if it’s a good fit. Commit if you like it.

Another important thing I should mention is that you should also take into account the kind of people you want to work with. In my opinion, someone who is fun to work with and genuinely cares about your success is a better fit than someone who has racked up awards and citations but is toxic to be around.

The way I look at the question is there are two orthogonal axes - my personal “joy” factor and does it impact the world in a way that is meaningful to me. Of course, this is just my answer and yours may be different.

I want to find something that hits both these categories. I want to enjoy my work and it should have a meaningful impact. For instance, curing cancer is an important issue and has a meaningful impact. However, working in a wet lab to address this issue is not something I would consider joyous (there are other ways I can attempt to solve this problem that bring me happiness at the same time). Another example is, I find string theory fascinating, but it does not impact the world in a way that is meaningful to me, at least in the foreseeable future.

And in the world of mathematics, it’s hard to find that sweet spot. There are only a handful of problems that do hit it. So recently, I’ve had to redefine what it means to have a meaningful impact. Now, I also care more about the ripple effect as opposed to just the immediate applications.

So, to answer your second question, why did I not like the aerospace lab, after a few weeks into it, I realized it had almost zero impact. The code I wrote will probably be thrown away as soon as I leave. I was designing an autonomous navigation system in deep space, which is basically science fiction at this point. And I was using deep neural networks. Working on this project, I realized I did not find deep learning interesting, because it is a highly incomprehensible black box and that made it slightly disturbing. Essentially, it scored negative on both my axes.

Also, reading this, you may wonder if I had it all figured out and worry that you don’t. All of my answers are based on retrospection and reflection. The truth is, I did not actively have any of these thoughts when this happened. Talking to other people, and writing essays for grad school applications forced me to think through all this, which is why I have concrete thoughts. And in hindsight, I wish I did all this thinking before and it would have helped me better.

Does that answer your questions?

Her: It really does answer my question, that was the most comprehensive response possible. I’ve read it over a few times to make sure I didn’t miss any details, so thank you for that!

In terms of finding the right opportunity, do you think I should just start reaching out to professors via email and see where it takes me? I would like to find a set up where I won’t be paid and the work would be voluntary and I want to start getting some experience as soon as possible.

Me: Yep! That’s basically what I did. I reached out to Aaron Clauset twice before he turned me down (twice!) and pointed me to Dan. Luckily I was familiar with Dan’s work because I attended one of his talks.

Some professors are too busy to respond to/take new people. So if you don’t hear back in a week, move on.

And that’s another thing: find talks and colloquiums to attend. You get familiar with other cool things happening around campus and the world.

This conversation took place on Slack, which is why it was possible to make a post here. The conversation continued and revolved around specific research groups.

This is unedited for the most part - the edits were primarily fixing typos and grammar corrections. I decided to leave it in the form of a conversation to follow the Socratic method.

I hope this was interesting to read!

Updated March 3, 2020: Corrected an error about the research done at Earth lab. Rephrased my choice of axes, and examples for uninteresting problems, to avoid misinterpretation.

Looking Behind and Looking Ahead: 2019 and 2020

2020-01-04T00:00:00+00:00

Here it is: the annual retrospection for 2019, along with the things I’m looking forward to in 2020.

Looking Behind at 2019

2019 was an interesting year. Here are some of the highlights:

I took part in the Mathematical Competition in Modeling again in January 2019. Our team modeled the opioid crisis in Appalachia and our paper was chosen as one of the three outstanding winners. Here is a news article from CU Boulder’s Applied Math department.
I spent the summer of 2019 in Seattle interning in Microsoft’s Edge Experimentation team.
I also broke my foot in Seattle.
I started working on my honors thesis. It involves modeling the effects of node covariates on the outcome on interactions between nodes in a complex network.
I led a recitation group for Critical Encounters, which was a class that reshaped my thinking. The recitation involved discussing personal philosophies with four freshmen for an hour every week.
I finally got my driver’s license.
I took a few interesting classes in 2019 – Chaotic Dynamics, Randomized Algorithms, Network Science.
I stared playing video games again.
I applied to grad schools 🤞

Looking at the numbers

Number of goals that I set out to achieve: 3
Number of goals that I completed: 0 (this fell apart two weeks into classes)
Number of hackathons organized: 2
Other competitions: MCM
Number of scientific papers read: 32 (+2 from last year)
Number of books read: 10 (-1 from last year)
Number of books in progress: 1
Favorite fiction: The Once and Future King
Favorite non-fiction: Sapiens: A Brief History of Humankind
Favorite music album: Lover
Favorite movies: Knives Out, Avengers: Endgame
Favorite TV show: The Witcher
Number of concerts attended: 2
- Most memorable: Lewis Capaldi (Seattle)
Number of theatrical performances attended: 5
- Most memorable: Broadway Christmas Carol
Number of video games played: 4
- Most favorite: The Witcher 3: Wild Hunt (there’s a theme going on here…)
States visited: Washington
New outdoor activities picked up: Nordic skiing
Old outdoor activities continued: Hiking, climbing, mountain biking
Number of 14ers completed: 0 (-1 from last year)

Looking Ahead at 2020

Here are some things I am looking forward to in 2020:

Being more intentional about what I do
Having more music - I’m excited to learn piano!
Teaching - I will be a teaching assistant for Chaotic Dynamics
Attending more concerts and theatrical performances
Reading more
Exploring more of Seattle and Washington this year
Finishing my thesis!

And some goals I am setting for myself:

Read a scientific paper a week
Read a book a month
Complete 3 side projects
Learn one new song every month

That’s it for now! Happy new year!

“What do you mean? Do you wish me a happy year, or mean that it is a happy year whether I want it or not; or that you feel happy this year; or that it is a year to happy on?”

“All of them at once!”

A lesson in speed and math abstraction

2019-11-27T00:00:00+00:00

When simulating a model, it is easier to take a teleological perspective. It is easier to approach the problem with the end in mind and work backwards, writing code how we would describe the model in words. This is definitely a good start. Sometimes though, as you may have guessed, this does not give the most efficient code.

I encountered this problem when I was implementing a network SIR model. My original implementation took upwards of 3 hours to complete a single iteration. This is not very ideal, especially when running the model for different parameters to compare results.

There were two bottlenecks in my implementation. To my surprise, when I tried to strip the details away from these bottlenecks, I was left with what resembled a textbook problem from an introductory probability course. And solving these problems in their raw and uncouth form, seeming to have no purpose without the application, I reduced the runtime of a single iteration to less than 2 seconds.

Bottleneck #01: Infecting people

In the traditional SIR model, there is an infection stage where already infected individuals try to infect the people they come in contact with. In our version of the network SIR model, $n$ infected people travel to a different state and try to infect the $m$ uninfected people at the destination with probability $p$.

My initial thought was to infect every uninfected person with each of the infected $n$ people. If at least one of them is successful, then this person becomes infected. So, I naively wrote the following code:

infected_possibility = np.random.binomial(n, p, m)
infected_possibility[infected_possibility > 0] = 1
num_infected = np.sum(infected_possibility)

Basically, every element in the infected_possibility vector tells the number of successful infections inflicted upon that person (after $n$ attempts, where each attempt is an independent Bernoulli trial). Then, I binarize the vector and sum it to get the total number of infected people. Clearly, this was super slow.

A binomial of a binomial

Taking a step back, all I care about is that there is at least one successful infection out of $n$ attempts. So, if $X_i$ is the total number of successful infections upon person $i$, then $X_i \sim Binomial(n, p)$. So,

\[ P(X_i > 0) = 1 - P(X_i = 0) = 1 - (1-p)^n. \]

This is the probability that there is at least one successful infection. Now, if I am treating each uninfected individual independently, then I essentially have another set of Bernoulli trials with probability $1 - (1-p)^n$. Together, this becomes another binomial. In the end, all I care about is the random variable $Y \sim Binomial(m, 1 - (1-p)^n)$.

This interesting turn of events leads to the following code:

prob = 1 - (1 - p)**n
num_infected = np.random.binomial(m, prob)

Needless to say, this is extremely fast. For $n = 10, p = 0.1, m = 5000$, the naive version took $795 \mu s$ while the mathematically intelligent version took only $9.69 \mu s$.

Bottleneck #02: Recovering the infected

Another important step in the SIR model is the recovery step. There are different versions of this stage. In our version, and most other commonly used versions, each infected person can either recover, die, or stay infected with some probabilities $p_r, p_d, p_i$ (that sum to 1).

In my first implementation, I decided to use numpy.random.choice where my choices were 0 (recovered), 1 (stay infected), and 2 (dead). After randomly choosing from these options, I calculated their respective frequencies like:

x = np.random.choice([0, 1, 2], m, p=[p_r, p_i, p_d])
recovered = len(x[x == 0])
dead = len(x[x == 2])

While this doesn’t seem bad at first glance, numpy’s random choice generator can be slow. Besides, there is the masking operation after which I compute the size of the vectors. This made the code really slow. With $m = 5000$, it took $529 \mu s$.

Uniformly simulate the choices

My initial reaction to fix this problem was to manually simulate the choices with a $Uniform(0, 1)$ distribution like:

x = np.random.uniform(0, 1, m)
recovered = len(np.asarray(x < p_r).nonzero()[0])
dead = len(np.asarray(x < p_d).nonzero()[0])

This improved the performance to $250 \mu s$. But this was still the bottleneck taking the simulation close to 1 hour to complete a single iteration.

A multinomial?

Abstract away the details. Now, breathe. What am I trying to do?

I have $m$ objects. I want to assign each object to one of three groups. And I only care about the final counts in each group, not the assignment itself. This smells oddly so familiar. Can this be… a multinomial?

Turns out it is as simple as a multinomial distribution, and I was just wasting my energy worrying about the details!

x = np.random.multinomial(m, [p_r, p_i, p_d])
recovered = x[0]
dead = x[2]

This took an insane 3 hours to realize the connection and only $8.77 \mu s$ to run.

In the end, each iteration of this efficient simulation took less than 2 seconds to finish. With this, searching the parameter space and generating results should be very fast.

If you are interested in running the results yourself, the test notebook is available on GitHub. The notebook also has some interesting failed alternate versions not discussed here. One version worth noting is realizing that Bottleneck #01 can also be re-imagined as a geometric distribution 😉

The Lesson

The lesson here is somewhat of a case for pure mathematics to applied mathematicians. As computer scientists and applied mathematicians, we often focus more on the applications. It becomes easy to get lost in the details of the system that we tend to miss the simplicity and beauty of the underlying mathematics. When we remove the details one by one and reduce the noise, the equations settle down leaving us with a simple, and maybe cute, textbook problem. And if it isn’t as simple as that, then you’ve got some work cut out for you!

Looking Behind and Looking Ahead - 2018 and 2019

2018-12-31T00:00:00+00:00

Here it is the end of the year post. I must say, even though I don’t write here often, I am at least consistently completing this ritual. This time, I’ve decided to combine the retrospective and future into one single post.

Looking Behind at 2018

Some important things that happened in 2018…

Research

I started working with Professor Daniel Larremore in January of 2018, and it has been amazing working with him and the people there. The broader view of what my lab is interested in is the structure of academic networks. I am helping that endeavor by automating the process of collecting data from academic CVs. This topic falls in the intersection of probabilistic modeling, machine learning, natural language processing, and, interestingly, DNA sequencing. I worked on this project over the summer too and learned a lot. And I am looking forward to learning more!

Connecting the dots

Something that makes mathematics and science very interesting to me is that everything seems to be connected. After accumulating the basic knowledge, you start to see connections between things you earlier thought were unrelated. I had the pleasure of experiencing that. Here are some of those:

In Spring 2018, two courses I took were Algorithms and Probability. An important algorithm we learnt was Quick Sort. At a later point in the semester, we calculated the expected runtime of this algorithm in Probability using the techniques we learnt in that class.
In Fall 2017, I had taken Discrete mathematics and in Spring 2018, I had taken Differential Equations. I noticed that solving linear recurrences and linear homogeneous differential equations follwed a same pattern by finding the charactersitic equation. Turns out that linear recurrences represent dynamical systems in a discrete space and differential equations represent dynamical systems in a continuous space.
In Spring 2018, I also took Operations Research. Once again, the same problems we solved in Algorithms including Max-Flow and shortest path cropped up in Operations Research and we were using different techniques to solve the problems. In Algorithms, we were solving the problem under some constraints that made it possible to solve them in polynomial time. In Operations Research, we solved the general problem which was usually in non-polynomial time.
In Summer 2018, I realized that logistic regression from machine learning borrows a key idea called entropy from information theory!
In Fall 2018, I took Fourier Analysis and Physics 3 (basic quantum physics). Being able to solve partial differential equations using the Fourier technique greatly helped me understand the Schrodinger equation (I think).

Philosophy of Education

At the end of 2017, I was asked to write my Philosophy of Education. This made me rethink how I wanted to approach education. And a key ingredient to my newly formed philosophy of education was suspension of disbelief.

For education to be complete, I think there are times when we need to temporarily give up logic and reason, and indulge in something completely preposterous for pure enjoyment and spontaneity.

At that time, I really didn’t know how. I am glad to say that I was lucky to find friends who taught me how to break the structures and truly be spontaneuous.

Looking at the numbers

This was sort of inspired by the post written by my professor Aaron Clauset.

Number of goals that I set out to achieve: 3
Number of goals that I completed: 0.5 (our MCM paper will be published in 2019)
Number of side projects started: 4
Number of side projects completed: 0
Number of hackathons organized: 2
Number of hackathons attended: 1
Other competitions: MCM, Google Games
Number of scientific papers read: 30
Number of books read: 11
Number of books in progress: 2
Favorite fiction: To Kill a Mockingbird
Favorite non-fiction: Leonardo da Vinci
Favorite music album: Calling All Dawns
Favorite movies: 96 (Tamil), Outlaw King
Favorite TV show: Merlin
Number of concerts attended: 4
- Most memorable: Tommy Emmanuel
Number of theatrical performances attended: 3
- Most memorable: STOMP
States visited: California, Washington, Arizona
- Most memorable city: Seattle, Washington
Outdoor activities picked up: Hiking, climbing, mountain biking
Number of 14ers completed: 1 (La Plata)

Looking Ahead at 2019

Here are some things I am looking forward to in 2019:

Being more spontaneuous
Doing more of the outdoor activities I picked up
Attending more concerts and theatrical performances
Reading more fiction (watching Merlin made me miss fantasy)
The summer I am going to spend in Seattle

And some goals I am setting for myself:

Read a scientific paper a week
Read a book a month
Write more often

That’s it for now! Happy new year!

“What do you mean? Do you wish me a happy year, or mean that it is a happy year whether I want it or not; or that you feel happy this year; or that it is a year to happy on?”

“All of them at once!”

Estimating the Number of Free Bike Racks

2018-07-15T00:00:00+00:00

If you have ever carried your bike on a RTD bus in Colorado, or even if you have just travelled in one, you will know that each bus has two bike racks at the front. Placing and retrieving your bike from these racks is almost effortless. If these racks are full, then you will have to store your bike in the storage compartment. Now, this can be really messy. Especially if someone else stores their bike after you (so your bike gets pushed back) and you get off before them (so you will need to take their bike out; take your bike out and; put their bike back in).

Luckily, only a small number of passengers bring their bike on the bus. So, unless you are riding with your bike during rush hour, there is usually space in the bike racks. One morning, I noticed that the bike racks were full. Naturally, I assumed that this was a rush hour and it would be difficult for me to find a seat in the bus. However, there were only 8 passengers! This intrigued me. What are the odds that out of 8 passengers, 2 of them brought their bikes?

I immediately got down to solving something that resembled a classic example taken from a probability textbook.

A Binomial Distribution

Let the probability that each passenger carries their bike on the bus be $p$. Now, suppose there are $N \geq 2$ passengers. What is the probability that at least two of them bring their bike?

This is simply a Binomial distribution. Let $X$ denote the number of bikes. Then $X \sim Binom(n, p)$. And we have:

\[Pr(X = k | N = n) = \binom{n}{k} p^{k} (1-p)^{n-k} \] \[Pr(X \geq 2 | N = n) = \sum_{k=2}^{n} \binom{n}{k} p^{k} (1-p)^{n-k} \]

A Generative Process

If I know how many passengers are on the bus, I have a quantitative estimate of the number of free bike racks. However, while I am still waiting for the bus and cranking out probabilities, I do not have any prior knowledge about the number of passengers. This is where we have the liberty to make the problem interesting by coming up with a generative process for the number of passengers, $N$. Here are some basic facts to get started:

$N$ is a discrete variable.
There are different bus stops where passengers can get on (or get off). Think of these bus stops as discrete time intervals, and each passenger getting on at a bus stop as a single event.
The number of passengers getting on at each bus stop can be considered independent of the number of passengers getting on at the previous stop.

This almost looks to me like a Poisson process. The only hiccup is that, more passengers may get on at a larger bus stop i.e., the rate at which events occur is not constant (something that is fundamental to a Poisson distribution). But, we can still approximate $N$ using a Poisson distribution, hoping that the difference in rates of events cancel each other. So, $N \sim Poisson(\lambda)$ where $\lambda$ is the average number of passengers getting on a particular bus stop.

\[Pr(N = n) = e^{-\lambda} \frac{\lambda^n}{n!} \]

Putting everything together

The Law of Total Probability gives a way to directly estimate the likelihood of $X$:

\[Pr(X = x) = \sum_{n} Pr(X = x | N = n) Pr(N = n) \]

Since we want to know $Pr (X \geq 2)$, we have $n \geq 2$. Further, the seating capacity of a bus is $N_{max}$. So, $2 \leq n \leq N_{max}$. Putting everything together, we have:

\[Pr(X \geq 2) = \sum_{n=2}^{N_{max}} Pr(X \geq 2 | N = n) Pr(N = n) \] \[Pr(X \geq 2) = \sum_{n=2}^{N_{max}} \Big(\sum_{k=2}^{n} \binom{n}{k} p^{k} (1-p)^{n-k} e^{-\lambda}\Big) \frac{\lambda^n}{n!} \]

More Data

There is some neat math going on here. But, how much of what has been proposed is actually valid? This is where data can help validate (or discard) this model.

We can get a crude estimate of $p$ relatively easily. Just set $p$ to be the fraction of people in Colorado who own a bike (which can be estimated with the number of bikes sold and the population). Fancier techniques can be used to polish this estimate, but this will suffice as a good starting point. Estimating $\lambda$ is the difficult part. We need data about how many people use the public bus. While I am sure this information is collected by the RTD, getting access to it is a different problem.

If you know where I can get access to such data, or have ideas to overcome this limitation, you should definitely contact me!

The Cost of Privacy

2018-03-16T00:00:00+00:00

UPDATE 4/24/2018 I am pleased to say that our paper was selected as a Meritorious Winner (one of the top 10%)!

Every year, the Consortium for Mathematics and its Applications (COMAP) hosts an international contest for high school students and college undergraduates where the participants get to work in teams of upto 3 to analyze, and propose solutions to open ended problems. COMAP releases 6 problems (3 of which are mathematical, and the other 3 incorporate interdisciplinary ideas) at the beginning of the contest. The contest itself takes place over 4 days, and at the end, the teams submit a 20 page report on their work.

Background

Our team chose to model the cost of privacy. This is a particulatly interesting problem because private information (PI) can reveal a person’s personality, ideas, interests, and identity. And, social media networks like Facebook and Google are already using our PI to make profits. However, there is no system in place for the owner to receive financial compensation.

Modelling financial compensation for PI is no simple task. This is a senstive measure and highly dependant on risk and benefits associated with each person sharing their information. This not only varies from person to person, but also varies with what kind of information is being shared. We explored the value of PI and created a model that considers the trade of PI in a free market.

After considering the subjective nature of the task at hand, we are still left with addressing the politicial, cultural and ethical implications of the free trade of PI.

Problem Summary

The problem can be conquered by dividing it into the following sub-problems:

Develop a price point for PI that takes into account the risks and benefits involved in sharing data with an unknown third party
With the help of the price point, create a pricing structure for PI
Using this pricing structure, develop a pricing system that treats PI as a commodity that could be traded
The model we develop should also take into account that human data is highly correlated i.e., the model should effectively capture the network effects of data sharing
We also need to consider the political, cultural and ethical implications of PI being available for sale

Our Model

Without going into the details of the model, we created a model with the following characteristics:

To create a price point for PI, we took a weighted average approach. We accounted for characteristics (such as education, age, etc.) that are most relevant to each specific facet of PI (social media, finance, general ID, etc.) and factored in the risk associated with people sharing their PI depending on the characteristics.
Using this price point, we developed a pricing structure that depends on the actual value of each PI record (name, birthday, bank information, etc.). With this pricing structure we turned PI into a commodity and brought in forces of supply and demand for PI under the assumptions of a free market.
To effectively capture the network effects of data sharing, we used network ranking algorithms (PageRank) to determine how much influence a person has in their society. We factored this into our pricing structure while also keeping in mind how connected the network is. Further, we also discussed the use of community detection algorithms (SpringRank) to get a better measure of how connected a person is.

It turns out that our model works under the assumptions of a free market and obeys the laws of microeconomics. Therefore, our model can theoretically scale well to real markets with factors such as government regulation and international trade.

Full Report

The complete discussion of our model is beyond the scope of this blog post. For more details, such as the assumptions of our model; the mathematics of our model; the strengths and weaknesses of our model and; sensitivity analysis and; a closer look at the ethical issues surrounding the trade of PI, do read the actual paper here

Acknowledegements

Shout out to my awesome teammates, Johann and Brendan.
I also want to thank Anne Dougherty, the head of the Applied Math Department at CU Boulder.
And, of course, I also want to extend my thanks to the Engineering Honors Program for giving us the space and resources to work for 4 straight days on just math.

The Number Guessing Game

2018-01-08T00:00:00+00:00

Let’s play a game. I think of 5 numbers from 1 to 100. A friend, who has no idea what my 5 numbers are, then tells that you can pick a number from 31 to 60. You win the game if the number you picked is one of the 5 numbers I thought of. Assume that I had no idea that you were going to be restricted to guessing only a number from 31 to 60 (otherwise it wouldn’t fair!). What are the odds of you winning the game?

Why is this interesting?

Well, apart from the fact that a mathematician never shys away from a problem, this problem is interesting because, there is a seemingly complicated twist to the original problem. It turns out that it actually isn’t that complicated at all.

But, the problem is most interesting because of the answer to the question. So, you will have to stick until the end to know why this is intersting and worth thinking about. Now that we have established the conundrum of the Hermeneutic Circle, let’s dive into the solution. If you are very impatient, jump to the results to know the answer.

The Original Problem

The question I posed is a spin-off of a very simple problem in probability. Let’s solve that before introducing the intricacies and restrictions. The problem goes like this:

I think of 5 numbers from 1 to 100. What are the odds that you guess exactly one of those numbers in a single attempt?

The solution is straightforward. There are 5 right answers. And you have a pool of 100 numbers to guess from. \[Probability = \frac{5}{100} = 0.05 \] It will do well to remember the number 0.05. Now, let’s look at the problem at hand.

The Solution

For the sake of simplicity, let’s call the person who thinks of the number as Player 1 (or P1) and the person who guesses as Player 2 (or P2). And for completeness, let’s call the person who imposes restrictions, making life harder for P2, as Referee (or R).

Back to the original problem, let us study the situation before we answer the actual question. First of, notice that there are 6 different possibilites for the 5 numbers and range.

None of the 5 numbers lie in the range
Exactly 1 of the 5 numbers lie in the range
Exactly 2 of the 5 numbers lie in the range
Exactly 3 of the 5 numbers lie in the range
Exactly 4 of the 5 numbers lie in the range
All of the 5 numbers lie in the range

For succinctness, let us call the event that exactly $i$ numbers lie in the range as $R_i$. So, the above mentioned possibilities are events $R_0$, $R_1$, $R_2$, $R_3$, $R_4$, and $R_5$. Notice that these events are mutually exclusive and exhaustive.

Let us call the event that P2 wins the game i.e., guesses a correct number as $C$. It is actually easier for us to calculate the probability of $C$ occurring conditioned on the events $R_i$. It is also straightforward to calculate the probability of $R_i$. So, with the help of law of total probability, we can answer the question posed as follows:

\[P(C) = \sum\limits_{i = 0}^5{P(C|R_i) * P(R_i)} \] \[P(C|R_i) = \frac{i}{30}\] \[P(R_i) = \frac{(^{30}C_i) * (^{70}C_{5-i})}{^{100}C_5} \] \[P(C) = \sum\limits_{i = 0}^5{\frac{i}{30}*\frac{(^{30}C_i) * (^{70}C_{5-i})}{^{100}C_5}} = 0.05 \]

Surprisingly we get the same answer got from the original problem i.e., 0.05. Could this just be a coincidence?

Generalization

Let us generalize our formula for arbitrary values. Let $S$ be the set of all elements from which P1 can think of. And let $k$ be the number of elements that P1 thinks of. Now, R imposes a restriction on P2. Let $A$ be that restriction i.e., the set of all elements from which P2 can guess the answer. Let $|S| = n$, $|A| = m$, and $A \subseteq S$. Therefore $m \leq n$. Let the set of elements that P1 thinks of be $X$. Clearly $|X| = k$.

Our events are defined as before. $R_i$ is the event that exactly $i$ elements of $X$ lie in $A$ i.e., $|X \cap A| = i$ where $0 \leq i \leq k$. $C$ is the event that P2 wins the game. Again, applying the law of total probability, we have:

\[P(C) = \sum\limits_{i = 0}^k{P(C|R_i) * P(R_i)} \]

Consider the event $C$ conditioned on $R_i$. P2 can guess from a total of $m$ elements. But, only $i$ of them can make P2 win. Therefore, the probability of $C$ conditioned on $R_i$ can be written as:

\[P(C|R_i) = \frac{i}{m}\]

The number of ways event $R_i$ can occur is the number of ways we can choose $i$ elements from $A$ and the number of ways we can choose the rest i.e., $(k - i)$ elements from $S-A$. Note that because $A \subseteq S$, $|S - A| = n - m$. So, we can write the probability of $R_i$ as:

\[P(R_i) = \frac{(^{m}C_i) * (^{n-m}C_{k-i})}{^{n}C_k} \]

Now, we can answer the generalized question:

\[P(C) = \sum\limits_{i = 0}^k{\frac{i}{m}*\frac{(^{m}C_i) * (^{n-m}C_{k-i})}{^{n}C_k}} \]

Results

I have written a quick python script to evaluate this for different values of $n$, $m$, and $k$. Turns out that if we set $n = 100$ and $k = 5$, then for all $m$ such that $0 < m \leq n$ $P(C) = 0.05$. This is far too interesting to be just a coincidence…

Well, in fact for arbitrary $n > 0$ and $0 < k \leq n$, as long as $0 < m \leq n$,

\[P(C) = \frac{k}{n} \]

This means that the restriction the referee R imposes on P2 has no effect on the odds that they will win the game.

Discussion

Our intuition says that if R gives a smaller range for P2 to guess from, then it reduces the probability that P2 wins thus increasing the probability of P1 winning. But we have just shown that our inherent human intuition is wrong just like the Monty Hall Problem and lots of other times.

But how can we undersand the result we just derived intuitively? Keep in mind the way we solved the original problem. Now, if the numbers are truly random, then the odds will be the same irrespective of the restriction imposed on P2. This is due to three facts:

P1, the person thinking the numbers, doesn’t know the restriction that will be imposed.
R, who sets the restriction, does not know what numbers P1 has thought of.
P2, the person guessing, also doesn’t know the numbers P1 has thought of.

Since we have eliminated bias in all three people playing the game, we need to account for all the different possibilites the situation creates. In doing so, the net effect of the restriction becomes nil. Thus, we end up with the original problem again. We went to great lengths trying to complicate a simple problem only to go back to sqaure one!

Acknowledgements

Shout out to my professor Chris Ketelsen and my classmate Michael Dresser for encouraging, and helping, me to think about this problem. Another shout out to my friend Aravindh Shankar for proof reading my solution.

2018 Goals

2018-01-01T00:00:00+00:00

Last year (2017), I had set three long term goals. It focused entirely on computer science. It didn’t entirely go well. Perhaps that could be attributed to the naiveness of the goals. This year, I will once again set three goals for the year. But, this time, I want to make sure that these are not focused only on computer science.

Goal 1

Write at least one technical blog post each month

Last year, I tried to commit code each day thinking that I will learn something in that process. But, it turned out that it actually stopped me from learning something. Hopefully, this task will achieve the same end goal. Why? This will force me to constantly work on something technical throughout the year. This could be anything from a cool math problem I solved or explaining a complex topic or a research project I am working on. This will also improve my scientific writing skills.

Goal 2

Read at least 20 books

I love reading books. But lately I haven’t been able to find the time to read all the books I want to. So, I put this goal out there to motivate me to read books. These could be anything from fiction to non-fiction, though I look at myself reading more of fantasy and science fiction more than anything else.

Goal 3

Co-author a technical research paper or journal article

This is probably the most far-fetched goal I have ever set for myself. That is good. It constantly makes me improve and learn. I have started doing research (since the Fall of 2017), so hopefully this isn’t really as far-fetched as I think it is. This will keep me on my toes.

Those are the three goals I am setting for myself. At the end of this year, I will look back and evaluate how successful I was accomplishing them.

That’s it for now. I wish you a happy new year!

2017: A Retrospective

2017-12-30T00:00:00+00:00

2017 was a long year, good nonetheless. I want to take a moment to look back at the highlights of 2017. Some of the things I cover here are the goals I set out at the beginning; some of my favorite books; the good stuff – research; and other miscellaneous things.

Goals

I set out 2017 with some goals (found here). Here is how well that went:

1 commit a day challenge: It started off well. But halfway through, I realized that I was writing code and committing code not because I wanted to do that, but because I felt obligated. So, I decided to quit the challenge as it was ridiculous and taking away time that I could have spent learning something else.
Solve 100 problems on Project Euler: I solved 77 problems. Then, school and research started taking priority and I soon forgot about this. Maybe I will complete it in 2018.
Implementing Neural Algorithm for Artistic Style: I actually managed to complete this one. Here is the link to the source code.

No matter how bad (or good, depending on how you look at it) these went, I am still planning on writing a set of new goals for 2018. Hopefully these are more realistic and I actually manage to stick to them.

Books

These are my favorite books that I read in 2017 (though all of them are much older):

Mistborn Trilogy [Brandon Sanderson]: I went in not knowing what to expect and that is the best reading experience. The magic system is really well defined with limitations. It keeps you guessing what’s about to happen and what does happen is nothing you expected, but all the clues were right there in front of you! You can buy them here.
The Death of Ivan Illych [Leo Tolstoy]: This book made me think about how I want to live my life. It is a tad bit dark, but I recommend this book to everyone, no matter how old you are. You can buy it here.
The Great Mathematical Problems [Ian Stewart]: Discusses 10 of the most famous math problems and their history. If you love math, you should definitely read this book. If you don’t, reading this book will change your opinion. Slightly. You can buy it here.

Research

I started out doing research the Fall of this year. The Fall semester I worked in the AVS laboratory in the Aerospace department on feature tracking algorithms for optical navigation in space. We (myself and a post-doc student) bootstrapped a deep neural network to identify craters in images with TensorFlow Object Detection API. We then came up with a tracking algorithm to track the craters in videos. You can read the blog post I wrote on that here.

Starting 2018, I will be working on a new project developing mathematical models for feature extraction from texts.

Misc

I took Critical Encounters. This class had a real impact on me and made me rethink several aspects of my life. It made me think about what kind of a person I want to be and how I should live my life. This was where I read The Death of Ivan Illych by Leo Tolstoy. I also wrote my Philosophy of Education as an assigment for this class.
I travelled to New York in August for HackCon V - a conference for hackathon organizers. Here is the post I wrote.
I wrote two posts explaining how to use TensorFlow to build a linear regression model and a simple neural network.
My three favorite movies this year (in that order) are Logan, Dunkirk and The Greatest Showman. Note: I edited and replaced Wonder Woman with The Greatest Showman.

That’s it for now. See you again in 2018!

My Philosophy of Education

2017-12-19T00:00:00+00:00

If you had asked me why I want to go to university, and why I am studying what I am studying, a few months ago, I would have replied with superficial answers like, “Get a job” and “I love mathematics”. But after taking the class Critical Encounters, my answer is completely different. This class made me rethink several aspects of life, and question who I wanted to be. One of the final assignments for this class was to give my statement on the philosophy of education. Here is the prompt for the assignment:

What do you want from university education? How do you want to approach it? What is its purpose in your life?

And this was my response.

At the crux, education is a very simple notion. Who is educated know that they know nothing. And I want my university education to be an embodiment of that idea. I want to be challenged every day, and every moment - I want to be constantly reminded that the only thing I can ever know for sure is that I know nothing. I want my education to equip me with the tools necessary to wrestle with the thought that I will forever remain this way. And through these challenges, I want to identify, and maybe invent, myself. Finally, I want my education to allow me to make my own choices - not those defined by the society - with the newfound perception of myself. As Merton says, “[…] to identify who it is that chooses”.

I want to approach my education from three different avenues - curiosity, skepticism, and suspension of disbelief. Though they are conflicting at a superficial level, I think they reflect the idea that “Knowing that I know nothing” is the essence of education. Firstly, I want to approach education with curiosity, a yearning to know more about everything. Only with this constant dissatisfaction of what I know right now can I ever truly learn that I know nothing. Next, I want to approach everything that my curious mind wants to know with a degree of skepticism. The fact that I know nothing should make me question whether the source of this new idea, be it a book or a lecturer, knows anything. Doubting and questioning everything is the key to understanding that we all know nothing. Finally, I want to approach education by suspending disbelief. Sometimes, perhaps to prevent the elusive case of Ivan Illych-ism, it is better to suspend your rationality, and believe something surreal for the sake of it. For education to be complete, I think there are times when we need to temporarily give up logic and reason, and indulge in something completely preposterous for pure enjoyment and spontaneity. Approaching education from these contrasting paths is a great challenge by itself. And I think that this is the way to identify myself.

The purpose of education in my life, on an abstract level, is to be able to make my own learned choices, and have my own conscious opinions. Armed with these, I want to spread the same to my society. I want to help others to make their own learned choices, and have their own conscious opinions. I want to use the gift that is my education to help others. Concretely, this would be something, but not necessarily, along the lines of setting up a charity for those in need; bridging the scientific gap across the world and; coming up with technology that helps people with terminal illnesses like cancer. In short, with my education, I want to make a difference in others’ lives.

This made me really think about what is it I want from my university education. Now, that I have discovered why I truly want to be educated, I think it is of utmost importance that I never lose sight of this. I also think that it is important for others to know what I value the most in education. Due to those reasons, and others, I have made a permalink to my statement on the philosophy of education to me homepage. You can find the link on the navigation pane. And here you go - Philosophy of Education

Feature Tracking and Optical Navigation

2017-11-30T00:00:00+00:00

This article is a simplified version of the research report that aims at identifying and tracking craters in images for optical navigation in space. We first survey at existing image processing techniques. We then proceed to bootstrapping a deep neural network classifier with the help of TensorFlow Object Detection API and images from NASA’s Detecting Crater Impact Challenge. We then implement a preliminary tracking algorithm that stores images and computes mean squared error to detect if the crater has already been seen before.

In this article, we additionally go into more details of the tracking algorithm. For the code, check the object-detection branch of our GitHub repository.

Background

Ever since humans landed on the moon, it became clear than deep space travel is a possibility in the future. One of the biggest issues faced by satellites and probes that we have sent into space is that they do are unable to react to the presence of other astronomical objects real time. This means that they must rely on scientists back at Earth for navigation. Satellite images get sent back to Earth for scientists to study the situation. At the least, the time delay slows down the mission progress and causes overhead.

In this article, we attempt to provide a method to track craters on astronomical objects. We will first identify potential craters on the astronomical object. Then we will start tracking these potential crates and calculate how far these craters have been displaced since the last image was taken. We can then feed these inputs into a navigation filter for thr actual navigation.

Our Workflow

This research area is very broad and involves three big topics – identification of craters, tracking of craters, and navigation. This article will mainly focus on the crater identification and will briefly touch upon tracking.

Our workflow, for tackling the identification of features, is to design two different models that achieve the same results simultaneously. The first will use pure image processing techniques to identify craters. The second will be a mixture of image processing and machine learning techniques. Then we will either choose one of either models, or a combination of both, based on their robustness.

For the second question, tracking craters, we discuss a preliminary algorithm we designed.

Image Processing

We used off-the-shelf feature detection algorithms to test their robustness. OpenCV offers many feature detectors like Hough Circle Finding method and Harris Corner Detection. These feature detectors provided by OpenCV allow for decent crater findings, but need significant tuning. This leads to questions regarding robustness and automation, and this is where machine learning might lead to better results.

Here is a sample result we got after a lot of fine tuning. As you can see, the results are decent, but not very precise.

Deep Learning

Instead of training a deep convolutional neural network from scratch, we decided to bootstrap a neural network using the TensorFlow Object Detection API and images from NASA. The base model we trained on was originally trained on the COCO dataset with the architecture of award winning Microsoft’s 152 layer residual neural network. We trained the model for nearly 10 hours and got these results at the end.

Tracking

Note: This goes into more detail than the report

Tracking craters presented a problem. We began by comparing the cropped images of craters against one another by computing the norm of the difference of pixels. But, we are not making using of another import result we are calculating. The rate at which a crater moves across the camera depends on the speed of the satellite. And under normal conditions, the motion of the crater can be seen as continuous. So, if the distance between the center of two craters across two consecutive frames is small, then it is likely that the two craters are the same.

Thus, we have two different parameters - one that tries to say that two craters are different (the norm of difference), and another that tries to say that two craters are the same (the euclidean distance between the centers). With this, we can construct a new cost function as follows:

\[Cost(X, Y)= \alpha\frac{\left\lVert {X − Y} \right\rVert}{\left\lVert {X^{0} − Y} \right\rVert} + \beta(\left\lVert {X_c − Y_c} \right\rVert - 0.5)\]

Here $\alpha$ is a hyperparameter that we can tune. We also normalize the norm of the difference to keep the errors within a small range. $X^{0}$ is the first crater against which we compare $Y$. So, the first term becomes just $\alpha$ when we compare agains the first image $X^{0}$. $\beta$ is a parameter that depends on the rate at which craters move across the camera i.e., the speed of the satellite. In fact, we can easily see that $\beta$ depends inversely on the speed. $X_c$ and $Y_c$ are the vectors representing the centers of the craters $X$ and $Y$ respectively.

Now, we can generalize an say that two craters are different if their cost is greater than a threshold we set, say $\lambda$. Otherwise, we assign the crater a new tag. Here are some results we got using our algorithm.

Conclusion

On the whole, we can conclude that traditional image processing techniques are not consistent. They need lots of preprocessing and manual fine tuning to work, which is what we are trying to avoid. Neural networks yield much better results and are also efficient – takes ~0.4s for a single 600x400px image.

Our preliminary tracking is sometimes lacking in robustness. Sometimes, we observed that the same crater gets different tags, and different craters get the same tag. We need to implement a third feature that penalizes the algorithm (increases the cost) if the crater has not been seen for a while. We could also try incorporating a simple Kalman filter that predicts the position of craters to assist our algorithm.

Future Research

We need to be able to track craters consistently, reliably, and efficiently. We need to improve upon our preliminary algorithm and increase the accuracy of our algorithm.
We need to start modelling our algorithms under different lighting conditions and angles, which is more realistic.
We need to use our results as inputs to navigation filters such as the Kalman Filter.

Clearly, we have a long way to go before this can be put to practice. But, this is a step in the right direction.

Acknowledgements

Thibaud Teil, my mentor for this research project
Dr. Hanspeter Schaub, the director of AVS Laboratory
Dr. Beth Myers and You’re@CU Program

References

[1] Simonyan, Zisserman (2014) “Very Deep Convolutional Neural Networks for Large-Scale Image Recognition”

[2] Szegedy, Liu et. al (2015) “GoogLeNet”

[3] Girshick, Donahue, et. al. (2014) “Rich feature hierarchies for accurate object detection and semantic segmentation”

[4] Urbach, Stepinski (2009) “Automatic detection of sub-km craters in high resolution planetary images”

[5] Kalal, Mikolajczyk, Matas (2010) “Tracking-Learning-Detection”

[6] Dor, Tsiotras “Application of ORB-SLAM to Spacecraft Non-Cooperative Rendezvous”

[7] Mur-Artal, Tardos (2016) “ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras”

[8] Ross Girshick (2015) “Fast R-CNN”

Datasets

[9] NASA Detecting Crater Impact Challenge - https://www.nasa.gov/feature/detecting-crater-impact-challenge

[10] COCO Dataset - http://cocodataset.org

Tools

[11] TensorFlow Object Detection API - https://github.com/tensorflow/models/tree/master/research/object_detection

[12] OpenCV - https://opencv.org

Neural Networks on House Prices

2017-08-12T00:00:00+00:00

In the previous article, we used linear regression to predict the price of houses. Then, we saw that this model does not find any non-linear correlations. The most fascinating thing about neural networks is that they automatically model any non-linearities present in the phenomenon. In this article, we will use neural networks to overcome that shortcoming.

Note that this is a follow-up post. We already downloaded, and cleaned the Ames housing dataset in the previous article. If you haven’t done that already, you should probably go ahead and finish that first. In addition to that, we also split the dataset into 3 parts (training, validation, and testing). I will jump into the code assuming that’s already done. Or if you prefer, you can follow along by running the Jupyter Notebook.

All of the code used here is available in the form of a Jupyter Notebook which you can run on your machine.

What is a Neural Network?

As one might think, neural networks are systems that are modelled after the human nervous system. The human body has neurons that connected together is a very complex network, with each neuron branching out to many other neurons and getting input signals from multiple neurons. Similarly, in AI, neural networks can be thought of as inputs going to different temporary outputs, and those going to other temporary outputs, and so on until we lead the final temporary outputs to the final output. Each of these temporary outputs are called hidden layers because they don’t really expose themselves anywhere else.

The image (taken from Wikipedia) below will help you understand the flow of inputs to outputs.

Notice that we are leading our features to multiple values. Think of each of these values as a separate linear problem (like the one we solved earlier). These new vector of values inside the hidden layer will now serve as new features for our problem. In this manner, we can construct many such hidden layers with different number of features. Finally, when we are happy, we can direct these features to the actual output. Generally speaking, more hidden layers equals better performance. But you must watch out for overfitting.

Notice that we described each of these connections as linear problems. That means they must have weights and biases. We find these parameters using a process called backpropagation. It’s called backpropagation because we use the final output and proceed in the direction towards the input (back) to reconstruct the weights and biases. The mathematics is a bit more complex than the one for linear regression and is beyond the scope of this article. Finally, we use an optimizer, just like Gradient Descent (in this tutorial we will be using Gradient Descent), to help converge the cost function.

An important aspect of neural networks is feeding the hidden layers into the next layer. It so happens that sometimes the gradient (when performing backpropagation) can vanish or explode. To prevent that we have activation functions. The most commonly used activation function is the Rectified Linear Unit function, abreviated as ReLU and is defined as follows:

\[f(x) = max(0, x)\]

It basically sets all negative values for the input to $0$. This function also significantly speeds up our computation process.

Remember that a simple linear regression has two big drawbacks:

The number of parameters are small and fixed
They only model linear correlations

This is why neural nets (NN) have an edge over linear regression:

There is great flexibility over the number of parameters (and hence performance). You can control the number of hidden layers and the number of nodes in each hidden layer.
Since there are multiple layers each being activated by an ReLU, neural networks automagically model any non-linear correlations. The better your NN (not necessarily having more hidden layers), the more non-linear correlations it captures.

The Design of Our Neural Net

The NN we are going to create is a rather modest one. It has only one hidden layer. So, you can consider it more of a proof-of-concept that NNs are better than linear regression.

Our initial number of features is $38$. So this is the size of our input layer. We will map this onto our hidden layer. Our hidden layer will have a size of $16$. This hidden layer will undergo linear rectification with ReLUs. That will serve as features for our output.

The image below represents our NN

The neural network can be defined by these equations. Here, $X$ is the input matrix, $W_i$ and $b_i$ are weights and biases respectively. $X_2$ represents the hidden layer, and $y$ is the output.

\[x_2 = W_1X + b_1\] \[X_2 = ReLU(x_2)\] \[y = W_2X_2 + b_2\]

Training the Neural Net

Continuing on after cleaning the data, we create some variables to store the size of the training data. Next, we define the number of activation units in out hidden layer as $16$. Now, we are ready to construct our graph.

As in the previous case, we define the datasets as tf.constant because we don’t want to modify them in the graph. Observe that we have two sets of weights and biases. weights_1 and biases_1 map our input variables to the hidden layer. And the matrix sizes are defined in such a manner. weights_2 and biases_2 map the hidden layer to the output. Then we have steps. We’ll discuss this more when we move on to the optimzation.

Now, we define out model. This is simply a rendition of the mathematical equations we described earlier in TensorFlow style. We do this for code reuse and readability.

Now, we compute the cost. The cost function we are using here is the same we used in the previous post. So you can read that one to gain more insight.

Then we optimize and minimize the cost. This time, we are not using a fixed learning rate. Instead, we exponentially decay the learning_rate i.e., as we run more iterations, the learning_rate slowly becomes smaller and smaller. As we get closer to the minima, we start moving slower towards the minima to ensure that we do not miss it. This is where the steps comes into play. This variable keeps track of the number of iterations. And finally, the optimizer we use will be gradient descent.

Finally, we use our parameters and predict the output for the test and validation dataset.

Now, we are ready to train our model. We initiate a tf.Session with our graph and run the graph for 1000000 steps. If you do not have access to tensorflow-gpu, I recommend you reduce the number of iterations for faster results. After running, we save our weights and biases for later use. You may want to read the previous post for a line by line code description.

Results

We first reconstruct our graph by initiating a tf.Session and restoring variables from the checkpoint file. Then we predict the sale prices of the test data from these weights and biases. Remember to predict the output using the same model you used to train.

Here is a graph comparing the actual values (blue) and predicted values (orange).

This model has a score of 2.23802. This is a slight improvement from linear regression. And this should place a few hundred ranks above your previous rank on the leaderboards.

Scope for Improvement

As you can see, there is still room for improvement. In fact, we started out saying that NNs are, generally speaking, better than linear regression and our NN was only slightly better than the linear regression. Here are some things you can do to make the NN better:

Better feature engineering - Here is a list of things you can do to have better features:
- Keep more features. We dropped lots of features. I bet there is some correlation between these features and the sale price.
- Creating bins instead of using actual features can prevent overfitting.
Better cost function - The cost function we used does not take the large range of sale prices into consideration. Think about it this way - we penalized the model for predicting a $5000 house to be $0 (i.e., a difference of $5000) by the same amount if it predicted a $200,000 house to be $150,000 (i.e., a difference of $5000). We know that this is wrong. Instead, you can define a new function that computes the square difference of $log$.
Prevent overfitting - You can use regularization to prevent this. In fact, in NN, there is more sophisticated method called Dropout. My guess is that this won’t work well because our training data is small. But you should definitely check it out.
Go deeper - Try experimenting with multiple hidden layers and vary the number of activation units in each layer. This is really just a shot in the dark, but you never know what’s going to turn up!

Final Words

I think this is a really great hands-on experience to get your feet wet with machine learning and TensorFlow. If you have any questions, or see any factual inaccuracies, let me know in the discussion below or contact me. I plan on writing more tutorials, especially for the other two Getting Started Kaggle Competitions. If you think you’d want to read those, subscribe to the RSS feed and stay updated.

Hackcon V

2017-08-07T00:00:00+00:00

3 days beside a beautiful lake, under the summer sun. 400 avid hackers who care about the community. Thousands of ideas shared. That’s probably how I’d describe Hackcon V in three lines. But it is so much more than that. Hackcon is the annual conference that brings together some of the most passionate hackathon organizers around the world to share ideas and views on how to make the hackathon community a better place for everyone.

Here, I’d like to share some of the big things I learnt there.

The Themes

The three main themes at Hackcon this year were making the community more welcoming to beginners; making the community more inclusive and; engaging the community.

Hello, World!

Why do we need beginners at a hackathon? For sustainability – the same reason the society insists on educating the younger population. There needs to be a community after the current hackers graduate.

Hackathons can quickly get intimidating. Imagine being surrounded by 400 people who have an IQ of 200 for 24 hours. That’s how newcomers imagine themselves when they enter a hackathon. In reality, it is hardly the case. There is a dire need for every newcomer to realize this.

Having workshops aimed at beginners can definitely boost their confidence. Mock hackathons, like hack nights, can help them familiarize with the hacking ambience.
Most beginners fail to complete their project because they refuse to ask for help. And that’s because they think their question is “stupid”. But the experienced hacker knows that there is no such thing as a “stupid question”. This problem can probably be fixed by tagging experience hackers along with a novice or have a mentor dedicated to helping that team.
Another way to boost confidence and encourage more people to complete their project is to give prizes that are dedicated to beginners.

More Inclusive

The first thing that comes to mind when we talk about inclusivity is probably gender, race, religion, and nationality. But the term is much broader than that. For instance, the education level of participants, and the field the participants are studying are often overlooked in hackathons.

They’re usually dominated by college students studying computer science. There is a difference between diversity and inclusivity. As mentioned in the keynote by Alex de Aranzeta,

“Diversity is about inviting everyone to the party. Inclusivity is about asking them to dance.”

If you are not being inclusive, then having a diverse population at your event doesn’t really count towards anything that actually means something.

Engaging the Community

We need to engage the community and keep the momentum going even after the hackathon. Having lots of workshops, tech talks, coding nights, bar camps, etc. is probably a good way to do this. Involving enthusiastic professors and professionals from the community is another great idea. Finding other student clubs or meetups that have similar goals and interests, and helping each other is yet another great idea. This can also help expand the audience of both communities.

Final Words

I must still say that Hackcon was much more than that. Putting together everything that happened there would be nigh impossible. It must be something that must experienced. My favorite part was the final keynote given by Joe Nash from GitHub, “The Life of a Student Community”. This was my first Hackcon and I’m pretty sure that it won’t be the last. I will conclude by urging you to register for the next hackathon if you’ve never been to one before, and consider attending the next Hackcon if you’re already an avid hacker!

Regression on House Prices

2017-07-31T00:00:00+00:00

Linear regression is perhaps the heart of machine learning. At least where it all started. And predicting the price of houses is the equivalent of the “Hello World” exercise in starting with linear regression. This article gives an overview of applying linear regression techniques (and neural networks) to predict house prices using the Ames housing dataset. This is a very simple (and perhaps naive) attempt at one of the beginner level Kaggle competition. Nevertheless, it is highly effective and demonstrates the power of linear regression.

All of the code used here is available in the form of a Jupyter Notebook which you can run on your machine.

Pre-requisites

This article assumes the reader to be fluent in Python to understand the code snippets. At least a strong background in other programming languages should be necessary. We will build our models using Tensorflow. So basic knowledge Tensorflow would be helpful, but is not a necessity. The tutorial also assumes the reader is familiar with how Kaggle competitions work.

The Raw Data

First off, we will need the data. The dataset we will be using is the Ames Housing dataset and can be downloaded from here. Opening up the train.csv, you will notice nearly 52 features of 1460 houses. What each of these features represent is described in data_description.txt. The file test.csv differs from train.csv in that there are fewer houses and the prices for each of the houses is not present. We will use the train.csv file to train and build our model. Then, using that model, we will predict the prices for each of the houses in test.csv.

You might want to spend some time studying this data by graphing charts, etc. to gain a better understanding of the data. This will definitely be helpful, but we will not do that here.

Cleaning Data

The cleaning of data refers to many operations. Here we will be performing feature engineering (creating new features), filling in missing values, feature scaling, and feature encoding.

52 features is a bit overwhelming. And if you have spent time studying what each of these features represent, you’d probably say that many of the features are redundant to some extent i.e., they play a very small role in the price of a house. So the first thing we will do is remove these features and make life simpler. The code snippet describes the features we want to get rid off. But, before we remove them forever, notice that the total porch area and total number of bathrooms is split into 2 columns. Again, to make life simpler, we will combine them into a single total porch area and a single total number of bathrooms. Now, we can go ahead and get rid off all these unwanted features.

The next thing we want to do is handle missing values. There are various ways to tackle this problem. An aggressive approach is to remove that entire training example. This can be bad if there are lots of missing values because you will lose too much data. But then, why would you train a model if you think you don’t have enough data? A simple and effective approach is to replace the missing value with mode (the most frequent value taken by that feature). A more sophisticated (and maybe better) technique is to study the other features and determine the missing value using probability and statistics. You might have guessed it - we are going to deal with missing values be replacing it with the mode.

The next thing we want to do is scale down the features. The motivation behind this is that some of our features have a large range of values. And this makes it difficult for our optimizer to converge. But, more on that later. We will use the following method for rescaling.

\[ x’_i = \frac{x_i - min(X)}{max(X) - min(X)}\]

Here, $x_i$ is the $i^{th}$ example of the feature $X$ and, $min(X)$ and $max(X)$ refer to the minimum and maximum values the feature $X$ takes respectively. An important thing to note is that you do not want to scale the output i.e., the Sale Price. This can lead to large errors in output and leave you clueless for a long time.

In machine learning, we almost always deal with numbers. But many of the features have letters for values where each letter (or sequence of letters) refer to a particular category. This is true for many datasets. And it also makes life difficult for us. And we do not like it when life becomes difficult. So, we will encode each of these features i.e., we will map a one-to-one correspondence from each of these categories to a number. The code snippet demonstrates how we achieve this.

The data we have now is almost ready for training.

Splitting Dataset

A standard practice is to split the data into 3 parts - training, validation and test datasets. We will use the training dataset alone to actually train the model. Then we will use the errors the model gives on the validation dataset to tune our hyperparamters. But now, the model we trained has “seen” the validation dataset. This means that if we were to report the error the model produced using either the training or validation datasets, our real error would be biased because this model has been exposed and modified to minimize the error on these datasets. This is where the test dataset comes into play. Its purpose is to serve as an unbiased judge and report the error on the model.

Usually, the dataset is divided as 60% training, 20% validation and 20% testing. And we will follow that fashion. We will also shuffle the dataset to make sure data is equally distributed across the 3 datasets.

So far we have been dealing with pandas dataframes. Alas! Tensorflow likes numpy arrays better. So, we will have to fix that by converting the dataframes into matrices. While doing so, we also need to separate the inputs, $X$, and outputs, $y$.

Linear Regression

The Algorithm

As I mentioned earlier, linear regression is perhaps the heart of machine learning. And the algorithm is the equivalent of the “Hello World” exercise. The algorithm is a very simple linear expression.

\[Y = WX + b\]

Here, $Y$ is the output values for $X$, the input values. $W$ is referred to as the weights and $b$ is referred to as the biases. Note that $Y$ and $b$ are vectors and $W$ and $X$ are matrices. This is, in many ways, analogous to the line equation in $2$ dimensions you might be familiar with.

\[y = mx + c\]

The only difference is that we are extending and generalizing this relation to $n$ dimensions. Just like being able to find a line equation between two points i.e., calculation $m$ and $c$, we are going to find the weights $W$ and biases $b$.

In this way, we are going to map a linear relation between the sale prices and the features. It is important to stress on the fact that this is only a linear relationship. In reality, very few events are linearly correlated.

Naturally the question we have is figuring out the weights and biases. To do this we will first randomly initialize the weights, and initialize the biases to $0$. Then we will calculate the right hand side of the equation and compare it with the left hand side. We will define the error between them as the $Cost$ or, the more commonly used term in neural networks, $loss$.

\[loss = \frac{1}{2}\sum\limits_{i = 0}^n{((Y) - (WX + b))^2}\]

Then, this becomes an optimization problem where we are trying to find $W$ and $b$ to minimize the loss. There are various methods to optimize this. As usual we will stick with the simpler one - Gradient Descent Optimizer. Understanding this optimizer is perhaps beyond the scope of this article. But imagine optimizing a function in one variable using derivatives and generalizing that method to a function $n$ variables. That is the core of gradient descent.

Now, let’s jump into the code.

The Implementation

In Tensorflow, we first define and implement the algorithm in a structure called graph. The graph contains our input, output, weights, biases, and the optimizer. We will also define the loss function here. Then, we run the graph in a session. During each iteration, the optimizer will update the weights and biases based on the loss function.

In our graph, we first define the train dataset values and labels (output), the validation and testing datasets. Note that we are defining them as tf.constant. This means that these “variables” will not and can not be modified when the graph is running. Next, we initialize the weights and biases. We treat these as tf.Variable. Pay attention to the dimensions of these matrices. You will run into compilation errors if you get them wrong. This means that these “variables” have the capacity to be updated and modified during the course of our session.

Now, we predict the $Y$ values using the weights and biases using the tf.matmul() function. This is nothing but matrix multiplication. Then we add that to biases. But if you go back to the definition, biases is a single number while tf.matmul(tf_train_dataset, weights) is a vector. This might be confusing because you can only add a vector to another vector. But Tensorflow is quite clever. It understands that we mean to add the same scalar biases to each element of the vector. Think about this as converting the single number into a vector (or matrix) of same dimensions as the other vector, and then adding those together. This is called broadcasting.

Then we calculate the loss as we defined previously. We can safely ignore cost for now. It’s only purpose is to report the error we get. When using the gradient descent optimizer, we need a parameter (one of the hyperparamters) called learning rate. The term is self explanatory - it refers to how fast we want to minimize the loss. If it’s too big, we will only keep increasing the loss. If it’s too small, and the algorithm will converge very slowly. Here, we define alpha as the learning rate. After much experimentation, I’ve decided to use 0.01 as the learning rate. It might be beneficial to vary this value and test for yourself.

Next, we define the optimizer. As mentioned earlier, we are using gradient descent with a learning rate alpha and trying to minimize loss. This will update the tf.Variable elements involved in the calculation of loss.

After that, we are predicting the outputs on the validation and testing datasets using the new weights and biases. Finally, notice the saver. What this does is it saves the weights, biases, and all other tf.Variable into a checkpoint file. We can use these at a later stage to make our predictions.

That is how our graph is constructed. Now, we can run the graph in our session.

We start our session by initializing the global variables. This means initializing all tf.Variable. Then we use the .run() function to run the session for 100000 steps. Generally, the more number of steps, the better your results. But 100000 can seem like a large number and will take a long time if you can’t make use of GPU. If that is your case, you can either install tensorflow-gpu or just reduce num_steps to 10000. After each run, we are storing the cost and train_predictions locally outside the graph. And after every 5000 steps, we are calculating the cost of out model on the validation dataset. At the end of the run, we save the session using the saver we created in the graph.

These are my results after 100000 iterations. The blue line is the actual value and the orange line is the predicted value. It’s quite impressive that such a simple idea can yield really good results. There is still lots of room for improvement though. I will touch upon some of those ideas at the end.

The Prediction

Finally, we are ready to predict the prices of houses whose features are described in test.csv. First, we initialize a new session. Then we restore the variables from the saver. And using these restored weights and biases, we predict the output on the new dataset. You can save that into a .csv file and make a submission. You should get a score of 2.5804. And you should be placed in the top 2000 ranks (as of 31 Jul 2017).

Improvements to Linear Regression

As I mentioned earlier (and as you might have guessed) there is certainly room for improving this naive model. Here are a few ideas to think about:

Regularization - This concept is very very important to make sure your model doesn’t overfit the training data. This might lead to larger errors on the training set. But, your model is bound to generalize better outside your training set. This means that your model is more likely to be applicable in the real world if you use regularization.
Creating bins - Remember how each of the numerical features (like area) are such varying numbers. To prevent overfitting, you can create bins for these features. For instance, all houses with area between 1000 and 1500 sq. ft would be assigned a value of 1 (say). I have seen this idea work really well for classification problems.
More features - I dropped a lot of features reasoning out that they wouldn’t cause the house price to be affected. In reality, I have no basis for that “fact”. Actually, there is a good chance they there is at least a correlation (if not a causation) between them. And any correlation, no matter how small, will help your model. So don’t drop them. Keep them around and test. You can even try your hand at engineering new features that you think might be helpful.
A new cost function - Did you notice the range of house prices? The cost function we used did not take this into consideration. Think about it this way - we penalized the model for predicting a $5000 house to be $0 (i.e., a difference of $5000) by the same amount if it predicted a $200,000 house to be $150,000 (i.e., a difference of $5000). We know that this is wrong. Instead, you can define a new function that computes the square difference of $log$. This will fix the problem of the large range of output values.
Non-linearities - Our assumption was that the output was linearly related to these features. This is rarely the case. One way to fix that is randomly try creating new features $X’$ from $X$ where $X’ = X^n$ ($n$ is another random number) and testing it out. This is clearly impossible and infeasible. One of the reasons why neural networks are amazing is that they automagically identify and map these non-linearities.

Next Steps…

This post is already longer than I intended it to be. And at the same time, I feel that making this shorter would make it less adequate. So, the next article will continue on our discussion of the Ames housing data. And in the next article, we will be using neural networks and see why it can be a better approach. Meanwhile, the code for the neural network is already out there. So you are welcome to continue using the Jupyter Notebook to try out neural networks.

My Productivity Toolkit

2017-07-15T00:00:00+00:00

August is fast approaching. This means that summer is about to end and school is about to begin soon. For some of you, school might have already started. And it’s time to start studying and managing stress again. Learning tough concepts, remembering when assignments are due, juggling time between school and life, pulling in an all-nighter to finish that project, and what not. In short, it’s time to become more productive. This article will outline the tools (software) that I have used and am still using to stay on track.

The Apps

Todoist - Todo lists are notoriously effective in that you get a sense of satisfaction each time you strike off a task. And this app meets all my todo list needs. I can set recurring tasks such as weekly homework. I can create “Projects” to sort my tasks into different categories. What I do is create a project for each of my subject, a project where I put my personal tasks (like writing this blog post), and a project for long term goals. I earlier used to have Wunderlist, but I switched over to Todoist because of its minimalistic interface.
Habitica - This is a habit tracker I use. This makes habits more interesting because of the RPG interface. Basically you build good habits and destroy bad ones to earn experience and gold that can be used to enhance your hero. You can party with friends and go on quests with your hero which is pretty cool and quite motivating if you like games. In fact, you can also make this your todo list. But, as a personal preference, I don’t do that because using this just as a todo list is overkill.
Pomodero Timers - Pomodero technique is quite effective when it comes to channeling your focus and getting started on that assignment you’ve been putting off for far too long. There are a lot of different apps that help you do the same thing. On iOS, there is a cool one called Forest. But if you’re poor (like me), you’re better off using Tide, which also has a cool UI.

The Laptop

As a (college) student, the laptop can be your best investment. But to make it more effective with studying, you should start organizing and cleaning it.

My desktop has just Recycle Bin and This PC icons. So this way, whenever I open my laptop to do something productive, I don’t get lost on a plethora of different files. Occasionally, you might want to place a file right on your desktop to remind yourself that the first thing you do once you open your laptop is to open that file. Another thing to notice on that screenshot is that my taskbar has only the essential icons - File explorer, Spotify (for some music), OneNote to take notes, Visual Studio (because I love C++), and Chrome. This once again, keeps me focused on the task.

Also when your files are more organized, it is so much easier to go back and look for something you saved one year ago. Hence, next time when you’re done with your paper at 1 AM and you’re so tired that you just want to go to bed, take that extra minute to save that paper in its appropriate folder.

One last thing about laptops - always backup your files. Be it Dropbox, Google Drive or an external hard disk. I cannot stress on how important this is. Also carry a few memory sticks in your bag with the most important files. This can save you when your laptop fails to work when you’re presenting something.

The Phone

The next important gadget in human lives today is a smart phone. It is a smart phone. So make sure you utilize its smartness. Once again, I cannot emphasize the importance of keeping your phone home screen clean. Don’t clutter it. Android and iOS both offer you to group apps together in a “folder”. Make use of this.

Here is a neat trick I use to make sure I don’t get distracted by my phone. I group all of my social media apps and games into one folder. Then, I place the most frequently used apps (Facebook, Instagram, etc.) in the last screen of that folder. This makes me crave less for social media than when the app is blatantly staring at my face telling me to open it. Another neat trick: Disable all notifications but the important ones (like phone and messaging). This prevents me from checking my apps every time the badge icon pops up.

The Browser

Internet has made the use of browsers mandatory. And the most popular choice is Google Chrome. Here are some Chrome extensions that I use to make my life better:

AdBlock - I can’t remember how happy I was when this free extension removed all of those annoying ads and pop-ups from those websites. The best part is that it can remove YouTube ads too! And did I mention that this was free?
MixMax - Email tracking. This extension lets you know when people have opened your email. It also lets you schedule sending emails and reminds you to go back to a conversation. Another cool thing this can do is create polls in emails and plan events with a mini calendar. Oh, remember typing out the same email to multiple people with minor changes. MixMax takes care of that by allowing you to create templates that you can reuse. This is not entirely free, but the free version is still totally worth it.
Momentum - Open up your browser to beautiful and serene scenery with an inspirational quote to motivate you throughout the day.

I know that this list is not completely exhaustive. There are many other tools and techniques that you can use to stay productive. I will perhaps outline them in another post in the future as my needs and technology improve.

Ethics in Machine Learning

2017-06-18T00:00:00+00:00

The ethics of how a Machine Learning (ML) or an Artificially Intelligent (AI) system is to function is a common thought that arises when we read about significant advancements in those fields. Will this sentience take over humanity? Or will it help us reach a Utopian era? It’s definitely not a binary question. But, one of the less commonly asked questions (and perhaps rightly so) is “Was this built and founded with the right virtues?”. And this question concerns less about the motivation behind building a ML system than it seems.

Background

If you have no experience or no knowledge about what a ML system is, think of it as a black box. A black box which when posed with a question outputs an answer that has a high probability of being correct. In order to get this high probability, we need to setup the black box first.

In practice, we try to create a set of many black boxes and choose the one with the highest accuracy. To build these we need lots of data and an algorithm. Think of the data as a long list of questions with correct answers. The algorithm learns from this data. Each black box in a set has a slightly different version of the same algorithm. Finally, we pick the version that is most accurate (technically called tuning the hyperparameters).

The Problems

In ML, there are primarily three possible avenues for cheating. They are:

Data
Algorithm
Results

1. The Data

This is perhaps the biggest of the Three Problems. A good ML system needs lots of data. But where are we going to get this data? And if this data we are seeking doesn’t already exist, how are we going to mine it? Sometimes, the data sought exists already. It might be open sourced and free. It might be publically available for a price. Or the data might be privately owned by a group of people. All is well, if it’s free and open sourced. Perhaps it is good even if it’s available to buy. But is it alright if you steal the someone’s private data? You might lean towards “No”. But what if that data, currently accessible only to a few, can help millions around the world with your brand new ML model. Would it then be considered right? Suddenly the question does not seem so black and white.

The answer becomes more ambiguous when we talk about tracking people anonymously, without their consent, to collect data. This data could perhaps be used to detect unusual activities. We already know our web searches are being tracked. If your ML system can help prevent the next terrorist attack by tracking the common people, does it become right to track them? Or does the action at least become justified?

2. The Algorithm

It’s a good thing that many of the important and useful algorithms are open sourced. This means that everybody has access to it and some even allow us to modify it and make profit. This is great! Now, once again, imagine the same scenario with the data. If a group of people own a patented algorithm, the laws make it illegal to use the same algorithm. But what if that algorithm, in the right hands can help millions? Can one’s own sense of right and wrong be used to reverse engineer the algorithm to benefit others? This deals with theft of intellectual property, but is nonetheless a concern of ML.

A problem with developing a new algorithm is closely tied with the datasets. If you don’t have a complete dataset (i.e., you have a dataset that doesn’t accurately consolidate a good number of all possible cases), it might just happen that your resulting ML system becomes biased and it could start discriminating. For example, an AI that helps a bank determine whether to invest in a particular business could deny loans to everybody with a poor credit history even though their business has great potential (something a human would have noticed and made an exception for). This is a bad example of automating human tasks that could take place unnoticed.

3. The Results

The first two problems are concerned with the larger picture. This one is more isolated to ML. In ML, to report the accuracy of a model, we compare the results the model produced to the actual answers. The more close they are, the higher the accuracy. There are different ways to report this score.

The most common way people cheat here, is they train their model on a dataset and report the error they get on the same dataset. This is a common mistake beginners make because they don’t understand that it is wrong. And it’s also a mistake that is sometimes made intentionally to be able to report a greater accuracy.

Why is this wrong? Imagine you are preparing for an exam and you are given a list of questions and answers to prepare for it. If you get the same questions in the exam, is your score on the exam a good measure of how much you learnt? Or is it a measure of how much you were able to memorize? The same is true for a computer. If you test the model on the same dataset you trained, your model will yield a high accuracy because, your model has now memorized the dataset and knows all the correct answers. But if I ask it a new question, there is a good chance that the answer is way off. This problem is called overfitting the dataset. Thankfully, the fix to this problem is very simple, but is out of the scope of this article.

Another way to cheat is creating a synthetic dataset on which the model performs extremely well and using that to report the accuracy.

If you’re wondering if people even do this, take a look at the leaderboards of some Kaggle competitions. In the public dataset (the training dataset), there are many people with high accuracies. But, when looking at the leaderboards in the private dataset (an invisible test dataset), only few who had high scores in the earlier leaderboard got similar results. The others had models that heavily overfit the data. Such a model, if put into practice, is only detrimental to the society.

This might not seem like a big ethical issue. Alas! It still concerns and questions the integrity of a ML engineer.

A moderator?

Many of the questions I posed above are subjective. What might seem right to one person will seem wrong to another. But these problems make use think about what we are willing to do to bring about a good change. I think, what we need is a set of bylaws, a code of conduct if you will, that an engineer should adhere to while designing a ML system. A violation of this code should entail the consequence of the ML system to never be put to use.

And why does all of these matter? It matters because there is such a thing as right and wrong and we must ensure that we always pick the right path to improve the world.

HackCU III

2017-04-28T00:00:00+00:00

Last week, the third edition of Colorado’s largest student hackathon, HackCU III, took place at Boulder. With nearly 400 hackers from all over US, this 24 hour hackathon is the largest one yet. And being a part of the organizing team this year was an amazing experience.

Along with meeting new people, the learning, and having fun, the best thing about a hackathon is simply being in an atmosphere filled with passionate students skipping school, sleep, and what-not to travel a long way just to do what they love - creating something cool. Even if you’re not a tech person, if you’ve ever been to a place that is so full of energy and enthusiasm you’d definitely agree that there’s no other place you’d rather be!

My role

I was mostly involved in the web team and helped build the website. The website was nearly finished early in January, so I started helping put together another event - Startups2Students. As the main event drew closer, we started creating a live page that displays updated schedules, a countdown timer, API’s and hardware available, etc. We used Google Sheets to fetch the information to display on the website rather than editing the source. This was done to make sure that it is easy for any admin to edit the schedule on the fly (as opposed to someone cloning, fetching commits, and all that mess) and to bypass the caching process (we don’t want any hacker to have an outdated schedule simply because they didn’t clear their browser cache).

Other cool things we used

We had a SMS notification system through which we could send text messages reminding hackers about upcoming tech talks, workshops, deadlines, etc. This was a really sweet software we had. However, we never tested the software on a large set of phone numbers. So, unfortunately, during the first run, the server timed out and killed the program. This was because Twilio took a long time to validate a single request and running it on an entire list timed out the process. And during the event, we didn’t have enough time to find a legit solution (like a separate worker/thread). So, the impromptu hack (it is a hackathon) was overwriting the worker timeout.

UPDATE 05/23/2017: I was able to fix it by moving the process to a background worker and making AJAX calls to check for completion. View this Pull request

This year, we also used HelpQ created by the HackMIT team for mentoring hackers. Earlier, Slack was used. But with 400 hackers, Slack is very inefficient and requests for help can get buried in messages. So we adapted HelpQ. It is a very effective tool that uses tickets hackers create to tell mentors what issues they have with their code. The mentors, on the other side, can view all of these tickets and choose the one they want to help with. Despite my initial skepticism, quite a few hackers and mentors used this and I think we will definitely use this moving forward (unless we find a better alternative). You can find some stats we collected from that app here.

During the hackathon

The event was 24 hours and I was there during the entire event. I took a 90 minute (power?) nap at 1:30 AM. At other times, you’d have probably met me at check-in at MATH 100 or at the MLH Hardware Lab helping you folks check out the right hardware. Or you might have seen me moving tables around or caught me taking out the trash or refilling RedBull (they ran out fast!).

I haven’t been to a lot of hackathons. But I felt that HackCU proceeded smoothly overall except for two small hiccups. At the beginning, the lunch order was messed up by the vendor (which we corrected soon to get more food!). And towards the end, there was a lot of confusion and panic among hackers about when they had to submit their projects to Devpost. This was due to the clock on out-of-state hackers’ computers not set to MST (Mountain Standard Time). So the countdown on the live page and the Devpost both told the hackers to submit their hacks an hour earlier! Luckily we found what was going wrong soon and notified all hackers to correct their clocks.

The aftermath

After the closing ceremonies, and after all the hackers had bid farewells, came the most tedious job - cleaning up the rooms. The building we had rented was a new building and the officials wanted the rooms to be super clean after the event. So we manually picked out all the trash and soda cans that hackers left behind. wiped all the tables clean with a solvent, cleared the boards that had been used, vacuumed the carpets, rearranged the tables to how they were before, etc. It was very tiring work - especially vacuuming the carpets. The coffee spills are another story.

10 people cleaning up after 400 hackers is quite a tall task. Since we’re planning to expand to 600 hackers next year, we’re also thinking about hiring a professional cleaning service next time.

Final remarks and takeaways

The venue we had (Wolf Law Building) was not best suited for a hackathon. Firstly, there was no classroom that could house all the hackers for opening and closing ceremonies. There was a court room, but that was out of bounds and it couldn’t serve as an auditorium. This brings us to the next issue - for the ceremonies, we rented a classroom that was 15 minutes away from the hacking space. This was quite disheartening and confusing to hackers. Finally, as mentioned earlier, the officials wanted the building to be spotless after the event (can’t blame them). And the entire place was carpeted. This made cleaning [vacuuming] a tedious job. And, spills are inevitable and spills on carpets are always harder to clean.

If there were so many issues with this venue, why did you people rent it in the first place?
This was the only building on campus that could house 400 hackers and allowed overnight events. Other event spaces were either too expensive or did not allow overnight events (which meant hackers couldn’t sleep at the venue). So we had to make the best out of what we had.

With that said, here are some takeaways. As a hacker (or any sensible human being) you must really take the following seriously when you travel to hackathons (or any other event):

Always clean up after your mess. If you spilt coffee on the table, go get a paper towel and wipe it clean. It is much easier to clean a coffee spill when the coffee still hasn’t dried up.
If it is a mess you can’t clean up on your own (such as a radioactive leak), inform one of the admins or other staff.
If you emptied a soda can, throw it in the trash. Don’t leave it lying around or wait for someone else to do it.
When you travel to another state, make sure to update your computer and phone clocks to the local time (just like your watches during daylight savings). This can prevent mass false panic attacks at a later time.

The future

Now that this edition came to successful end, our team has taken a small break and started buckling up for the final exams. Next year, it’s going to be a lot bigger and better with more cool prizes! So be sure to keep an eye out for us next year and return to make more awesome stuff! Until then, keep hacking hard!

Math-Functions and Computer Science

2017-03-19T00:00:00+00:00

Over the past few weeks, I’ve been compiling some of the recurring procedures I had used to solve the first 50 problems of Project Euler. While solving these math problems, I needed to find the most efficient method to get the solution. The underlying idea to solve most these are the same and it is pretty simple. But as you proceed, the simple method you had used earlier will take a really long time to produce an answer. So you need to improve upon these methods to make them run faster. Sometimes, you’ll have pushed the simple idea to the extreme and it still won’t work. In that case, you need to come up with a better algorithm or implementation.

In this post, I’ll try to work my way through one of the most commonly used procedure - finding a prime number; checking whether a number is prime; or counting the primes - and how I improved it over the course of solving the first 50 problems. During the discussion, I’ll also attempt to give my most efficient implementation (so far).

Primality test

Perhaps the most simple way to check whether a number is a prime is the trial division taught in elementary school. And it is still very effective. In fact, it is the only method guaranteed to give correct result (a consequence of the definition of prime numbers. It’s true that there are other probabilistic and heuristic tests, but none of them are proved even though they work for numbers larger as $10^{10}$ .

isPrime = true;
upper = sqrt(n);
for (int i = 2; i < upper; ++i) {
  if (n % i == 0){
    isPrime = false;
    break;
  }
}
return isPrime;

This is what I’ve implemented in my library. But, we can clearly do better than this brute force. So I will also bring to light a probabilistic method to solve this. This test was proposed by Fermat. This works for most cases. And in base $2$, for numbers up till $2.5 * 10^{10}$, only $21853$ numbers fail. So once can easily store these values in a hash table and if the test passes, searching for this number will reveal whether it is a prime or not.

probablePrime = false;
if (pow(2, n - 1) % n == 1){
  probablePrime = true;
}
return probablePrime;

Storing primes

Other common problems were finding the n^th prime and creating the sieve.

Finding the n^th is a very rote approach where I check every number whether it is a prime or not. An improvement to this is to check only the odd numbers. An even better improvement would be to cache all the prime numbers found earlier. Then check the next number only against these prime numbers. This is the final implementation I chose. Another approach would be to use the sieve. But, we need to create a sieve of size larger than position of the n^th prime. While there are asymptotic functions that produce such upper bounds, they are not accurate for smaller sizes and this worsens the memory usage.

Now, moving onto the sieve, the problems I came across were relatively of smaller range and a simple Sieve of Eratosthenes served well. However I had to refine it and improve the implementation to hit a decent runtime.

Here is my final implementation of it:

bool* prime = new bool[size + 1];
memset(prime, true, size + 1);
prime[0] = false;
prime[1] = false;

for (int64_t p = 2; p*p <= size; p++) {
  if (prime[p] == true) {
    for (int64_t i = p * 2; i <= size; i += p) {
      prime[i] = false;
    }
  }
}
return prime;

There are a couple of things that I’d like to draw to attention here. The first thing to notice is that I abandoned the use of vector. This is plain because, vector is a secondary data structure and they increase the runtime of the program. With a bool array, the program ran in under a minute. The second modification I made was abandoning the for loop. Earlier, I had used a for loop to initialize the values of the array to true. I did away with this using memset. With memset, the compiler can assign values in any order (fastest order). However, in a for loop, you are forcing the compiler to go in one direction.

Here is a chart comparing the runtimes. The code can be found here

Thus, I finalized on this procedure, and it works really well so far. The only caveat is that you do need to remember to deallocate any memory to prevent memory leaks.

Final remarks

There are two takeaways from solving these problems:

Grab a piece of paper and work out the problem by drawing graphs or writing equations. In most cases, you will realize something that you didn’t catch and find that the problem isn’t really that hard. Then, instead of starting by coding a brute force solution, you already have an algorithm to work with. This can save so much time when optimizing brute force.
Sometimes, writing it out won’t work. You won’t see patterns. Recursion won’t help. In fact, it will only get worse. Then you should quickly implement a dirty brute force. Work your way from there and see how you can optimize it. Skip through loops. Cache items. And if there is an alternative to recursion, almost always choose the alternative because, recursion on large items can cause the stack to overflow (which you don’t want!).

What's new?

2017-02-19T00:00:00+00:00

It’s already two months into the new year. I guess that no longer makes it a new year. Nonetheless, there are some new things going on that I thought I should update. You might have read about the goals I had set for myself.

The first challenge, is going really well. I am still yet to miss a day of committing code. Hopefully, I’ll be able to continue to do the same throughout the year. This was mainly possible because this tied in with my second challenge.

The second challenge was solving 100 problems on Project Euler. I have now solved the first 50 problems using C++ with each program running less than 500 ms. You can look at the code on my GitHub. Review it, star it, fork it, and share your views on it.

I am really a long way from completing the third goal. For starters, I decided to work through the classic machine learning course taught by Andrew Ng. It is available on Coursera. If you haven’t heard of it, you should totally check it out. In my opinion, the course is taught really well, though it might have a learning curve if you are unfamiliar with calculus and linear algebra. I have finished 8 weeks, as of now, out of the total 11. I’ve been uploading my solutions to the assignments on GitHub.
DO NOT CHEAT!

But what is new?

True, all of this has just been updates on what I’ve been doing the past month. There are two new things that I plan to work on in the coming days (or weeks, depending on my school work).

#1 - A New Library

While solving through the first 50 problems on Project Euler, I realized that I was reusing most of my code. And the code I had written earlier was just not fast enough to complete the problem in under a second. So I had to look at alternatives and optimize the code. And there were many times, when I found a really fast algorithm for some problem I had encountered. So, I thought to myself, “What if I could just put together a simple library that encompasses all of these functions?”. Thus, I decided to work on putting together this library in C++ with all of these functions in their most efficient form. The list of all these functions isn’t very big. So hopefully, this will not take more than a few days to finish writing.

And I’ll write a detailed post explaining the routines I used and compare their runtimes with other possible routines once I complete it.

Fun Fact:

I’ve found that using a boolean array to be much faster than using a vector of same size. In fact, with a vector, the code took more than a minute to run (I didn’t time it and terminated the program. Perhaps I should provide stats next time…). But with a boolean array, it ran in under 300ms. I’ve also concluded that the memset function, introduced in C, is faster than using a loop to initialize values in an array.

#2 - Course Planner

If you follow me on GitHub (which you should), you’d have seen that I created an app using JavaScript. This app will help you plan courses helping you choose them in the right order. The code performs well on relatively simple input. But on more complex input, it fails to give sound advice, even though it’s logically correct. That’s the reason I’ve not yet made a post on how awesome it is (or I am). The fix I’ve been thinking of involves adding a co-requisite course along with the pre-requisite. In the coming weeks, I hope to work on it and come up with a better algorithm to sort courses.

Those are the two new things I’ve proposed to work on. And as usual, since this goes public, I need to keep my word and work on them.

Because you’ve been really nice and read the entire post, here is a bonus. I joined the HackCU team last September. What we do is mainly organize hackathons. There are two, Local Hack Day and HackCU III. Local Hack Day is already over. HackCU III is coming up and you should totally register for it. Apart from those, we organize Startups2Students. This event is aimed at bridging the gap between startups in Colorado and the students. Once again, you should register for it because it’s free, we provide pizza, and it’s a great opportunity to meet new people! Feel free to hit me up if you have any questions!

2017 Goals

2017-01-11T00:00:00+00:00

It is an unspoken custom for everyone to start something new at the beginning of a new year. There are people who want to begin a new habit that would improve their life. There are people who decide to give up a bad habit. And there are people who set out on some challenges trying to get it done before the year ends.

What all of them are essentially doing is setting goals for themselves and embarking on an adventure pushing themselves out of their comfort zone. By the end, they would have made a difference, at least to themselves and the people around them (if not bringing fiction to life), and that is all that matters.

I too have decided to do something along those lines. I have set three goals for the year 2017. Hopefully they will be interesting and give me a fresh experience.

1 commit a day challenge: Commit at least once a day to an open source repository. This could be anything from personal projects to school work to something else that might pop up. But that one commit should be an insightful one. While I do not mean coming up with a new algorithm every day (that could maybe be a challenge for 2018), this commit should not be petty like editing a Readme file. You can track my progress at GitHub.
Solve 100 problems on Project Euler: Project Euler is notoriously famous for its perfect amalgamation of mathematics and computer science. Although I have already solved 12 problems before the start of this challenge, I will stick to solving the first 100 problems. You can track my progress on GitHub repo.
Implementing Neural Algorithm for Artistic Style: This paper proposes a deep learning network for the creation of artistic images combining various styles. Various implementations of this algorithm keep popping up in my feed. So I decided to implement my own version of this. Since I have little background in Machine Learning, I need to work a lot to accomplish this task. My plan is to implement this in Python using the Tensorflow library.

Those are the three challenges. Now that I’ve put this online, and you have seen this, I need to keep my word and give it my best shot. And in December you will hear back from me regarding my progress on these tasks.

A happy new year to you!

To all Tolkien fans out there who ask me:

“What do you mean? Do you wish me a happy year, or mean that it is a happy year whether I want it or not; or that you feel happy this year; or that it is a year to happy on?”

I say unto you, “All of them at once!”

How I built an app from scratch

2016-12-26T00:00:00+00:00

Popularity on Twitter was never intended to be what it is right now (an app hosted on Heroku). It started out as a weekend project to help me learn Python and APIs. I previously had little knowledge of Python and knew nothing about APIs. Over Thanksgiving break, I decided to learn them using the Twitter API.

The results of the simple get_status function seemed magical. And I decided to take it a bit further. By following a tutorial, I implemented a functionality to analyze tweets and find the most common words amongst them (ignoring stopwords like ‘the’, ‘I’, ‘there’, etc.) and plot a time frequency chart to see the tweet trends with time. But that wasn’t enough.

Adding my small feature

Nothing’s ever enough. I decided to add a small feature of my own that would track live tweets containing the requested search query and calculate a score to determine how popular the query is at that instant. This was where I started facing a lot of problems and thus learnt a lot.

The biggest issue was that the streaming API would not stop until I terminated the script manually. So I had to modify the API wrapper’s implementation of the stream listener to add a timer to stop streaming after the limit exceeds. Then I realized that this method failed when streaming low volume tweets. After scouring stackoverflow to no avail, I came up with a novel idea. I used the original implementation, but ran it on a separate thread. I used a timer in the main thread and disconnected the stream, from the main thread, upon completion of the timer. Check out the gist.

Then there is the calculation itself. As you might know, my formula isn’t necessarily perfect. But it does a good job of giving qualitative results when comparing two or more scores. The formula isn’t perfect because, it does not give you an absolutely deterministic score. Unlike looking at your math grade and feeling satisfied you got a 95, you cannot look at the score of a query and determine whether it is actually popular or not. This is not possible (correct me if I’m wrong) because the Streaming API does not allow you to get all tweets (and you do not want to, unless you are trying to run out of memory). You can only track tweets by applying a filter and there is no empty filter to download all of them.

The algorithm

First, I had find all the factors that determine the popularity. The total number of tweets gathered in the time interval is the most obvious. The number of followers the tweeter has should also play a role because if he has more followers, then the tweet ends up on more user’s feeds. Then there is the retweet count. This makes sense because if a tweet is being retweeted more, then it is clearly reaching more people and getting more attention. The number of likes is similar to the retweet count.

Hence, I calculated the total number of tweets ( $T$ ). Then I summed up retweet count for all tweets ( $T_R$ ) and calculated the retweet index ( $R$ ). Then I averaged the number of followers each user had ( $f_i$ ) across the entire set. Then, for the likes, I divided the likes each tweet had ( $l_i$ ) with the number of followers the user had because liked tweets show up less on someone else’s feed. I averaged this new likes index ( $L_i$ ) across the entire set. Then I summed them all up and divided them by the amount of time ( $t$ ) the tweets were collected.

\[ L_i = \frac{l_i}{f_i} \] \[ R = \frac{T_R}{T} \] \[p = \frac{\sum_{i}L_i}{T} + \frac{\sum_{i}f_i}{T} + R + T\] \[ P = \frac{p}{t} \]

Clearly there are some fallacies here. For instance, I should probably factor in the number of followers for the retweets, similar to the likes count. Maybe I could assign weights to each of these factors and then find the score which would help a lot as it scales down the score to a range. There is obviously scope for improvement here. In fact, I want to improve this and I tweak this often when I get new ideas.

Thus I created something that works alright locally, on the console.

The next level

During the winter break, I decided to take it to the next level by running it on a website. I knew how JavaScript works on browsers, but not much about Python and I’ve come across Python and web app put together frequently.

I used Flask to set up a local server and added some (not so) fancy front-end stuff to deliver data the client provides to my Python program. Then I wanted to host it on some web service to show it to the world.

Hosting on Heroku

This was the next biggest hurdle. I had to learn how this service worked and modify a lot of my existing code to comply with their service. This was harder than I expected because I’m using Windows as my development environment and let’s just say that Windows has its own way of dealing with things that aren’t quite friendly with developers. And migrating the entire project to my VM wasn’t an option now.

I learnt to live with Windows and finally managed to deploy it on Heroku. Then came the next shock. If the client wanted to stream for more than 30 seconds, the request would time out and lead to an error page. So I had to move the streaming process and calculation process to a background worker in a separate thread and lead the client to a loading page, which would periodically make calls to see if the calculation has completed.

Finally, I had to make sure I notified the client if the app began to hit the rate limits set by the streaming API. This was necessary to prevent erroneous results from being delivered to the client and most importantly prevent Twitter from banning my API credentials for making frequent requests.

What did I learn?

A great deal more about python – file I/O, turning a console app into a web app using Flask. I can also confidently say that I will be able to deploy another app on the Heroku infrastructure, which is pretty straightforward and intuitive now that I know how it works. Finally, I learnt a lot about multi-threading and feel safe about using threads, which is something I’ve been dodging for a while because it sounded very dangerous.

Overall, it’s been an amazing learning experience.

7 superheroes I'd like to be

2016-12-17T00:00:00+00:00

Why 7 heroes and not a solid number like 5 or 10? Well, 7 is a magical number possessing the power of wibbly-wobbly, timey-wimey… It’s 7 because I had 7 perfect heroes in mind. I neither wanted to cut it short by removing some nor extend it by adding unnecessary people. That said, here you go:

Doctor Strange
A stupefying costume, a red cloak that makes you fly, access to the mystic forces, proficiency in martial arts, and not to mention a skilled neurosurgeon. These alone make it impossible to not want to be the ‘mightiest magician in cosmos’. And back in the good ol’ days, he was unstoppable. The writers had to nerf him to make the comics more interesting. I guess being the Sorcerer Supreme means being the strongest entity in the universe.
Magneto
If you have read the comics, you would know that this mutant has power over the entire electromagnetic spectrum (not just metals and magnetic elements as portrayed in the movies). This implies he can control everything from light to anything that has a magnetic field associated with it. All atoms have a small electromagnetic field due to electrons and thus Magneto can wield nearly everything. He even lifted Mjolnir by manipulating the magnetic field around it! Wait, what?
Wolverine
Who doesn’t like this absolute badass! Accelerated healing, regeneration, claws, and the entire skeleton laced with Adamantium. This guy is immortal (until they decided to kill him off in the Old Man Logan arc). He’s the best there is at what he does. Quick to temper, you do not want to get on his bad side. Actually, it is not possible for you to get on his good side either. So you better just stay out of his way.
Avatar
The Avatar, master of the elements of water, earth, fire, and air, is probably not your conventional comic book hero (even though there are comics). He/she is still a hero who tries to bring peace and balance to the world. With the power of the elements, comes the power to do crazy things (like controlling lava). And then, there is the Avatar state that gives you the combined strength of all your past lives making you ever more powerful. Enough said!
Wonder Woman
While most of the other heroes on this list are all about fighting and destruction, Diana is a symbol of truth, justice and peace. Oh, she can put up a good fight if that’s what you want but, she is more of a defender of peace and equality. With the Lasso of Truth, Indestructible Bracelets and occasionally the sword, this demigod is the perfect balance of diplomacy and deadliness.
Professor Xavier
Another mutant, but a mutant like no other, Professor X is perhaps the greatest telepath in the Marvel universe. Magneto had to alter the earth’s magnetic field to reduce Charles’ telepathic range. He even has the power to learn new things by tapping into the learning center of someone else’s brain. Eidetic memory, manipulating someone, projecting himself into someone’s mind are a few of the perks that come with telepathy. He is so powerful he can project himself into the astral plane! Even if the others don’t, at least the very basic mind reading should count for something.
Batman
A man with nothing else but (extremely) strong will can do anything. Batman is an example of that. Strip him of all his gadgets, money, martial skills and he will still come out alive. He actually did survive when Darkseid threw him back in time to the Stone Age. He has defeated Superman and survived, and he had a successful plan to take out the entire Justice League. Even Captain America recognized him as a formidable opponent! I think this man, with no super powers, has done some extraordinary things that serve as an inspiration to everyone that ‘Anybody can be a hero’.