Capstone Beginnings

1 month into my summer vacation, and I have been spending time doing research on ecological modelling, wrapping up work at Code Gakko and doing a side project on scraping and representing MOE data. With 2 more months left, I have decided to start a series on my capstone topic, so that I may document the research I’m doing, and make it more robust. From time to time, I will also put up my thoughts regarding Code Gakko as well as my side project as and when I make some progress.

So, *drumrolls*

THE CAPSTONE

The most important question out there, that Maurice asked me:

What do you want to learn?

After all, the capstone is a 10MC subject.

Choose one approach, master it.

After much thought, I want to learn 2 things:

  1. To know how agent-based models work, what their limitations are, and finding out if there is a scientific way of going about working with ABMs
  2. How to create a webapp for this model, for educational and research purposes

Capstone Beginnings

So far, this is what I have, just a little plan that outlines the knowledge I require to do my capstone. One advice I received from my advisor Professor Gastner is that many students don’t do their literature review. What is the literature review anyway? It is a process by which you look at all the related work that has been done out there, and knowing how your work is situated in the greater literature of academic research.

In my context, I am doing an ecological model that simulates tropical forests for chronosequences of more than 100 years. I am investigating how the biodiversity of a landscape can be affected by different parameters, such as dispersal limitation, functional traits of species etc.

Thus the relevant questions / content I need to look out for are as such:

  1. Why is biodiversity important?
  2. What are current measures of biodiversity?
  3. What are the theoretical factors that affect biodiversity?
  4. Why is there a need for a model in the first place?
  5. What assumptions am I making in my model?
  6. How do I choose the parameters for my model?
  7. Is there a mathematical basis for the model?

I have sought out a few resources to kickstart this research process and they are as follows:

Concepts of Biodiversity / Community Ecology:

  1. Colwell, R. K. (2009). Biodiversity: concepts, patterns, and measurement. The Princeton Guide to Ecology, 257–263.
  2. Cernansky, R. (2017). Biodiversity moves beyond counting species. Nature News, 546(7656), 22. https://doi.org/10.1038/546022a
  3. Verhoef, H. A., & Morin, P. J. (Eds.). (2010). Community ecology: processes, models, and applications. Oxford: Oxford University Press.

Concepts of Ecological Modeling / Statistics:

  1. Kéry, M., & Schaub, M. (2012). Brief Introduction to Bayesian Statistical Modeling. In Bayesian Population Analysis using WinBUGS (pp. 23–45). Elsevier. Retrieved from http://linkinghub.elsevier.com/retrieve/pii/B978012387020900002X
  2. Semeniuk, C. A., Musiani, M., & Marceau, D. J. (2011). Integrating spatial behavioral ecology in agent-based models for species conservation. Edited by Adriano Sofo, 1.
  3. Grant, W. E., & Swannack, T. M. (2008). Ecological modeling: a common-sense approach to theory and practice. Malden, MA ; Oxford: Blackwell Pub.
  4. Gimblett, H. R. (Ed.). (2002). Integrating geographic information systems and agent-based modeling techniques for simulating social and ecological processes. Oxford ; New York: Oxford University Press.
  5. Botkin, D. B. (1993). Forest dynamics: an ecological model. Oxford ; New York: Oxford University Press.
  6. Jørgensen, S. E., & Bendoricchio, G. (2001). Fundamentals of ecological modelling (3rd ed). Amsterdam ; New York: Elsevier.

More thoughts to come!

 

 

Advertisements

An Introduction to Model Checking and Verification

This year, I’m doing research with my professor on model checking and verification. More specifically, analyzing an open problem by Thomas Schelling. The open problem is to do come up with a more formal analysis and precise formulation of certain criteria that the Schelling model tries to fulfill. To put it in simple terms, the Schelling model is about racial segregation and whether people of different types will mix evenly if they even have a slight preference for their own race.

But first, we need to think broader. In the broader perspective, having a more precise formulation is really important. That’s because there are many systems that are probabilistic, and being able to describe them precisely helps other stakeholders involved to make more informed decisions. For example, the software glitches in Toyota Prius caused 185,000 cars to be recalled when good model checking could be used to make sure the system doesn’t mess up.

The best thing I’ve read was the comparison between model checking and testing. Testing is to verify the presence of errors, and not their absence. More precisely, when we test, there are errors that we may not be able to catch (because it requires prior knowledge of what errors could happen), and that requires model checking to first consider all possible ways the system could end up in, and formally ensure that it fulfills the specifications of its task to the degree we want it to. This is really hard to do mechanically and so we rely on math to rigorously verify the models we use.

Now what entails model checking? First, coming up with a way to model a certain problem. There are many techniques to approach this, but the one I’m trying out is transition systems, or how a system would progress from one point to another, considering all the possible states a system could move into. Next is to come up with certain properties that I want the model to fulfill. There are several terminology for properties, for example, one basic property being reachability. What is the probability of an algorithm terminating successfully / error occurring during execution? Another property is long-run behavior for example. Does a system oscillate between two states in the long run, or is there a limit to changes made to the system?

And so there are a few ways to go about doing this. To compute reachability properties, compute as a sum of probabilities if the system is finite. If infinite, then you can’t compute (cause the computer will just go on and on forever). Instead, derive a linear equation system that best fits the reality. From there, solve for all the states simultaneously. There’s also the method of expressing reachability probabilities as a least fixed point, solved using the power method. To compute precisely, long run behavior, by computing the solutions to the linear system, you can find out whether there is a limit, whether it depends on an initial distribution or state, and other properties of the system in the long run. I can see its usefulness in understanding a model more precisely.

To do all of this, there is a software that I’m learning to work with, called PRISM. It’s a model checker software that you have to code in order to describe the states of the system as well as the behavior of its evolution. But before I even get into the workings of PRISM, my job this coming week is to dive into the habit of modeling phenomena and being able to formulate them precisely. Also, to come up with a few properties I want to check regarding the models. As I continue doing my research, I’ll keep track of my progress here, so stay tuned!

Internship at Metalworks – Week 11 [17 August to 20 August 2015]

Instead of talking about technical things for my last post on Metalworks, I’m going to take this time to pen down the things I’m really grateful for.

The people

I came in knowing almost nothing about software development. Previously, I had come in with knowledge of only algorithms, data structures, and basic computer science, but nothing about how to create good functioning software that other people can use. 3 months later, with the patient guidance of my fellow staff members, I learnt more than I could have asked for. Whenever I had a question, nobody grumbled or had any nasty things to say, and they spent time to make sure I understood anything.

That’s what I really appreciated at Metalworks. No office politics. No backstabbing, no gossip. Just really fun conversations about any topics from literature, to cool technology, or the latest happenings around in town. We have this channel in slack called #random where our staff posts really interesting things happening. Lunches are really nice too – we all take lunch away and sit at a wooden table near the pantry. These are a relaxed affair. Very rarely do we continue to talk about work, and it is a great time to recharge.

A huge shoutout to the full time software developers Rollen and Jayden who were such great mentors. They know a lot about software and hardware, are really smart and perceptive, but have no airs about them. They always explained their code really eloquently and allowed me to ask as many questions I can about it. I owe them a lot, and they really made my experience at Metalworks.

And then there are the fellow interns who do a fantastic job and inspire me to work smarter and harder. And then there is the PR manager Daylon whose unwavering focus and zen-like attitude inspires me to get my act together more.

And finally, the heads Tom, Mark and Nico who are really knowledgeable and hands-on about the tech they manage. After asking around considerably, I understand the amount of work they have to do. I see them stay back on weekends to work on the huge amounts of projects that come in, and they keep our job fulfilling by giving us really interesting tasks to handle.

The work

Because of the amount of clients we handle, we are almost always guaranteed an interesting spread of technology to use. From UV Cameras to VR Smell Components, I’ve certainly had to put a lot of quick thinking to the test. Rapid prototyping is a really hard thing to do, because it requires a lot of research. Research usually entails figuring out whether a certain technology can be used, and that means experimenting with the APIs (if any), actually testing them out, and then writing a little report to explain to people how things will work. It’s not easy, as you can run into a roadblock quite quickly.

So, that’s it from me! Next week, I will start work on my computer science research proper and I will be coming up with a proper framework soon 🙂 Stay tuned!

Internship at Metalworks – Week 10 [17 August to 20 August 2015]

My penultimate week at Metalworks has ended and I have many thoughts about my internship which I will share next week. Meanwhile, I’m leaving for Japan in 2 weeks and have just started research with my computer science professor on model analysis and verification. I’ll speak more about that in another post, but this week at Metalworks was about touching up on my work on image processing and running some debugs.

Monday – Thursday: Running debugs / touching up on code

Image Processing

This was the structure of my code:

for x in range(len(p)):
for y in range(len(p[0])):
do: <insert task>

Having two for loops is disastrous for run-times in a Raspberry Pi. Even on a Pi 2, it took a good 15 seconds to process – not good enough for the project I was working on. Thus, we threw that out of the window. Instead, we eventually settled for a much much simpler option, which was using the image editor already embedded in the Pi. GraphicsMagick did the trick in less than 2 seconds on the Pi 2, which really makes me wonder how it did it so quickly. The command was really just this: gm convert -modulate <brightness, saturation> <input filename> <output filename>. Jay and I thought it must be multithreading, using multiple processors in the CPU to do the job. Or perhaps they used C++ and pointers to loop through a bitmap instead of an array. I will have no idea as of this time. I tried looping through the image bitwise, but that proved to be too long compared to GraphicsMagick. Incredible, it reduced the code to just one line – the power of libraries.

But that also left me with lesser flexibility to do what I really wanted with the image, which was to enhance contrast given a certain criteria, but while we are at prototyping phase, sometimes we can settle for a little less.

Multithreading

One of the features we tried to implement was when one function was being run by the CPU, could we run another process simultaneously and break the foreground process when necessary?

That brought us to multithreading, which was a whole new world for me altogether. In my algorithms and data structures class, we briefly touched on the topic of parallel programming, but I had no idea how to access this topic. Well, I’ll leave that for another time, but if you want to know a little more about it, a quick google search brought me here (quora), here (stackoverflow) and here (of course wikipedia). I still don’t understand it fully, but I will eventually get there once I’ve learnt enough in class.

Eventually, we used another method to create the same effect we wanted, which was just about killing processes (less elegant, but does the job – sounds like a prototype??)

Raspberry Pi / Command Line

I admit. I do not have enough experience with command line work, and I’m often made to look very amateurish when it comes to doing things efficiently. It is really about the little things like “cd ~/<insert filename in root folder>” or “sftp pi@<insert IP address>” that is the difference between a seasoned user and someone who just flops around on Google searching for command line shortcuts.

And I guess, that concludes the reflections for this week! There really wasn’t much as the rest was touching up on the projects and doing rounds of testing and debugging and pushing code. Watch out for my post early next week on the beginnings of my research.

That’s it from me!

Internship at Metalworks – Week 9 [11 August to 14 August 2015]

I’m back! I spent my 2 week hiatus in New York and Boston to attend my brother’s wedding ceremony and I guess take a little break from work. The moment I got back, it was straight to work. I handled quite a difficult task (for me) that was image processing, and I had never done it before. The only little thing I knew was that images are made of pixels and that they could be manipulated. So this post will be all about image processing and a little introduction to computer vision.

Monday: Public Holiday!
Tuesday: Briefing
Wednesday: Project
Thursday – Sunday: Image Processing

What is computer vision?

There is the Wikipedia definition, but here’s how I see it. Computer vision is a field that concerns with how computers view things like humans do and how they can manipulate the things we see. The most basic example is facial recognition. A computer looks through a set of images and runs through the pixels. There is a certain criteria as to what determines a face and that’s usually by color. By detecting RGB (red blue green) values of an image, the computer is able to determine, whether by a probability model or straight up color prediction, whether there is a face there.

A little more in depth

Usually, an image is represented as a 2D array, one array for the y-axis, and another array for the x-axis, nested in the first array. And each value in the inner array contains the RGB values, represented as a tuple, i.e. (255, 255, 255), and usually in 8-bits. If the image is converted to grayscale, then it is just a single value to represent the “whiteness” or “darkness” of a pixel.

Finding an efficient way to access these pixels are important, because looping through the arrays means lots of iterations. It might be fine for a 800 x 800 image, but for 5MB images that go 5634 x 3687 pixels, that’s a lot of times to go through just one iteration. And that’s just looping through it, let alone making modifications to it. This means that to achieve a reasonable runtime, the algorithms have to be efficient. Imagine for the 5634 x 3687 pixel image, adding one additional step within the loop means performing an additional 9000+ steps. That’s by no means trivial.

And then there’s manipulation

There are a few basic ways to make changes to images, best if they are binary. Then it makes detection so much easier. But what if things are in gradients? My project this week required me to detect and make changes to colors over a spectrum and not just one pixel value. I went to search up a few algorithms that could’ve helped me achieve this, but they all dealt with binary images. For example, how to detect black dots on a color background. That is done by converting the image to a HSV color space and then isolating the black color from the rest. Other ways included using adaptive threshold to input a threshold and then getting the other pixels to adapt to it. Problem is that they are all binaries, and the algorithm sets the colors to either that color or 0 (black). That makes for really unnatural effects, which was not in the job scope of my project.

The solution?

Write my own algorithm. It’s by no means an easy task, and there’s certainly a lot more sophistication that could go into it. But in layman terms, here’s what it tries to do:

1. Isolate the portion that I want to detect.
2. Crops it out for the computer to work on
3. Uses percentiles to check for the relative brightness of the pixels.
4. Examine the color of each pixel in relation to the percentiles I determined.
5. Makes the necessary changes if it satisfies the criteria I input into it.

There are more clever ways I could go about implementing this, like using block sizes to filter out parts of the image I don’t want to be detected, or using recursive calls to use the previous pixel values detected as the criteria for the changes I need to make. But I had a time limit, and currently, my skills with Python aren’t advanced enough to do it. At the moment, I had been trained so much in OCaml, that thinking in Object Oriented Programming is more difficult for me.

On Runtime:

In computer science class, I learnt that anything that runs at O(n^2) is really bad. Like insertion sort is O(n^2), and you could very well do with better sorting algorithms. The algorithm I tried was O(n^2) as it used one for loop to go through the image, to grab the relative brightness, and then another loop to make the necessary changes. However, there was a dependency of one loop on the other, meaning that for the second loop to run, I needed the first loop to run first. The binding of these processes meant that I had little choice but to run two of them one after the other.

Another thing, most processes run on the computer are fast. They do the job pretty well. However, when they’re transferred onto a microcomputer, things can go pretty awry. So for our project, we had to do it on a Raspberry Pi 2, and before that we had to install a few libraries, mainly numpy. Numpy took 1 hour + to compile on the Pi, and we kind of regretted not compiling it first. Also, cause on a computer we can usually override the “Permission denied” errors easily just by doing sudo once, on the Pi you have to ensure it is being done consistently.

My algorithm ran 5-7s on a computer, but ran 20s on a Pi. For our project, that was quite disastrous. And now, I’m trying out other algorithms, but they’re not doing any better at all. I’ll update it later this week.

When it comes to real-life work, runtime matters a great deal.

A little musing on OOP (Object Oriented Programming)

I’ve dealt with a few languages based on OOP, namely Java, C, Javascript, Swift, and most lately, Python, and there are a few commonalities in my experience and frustrations with them:

1. No error catching at compile time.

This means that when I run a function, it goes through the compiler without showing any errors. For me, cause I don’t have the habit of printing debug statements throughout my code, I am unsure of where the source of the error is. The system may throw an error saying that there was a wrong type here, but that also means reviewing lines of code all over again to identify what’s actually causing the error. It takes a lot of time to do so, and having the habit of testing your code after small snippets really helps. Also, when code doesn’t run, they don’t tell me either, so sometimes I’m left wondering what went wrong.

That being said, I have enough experience now to know how to set up my own code to find bugs and test properly.

2. Functions are not first-class objects

This means that I can’t just pass functions around in other functions easily. It can get quite tedious to write code after awhile, and I really appreciate functional programming languages for making it so effortless.

Final Thoughts

This week was a good introduction to computer vision though, even though I probably only scraped the surface. More importantly, I’m learning how to write good code on my own, and understanding how to find errors in them in a new context.

Internship at Metalworks – Week 8 [13 July to 16 July 2015]

Notice: I’ll be taking a hiatus from this series, from the 20th July onwards till the 10th of August, as I’m going to Boston for 2 weeks! I will be updating this blog but in a different capacity, to document my explorations not relating to my internship.

This week, was really all about the one project I was tasked with. As it makes sense for me not to reveal the inner workings of the project, I’ll talk about some things I’ve explored during the week, and something I did for MakeDay!

Monday: Project / Make Day
Tuesday: Make Day / Project
Wednesday: Project
Thursday: Project
Friday: Hari Raya!

MakeDay

At Metalworks, we get one day a week to do anything we want to do, preferably non work-related and something creative. We call it MakeDay. The possibilities could go anywhere really, from remote 3D printers to wireless chargers. Some choose to learn a new framework like Ruby on Rails or learn to work the arduino. A previous intern did Arduino Pong, which was pretty cool.

I decided to do something a little different and try out computer graphics manipulated real time. My initial inspiration was this video made by Adrien M and Claire B, where they created this IDE called eMotion. Their main focus was on particle manipulation, and you could come up with really cool interactive graphics from it. I also realised that they had Leap Motion integration so I took liberties to have fun with it. This was the result:Screen Shot 2015-07-20 at 10.55.35 PM
T
his was just a basic example that explored the different motion brushes to see what kind of effects you could pull off. An important thing I noted was how different effects had different dependencies. They don’t work with all types of particles, and the parameters required to activate them are very different. To get the right strength for the effects is one thing, to use it for a good artistic purpose is another thing. I’d like to explore that a little further. Also, I don’t think I’ve explored the full potential of leap integration just yet, but somehow I guess the leap is quite limited. My goal would be to mimic the above video but with different graphics.

I’ve always wondered if you could script certain effects specific to certain hand motions, for example if you swiped right, the particles would react to the speed at which you motioned at. And there is a way to do that with the scripting feature on eMotion. I haven’t gotten a chance at doing it, but hopefully I will get my chance at it when I’m back from Boston 🙂

Project

I’ve been working on a retail project for the past few days, and it entailed tons of iterations. If anything, I’ve begun to question a bit more than I used to. I kept quiet most of the time because I wanted to observe how things ran, and now I’ve done so, it’s time to start asking questions. I went to clarify the position of the company, what our main value offering is, and what kinds of opportunities are we looking out for.

After 8 weeks here, I think I’ve got a good idea. Credits to Rollen, we do 3 main things: Prototyping, Production and Pitching. Prototyping is providing a proof of concept to our clients. Production is making a product worthy of going public. Pitching is really the initial phase of putting the idea out there for the client. This week was about prototyping. But how do we add value with prototyping? We find new ways to use technology and show how it can be done. We do the necessary research to check for its feasibility and hack away at a very rough design in 1 – 2 weeks. If the prototype we are doing already happens to be implemented elsewhere, then we work on pushing it further ahead by adding a few features that our clients have not thought about.

Modularity

While building the prototype, modularity is really important to troubleshooting. It helps you identify the problem, isolate the parts that aren’t working, and fix them without affecting the other parts. It is a basic principle in software development, but equally applicable in electronics. It saved me countless number of hours spent resoldering, or testing connections in places I shouldn’t have been checking. And the idea of plug and play makes changing broken components or even just putting together the prototype much more easy on the eye. It’s always stressed in my computer science classes at Yale-NUS, but it’s only when you encounter it in real life that you really appreciate the countless number of reminders that the prof drills in.

But yeah, that was this week, and Boston here I come! Going to play around with this Haskell eBook that Rollen passed to me, and it’s time to do a review in functional programming. More thoughts coming soon 🙂

Internship at Metalworks – Week 7 [6 July to 10 July 2015]

Work has gotten a lot busier this week. I’ve been tasked to handle a project individually, so I’ve been spending a lot of time on it. Reflecting on 7 weeks at Metalworks, I felt that I’ve become a lot better with handling & reusing old code, coming up with ideas for projects, and researching on how to turn it to reality. This week and this coming week will be the test of whether I can push a project through. It isn’t a big project but I’m counting on small steps to learn and improve.

Monday: Electronics – soldering, proto-boards
Tuesday: Wiring
Wednesday: Make circuitry more robust
Thursday: 3D Printing first iterations, Processing (to play videos)
Friday: Processing code editing, 3D Printing iterations

Tearing Down and Starting Again
20150706_185637
Something I gotta get used to re-do things even when they’re going well. I have a tendency to get attached to what I’ve created and create my own inertia to budge from my current path. But there’s a limit to inertia. I built the circuit above and wired it pretty well to function how I wanted it to, but it didn’t fit exactly to the specifications of the design I was required to build, so I had to tear it down. But before I did, I did small iterations with each part of the circuit to make sure I knew what I was doing, and I was quite happy with the results.
20150708_101553
In one of the iterations, my supervisor thought of using USB ports to join up the pieces together, and it was a great idea. I had originally used a different configuration to achieve the same effects, but thinking about it, the USB port would really do it. Well it took a while to realize that the name USB does really have a proper meaning to it. Like its called Universal Serial Bus. Serial because it transmits serial data, and bus, because it joins up electric circuits. In mainstream, we often toss around the word USB freely, but it does so much more than simply shift folders around from one device to the next. With a little hack, I managed to create these kinds of connections and get them to work. Being able to throw away old ideas and use new ones isn’t a revolutionary idea, but it is important.

Iterations

It’s important to iterate, but even more so to iterate properly. 20150710_100220
This is what I tried at first. An ambitious attempt to go all out at once, which was obviously not going to work. Almost nothing fit. I then worked on smaller iterations, and got the pieces to fit, one by one, along the way discovering certain principles of 3D design, like how much room to give holes in design, and how to make each iteration less time-wasting, like really sizing down on the amount of PLA I’m using. These were all really important, and I managed to get them to fit in the end. Yay! \^o^/

01 Casings 02 Casings

Reusing old code

This project had been done before, and it was important for Metalworks to do it again to build a good repository. Unfortunately, the code wasn’t very well documented, and the old hardware had been torn apart, which is why I’ve been working on it. So I had to reuse old code. It was hard at first. Processing was written in Java, and I hardly have any good experiences with Java, so I was a little hesitant, but my supervisor ran me through the code once and that was really important. On hearing the explanation, I quickly dove into the code, and commented out the document into parts so that I could understand it easily, and hopefully others too. I wrote out explanations for ambiguous parts of the code that I couldn’t catch immediately, including some “under the hood” events. All this seemed important so that the code that I’ve edited and written is more reusable for the next person who may use it.

I found Processing interesting, and in about a day I managed to write a class, make split windows, display text, play a video, and read serial inputs. I’m happy with my progress, and it allowed me to focus on the 3D prints for next week.

Wrapping Up

The act of creating really excites me. I get really motivated whenever I get the chance to create something I can call my own. I dove in, got sucked in, and now there’s no turning back. I’m going to keep exploring, keep making, and make the most of my time here. As I was thinking, it is rare to have the chance to deal with hardware, and rare to have great support from my team mates, and I will be pushing on from here. yaaaa!