If Frankenstein Wrote JavaScript...

Tuesday, June 26, 2012

A Piece of the Puzzle

I'm taking an introductory AI class this semester at RPI. In today's class (class number three of the semester), we have just gotten to the point of talking about some of the different approaches to problem solving in AI. Now, this class is structured around the idea of learning to build a rational agent. Therefore, we focused today on developing a goal-based problem solver. A goal-based agent is designed around the concept of "states". The current position of the agent in its environment, all observable attributes of the environment, etc. comprise the agent's "current state". Every time the agent performsa an action (or any other agents or environmental bodies perform an action in a non-static environment), a new state is essentially created. We work under the assumption (for now, anyway) that all possible states are known. In fact, we work under several assumptions when creating a simple problem solver. We assume that our environment is:

Static - unchanging except when the agent affects it
Observable - there are no unknown or unaccounted for variables
Deterministic - again, we either know all possible states or can at least calculate/predict with complete accuracy the state after a given set of actions

Now, with these assumptions in place, a goal-based agent needs four thing to solve a problem:

An initial state
A set of possible states (or a way to calculate them)
A set of possible actions
A goal state
A metric by which to measure solution desirability, so as to be able to find the "best" solution

This is a fascinating concept to me, mainly because the more I think about it, the more I realize that this is essentially what human brains do a lot of the time. Consider a board game such as chess, checkers, or Stratego. A state consists of the current arrangement of the pieces on the game board. The set of states is quite large, though not infinite, and consists of every possible arrangement of every piece. The sets of possible actions are different for each game and are defined by the various sets of rules, but in general consist of moving the various pieces in different directions, capturing pieces, etc. And of course the goal state consists of either eliminating the opponent from the board (checkers), leaving the opponent with no possible moves while the King piece is in imminent danger (chess), or capturing the opponent's flag. The desirability metric is generally considered to be how few moves the solution takes, though this may be replaced by how enjoyable the game was, etc. This item is far more subjective than the others. Another example is road travel. You want to get from your house to someone else's. The various states may be roadways, cities, buildings, etc., depending on how you frame it. Your actions may involve going straight, turning, slowing down, increasing speed, etc., depending on what you designate as states and how detailed you want to get here. Your initial state is your house and your goal state is your destination. Your desirability metric can be time, distance, road quality, number of tolls, etc. This type of model can be applied to a huge number of scenarios and applications. It is only one part of a much larger set of skills that a brain has, but it is certainly a good place to start. I think I am going to begin withby writing this goal-based agent. This will probably extend to a utility-based agent, which I will probably write another post about down the line (study ahead here), but that change shouldn't be too hard to make later. For now I need to devise a solid general purpose algorithm and a couple of extendable classes to go with it.

Monday, August 8, 2011

Steps along the way (or, the Human Brain is really complicated)

Here I am, 6 days into the project and I haven't done too much with it. This is largely because I have not had too much time, as I'm currently finishing up a summer internship and taking care of all that comes with that.

Now, my progress. Initially, I had hoped to code my own JavaScript Prolog Interpreter, but decided to first look at what already exists out there. I decided, after some quick searching, to base my code around jsProlog, a project developed by Jan (no last name given) at the University of Bristol (check it out at http://ioctl.org/logic/prolog-latest). I took her code, which is tailored to the layout of her web page, removed references to specific divs, etc., and wrapped it in an object that I subsequently named the PrologEngine object. Thus far I can read in a file containing Prolog statements on the fly and interpret them, storing them in a database contained within the PrologEngine object. Querying is still a bit buggy (read: it's not working at all yet), but I hope to have those bugs worked out by the end of the week.

I have spent a good stretch of time thinking about this project. Here are some of those thoughts:

From the get-go, I want this project to ultimately serve as a simulation of the human brain. This means incorporating a few specific things that are difficult to code.

We're talking about things like:

Memories
Learning/Observation/Data Interpretation
Original Thought

Memories

How does the brain remember things? If you think about it, we store enormous chunks of data as single, consistent memories. We learn new things when we revisit our memories. For example, I have met a person and become thorough acquaintances with him, only to realize a long time afterward that I had actually met him years before, in a completely different context. Now, I realized this by reviewing my memories, essentially, just letting my mind wander over old thoughts and putting pieces and images together.

This is one of the absolutely incredible things about the human mind. We remember almost every single little detail about our experiences. This is a massive amount of data, and our mental processors aren't quite equipped to handle all of it at once. Therefore, we simply store it away and wait until ma more convenient time. Whenever it isn't actively doing something, and often even when it is, our mind is constantly processing the data it has stored away. Large amounts of data we have stored is never actually processed at all. This brings us to the second point.

Learning

This is a two-fold issue. First, how does the brain know what connections to make? Second, how does it know what data is important enough to be immediately processed, and how does it know what to ignore for the moment and push to the back to be processed later, or perhaps not at all? We know the brain makes these distinctions. Here is an example: when we look at a painting, we tend to pick up on the major points first. Let's examine Henry Tanner's The Banjo Lesson as a simple example.

Brief aside: I just want to say that I really love this painting. I first saw it in a fantastic post by David Byron over at the blog Barouque Potion. His stuff is fantastic; check it out if you have free time.

Now, I'm betting that when you first looked at this painting, you immediately noticed a few major things:

The man
The boy
The banjo

Whether because of an instinctual communal bond, a tendency to notice the most familiar things or the things that could most directly affect us first, or just because they're the focus of the photo, you noticed the people first. The banjo came next most likely as an observation about what the humans are doing, and as a connection to the title of the painting.

However, it takes some detailed observation and studying to notice some of the following things and connections:

The ruscksack on the floor
The paintings on the wall
The ceramic pitcher and saucer on the counter
The metal pitcher and pot on the fireplace mantle
The physical parallelism between the two previous items
The metaphorical parallelism between these two pairs of utensils and the man-boy pair

The list could go on, but the point is apparent. We notice some things only after extended processing. How do we programmatically decide, though, what to consider first? For example, I would like to be able to supply a topic and be able to search the web within a minute or so and learn as much as possible about the topic. But how do you decide what information on the page is worth looking at?

Another point that can not be lost here is this: Even once the right sentences have been chosen, how do we make connections? How does the computer record facts, rules, and observations? Suppose you were to tell me, "The American economy is a very poor condition." I could easily convert this into a Prolog-style First-Order Logic (FOL) statement as follows:

poor-economy(america).

But how does the computer determine this? How does it select a consistent format? Perhaps a better choice would be:

bad(american-economy).

This is a little more basic and perhaps universally understandable and useful for the computer. However, it takes a little more natural language processing, understanding that "in poor condition" equates to "bad", etc. And there are plenty of other options for how to represent this statement. Our brain goes through a very complicated process when understanding statements like these.

And now we get to perhaps the biggest obstacle of all:

Original Thought

For a long time this has been the thing that I considered unconquerable. A computer is programmed by a person to follow very specific procedural paths. We can squabble over imperative vs. declarative programming, Von Neumann vs. functional vs. object-oriented programming languages, etc., but ultimately it's all compiled or interpreted down to procedures, top-to-bottom operations written in binary machine language. It's all got to be put there by an intelligent human. How is it possible that a machine programmed by a human could ever end up with code that the human didn't put there? How could it ever perform an action or make an observation that it wasn't specifically told to?

This question is something that was pondered by some of the greats in Computer Science history, men like Alan Turing, over sixty years ago, when things like the internet were not yet even thought of. Incredibly, it is still considered an open question in Computer Science, i.e. a question which has not yet been answered. We don't know yet whether the human brain is more powerful (computationally speaking) than the Turing Machine model.

After all, what is the brain but ultimately an enormous bundle of circuits, performing procedural tasks very, very quickly?

I would love to build a JavaScript "machine" that passes the Turing test. That would be amazing.

But there's a long way to go first.

One step at a time.

The next few steps along the way:

Finish PrologEngine.js
Begin determining how to represent data in a universally consistent way
Begin planning how to parse blocks of text for data

You seek for knowledge and wisdom, as I once did; and I ardently hope that the gratification of your wishes may not be a serpent to sting you, as mine has been. - Frankenstein, Mary Shelley

Monday, August 1, 2011

If Frankenstein Wrote Javascript... He Would've Been Crazier Than He Already Was.

JavaScript gets a really bad rap these days. But honestly, I love it. The fantastic mix between the straight-forward and familiar C-style syntax crossed with the surprising number of functional programming elements (list operations, lambda-like function() statements, high order functions, etc.) honestly makes for, in my opinion, a versatile, fairly elegant (even if occasionally confusing) scripting language.

I will admit, however, that I am somewhat biased, as it was the first language I ever learned. I wrote my first "Hello World" script at the age of 12. Since then I have been fascinated with the incredible power and flexibility of computers, especially the capacity for "intelligence" of a sort found in things like machine learning, genetic algorithms, and all flavors of AI.

Today I decided to start a project with three goals in mind: first, to stretch my own abilities as a programmer; second, to have some fun with JavaScript, as I haven't had the opportunity to do so in quite a while; third, to see just what JavaScript can do, and how far I can take it as a computational language.

The project is this: I will be writing a brain in JavaScript. Now, that's a very vague and broad assertion, and a huge undertaking; I understand this. This is one of the reasons I have set up this blog: I want to make sure I have my thoughts organized before I begin. I plan to take this in baby steps, recording my progress along the way. I will be hosting my code on GitHub, and I always welcome any and all advice and suggestions.

So, the plan so far is fairly loose and open (largely because I just imagined this whole project about an hour ago), but will become more solidified (hopefully) as I move forward. I should state off the bat that I don't have overreachingly extravagant goals in mind here. I plan to start where I can and take it as far as it will go, I suppose. But ultimately, I'd like to see it become a framework for simulating certain innate features of the human mind.

I would like to be able to run genetic algorithms, which simulate learning from mistakes.
I would like to run logical decision-making problems, which simulate learning from others' mistakes.
I would like to run information gathering queries, such as searching for a person online or reporting on the stock history of a company, which simulates human observation.

These are just a few examples.

My initial plan is as follows:

1) Set up a GitHub repository and make first commit.
2) Begin work on a Prolog-style first-order-logic interpreter.

And that's all I've got so far. I'll see where that goes and come up with things as I move forward.

Any and all comments, suggestions, and ideas are more than welcome.

"So much has been done, exclaimed the soul of Frankenstein—more, far more, will I achieve; treading in the steps already marked, I will pioneer a new way, explore unknown powers, and unfold to the world the deepest mysteries of creation." - Mary Shelley, Frankenstein