Tuesday, June 24, 2008

4 Color Gift Cards

So my store is working on converting to some new stored value gift cards. Our software supports the capability so now the only problem is how to get the cards themselves. We're looking at going with 4colorprint.com . They'll print 4-color gift cards for us and print serialized barcodes as well. This works nicely, because we can have each customer we sell a gift card to have a unique barcode so that they can do balance inquiries, add and subtract value, and allow us to manage the accounting easily. Their prices are also excellent; I'd recommend them to anyone looking to do something similar.

Tuesday, May 6, 2008

Final Project progress

My final project is exploring an unsupervised neural net in the framework of stereotyping. Unsupervised neural nets just search for patterns on the data that has been entered and does a sorting just based on similarities of certain traits. It doesn't require a teaching signal and it seems to do a decent job of sorting based on a simple REALbasic construction of it that I have used. I haven't yet been able to get my implementation to compile when adding the stereotyping component to it, but I'm still hopeful. I'm using a network based on Stephen Grossman's work: the Adaptive Resonance Theory network. It uses the vigilance parameter to yield formation of new nodes. I expand it by allowing these nodes to backpropagate weights through the entire network. This backpropagation allows there to be change of the original stereotypes.

I've been reading a lot on a couple of different network models used in social cognition. First, the tensor product model which uses the tensor product of two matrices and has the interesting property of being able to add context to an event. I may try to explore using the basics of the tensor product model in implementing my new network. My network uses vectors (in the linear algebra sense) as the basis for my input patterns. So, I can just take the tensor product of the vectors used as my input data and look at the kind of relationships it can build.

What's interesting about the model I'm building is the way it clusters the pattern representations. Essentially, it sorts the vectors by seeing if they're "closest" to each other in n-dimensional space. It's easy to visualize if the vector has 3 or less members of its pattern set, but beyond that we enter fourth dimension and into difficult to imagine scenarios.

Anyway, this model has been very interesting to explore and I hope it can yield some usable results.

Tuesday, April 29, 2008

Sci Gen: An Experiment in Pluralistic Ignorance

I mentioned in class a randomly generated computer science paper that I "wrote" and showed to friends. I looked up the project again and although at the time I played around with it, I didn't think of it in psychological terms, it's an interesting idea to consider. The basic premise was that using a variety of technical terms and a an algorithm which assembles them into mostly sentences, a computer science paper was generated. Now, to anyone with much common sense, reading through much of the randomly generated papers will tell you pretty quickly that something is wrong. Yet, a number of the papers have been accepted to journals and conferences before the submitters told the journals that it was a hoax. See http://pdos.csail.mit.edu/scigen/blog/ for more details.

I wonder if this is a pluralistic ignorance effect of some sort. If a journal editor doesn't understand a paper, might be afraid to ask his superiors or question the author? The friends whom I showed it to either didn't read it or read it, didn't understand it, but didn't want to admit that they had no idea what "I" was saying in the paper. It's an interesting idea and I wonder how pervasive this effect is within academia.

Wednesday, April 16, 2008

Implicit Learning

I’m doing research for my final project, which will be exploring stereotypes. I’m trying to get a handle on implicit learning so I’ve been reading several books on that. Particularly, stereotypes seem to be automatic to a fairly high degree, so I want to understand implicit learning well enough to be able to conceptualize a model implementation. I’m reading a book conveniently titled Implicit Learning by Axel Cleeremans. Some interesting things I’ve learned about implicit learning:

Subjects who are given a task such as artificial grammar leaning very often do quite well with learning the rules and pattern recognition when given this task and simply learning implicitly. They also cannot verbalize many of the methods by which they have been able to come up with these rules. Interestingly, when told to “look for rules” subjects’ performance decreases. They do much better on a sequential learning task like that when only learning implicitly. I’ve noticed this sort of thing when doing calculus. As one gets to the point of integrating more complex functions, it becomes much less of an “apply the rules you’ve learned” game to a subtle art. When given complex functions, one often has to just try several different methods until the way that works well becomes clear. I’ve noticed that as I practiced calculus more, I was able to get the complex functions done in less steps. Of course, I can’t verbalize why it’s easier or what about a function makes me decide that integration by parts is the way to go, but nonetheless I can do it.

Relating this to stereotypes, I wonder how much of stereotype formation is implicit. Going back to my convenience store stereotype described in previous posts, I can’t tell you (for the most part) what about a customer makes me decide that they will ask for a bag once they get to the register. But nonetheless, I immediately make these sorts of assumptions. Some stereotypes (ethnic, gender, etc.) are quite salient and it’s easy to see how one applies a stereotype to them. However, not all are immediately evident. Further, not all of the traits that are applied to a stereotype are evident as to why we apply them. Again, though, some are. Statistically inaccurate stereotypes seems to be a case where implicit learning has failed. I don’t believe we consciously decide “I am going to form a stereotype about Asians. I will describe them with these characteristics: ...” We might have learned them from our culture, our parents, our environment, society, etc. But it’s unlikely that we purposefully formed our conceptions of people.

Anyway, I’m getting very excited about getting into the network creation for my final project. I will readily admit to being baffled by some of the math that some of the network books I’m reading have, but I think I’m finally starting to get some clarity and inspiration on how I can implement a stereotype framework into a neural network.

Monday, April 14, 2008

An Empirical Study of Stereotypes

The title is slightly misleading, but I was called into work at my convenience store on Friday due to an IT emergency. Our main cash register experienced a database corruption and aside from my duties as hangover vomit removal specialist, I also serve as the store's IT Manager, so I came in to deal with that at double overtime pay. I wish there were more emergencies that needed my immediate attention at double overtime.

While I was waiting for the database to rebuild, I decided to test my "customer stereotype accuracy". As a customer entered, I made note of what my unfounded assumptions about them were, then noted whether they were correct. I got 7.1% completely correct and 17.3% partially correct out of 197 customers who entered. I only counted customers who bought something and I did not include any friends that were with these customers unless they bought something as well. I also did not count any of the "regulars", whom I would already be familiar with. So, at best I had about one fourth of my assumptions correct. It seems that there's quite a confirmation bias when it comes to that stereotype; I thought I had quite an accurate picture of our customer base, but in the end it turned out that most of that was just my only noting my correct assumptions. I would speculate that this is what fuels a lot of stereotypes. Confirmation bias seems to be the main reason why my customer stereotypes are maintained. I'm hoping to do my final project on stereotypes, so I wonder if there's any way I can try to perform any sort of confirmation bias framework for my network.

Of course there are several problems with my "experiment:"
1) No control
2) I was quite stressed at this time, as our Friday sales are some of our biggest sales day, and a colossal database crash is quite a stressful event for me. Running that many customers through on our single backup terminal while simultaneously repairing our main system probably caused me to be a bit terser with customers than I normally would be and thus might have affected their attitudes.
3) One particularly annoying customer whom I ended up having to call the police on put me in a very terse mood that, again as above, might have affected my interactions with customers.
3) There's the "awareness" factor where, as both the experimenter and the subject, I was probably not making the same assumptions I normally would be, or at least had a different pattern of behavior.

In any case, it was an interesting study that I think will affect how I view customers and how I treat people who enter our store.

Friday, April 11, 2008

Stereotyping and Convenience Stores

So I worked (and still come in to do occasional tech consulting for) a convenience store in Collegetown Ithaca, home of the Ivy League Cornell. I worked there regularly all through high school, four years, and as I worked there, I discovered that as soon as someone walked into a store, I made a few assumptions about them:

1) What kind of things they would buy (Beer, frozen yogurt, cigarettes, grocery items?)
2) How they would pay (Cash, Credit, City Bucks (the Cornell off-campus dining plan)
3) If they were buying beer or cigarettes, would they have their ID?
4) Would they ask for a bag?
5) Would they be polite, or would they give me a hard time?

It's amazing, based on a five second analysis as someone walks in, I've unconsciously made a determination about a person's demeanor based on their appearance and nothing else. I like to think that it's based on some sort of knowledge and experience, and to some degree it is. There is definitely a stereotype of the "fro-yo girl" (affectionately called "fro-yo hoes" by the store staff). She's always on her cell phone, pays with City Bucks, and buys frozen yogurt regardless of how cold it is. My store has gone as far as creating a "fro yo hoe of the month" promotion, which has been surprisingly successful. The recipients (with their consent) are awarded free frozen yogurt for a month, a t-shirt, and allowed to choose one of the flavors all because they fit a stereotype well. It's quite interesting to see how receptive most of the girls are to receiving this "honor."

Seeing how easily I formed my own stereotype about who tries to buy beer without an ID and who will ask for a bag based only on appearance seems to really speak to the way that stereotypes are so easily formed. It should be interesting to see if I can model this automatic behavior in my cognitive neural net; this is the topic I hope to study for my final project.

Homework 2: Recreating a memory effect

So I am finally writing up homework 2. After weeks of procrastination, fifteen hours of little progress and subsequent panic, and finally three hours of working results, I finished. I used JOONE to recreate the primacy and recency effect, though only succeeding in creating the primacy effect. I used a new kind of layer: the delay layer. This layer allowed me to give an input node a "delay" on being processed through the network. This worked by having undelayed nodes being processed more than the delayed nodes. A parameter, called taps, allowed me to set the number of cycles by which the process would be delayed.

My main difficulty was figuring out how to measure the results. The main idea is "how did it learn a set of data?". I gave it a set of values on the input node, and had a teacher signal give it a result, then tested it with various levels of delay. I had a large set of data so as to keep the learning fairly randomized, so I had two hidden layers as well. I then ran it through just 100 epochs for each training set. This seems somewhat consistent with a basic learning paradigm, where we would be exposed to words or some other stimulus and our neurons would synapse many times even in the few seconds that one might be exposed to a stimulus in a primacy/recency task. I was able to get a consistent result where the undelayed (first stimulus in a series) item had the lowest RMSE, which I used as my baseline for measurement. My problem came when I tried to determine how to get a recency effect. Primacy and recency are probably the result of two different processes, so a different approach is needed to get a recency result. Particularly, one needs to be able to recreate a long term and short term memory system and have some way to distinguish the recall between these two. Perhaps two input "networks" that have already been trained could be transferred to an output node via a learning "switch" that has the short term more salient and accessible, but also has things learned in there not quite so permanent. There would also need to be transfer between STM and LTM. A complicated proposal for sure!

Heuristics

We have been studying heuristics: "rules of thumb" that our memory system uses to make likelihood judgments. They're not perfect, but they give us a good idea about what's likely. Sure, we could be more accurate if we didn't use them, but we'd also be less efficient. The availability heuristic is particularly interesting: we make a judgment or assumption based on whatever is there. We could be given a totally unrelated figure and "latch onto it", using it in some unrelated guess.

It's as I study the issues like these of the memory system that I realize part of why we have the scientific method and, in math, the axiomatic method: we need to prevent the errors in human judgment affecting our ability to learn more about the world. If science was based on introspection and just figuring things out based on the way things appeared to be, we'd have so many false assumptions about the world. Our memory system is incredible, but it can also lead us to very powerful mistakes.

Illusory Correlations

We're studying a bit about illusory correlations and what especially interested me was the note that the author began with, that while many doctors and patients believe arthritis pain is correlated with weather changes, the scientific data is not present to support this. This interests me, as my fiancée suffers from fibromyalgia, a disease similar to arthritis that causes chronic pain and both of us seem to notice the pain is worse in winter. Yet, I looked at studies and confirmed that there is also no link between weather and fibromyalgia pain. Are we really that bad at seeing correlations? The ebb and flow of pain is constant, but we find a link because everyone else seems to find a link.

It appears that our system is quite susceptible to suggestions from others and "common sense." Mathematics, which is, of course, done by humans, had the quite serious problem that mathematicians were mostly using common sense to make claims about things. This worked when we try to prove 1+1=2, but when we try to prove things that get more complicated, it falls apart. It was as mathematics got more complicated and advanced mathematics became used more and more in practical applications that the axiomatic proof method became widespread. Mathematicians accepted nothing without a proof to go with it. I recall having to prove that any number times 0 is 0. Unfortunately, our memory system is not robust enough to be able to axiomatically and logically prove everything. We must make assumptions and inferences or the world would overwhelm us. Problems like illusory correlations become outweighed by the benefit of having the ability to make inferences. But these problems are why, as scientists, we must not make assumptions about the world without data to back it up.

JOONE

So we began working with JOONE: The Java Object Oriented Neural Environment. Unfortunately, I had an emergency appendectomy during the week that the class worked on it, so I didn't have an opportunity to explore it with my peers. But I have had a chance to look through it and I've noticed a few things:

1) It violates Apple's Human Interface Guidelines! Of course, it's meant to be a cross-platform program, but having gotten used to a certain way of programs working, it's a bit hard to get used to. There are, of course, worse things that a program could do.
2) It is open source and the GUI editor we're using is just a front end to a very powerful Java programming environment. Unfortunately, most of my programming experience is in Objective-C, so it will be a bit of a learning curve for me to try to create a network just from the APIs that JOONE has, but I hope to try to give my Java chops a spin again.
3) The GUI is a bit hard to get used to and the complete guide is mostly based on the actual programming environment rather than the GUI front end, but the guide was quite useful in understanding the underlying frameworks of the network environment. It should be helpful and especially interesting were the different layers: a variety of different functions and synapses are available for different kinds of networks. I really hope I can spend a bit of time playing with some of the other kinds of layers and synapses to see what kinds of results it yields.

JOONE can run off of any UNIX environment, so I've installed it on my Linux server and I've been playing around with the source code on there to try different tricks in the builds, though most of my attempts at tricks just made the program not compile and even once impressively caused a kernel panic in my machine. I'm keeping the stable, working version running on my Mac and using my Linux server as my development test bed.

Thursday, March 20, 2008

Sigmoid Function

So with my math background, I'm always interested in functions and why they do the things they do. Since we're using a sigmoid function for our feedforward neural networks, I thought I'd investigate this function a bit.

One of the most useful things about this function is its differentiability. That is to say, the first derivative of the function is an easily expressible; you can easily figure out how this function changes as its input changes.

The derivative of the function

Where f(x) would be the function of the net input in our neural network

So, we would differentiation the function of the negative net input and multiply that by (1-(-net input)) in order to get the derivative of our sigmoid function. This result tells us how the function changes as the input changes. The derivative tells us the rate of change. This becomes useful because the easy calculation of f(x)(1-f(x)) can tell us exactly how our network is changing for any given f(x).

The sigmoid function is also useful because it is bounded between 0 and 1. If an equation doesn't have an upper bound, real world numbers can push the function into some strange locations that can cause some damage to the usefulness of a network.

A study of sigmoid functions will sometimes lead you to look at some of the other types of sigmoid functions. I'd be interested to see how useful a double sigmoid function is. It still squashes, but it can be bounded between -1 and 1, maintaining our original values. A double sigmoid function essentially bonds two sigmoid functions together. It has the useful property of providing normalization to a function. It has some problems because it has four inflection points rather than one, so the curve will change signs several times. However, this only related to the second derivative; the first derivative would not change signs. The only issue would be that the values might not be perfect at some of these inflection points.

Below is the formula for a specific double sigmoid function and its result in Mac OS X's grapher application:



Not the flat point at about (1,0). This is the "bonding" location between the two sigmoid functions.

Tuesday, March 18, 2008

Counterfactual Reasoning and rationality

So during class the other day we got into a brief discussion about whether humans are truly "rational" beings. This question came out of the discussion of counterfactual reasoning, the idea that we can rationalize events mainly in two categories, "it could have been worse" or "if only...". It seems that we often reason against facts in order to fit the way we want the world to be. Is this necessarily "irrational", though? Although our memory system is not perfect, it wouldn't make evolutionary sense for it to be. A perfect memory system would be quite resource intensive, and we'd also likely run in to problems with our retrieval system having difficulty finding relevant information. Our system does an excellent job of recalling important information. We can recall information important to survival and information that is important from an evolutionary standpoint.

I wonder, however, if memory for social activities is selected for as strongly as memory for things necessary for survival. Although there is social interaction among our evolutionary ancestors, it seems as though it becomes much more important for humans because of language development. We can interact in social situations in ways that other species cannot do. But since language is a comparatively new development in the evolutionary sense, is our memory as well suited for language memory as it is for other things? Our memories can sometimes cause problems in social interactions. We sometimes place a high standard on others to remember things about us and about our lives. But our memory system isn't perfect. We might not be able to remember things about someone's children or something else in his or her life. Although we do fairly well with social interaction, our imperfect memory system can cause problems.

Saturday, February 16, 2008

Concept Modeling

So we began working on building our own cognitive neural network models using a simple spreadsheet. We're modeling the ways different patterns and features are mapped and how a neural network model might adapt to learning situations as it gains more data and begins to understand the weights of the feature nodes.

While trying to better understand this, I stumbled upon the concept of Semantic relatedness, algorithms and other means for determining the relative meaning of other words, especially through the distance two words are from each other in meaning. One of the more powerful ways of determining semantic relatedness is through Google distance, how related two words are in terms of google searches. Specifically, one can enumerate this idea by understanding the number of hits for two search terms and the overlap of two terms.

Thus, an equation has been developed for this:


Where M is the number of google pages searched for and f(x) and f(y) the number of hits for the corresponding search terms.

As I look at this kind of model, I wonder if our mind works similarly, in that we do a proverbial Google search and see how related two words or concepts are by the number of "results" (categories) they fall under and make a judgment based on the kind of overlap there is. I think I'll want to explore semantic similarity and related concepts more.

Concepts: Representing knowledge

The next section reviews the idea of concepts and representing knowledge. Although the book focuses on social knowledge, it's evident that other types of knowledge can be represented. In fact, it's easier for us as psychologists to conceptualize the idea of knowledge about things than the idea of knowledges about people. It's easier to think of the idea and category of "chair" than it is for something like "extrovert."

Concepts are the basis for cognition. Our cognitive processes would be highly inefficient if we had no way to make inferences about an object based on its categorization. When I see a pen, I don't have to spend time trying to ascertain what it does and how to use it. Even if I've never seen this particular pen before, my experiences with other pens tell me that it's likely to be useful for writing and I would be familiar with how to hold it.

This makes life a great deal simpler than if we didn't have a way to categorize and infer things about objects. It gets messier with people, however. Stereotypes are an excellent example of this. Even though most of us don't like to think of ourselves as constantly stereotyping people and making inferences about people based on initial appearances, I imagine that most of us do. Again, life would be harder if we didn't. We may talk for someone for a few minutes and make some generalizations about how they behave and change the way we will act around them accordingly. If I talk to someone for a while and they misuse grammar and seem to have a low level vocabulary, I'm not likely to strike up a conversation about math. However, a lower level of English ability certainly doesn't imply a low level math ability. Yet, I'm still likely to put such a person in a category of "unable to understand higher math" just because of a subject-verb disagreement!

We're not on the chapter about stereotypes yet, but I'm very interested to learn more about and how we use them in both positive and negative ways.

Friday, February 15, 2008

Introduction: Social Cognition

The name of the primary text for this course is Social Cognition: Making Sense of People by Ziva Kunda. The first chapter gives an overview of the book and gives a bit of an overview to social cognition. It makes the point that the notion that motives can influence our beliefs lies at the core of social psychology theories. This point is very interesting to me, especially as someone who has studied pastoral theology and religious theology in general. Why do we believe what we believe? Do we believe what we believe because it conflicts the least with our preconceived ideas? Or is it that we want to believe what makes sense?

Part of what is interesting about social cognition is that it tries to make sense of social events. Why does one person react completely differently from another to the same set of given circumstances? How does our mood affect how we perceive and remember things.

I'll be very interested to learn about modeling in this course. How can the mind be understood in a mathematical, testable way? Can we set up an algorithm that can simulate the way we categorize and learn? I'm a double major: psychology and mathematics, and it's always interesting to see bridges between the two. I enjoy watching the show Numb3rs because it often shows fairly good mathematics probing into the human psyche. It's amazing that what we do is often understandable in mathematical patterns. What we see as random behavior can often be predicted when given enough data.