Since this is a journal you may find starting from earlier articles helpful. I have covered a bit about the science, the FoldIT user interface, GUI recipes, and Script recipes. If you give me hints where I could be helpful I will focus in that general direction to my discretion. Currently I am going though the basic concepts of LUA script recipes. Once I get past intoductory LUA scripting I can start exploring the science of folding proteins by using LUA scripts.

Monday, November 15, 2010

It's been awhile. I feel the need to post something.

My early inclination to believe rigged wiggles was a bit off.  It turns out that when moving segments further apart their segments have less and less interaction so the score changes smoothly.  When one moves the segments back together the sidechains will interact more strongly so the backbone has to overcome this interaction so the scores jerk and may even go negative then more positive.

I know this isn't much. 

I have published two scripts, rebuild worst and minima finder. 

Use rebuild worst when you are tempted to use a rebuild routine and want to focus on the worst scoring segments.  If you convert your protein's secondary structure to loops it will have greater freedom.  Wiggle worst leaves your secondary structure as it is.

Don't use minima finder without understanding what it does.  It first tries to rebuld a segment then, if that fails, it wiggles it and its neighbors.  It doesn't find the best solution just the best it can find quickly. 

I thought foldit used pass by value.  It passes tables by reference.  This means one needs to build one's own copy function for tables.  Rebuild Worst has the copy function in it.  Since Foldit doesn't implement foreeach I can't create a generalized routine that works with string indexes.

Rebuild Worst aslo has a bubble sort  routine in it that will sort a table of tables on any column.  I read from the end towards the beginning to gets ascending scores from worst to best.

Tuesday, October 19, 2010

FoldIt and frustration.

OK, I've taken nearly two weeks off.  At time's I'm a bit disappointed with FoldIt.  The scripts depend upon smooth processing over accounted actions.  The FoldIt wiggle and shake actions use iterations ratther than time to control their behavior and yet their behavior is strange.  For the last couple of weeks they have behaved as if unfair.  I've been concentrating on the display to see if I can account for their behavior based upon the specifics of the protein structure.  I wanted to concentrate on folding rather than verifying a fair shake.  The abruptness of the change in the behavior of shake and wiggle leaves a sour taste in my mouth.  Likewise, recipes that used to restore to recent best (the score upon starting the recipe) seem to restore to 30 points less than that even though higher scores have been achieved during their running.  FoldIt is supposed to be about the science while disguised as a game but if something else is going on I'm not so interested in the game and can just move on to other pure games where I don't feel the need to employ my programming skills.  I use them all day to earn a living.  I don't find programming as fun as I once did.  I certainly don't like defensive programming against antisocial behaviors.

There's a strange focus on security.  Many files are free text but digitally signed.  This includes one's saved password to the FoldIt account.  I'm not sure what's going on. 

I'm not sure if LUA or the FoldIt implementation is responsible for the lack of static variables.  The lack of static variable causes properties to be exposed outside the use of methods.  While I've been mentally using objects for decades now I haven't done much programming in object oriented languages.  It's going to be hard to implement consistent style.

Since FoldIt doesn't implement file IO I can't gather statistics and save them.  Instead I have to use a hobbled output window that doesn't allow copy to display gathered information and hand type it into the scripts.

Early on I though having objects passed by value would be a good thing but now I'm not so sure.  It seems to me that inherited methods might best be implemented as pass by reference.  I'm not looking forward to using something like this just because FoldIt hobbles LUA:

--test
classes={
xclass={
y='testing',

tostring=function()
-- the class tostring method
end
}
}
x={
class='xclass',

tostring=function()
-- the local tostring method
end
}     
print(classes[x.class].y)
function method(obj,meth)

  if  obj[meth] ~= nil then
      begin return obj[meth](); end
  elseif obj[class] == nil then
      begin return nil; end
  else
     begin return method(classes[obj[class]],method) ; end
  end
end

It won't be easy if I don't trust the game to be fair.

Thursday, October 7, 2010

FoldIt Script Collections or Tables.

LUA refers to "Tables" where I refer to Collections and have referred to Structures.  A LUA collection contains an indexed set of objects.  The index can be either an integer or a string.  The index can be specified or assumed during construction.  A collection is constructed by a paired set of braces({ and }) .  The collection can be constructed empty or with objects in it.  Here is an empty constructor:

  x={}

The index is identified by a paired set of brackets ([ and ]).  The index will be evaluated during assignment.  A variable name is equivalent to index with it's name quoted.  When elements are indexed during construction the elements must be separated by either a comma (,) or semicolon (;).  Here the construction of a collection with named indexes:

x={[1]='first',[2]='second',['third']='value indexed by "third"',forth='value indexed by "forth"'}

A constructor can span lines.

During construction elements whose index is not specified are assigned to sequential integer indexes starting from 1.

A string index can be referenced using "dot" notation or by using normal bracket notation.  These produce the same results:

print(x.forth)
print(x['forth'])

After construction elements can be added to the collection by placing the collection and index on the left side of the equal sign (=) in an assignment statement.  Remember that the index is evaluated prior to use.  Here is a small script you can try and modify to prove your understanding:

a=4;
x={
  [1]='lookout index 1 will be overlayed',
  'implied index of 1',
  'implied index of 2',
  x='"x" masquerading as a variable';
  ["a"]='indexed by "a" can be referenced using dot notation';
  [a]='indexed by the value of a, in this case 4'
};
print('x[1]=', x[1])
print('x[2]=', x[2])
print('x["x"]=', x["x"])
print('x.a=', x.a)
print('x[4]=', x[4])

I hope you are beginning to get an idea about how FoldIt builds scripts.  A script is a collection with some special syntax.  Running a script means to evaluate the integer indexes in order.

Next time I'll cover functions and start to move into writing maintainable programs.

Comments, Variables, and Constants.

Programs can be quite mysterious.  Programmers try to help each other by putting comments into their code.  LUA has two varieties of comments.  The first variety starts with two dashes (--) and ends when the line ends.  The second variety starts with two dashes and two open brackets (--[[) and ends with two close brackets and two dashes (]]--).  Here's a small script you can run to see the behavior of commnets:

  print('before dashes')  -- print('after dashes') anything I put here is a comment
  print('new line after comment')
  --[[ print('line 1 of comment')
  print('line 2 of comment') ]]--
  print('after the second comment')
  --[[ the normal use of multiline comments is to temporarily disable code
         or make multiline comments when one doesn't want dashes at the beginning of the line.
  print('this is the normal use of multiline comments)'
  ]]--
  print('after third comment')

Trying code helps me verify my understanding.  If you aren't sure you understand comments please create a new script, copy the text from above into it and run the script.  Then modify it to prove your understanding.  If you put something the computer doesn't understand outside a comment the recipe output will show an error.  I added this line to the end of the script above:

this throws an error

The recipe output window displayed:

ERROR:[string "print('before dasses') -- print('after dashes') anything I put..."]:20: '=' expected near 'thows'

The error message displays the first line of the script followed by a colon followed by the line number on which the error happened followed by a colon followed by the computers best guess as to the problem.  The computer is seldom right about the cause of the error but is always right that there is an error in the program.  In this case the computer is assuming the programmer wanted to assign an object to the variable "this".

Last time I said Variables name objects.  LUA has several ways to assign a name to an object, the most basic of which is an assignment statement.  An assignment statement has the variable on the left of an equal sign (=) and the definition of the object on the right of the equal sign.  A variable can be redefined as often as one wishes.  Here are a few somewhat nonsense examples you can test:

  --exploring the assignment statement
  x='a string constant'
  x = 'the equal sign can have zero or more spaces on either side of it'
  x=3 -- numbers can be integers or reals.  3 is an integer constant
  x=.01 -- .01 is a real constant.
  y= x * 100 -- * is the LUA infix multiply function.  The definition is evaluated and the result assigned.
  print('x is ', x, ' and y is ', y)
  z=print -- you can assign your own name to functions
  z('yes this will print')
  aStructure = {} -- structures are complex enough to warrant its own article. 

OK, I didn't get to data types this morning.  I'm not sure of my audience so I'm going into greater detail than I otherwise would.  As I said last time, one has to power through some basic concepts even if they are a bit confusing at first.  The best way to learn is to try things out and see what they do.  Try the code on your own and modify it based upon your understanding.  If you create a new script it will be unnamed.  I have a recipe named test I modify and run.  Remember to set info and save before running a modified recipe.

I've mentioned assignment to strings, integers, reals, and functions.  Next time I'll go a bit deeper into structures and reintroduce the concept of scope.

Have fun folding.

Tuesday, October 5, 2010

On LUA variables and the scope of variables.

OK, programming has some pretty abstract concepts.  Sometimes you just have to power through them.  They make sense once you get the hang of them. 

Variables are the names of objects.  In many computer languages some variables name the memory containing the values and some variables name the memory containing a pointer to the memory contianing the values.  So far my tests indicate LUA variables name the memory containing the values and that no two variables will ever name the same memory.  The Computer Science people call this "pass by value".  When a language uses variables to name the memory containing a pointer to the memory containing the values and can make two variables point to the same memory locations the Computer Science people call this "pass by reference".  It doesn't matter if LUA actually names pointers to memory containing the values or names the memory containing the values.  What does matter is that one can't change the value by using one name and have it change the value of another variable.

While variables are the names of objects multiple objects can have the same name as long as the computer can decide which is being named at any given time.  The contexts under whic any particular variable can be identifed by nane is its "scope".  Identifying the scope of a variable can be daunting.  Further, using a variable name improperly can introduce bugs.  To limit the scope of a varaible one identifies it as local.  This program:

  function y()
     x=2
    z()
    print ('in y, x=',x)
  end
  function z()
    x=3
  end
  y()

prints "in y, x=3".

This program:

function y()
   local x=2
  z()
  print ('in y, x=',x)
end
function z()
  x=3
end
y()

prints "in y, x=2".

The same will happen if I define x as local to function z.

Next time I'll go into types of variables.

Monday, October 4, 2010

Beginner LUA script concepts.

Foldit calls user programs "recipes".  You find them in the cookbook.  Computer programs are like detailed recipes.  Most poeple know you add liquids to solids so you don't get lumps and that you add the liquids slowly and stir them in.  Some people wouldn't know that so the recipe would have to give the steps and ingredients in order with appropriate quantities.  Well, computers are like beginner cooks.  They need to have every step spelled out in full detail and with the right syntax.

One can start writing LUA scripts using the ingredients known as the built in FoldIt functions.  Here's a link to the FoldIt functions on the FoldIt Wiki.  Some computer languages are case sensitive and some aren't.  LUA is case sensitive, that is "X" and "x" don't reference the same thing.  Here's a simple script:

  print('shake and wiggle')
  do_shake(2)
  do_global_wiggle_all(2)
  print('done shaking and wiggling')

That can be done manually.  Scripts get much more complex.

In addition to the built in functions one can define one's own functions and use them just like the built in functions.  For instance:

  function ShakeAndWiggle(x)
    print('shake and wiggle for ', x, ' iterations.')
    do_shake(x)
    do_global_wiggle_all(x)
    print('done shaking and wiggling.')
  end
  ShakeAndWiggle(2)
  ShakeAndWiggle(4)

When I defined the function ShakeAndWiggle I told LUA that I would pass it a parameter.  Some languages require you to say what type of parameter is to be passed.  Sometimes one wants an integer, sometimes a real, and sometimes something else.  LUA doesn't require you to specify the type but if you pass the wrong time you're going to have problems when the parameter is used.  The function do_shake and do_global_wiggle_all require integers so if you give it a string the program will blow up.

As it turns out the parameter "x" in the prior example is a special case of a variable.  Variables let one use names of things rather than their values.  In LUA you use an equal sign to assign  a value to a variable.  Here are some examples:

iterations=4
mytext='Shake and Wiggle'
delta=.01

Trying writing some scripts using some of the simple built in functions.  Remember to keep the capititalization you see in the documentation and include the right underscores in the right places.

Next time I'll go into a bit more detail about variables.

Saturday, October 2, 2010

Using LUA scripts in FoldIt.

Unlike GUI recipes Script recipes look like regular computer programs.  FoldIT doesn't implement the full LUA language.  If it implemented the full language the shared recipes could contain viruses that could infect your computer.  None the less it's a pretty powerful language giving you the same controls as the GUI recipes plus some exta stuff that lets you do different things at different times.  Here is the link to the FoldIt functions.  You can get the complete list by using the function help.

The LUA editor isn't very good.  Most script writers do their work in the editor of their choice then copy and paste into the LUA editor window.  The only hot keys I know for the LUA editor are:
  • ctl-A -- Select All.
  • ctl-X -- Cut Selected.
  • ctl-C -- Copy Selected.
  • ctl-V -- Paste from Clipboard.
Windows uses the pair of control characters Carriage Return, Line Feed to separate rows.  This is a holdover from MS-DOS days.  Unix just uses the Line Feed.  When you copy from Notepad into the LUA editor the Carriage Return is displayed as a box with an x in it.  LUA ignore it but it looks ugly.  If you see the box with x in it at the end of the line you can copy from LUA directly into Notepad but if you don't see it you have to copy from LUA into Wordpad.  Wordpad is smart enough to recognize the Line Feed as a record delimiter and automatically inserts the Carriage Return.  I don't like programming in Wordpad so I copy the text from Wordpad into Notepad.  I do my programming in Notepad then use ctl-A ctl-X in the LUA editor to remove the old code then copy and paste the new code from Notepad into the LUA editor.  Before I can save the revised script in FoldIt I have to use Set Info, making changes to name or description or leaving it alone, then Save.

Next time I'll go into some actual LUA code.  You can always bring up the code behind existing scripts and review how others are doing things.  Learning from others is a good way to learn.

Have fun.  Let me know how I can be more helpful.

Getting started with your own recipes

Recipes are just a way to automate manual steps one uses in the game.  The FoldIT programmers have seen to it to add both a GUI (Graphic User Interface) and Script recipe language.  The GUI version is more limited but is a good place to start for those who aren't programmers.  I've programmed for years so won't go into much details about the GUI recipes.  When you insert a step you will be given hints on how to configure it.  Once you've created your recipe you will want to add it to your cookbook. Save As lets you name your script and give it a description.  Before you can use Save you have to Set Info.  If you want to modify your recipe you will need to load it then modify it then set info then save it.  It may seem strange but it work.  I'll go into script writing this  afternoon.  Sorry for the delay.

Wednesday, September 29, 2010

The Basic FoldIt Controls and the Way Nature Folds Protiens

When playing FoldIt you have some controls you use all the time and some you use less frequently.  The Actions you use all the time are Shake Side chains, Wiggle  All, Wiggle Backbone, Wiggle Side chains, Unfreeze Protein, Remove Bands, Disable Bands, and (unshown) Enable Bands.  In addition to these Actions, I have demonstrated a frozen segment and a band between a segment and space.  You can also band and freeze side chains.  I have also brought up the local action popup menu where you can Freeze or Tweak the structure(if the segment is part of a helix or sheet) or Rebuild, Shake, or Wiggle the protein as bounded by frozen segments.  The Behavior menu item brings up a slider for Clashing Importance.  Some puzzles let you Mutate Side chains.  Some puzzles have ghost guides of the native protein fold.  Some puzzles allow the protein to be threaded by aligning it against other proteins with known folds.  The Modes are Pull, Structure, Note, and Design.  I play most of the game in Pull mode.  Structure mode lets you indicate a segment as being part of a loop, helix, or sheet.  Note mode lets you add comments to segments so you can document what you want to do and where.  When a Puzzle lets you mutate the protein Design mode lets you change the amino acid at a particular segment and to insert or delete segments in puzzles that let you change the number of segments.

Since the idea of the game is to help science these control need to be analogs of natural processes.  When you Google Protein Shape you find some interesting articles about why protein take the shapes they do.   When I ran the query the first page returned was to The Rules of Protein Structure by J Kimble, the second was to The importance of protein folding by Joachim Pietzch.  Later I found How Proteins Get In Shape, an unattributed article on the Pittsburgh Supercomputing Center Website.  A scan of the cited articles and others you find from the search should give you an idea about the shake and wiggle actions and even the clashing importance.  The freeze and band controls are a bit more mysterious.  For the most part the take on the role of all the natural processes too complex to individually model with today's technology.

You can help science by more than just playing the game.  You can record the techniques you use in "Recipes" and share them with others.  In a previous article I touched on recipes.  I will return to them in my next article.

Why Humans Fold Proteins.

OK, I guess I need to step back for a moment. 

Some wonder how we are helping science by playing a game.  It seems like we aren't doing anything a computer can't do faster.  Nature doesn't need us to fold proteins.  Empirical science looks at nature to inform theory.  Just how are we helping?

This is the solution the first introductory puzzle.  All one has to do is move one of the side chains away from the other.   Once one gets past this first introductory puzzle the Actions menu has a button, "Shake Side chains", that will do it automatically.  It's a simple move.  Just how are we helping?

As it turns out, even simple proteins are quite complex.  Proteins are composed of a chain of amino acid residues.  There are twenty two standard amino acids and many non-standard amino acids.  Living cells use the process protein synthesis to build the various proteins of life encoded in DNA.  Science hasn't learn how to predict the way any particular protein will fold and the process is computationally complex.

When humans play FoldIt  they are using idealized versions of natural processes to fold models of proteins.  The game scores the folded protein based upon the energy left in it where the lower the energy the higher the score.  Intermediate positions are recorded and the sponsors of the game can use this to improve their model of how proteins fold.  We are solving a problem in what appears to be less steps that mathematics predicts it should take.  Being able to replicate that process even if it isn't understood will make designing proteins to fight diseases a reality.
 

Monday, September 27, 2010

Writing foldit scripts.

FoldIt scripts are simply a way to automate one's manual action.

a simple script to do a global wiggle for 20 iterations is:

do_global_wiggle_all(20)

By adding other function you can combine other steps.
I'll write more about this tommorow.

Sunday, September 26, 2010

Playing FoldIt takes time

OK, I've been playing FoldIt for nearly two months now.  I'm doing pretty well even if I'm still learning.  It's hard separating game playing from the science.  I'm mostly using scripts to get points and letting the science go except as needed to gain points.

Many  puzzles get good score for beginners by using:
Show Allignment
Select best.
Repeat:
  Shake Sidechains
  Short burst of Wiggle Backbone
Repeat:
  Wiggle sidechains
   Slightly longer burst of Wiggle Backbone
Run Tlaloc's script Hydrophobe
Run Tlaloc's script Cataclysm
Run Tlaloc's script Repeat Settle
Run Rav3n_pl's Walkin' Rebuild

Some of the science comes later.

Friday, September 24, 2010

Getting Started with FoldIt

I learned about FoldIt from the  University of Washington UWeek online magazineThe article talked about a new article published Aug. 5 in the journal Nature describing how game playing can help science.  I thoght I'd give it a try.  It's not like other games but it's fund and there is a lively community of people.  I've been folding since then.  You can see my player entry and follow my progress as I learn the ropes and share them with you. If  I can fold proteins and help science while playing a game so can you.