Author Archives: david

How to use a Netgear Nighthawk R7000 with Vodafone NZ fibre

After Chorus connects your fibre optical router to the wifi modem that Vodafone provides, and ensures it’s working, you can switch to using your own, better Nighthawk modem this way:

  1. Detach the red plug from Vodafone’s modem and plug it into the Internet port on the Nighthawk
  2. Follow the instructions in your Nighthawk manual and connecting power, etc.
  3. In a browser, visit routerlogin.net (This is an address that the Nighthawk interprets as referring to itself; it’s not a machine on the Internet, so it will work if the modem is on but you don’t have an Internet connection yet.)
    1. If you can’t connect, follow the instructions about using an Ethernet cable to connect your computer to the Nighthawk
    2. The default login is admin with password password. But if you have house-sitters, you probably want to change this. I recommend using LastPass to keep track of passwords.
    3. If prompted whether to use the Netgear Genie software or a wizard to connect to the Internet, answer yes.
  4. In Genie:
    1. To connect to the Internet:
      1. Select: Advanced / VLAN-Bridge settings / Enable VLAN / Enable VLAN tag / VLAN ID = 10 / click Apply
      2. Note that if you ever use a paperclip to press the recessed reset button, you will need to do the step above to reconnect.
      3. I got this tip from https://community.vodafone.co.nz/t5/Broadband-services/Trying-to-connect-a-new-router-to-the-ONT/td-p/183755
    2. To change the wifi name or password:
      1. Select: Advanced / Setup / Wireless setup / 2.4GHz name ssid = NETWORK NAME YOU WANT / security = wpa2-psk / password = PASSWORD YOU WANT Then do the same for 5GHz but add something like -5G to the name. Click Apply to restart router.
      2. Note that this changes the name of your wifi as it appears on your computer and for any of your guests. It doesn’t change the name and password that you use to access the modem’s settings. If you’ll have a house-sitter that you want to prevent from changing modem settings, follow the steps below.
    3. To change the password for modem settings:
      1. Select: Advanced / Administration / Set Password
      2. Enter password as Old Password
      3. Use LastPass to generate and record a New Password. I recommend saving this to a folder in LastPass that you’ve shared with any other household members that might need to access settings.
      4. NetGear recently advised against using the Enable Password Recovery checkbox for security reasons. If you do lose your password, press the recessed Reset button and redo all of the steps on this page.
      5. Click Apply
      6. When prompted to login again, enter admin as the user and get the password out of LastPass
    4. To set the clock to NZ (which is necessary if you’re going to use a schedule below, but probably not important otherwise):
      1. Select: Advanced / Security / Schedule / Timezone = NZ GMT+12. Adjust for DST = on. Click Apply to restart router.
    5. To turn wifi on and off according to a schedule:
      1. Select: Advanced / Advanced setup / Wireless settings / Wireless advanced settings for 2.4GHz / Turn off by schedule = on / Add new period / start=11pm end=7am daily / click Apply. Then do the same for 5GHz. Click Apply to restart router.
      2. You might need to set both “Turn off by schedule = on” again after updating the schedules.
    6. Logout of Genie and close the browser tab it was in.
  5. Disconnect any Ethernet cable.

The 6 Evolutionary Stages of Chatbot AI

Haeckel_Actiniae

[I posted this originally over on medium.com. I’m archiving it here.]

Suddenly this year, there are ‘conversational interfaces’ and ‘chatbots’ on offer from companies big and small. Many claim to be breaking new ground, and that’s surely true in some cases. For example, Google’s recent release of SyntaxNet uses Deep Learning in a new way to achieve the greatest accuracy ever attained for syntactic parsers. The release of that tool allows companies whose chatbots have reached a certain stage of development to switch out their own syntactic parser and focus even more intently on the really interesting problems.. But how can we assess the smartness (i.e. potential usefulness) of these bots?

Many of the techniques used in these bots are new to industry but have been known to AI researchers for a decade or more. So an AI researcher can make pretty good guesses about how each bot works based on what features are promoted in its demos. What I’m going to do in this article is to give you some tools for making the same insights.

Let’s adopt a guiding metaphor that bot intelligence follows a natural evolutionary trajectory. Of course, bots are artifacts, not creatures, and are designed, not evolved. Yet within the universe of artifacts, bots have more in common with, say, industrial design than fashion, because functionality determines what survives. Like the major branches of Earth’s genetic tree, there are bundles of features and functionality that mark out a qualitative change along the trajectory of development. How many stages are there in the evolution of chatbot AI? Based on my years of research in NLP and discussion with other experts, my answer is six. Let’s look at each.

Stage 1, The Character Actor: Bots at this stage look for key phrases in what you type (or say, which they process with speech recognition) and give scripted responses. The primary effort that goes into these bots is making the scripted responses entertaining so you’ll want to continue the interaction. But because they only look for key phrases, they have a shallow understanding of what you’re saying and cannot go in-depth on any topic, so they tend to shift topics frequently to keep the interaction interesting. Bots at higher stages need to provide interesting interaction also, so it’s not a good idea to skip stage 1 entirely. There are no really good techniques at later stages to convey personality, emotion, or to be entertaining.

Stage 2, The Assistant: Bots at this stage can do narrow tasks they’ve been explicitly programmed to do, like take a pizza order or make a reservation.Each kind of task is represented with a form (i.e., a tree of labelled slots) that must be filled out so the bot can pass the form values to a web service that completes the order. These bots use rules similar to stage 1, but the rules have been split down the middle into two sets of rules. One set looks for key phrases in what you type, as before, and use parts of what you write to fill in the form. The other set checks which parts of the form still need filling and prompts for answers, or when the form is full, calls a web service.

This form-filling design has been part of VoiceXml, a technology that underpins automated phone services, since the early 2000s. And it’s been in AI research systems since the late 90s. Recent work in machine learning and NLP certainly makes this design work better than those old systems do.

Stage 3, The Talking Encyclopedia: Bots at this stage no longer require a predefined form but instead build-up a representation of what you mean word-by-word (aka ‘semantic parsing’). This is the stage that Google’s SyntaxNet is designed to help with, because these systems usually try to identify the ‘part of ‘speech’ of each word and how they relate to each other first, as a guide to extracting meaning. SyntaxNet can do the first step, leaving the hard, interesting work of extracting meaning that distinguishes bots at this level.

A common use for these bots is for looking up facts, such as the weather forecast two days from now, or the third-tallest building in the world, or a highly rated Chinese restaurant nearby. The way they make this work is that they build a database query as part of doing the semantic parse, leveraging the syntactic parse result. For example, nouns like ‘restaurant’ can be indexed to a database table of restaurants, and adjectives like ‘Chinese’ and ‘nearby’ add constraints for querying that database table.

Stage 4, The Historian: Bots in all the stages so far aren’t good at understanding how one of your sentences connects to another (unless both sentences are focused on a common task). Bots can’t even do a good job of knowing what you’re talking about when you say ‘he’ or other pronouns. These limitations are due to not having a usable record of how the exchanges have gone and what they’ve been about. What distinguishes a bot at stage 4 is that it maintains a ‘discourse history’.

In addition to those benefits, research suggests that this is the best stage to integrate gesture understanding with language understanding. The reason gesture understanding fits at this stage is that pointing gestures are a lot like pronouns, and more complex demonstrative gestures (like miming how to get a bottle cap off) build on pointing-like gestures similar to the way sentences build off pronouns.

Remember how last year’s buzz was about virtual and augmented reality? In the near future the buzz will be about giving chatbots a 3D embodiment which can understand you just from your body language. After all, conversational interfaces should be easier to use than others because you just talk, as one does in everyday life. If you reduce the amount of talk needed to just what people use with each other, that’s really smart/useful.

Stage 5, The Collaborator: The Assistant in stage 2 recognises requests that match the tasks it knows about, but it doesn’t think about what you’re ultimately trying to achieve and whether it could advise a better path. The best example of such a bot to be found in AI research is one that, when asked which platform to catch a certain train, answers that the train isn’t running today. It’s certainly possible to get bots at earlier stages to check for specific problems like this one, but only stage 5 bots have a general ability to help in this way.

Bots at this stage seem to fulfill the ideal of the helpful bot, but it’s still not what the public or researchers dream of. That dream bot is found in the next stage.

Stage 6, The Companion: This is a mythical stage that all bot creators aspire to but which only exists so far in fiction, like the movie Her. Ask yourself what is it about the computer character that people wish were real. I believe her empathy and humor are part of it, but that her core virtue is that she understands his situation, and what could help him through it, and she does it. Would a real bot require any technological advance beyond stage 5 to help in similar ways? It seems not. What bots do need to reach this stage is a model of everyday concerns that people have and how to provide counsel about them. That core skill, leavened with humor from stage 1, could provide something close to companionship.

Could a bot be a good counselor? It’s certainly a serious question, because the elders, teens, parents, and anxious wage-earners of today and coming years are outstripping available professional care. Conversational AI has potential to deliver real benefits beyond simplifying interfaces. Let’s work with professional counselors and make it happen.

I hope this article has provided a way for us all to identify the strengths in different chatbots and talk about them with workable shorthand terms. It would be great to see discussions about some particular chatbot being strong in stage 4 but boring because it neglected stage 1, for example. That would give everyone a chance to influence the directions of this very promising technological advance.

Workshop: Intention recognition in Human-Robot Interaction 2016

This workshop just ended, but information about it can be found at http://intentions.xyz/

Like many areas of AI, intention recognition has no established benchmarks for measuring progress. I presented a paper proposing eight design objectives that cognitive models of intention perception should try to fulfill, although these are meant to stimulate discussion toward some kind of benchmark rather than claiming to be complete solution.

An alternate approach would be to select a set of training and test data, following the example of some NLP conferences. But the amount of annotation effort needed alone, not to mention coming to agreement on it and documenting it, would be tremendous.

Whichever path toward quantifying progress we might take, I also proposed that we take-on the problem of helping to care for the huge number of elderly people in their homes, nursing facilities, and hospitals. There just aren’t enough human care-givers to deal with this imminent worldwide issue. It could be a great fit for the social robotics research community, and could bring funding and more researchers into the field while having one of the most socially-relevant impacts of any AI project.

How to setup Android tablets for offline Amazon video and browser whitelisted parental control

These instructions are for Android 4.4 on two Lenovo TAB 2 tablets bought in the US. I’m also using a 32 GB microSDHC card for each so we can download more.

Aims

  • Allow videos that were bought for watching on Amazon’s website to be watchable while offline (e.g. on a plane)
    • Caveat: There is no option in this app to password-protect all sections other than Downloads in order to keep children from watching trailers and such. There are parental controls that can be enabled, and we did, but this leaves the choice of what’s appropriate to Amazon’s editors.
  • Allow children to access websites that we choose and no others
    • Caveat: We installed an app named Kiosk Browser to do this, but it’s not perfect. Mostly, it’s designed to be the only app that non-admin users of the tablet will be able to use, while we want to allow access to Amazon Video also. Also, if you use the hardware power button while Kiosk Browser is running, which should be a normal thing to do, after startup Android will prompt which launcher app you want to use instead of remembering if you don’t want it to be Kiosk Browser (so that your kids can get to Amazon Video easily)
    • We just learned of Android Chrome’s support for Supervised Users, and this might be a better solution
  • Disable access to all installed apps, including Settings and Play Store, except for Amazon Video, a browser, and maybe Skype for calling relatives
    • Caveat: As of the end of 2015, neither Android Settings nor the Android Play Store offer a way to set a password for access to them specifically, and I could not find any app that I would trust to do this. So this aim isn’t solved yet.
  • Prevent anyone but us parents from changing any of these settings

Steps for offline Amazon Video

  1. Create a Google account to be used only for the children’s apps.
  2. Enable 2-factor auth on that Google account, which should help prevent your child from installing anything. To be sure you can get these codes while on a flight, install the Authy app on your phone and use it to photo the QR code shown when you enable two-factor auth.
  3. On each tablet, go into Google Play app and login.
  4. We have a tablet for each child, so to identify the devices, we bought color-coded cases. And went into Play Store using desktop browser > Gear button > Android Device Manager > Pencil button > (added case color to name)
  5. In the tablet Android Settings, disable location services so Google and perhaps others cannot track your child
  6. Disable loud sounds https://forums.lenovo.com/t5/Lenovo-Android-based-Tablets/Start-up-sound-idea-s6000/m-p/1298239#M13520
  7. Make sdhc the default storage medium so you can download more: Android Settings > Storage > SD Card.
  8. Uninstall or disable as many apps as you can except the Play Store and Android Settings (and maybe Skype for calling relatives)
  9. (Necessary only for non-Amazon Fire devices) Using your tablet’s browser, go to https://www.amazon.com/appstore_android_app which will download the Amazon App Store app for Android. Use it to install both Amazon For Tablets app and Amazon Video app (exactly as spelled here, not Instant Video).
  10. Amazon Video app shows “We’re unable to show this content. Please try again later.” I see this while outside the US, so I installed the Android solution of my vpn service https://play.google.com/store/apps/details?id=com.privateinternetaccess.android&hl=en then while connected through it, Android Settings > Apps > Amazon Video > Clear Data.
  11. Logout of Amazon For Tablets, then log back in. Visit ‘Your Orders’ and confirm that it shows actual orders you’ve made — before logout it may show Recently Viewed Items, which indicates it’s confused. You should still be on VPN.
  12. Start Amazon Video and verify your video library is accessible through the menu. Download all your purchases that you want while offline.
  13. After downloading, we could view all the downloads without the vpn and without any network connection at all.
  14. Set Amazon Video app’s parental controls http://www.amazon.com/gp/help/customer/display.html?nodeId=201423060

Setting up a browser that enforces a whitelist

  1. Using laptop browser logged into same Google account as the tablet’s Play Store, installed Pro version of https://play.google.com/store/apps/details?id=com.procoit.kioskbrowser because it’s a browser app that has a url whitelist, and is based in a country I trust (UK)
  2. Signup for free trial of the remote management service at https://www.kbremote.net/Home/Start Then in the app go to Settings > Remote Management > Login/Signup > Login and use the same userId and password. In a few minutes, logging into the site will show the device.
  3. Setup which urls should be allowed https://kioskbrowser.userecho.com/topic/908050-profiles/
  4. To make one device’s profile slightly different from others (e.g. different default url), use Profile Overrides https://kioskbrowser.userecho.com/topic/847081-profile-overrides/
    Sign up for a Kiosk Browser forum account at https://kioskbrowser.userecho.com/

Working all this out took me a day and half, so I hope sharing it here saves some other parents some time to prep for their travel!

Clearing Eclipse when it seems to hang-on to old project settings

When Eclipse seems to be holding-on to old project settings:

  1. Make sure you’re updating the correct configuration file. For example in a Spring project, you might have xml files with very similar content for dev use, production use, and integration test use. If there’s an exception, look closely if it mentions such a filepath; you might be editing the wrong one!
  2. If all else fails,
    1. Shutdown Eclipse
    2. Find .metadata/.plugins/.org.eclipse.wst.server.core/ under your workspace dir and delete all its contents
    3. Restart Eclipse

Developing speech applications

Contents

Personal background

The idea of controlling technology by telling it what to do has been compelling for a long time. In fact, when I was part of a “voice portal” startup in 1999-2001 (Quack.com, which rolled into AOLbyPhone 2001-2004 or so), there was a joking acknowledgement that the tech press announces “speech recognition is ready for wide use” about every ten years, like clockwork. Our startup launched around the third such crest of press optimism. And like movies on similar topics that release the same summer, there was a new crop of voice portal startups at the time (e.g., TellMe and BeVocal). Like the browser wars of a few years earlier between Netscape and IE, in which they’d pull pranks like sneaking their logo statue into the competitor’s office yard, we spiked TellMe car antennas with rubber ducks in their parking lot. Those were crazy, fun days when the corner of University and Bryant in Palo Alto seemed like the center of the universe, long hours in our little office felt like a voyage on a Generation Ship, and pet food was bought online. And a little more than a decade later, Apple has bought Siri to do something similar, and Google and Microsoft have followed.

The idea that led to our startup was wanting to help people compare prices with other stores while viewing products at a brick-and-mortar store. Mobile phones then had poor cameras and browsers, so the most feasible interaction method was to verbally select a product type, brand, and model by first calling an Interactive Voice Response (IVR) service. But a web startup needs more traffic than once-a-week shopping, so other services were added such as movie showtimes, sports scores, stock quotes, news headlines, email reading and composing (via audio recording), and even restaurant reviews. This was before VoiceXML reached v1.0 and we used a proprietary xml variant developed in-house alongside our Microsoft C++-based servers. We were the first voice portal to launch in the US, and that was with all services except the price comparison feature that was our initial motivation. It hasn’t reappeared on any voice portal since, that I know of.

As any developer knows, building on standards often provides many advantages. Once VXML 1.0 appeared, I wanted to see if we could migrate to it, so I bought a Macintosh G4 with OS X v1 when the Apple store first opened in Palo Alto, and used the Java “jar” wrappers for its speech recognition and generation features to prototype a vxml “browser”. When it supported 40% of the vxml spec, I shared it with our startup, recently bought by AOL, but they passed. I stopped work on it and released it as open-source through the Mozilla Foundation (see vbrowse.mozdev.org).

More than a decade later, markup-based solutions like vxml still seem like the most productive way of creating speech-driven applications (compared to, say, creating a Windows-only application using Dragon NaturallySpeaking).

Application design

State-of-the-art web applications tend to adopt the Model-View-Control design pattern, where the model is a JSON finite-state machine representation of all the states (e.g. ViewingInbox, ComposingMessage) supported, and JavaScript is used as controller to create DOM/HTML views, handle user actions, and manage data transfers with the server. This is also the pattern of newer W3C specs like SCXML that aim to support “multi-modal” interactions such as requesting a mapped location on one’s mobile phone by speaking the location (i.e., speech is one mode) and having the map appear in the browser (i.e., browser items are another mode). As “pervasive computing” develops and is able to escape the confines of mobile phones and laptops, additional modes needing support are likely to be, first, recognizing that the user is pointing to something and resolving what the referent is, and secondly, tracking the gaze of the user and recognizing what it’s fixated upon, as a kind of augmented reality hover gesture. Implementing and integrating such modes is part of my interest in the larger topic of intention perception; if you are interested in how these modes fit into a larger theoretical context, I highly recommend the entry on Theory of Meaning (and Reference) in the Stanford Encyclopdia of Philosophy, and Herb Clark’s book “Using Language“.

Vxml is up to v3.0 now, and it might support integration with these non-speech modes. But vxml 2.0 and 2.1 are supported more widely, and creating applications with them that follow the design pattern above requires a bit of thinking. The remainder of this article will share my thoughts and discoveries about how to do that with an excellent freemium platform, Voxeo.com

Tips on Using Vxml 2.1

Before attempting to create a vxml application, I strongly recommend getting a book on the topic or reading the specs online. But as a quick overview, think of conversations as pairs of turns in which one person already has in mind how the other person might respond to what he is about to say, he then says it, and usually allows the other person to interrupt, and as long as the other person says something and it’s intelligible, the speaker will respond with another turn. Under this description, the speaker’s turn gets most of the attention, but the respondent’s turn usually determines what happens next. Each such pair can be conceived of as a state in a finite-state machine, where all the speaker’s reactions to the respondent correspond to transitions out of those states.
To implement such a set of states in vxml2.0 or 2.1, one can create a single text document (aka “Single Page Application (SPA)“) with this as a start,

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE vxml SYSTEM "http://www.w3.org/TR/voicexml21/vxml.dtd">
<vxml version="2.1">
</vxml>

and then for each state, insert a variant of the following between the ‘vxml’ tags:

<form id="fYourNameForTheState">
</form>

To implement each state, add a variant of the following within each ‘form’ element,

<field name="ffYourNameForTheState">
  <grammar mode="voice" xml:lang="en-US" root="gYourNameForTheState" tag-format="semantics/1.0">
    <rule id="gYourNameForTheState">
      ...All the things the speaker might expect the respondent to say that are on-task...
    </rule>
  </grammar>
  <prompt timeout="10s">...The primary thing the speaker wants to tell the respondent in this state, perhaps a question...</prompt>
  <noinput>
    <prompt>...What to say if the prompt finishes and the respondent is silent all through the timeout duration...</prompt>
  </noinput>

  <nomatch>
    <prompt>...What to say as soon as any mismatch is detected between what the respondent is saying and what the speaker was expecting in the grammar; "I didn't get that" is a good choice...</prompt>
  </nomatch>

  <filled>
    <if cond="ffYourNameForTheState.result &amp;&amp; (ffYourNameForTheState.result == 'stop')">
      <goto next="#fWaitForInstruction"/>
    <elseif cond="ffYourNameForTheState.result &amp;&amp; (ffYourNameForTheState.result == 'shutdown')" />
      <goto next="#fGetConfirmationOfShutdown"/>
    <else />
      <assign name="_oInstructionHeard" expr="ffYourNameForTheState"/> <!-- Assumes _oInstructionHeard was declared outside this form in a 'var' or 'script' -->
      <goto next="#fGetConfirmationOfInstruction"/>
    </if>
  </filled>
</field>

We’ll discuss grammars in more depth below, and the rest of the template is largely self-explanatory. But a few minor points:

  1. If you need to recognize only something basic like yes-or-no or digits in a form, then you can remove the ‘grammar’ element and instead add one of these to the ‘field’ element:
    • type="boolean"
    • type="digits"
    • type="number"
  2. Grammars can appear outside ‘field’ as a child of ‘form’, but then they are active in all fields of the form. There are cases in which doing so is good design, but it’s not the usual case.
  3. The only element that “needs” a timeout for the respondent being silent is ‘noinput’; yet, the attribute is required to be part of ‘prompt’ instead.
  4. ‘goto’s can go to other fields in the same form, or different forms, but not to a specific field of another form.

I’ve made the ‘filled’ part less generalized than the other parts to illustrate a few points:

  1. The contents of the ‘filled’ element is where you define all of the logic about what to do in response to what the respondent has said.
  2. Although I’ve indented if-elseif-else to highlight their usual semantic relation to each other, you can see that actually ‘if’ contains the other two, and that ‘elseif’ and ‘else’ don’t actually contain their then-parts (which is somewhat contrary to the spirit of XML).
  3. The field name is treated as if it contains the result of speech recognition (because it does), and it does so as a JavaScript object variable that has named properties.
  4. The field variable is lexically scoped to the containing form, so if you want to access the results of speech recognition in another form (perhaps after following a ‘goto’), then you first must have a JavaScript variable whose scope is outside either of the forms, and assign it the object held by the field variable.
  5. A boolean AND in a condition must be written as &amp;&amp; to avoid confusing the XML parser. (You might want to try wrapping the condition as CDATA if this really bugs you.)
  6. Form id’s can be used like html anchors, so a local url for referencing a form starts with url fragment identifier ‘#’ followed by the form’s id.

Note that it’s not necessary to start form id’s with “f”, or fields with “ff”, or grammars with “g”, nor is it necessary to repeat names across them like I do here. But I find that simplifying this way helps keep the application from seeming over-complicated.

Creating grammars

To implement the grammar content indicated above by placeholder text, “…All the things the speaker might expect the respondent to say that are on-task…,” one provides a list ‘one-of’ and ‘item’ elements. ‘one-of’ is used to indicate that exactly one of its child items must be recognized. ‘item’ has a ‘repeats’ attribute that takes such values as “0-1” (i.e., can occur zero or one times), “0-” (i.e., can occur zero or more times), “1-” (i.e., can occur one or more times), “7-10” (i.e., can occur 7 to 10 times), and so on. ‘item’ takes one or more children which can be any permutation of ‘item’ and ‘one-of’ elements, which can have their own children, and so on. The children of a ‘rule’ or ‘item’ element are implicitly treated as an ordered sequence, so all the child elements must be recognized for the parent to be recognized. (This formalism might remind you of Backus-Naur Form (BNF) for describing a context-free grammar (CFG). If you need a grammar more expressive than a CFG, you’ll have to impose the additional constraints in post-processing that follows speech recognition.)

If the contents of the grammar rule take up more than about five lines, it’s good practice like in other coding languages to modularize that content into an external file. Each such grammar module is declared within an inline ‘item’ like this,

<grammar mode="voice" xml:lang="en-US" root="gGetCommand" tag-format="semantics/1.0">
  <rule id="gGetCommand">
    <one-of>
      <item>
        <ruleref uri="myCommandLanguage.srgs.xml#SingleCommand" type="application/grammar-xml"/>
      </item>
      <item>
        <ruleref uri="myCommandStop.srgs.xml#Stop" type="application/grammar-xml"/>
      </item>
    </one-of>
  </rule>
</grammar>

and the external grammar file should have this form:

<?xml version= "1.0" encoding="UTF-8"?>
<!DOCTYPE grammar PUBLIC "-//W3C//DTD GRAMMAR 1.0//EN"
                         "http://www.w3.org/TR/speech-grammar/grammar.dtd">
<grammar version="1.0" xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en-US" tag-format="semantics/1.0" root="SingleCommand" >
  <rule id="SingleCommand" scope="public">
    ...A sequence of 'one-of' and 'item' elements describing single commands you want to support...
  </rule>
  <rule id="SubgrammarOfSingleCommand" scope="public">
    ...Details about a particular command that would take too much space if placed inside the SingleCommand rule...
  </rule>
</grammar>

Defining the Recognition Result

Human languages usually allow any intended meaning to be phrased in several ways, so useful speech apps need to accommodate this by providing as many expected paraphrases as seem likely to be used. So, a grammar often has several ‘one-of’s to accommodate paraphrases. A naive approach for a speech app would be to provide such paraphrases in the grammar, and take recognition results in their default format of a single string, and then try to re-parse that string with JavaScript case-switch-logic similar to the ‘one-of’s in the markup — a duplication of work (ugh) with the attendant risk that the two will eventually fall out of sync (UGH!). What would be much preferred would be to retain the parse structure of what’s recognized and return that instead of a (flat) string; in fact, this is just what the “semantic interpretation” capability of vxml grammars offers. To make use of this capability, a few things are needed (these may be Voxeo-specific):

  1. The ‘grammar’ elements in both the vxml file and the external grammar file(s) must use attributes tag-format="semantics/1.0" plus root="yourGrammarsRootRuleId"
  2. ‘tag’ elements must be placed in the grammars (details on how below), and they must assume there is a JSON object variable named ‘out’ to which you must assign properties and property-values. If instead you assign a string to ‘out’ anywhere in your grammar, then recognition results will revert to flat-string format.
  3. If using Voxeo, ‘ruleref’ elements that refer to an external grammar must use attribute ‘type=”application/grammar-xml”‘, which doesn’t match the type suggested by the vxml2.0 spec, “application/srgs+xml”, http://www.w3.org/TR/speech-grammar/#S2.2.2

To use ‘tag’ elements for paraphrases, one can do this,

<rule id="Stop" scope="public">
  <one-of>
    <item>stop</item>
    <item>quit</item>
  </one-of>
  <tag>out.result = 'stop'</tag>
</rule>

in which the ‘result’ property was chosen by me, but could have been any legal JSON property name. The only real constraint on the choice of property name is that it make self-documenting sense to you when you refer to it elsewhere to retrieve its value.

‘tag’ elements can also be children of ‘item’s, which makes them a powerful tool for structuring the recognition result. For example, a grammar rule can be configured to create a JSON object:

<rule id="ParameterizedAction" scope="public">
  <one-of>
    <item>
      <one-of>
        <item>drill</item>
        <item>bore</item>
      </one-of>
      <ruleref uri="#DrillSpec"/>
      <tag>
        out.action = 'drill';
        out.measure = rules.latest().measure;
        out.units = rules.latest().units;
      </tag>
   </item>
   ...
</rule>

In this example, we rely on knowing that the “DrillSpec” rule returns a JSON object having “measure” and “units” properties, and we use those to create a JSON object that has those properties plus an “action” property.

‘tag’ elements can also be used to create a JSON array:

<rule id="ActionExpr" scope="public">
  <tag>
    out.steps = [];
    function addStep(satisfiedParameterizedActionGrammarRule) {
      var step = {};
      step.action = satisfiedParameterizedActionGrammarRule.action;
      step.measure = satisfiedParameterizedActionGrammarRule.measure;
      step.units = satisfiedParameterizedActionGrammarRule.units;
      out.steps.push(step);
    }
  </tag>
  <item>
    <ruleref uri="#ParameterizedAction"/>
    <!-- This use of rules.latest() should work according to http://www.w3.org/TR/semantic-interpretation/#SI5 -->
    <tag>addStep(rules.latest())</tag>
  </item>
  <item repeat="0-">
    <item>
      and
      <item repeat="0-1">then</item>
    </item>
    <ruleref uri="#ParameterizedAction"/>
    <tag>addStep(rules.latest())</tag>
  </item>
</rule>

These object- and array-construction techniques can be used in other rules that you reference as sub-grammars of these, allowing you to create a JSON object that captures the complete logical parse structure of what is recognized by the grammar.

By the way, if you want to use built-in support for recognizing yes-or-no, numbers, dates, etc as part of a custom grammar, then you’ll need to use a ‘ruleref’ like this,

<rule id="DepthSpec" scope="public">
  <item>
    <ruleref uri="#builtinNumber"/>
    <tag>out.measure = rules.builtinNumber</tag>
  </item>
</rule>
<rule id="builtinNumber">
  <item>
    <ruleref uri="builtin:grammar/number"/>
  </item>
</rule>

URI’s for other types can be inferred from the “grammar src” examples at http://help.voxeo.com/go/help/xml.vxml.grxmlgram.builtin (although these might be specific to the Voxeo vxml platform).

If you follow this grammar-writing approach, then you can access the JSON-structured parse result by reading property-value’s from the field variable containing the grammar (e.g., “ffYourNameForTheState” above), just as if it were the “out” variable of your root grammar rule that you’ve been assigning to. These values can be used in ‘filled’ elements either to guide if-then-else conditions, or be sent to a remote server as we’ll see in the next  major section, “Dynamic prompts and Web queries”.

Managing ambiguity

As a side note, if you’re an ambiguity nerd like me, you’ll probably be interested to know that Vxml 2.0 doesn’t specify how homophones or syntactic ambiguity must be handled. But Voxeo provides a way to get multiple candidate parses.

Dynamic prompts and Web queries

So far, we can simulate one side of a canned conversation via a network of expected conversational states. It’s similar to a Choose-Your-Own-Adventure book in that it allows variety in which branches are followed, but it’s “canned” because all the prompts are static. But often we need dynamic prompts, especially when answering a user question via a web query. JavaScript can be used to provide such dynamic content by placing a ‘value’ element as a child of a ‘prompt’ element, and placing the script as the value of ‘value’s ‘expr’ attribute, like this:

<assign name="firstNumberGiven" expr="100"/> <!-- Simulate getting a number spoken by the user -->
<assign name="secondNumberGiven" expr="2"/> <!-- Simulate getting a number spoken by the user -->
<prompt>The sum of <value expr="firstNumberGiven"/> and <value expr="secondNumberGiven"/> is <value expr="firstNumberGiven + secondNumberGiven" /> </prompt>

The script can access any variable or function in the lexical scope of the ‘value’ element; that is, any variable declared in a containing element (or its descendants that appear earlier). Also notice that, by default, adjacent digits from a ‘value’ element are read as a single number (e.g., “one hundred and two”) rather than as digits (e.g., “one zero two”). That’s convenient, because one can’t embed a ‘say-as’ element in the ‘expr’ result, although one can force pronunciation as digits by inserting a space between each digit (e.g., “1 0 2”) perhaps by writing a JavaScript function (see http://help.voxeo.com/go/help/xml.vxml.tutorials.java); otherwise, if the default were to pronounce as digits, then forcing pronunciation as a single number would require a much more complicated function.

I’ve said little to nothing about interaction design in speech applications, although it’s very important to get right, as anyone who’s become frustrated while using a speech- or touchtone-based interface knows well. But one principle of interaction design that I will emphasize is that user commands should usually be confirmed, especially if they will change the state of the world and might be difficult to undo. When grammars are configured to return flat-string results, prompting for confirmation is easy to configure like this:

<prompt>I think you said <value expr="recResult"/> Is that correct? </prompt>

But when a grammar is configured to return JSON-structured results, the ‘value’ element above might be read as just “object object” (the result of JavaScript’s default stringify method for JSON objects, at least in Voxeo’s vxml interpreter). I believe the best solution is to write a JavaScript function (in an external file referenced with a ‘script’ element near the top of the vxml file) that is tailored to construct a string meaningful to your users from your grammar’s JSON structure, then wrap the “recResult” variable (or whatever you name it) in a call to that function. If there is any need to nudge users toward using terms that are easier for your grammar to recognize, then this custom stringify function is an opportunity to paraphrase their commands back to them using your preferred terms.

Now we’re ready to talk about sending JSON-structured recognition results to remote servers, which is the most exciting feature of vxml 2.1 for me, because it’s half of what we need to make vxml documents able to leverage the same RESTful web APIs that dhtml documents can (the other half, being able to digest the server’s response, will be discussed soon; “dhtml” === “Dynamic HTML”, which is a combination of html and JavaScript fortunate enough to find itself in a browser that has JavaScript enabled). Like html forms, vxml provides a way for its forms to submit to a remote server. And also like html, the response must be formatted in the markup language that was used to make the submission, because the response will be used to replace the document containing the requesting form. Html developers realized that their apps could be more responsive if less content needed to travel to and from the remote server, and that if they instead requested just the gist of what they needed, and the response encoded that gist in a markup-agnostic format like XML or JSON, then JavaScript could be used in their browser-based client to manipulate the DOM of the current document and that might usually be faster than requesting an entirely new document (even if most of its resources could be externalized into JavaScript and CSS files that can be cached). Because these markup-agnostic APIs are becoming widely available, they present an opportunity for non-html client markup languages like vxml to leverage them. Vxml developers created a way to leverage these APIs by adding a ‘data’ element alternative to vxml form submission in the vxml 2.1 spec. Here’s an example:

<var name="sInstructionHeard" expr="JSON.stringify(_oInstructionHeard)"/>
<data method="post"
      srcexpr="_sDataElementDestinationUrl + '/executeInstructionList'"
      enctype="application/x-www-form-urlencoded"
      namelist="sInstructionHeard"
      fetchhint="safe"
      name="oRemoteResponse"
      ecmaxmltype="e4x" />

The ‘data’ element isn’t as user-friendly as it might be. For example, one can’t just put the JSON-structured recognition result in it and expect it to be transferred properly; instead, one must first JSON.stringify() it (this method is native to most dhtml browsers circa 2014 and to Voxeo’s vxml interpreter). And the ‘data’ element requires that even POST bodies be url-encoded, so the remote server must decode using something like this (assuming you’re using a NodeJs server):

sBody = decodeURIComponent(sBody.replace(/\+/g, ' '));
sBody = sBody.replace('sInstructionToEvaluate=',''); //Strip-off query parameter name to leave bare value
sBody = (sBody ? JSON.parse(sBody) : sBody);

What the remote server needs to do for its response is easier:

oResponse.writeHead(200, {'Content-Type': 'text/xml'});
oResponse.end('<result><summaryCode>stubbedSuccess</summaryCode><details>detailsAsString</details></result>');

If the server is reachable and generates a response like this, then the variable above that I named “oRemoteResponse” will be JSON-structured and have a ‘result’ property, which itself will have ‘summaryCode’ and ‘details’ properties whose values, in this case, are string-formatted. You have the freedom to use any valid XML element name — which is also a valid JSON property name — in place of my choice of ‘result’. The conversion from the remote server’s XML formatted response to this JSON structure is done implicitly by the vxml interpreter due to the ecmaxmltype="e4x" attribute. (The vxml 2.1 interpreter cannot process a JSON-formatted response as dhtml browsers can.) These JSON properties from the remote server can be used to control the flow of conversation among the ‘form’s in the same way we used JSON properties from “semantic” speech recognition earlier. Coolness!

A few final comments about ‘data’ elements:

  1. To validate the xml syntax of your app, you probably want to upload it to the W3C xml validator; however, the ecmaxmltype="e4x" attribute is apparently not part of the vxml 2.1 DTD, which the validator finds at the top of your file if you’re following my template above, and so you will get a validation error that you’ll have to assume is spurious and ignorable.
  2. My app uses a few ‘data’ elements to send different kinds of requests, so to keep the url of the remote server in-sync across those, I have a ‘var’ element before all my forms in which I define the _sDataElementDestinationUrl url value.
  3. fetchhint="safe" disables pre-fetching, which isn’t useful for dynamic content like the JSON responses we’re talking about
  4. If you want to enable caching, which doesn’t make sense for dynamic JSON content like we’ve been talking about but would be reasonable for static content, you’d do that via your remote server’s response headers.
  5. If the remote server isn’t reachable, the ‘data’ element will throw an ‘error.badfetch’ that can be caught with a ‘catch’ element to play a prompt or log error details, but unfortunately this error is required by the spec to be “fatal” which appears to mean the app must exit (in vxml terms, I believe it means the form-interpretation algorithm must exit). That’s a more severe reaction than in dhtml which allows DOM manipulation and further http requests to continue indefinitely. Requiring such errors to be fatal blocks such potential apps as a voice-driven html browser that reads html content, because it could not recover from the first request that fails. But maybe I’m interpreting “fatal” wrong; Voxeo’s vxml interpreter seems to allow interaction to continue indefinitely if this error is caught with a ‘catch’ element that precedes a ‘form’.
  6. If the remote server is reachable but must return a non-200 response code, the ‘data’ element will throw ‘error.badfetch.DDD’ where DDD is the response code. This error is also “fatal”.

At this point, we’ve covered all that I think is core to authoring a speech application using vxml 2.1. For more details, the vxml 2.0 spec and its 2.1 update are the authoritative references. Voxeo’s help pages are also quite useful.

Up next: Test-driven development of speech applications, and Hosting a speech app using Voxeo.

How to keep selected local Windows folders in sync

Some web platforms, like Apache Tomcat and Voxeo Prophecy, require their configuration files to be kept in specific locations under the platform’s installation folderpath. This creates a minor annoyance when developing for these platforms, because your IDE may not be able to add such a folderpath to its workspace. But even if the IDE can access such a folderpath, or even if you can create a symbolic link to it, the IDE might add project-related files that would confuse the web platform.

As a workaround, I’m using FreeFileSync to define rules about which items from the two folders should be overwritten by newer files from the other folder. However, FreeFileSync only syncs when prompted to via its GUI. To automatically detect new files and trigger sync, you can use the companion app RealtimeSync (which is included in the FreeFileSync installer). Yet, RealtimeSync is an app rather than a service, so you need to configure Windows to launch it whenever you start Windows or resume from sleep.

Installing FreeFileSync

  1. This app’s sourceforge page has a download button, but it will lead you through several hops before you get to a mirror at “FossHub”.
  2. The download links can be hard to find. Look in the light blue box near the top for a set of links such as “Download FreeFileSync for Windows”.
  3. During the installation, avoid Express install because it will install bloatware; use Custom Install instead.

Configuring FreeFileSync

These steps are adapted from another guide.

  1. If there are any files already in the web platform folder(s) that you want under the IDE’s control, import them to the corresponding project folder(s) in the IDE now
  2. In FreeFileSync, click the small green ‘+’ button above the left file listing pane to add as many pairs of folders to sync as you need.RealtimeSync_config1
  3. Set the left folder listing to the project folder, and the right folder listing to the web platform’s configuration folder
  4. Customize the default syncing across all folder pairs by clicking the green gear button in the upper-rightRealtimeSync_config2
  5. Cause new items in the left folder to be copied to the right folder. Cause edited items in the left folder to overwrite the corresponding file in the right folder. Note that only files introduced after RealtimeSync is first started will count as “new”, which is a good thing if you don’t want all the files in the left folder copied over.
  6. If you need to override the default sync behavior for any pair, use the small gear button that’s between the folderpaths
  7. Click OK to close, then go to Program | Save As Batch Job
  8. I recommend creating a top-level Dropbox folder called Settings where this batch-settings file can be kept.

Configuring RealtimeSync

  1. Launch RealtimeSync
  2. Go to Program | Open and select the batch-settings file you just created.
  3. Click Start.

Configuring ScheduleTasks (Windows) to launch RealtimeSync

  1. Use Windows+Q to open the start menu search box, enter “schedule”. Either “Schedule tasks” (in Windows 8) or “Task scheduler” (Windows 7 and before) should be offered. Select it.
  2. In the app that appears, select Create Basic Task.
  3. Under General | Name, enter something like “RealtimeSync after login”
  4. Set the trigger to be “On local connection to any user session”
  5. Set the action to be “Start a program” and browse to RealtimeSync’s executable (probably at C:\Program Files\FreeFileSync\RealtimeSync.exe). Under “Add arguments”, paste a complete path to the batch-settings file. (I recommend the Copy Path feature of FileMenuTools, a separate app, for this frequently-needed ability.)
  6. Under Conditions, make sure there are no dependencies on idleness, AC power, etc.

Verifying that it works

  1. Logout of your Windows session, then log back in.
  2. Edit one of the project files and save it. The FreeFileSync dialog should appear within a second or two, and disappear within a second.
  3. Verify that the corresponding file in the web platform folder has the edited content.

 

Testing custom REST APIs using a NodeJs http server

When you’re developing a REST API, and it’s not entirely clear how well your configuration of the client-side is working, it can be handy to get a server up quickly to log what it receives. In the Java servlet world, this might be done by introducing a filter class that inspects and logs all requests and responses. But if you don’t have the servlet itself setup yet, a faster way is to launch a NodeJs-based server locally.

If you’ve setup Eclipse IDE for debugging standalone JavaScript using NodeJs, then you can paste-in this as a new .js file and you’re almost home:

//Derived from http://stackoverflow.com/a/12007627

http = require('http');
//fs = require('fs');

port = 51100;
host = '127.0.0.1';

server = http.createServer( function(req, res) {
	
    console.log("------------------------------");

    var htmlPrefix = '<html> <body>',
    	htmlSuffix = '<form method="post" action="http://'+ host +':'+ port +'"> String value to POST (type exit to quit): <input name="formSent" type="text"/> <input type="submit" value="Submit"/> </form> </body> </html>',
    	html = null, //fs.readFileSync('index.html');
    	body = '';
    
    var generateResponse = function() {
        html = html || htmlPrefix +'Received '+ req.method + (body ? ': '+body : '') + htmlSuffix;
        res.writeHead(200, {'Content-Type': 'text/html'});
        res.end(html);
        
        if (body == "formSent=exit") {
                //console.log("Exiting process");
                process.exit(code=0);
        }
    }
    
    if (req.method == 'POST') {
        req.on('data', function (data) {
            body += data;
            console.log("Partial body: " + body);
        });
        req.on('end', function () {
            console.log("Body: " + body);
            generateResponse();
        });
    } else { //GET
        generateResponse();
    }

});

server.listen(port, host);
console.log('Listening at http://' + host + ':' + port);

Once you see the console entry about “Listening…”, just open a browser tab to http://127.0.0.1:51100/.

If you can’t kill the process by entering exit into the web form, then you can kill the node process from the commandline.

C++ debugging in Eclipse IDE

If you’re a Java developer and find yourself also doing some coding in another language, you don’t want to have to change your IDE just to do so. That would make the learning curve twice as steep. Fortunately, if you use the Eclipse IDE and you need to code in C++ (or JavaScript), you don’t have to switch.

CHUA Hock-Chuan has an excellent guide about how to set this up. Some caveats:

  1. I tested with MinimalGnuWindows; I don’t know if I’m missing out on something better in Cygwin (but everytime I’ve encountered Cygwin’s installer, I wonder why it’s so difficult to use)
  2. To get support for the 2011 version of C++ (aka “C++0x” or “C++11”), go to Project | Properties | C/C++ Build | Settings | GCC C++ Compiler | Miscellaneous | Other flags and append -std=c++11 to whatever might already be there. Then do the same for GCC C Compiler | Miscellaneous | Other flags.
  3. “Clean”ing a project may not work on your Windows (ref1) (ref2), because it uses “rm -rf”. If this happens, install GNU CoreUtils, make sure it’s in your PATH (i.e., append ;C:\Program Files (x86)\GnuWin32\bin), and restart Eclipse.
  4. When you get to the point of entering the C++ helloworld, std, cout, and endl will show errors about not being able to resolve them. These errors disappeared when I saved the file, which probably means there’s a fast kind of syntax-checking done for (almost) every key press, and a more expensive compilation done only when edits are saved.
  5. If you check the Includes settings under Project, all the paths will be entirely down-cased. My actual paths use some camel-casing and work fine, so this must just be a quirk of the dialog.
  6. Even if the project is set to Build Automatically, you must still select Project | Build Project for the binaries to be created.
    1. If ‘clean’ or ‘build project’ fail with error ‘make cannot be found in PATH’, then ensure you have the following setting: Project | Properties | C/C++ Build | Tool Chain Editor | Current toolchain = MinGW GCC. If you had to change this setting, try cleaning or building again.

Debugging standalone JavaScript in Eclipse IDE

There are some nice online JavaScript debuggers like jsbin.com and jsfiddle.net, but if you want to edit a .js file that’s part of a version-controlled project, you’d probably prefer to do so in an IDE.

The Eclipse IDE has the JSDT plugin that provides syntax highlighting. (And it’s baked into the “Eclipse for Java EE Developers” edition, but can be installed in other versions, too.) To evaluate the code and see the output in the IDE’s Console view, some extra steps are necessary. Z0ltan’s guide is very helpful. Some extra hints:

  1. Install nodeJs first. This is an executable that wraps Google’s V8 js engine, providing a commandline interpreter (and lots of useful libraries, I believe, such as allowing the coding of a web server using javascript). As of 2014 April 29, the Windows installer doesn’t broadcast through the OS that the Path has been updated, so as a workaround you may need to open the Environment Variables dialog and then close it with the OK button.
  2. If you copy any text from Z0ltan’s page, or this one, into your IDE, make sure to re-type all the quotes; otherwise, copy/paste tends to pick up “fancy” quote characters that cause weird errors.
  3. In between Z0ltan’s steps 1 and 2, you need to select “Program” in the left pane, and then hover over the buttons above it to select the one having hover text “New launch configuration”.
  4. There’s a comment at the end of Z0ltan’s guide advising that you add the /D flag like this:
    /C "cd /D ${container_loc} && node ${resource_name}"
    

    if your js source file might be kept somewhere other than the C drive. But Microsoft’s documentation of cmd.exe flags just says /D disables autorun, so I’m not sure how helpful that is.

The NodeJs API pages start here, and there’s some helpful guidance on error-handling on Joyent.

Unit-testing a Vxml app using JVoiceXml in Text mode

Installing and verifying JVoiceXml

The following steps were tested on this combination:

  • Windows 8.1 Pro 64-bit
  • JavaSE JDK 1.7.55 32-bit
  • Eclipse Juno SR2 32-bit
  1. Download the JVoiceXml 0.7.6.1GA (or higher) installer jar from http://sourceforge.net/projects/jvoicexml/
  2. Unpack that zip and double-click the jar installer
  3. Instead of installing to C:\Program Files (x86)\, install to some folder on your Desktop. (The default location runs into a permissions error for me; maybe it’s intended that the installer jar be run with Admin privilege.)
  4. When prompted which modules to install, choose “Text Implementation” (skipping jsapi through mrcp and the call manager). However, “VoiceXML Unit” through “Source Code” will probably be useful later on.
  5. Check the doc folder of your installation for the user guide, which provides further installation steps. The following steps are based on it.
  6. Open the following in a text editor: config/text-implementation.xml
    Verify that the classpath element values correspond to files that actually exist in your installation folder (even if the user guide gives different values).
  7. At a commandline, run [installation dir]\bin\JVoiceXML.exe. If you don’t see
    VoiceXML interpreter [version] started.
    

    then check the user guide for troubleshooting guidance. Leave this window undisturbed, and if you want to stop the server, make sure to use Shutdown.exe instead of closing the window or using Ctrl+C.

  8. Using Eclipse (Juno) IDE, go to File | New | “Java Project from Ant buildfile…”
  9. For “Ant buildfile” browse to JVoiceXml\demo\org.jvoicexml.demo.textdemo\build.xml
  10. “Project name” should now be auto-filled and the Finish button should be enabled.
  11. Select the project in the left pane, then select Run | Run Configurations…
  12. Select “Java Application” in the left pane, then click the button at top whose on-hover tip shows “New launch configuration”. Name the new launch config something meaningful to your memory like “JVoiceXmlTextDemo”.
  13. Select the Arguments tab, then paste
    -Djava.security.policy=${config}/jvoicexml.policy
    

    into the “Program arguments” textarea, replacing ${config} with the full path to the config folder within the text demo folder. For example,

    -Djava.security.policy=C:\Users\david\Desktop\Tools\JVoiceXml\demo\org.jvoicexml.demo.textdemo\config\jvoicexml.policy
    
  14. Select the Classpath tab. Select “User entries” then click the Advanced button. Select the “Add External Folder” radio button, then click OK. Browse to the config within the text demo folder. When you return to the Classpath tab, click Apply. If instead you select the config folder at the top-level, later you will get an error like this in the Eclipse console:
    Need to specify class name in environment or system property, or as an applet parameter, or in an application resource file: java.naming.factory.initial
    
    javax.naming.NoInitialContextException: Need to specify class name in environment or system property, or as an applet parameter, or in an application resource file: java.naming.factory.initial
    
  15. Select the Main tab, then click Search to select TextDemo. Make sure the package shown for TextDemo is from jvoicexml. Click Apply.
  16. In TextDemo.java, the imports of org.jvoicexml.client.text.TextListener and org.jvoicexml.client.text.TextServer might be shown as missing. If so, select the project in the left pane, then select Project | Properties | Java Build Path | Libraries | Add External Jar, and browse to JVoiceXml\lib\org.jvoicexml.client.text.jar
  17. WARNING: I still run into java security errors like this:
    Exception in thread "JVoiceXML text server" java.security.AccessControlException: access denied ("java.net.SocketPermission" "localhost:4242" "listen,resolve")
    

CFP: 5th Intl. Workshop on Human Behavior Understanding

I won’t be there, but I’m interested to see the paper titles after the event…

--------------------------------------------------------------------
Call for Papers: 5th Int. Workshop on Human Behavior Understanding
(HBU'2014) to be held in conjunction with ECCV'14, 12 September,
Zurich, Switzerland

"Focus Theme: Computer Vision for Complex Social Interactions"
http://www.cmpe.boun.edu.tr/hbu/2014/
--------------------------------------------------------------------
Short description:
The Fifth Workshop on Human Behavior Understanding, organized as a
satellite to ECCV'14, will gather researchers dealing with the problem
of modeling human behavior under its multiple facets (expression of
emotions, display of complex social and relational behaviors,
performance of individual or joint actions, etc.), with the focus
topic of computer vision for complex social interactions.
While different aspects of social interactions are tackled in several
venues, this workshop will solicit computer vision solutions that
clearly advance the field, and chart the future of computer analysis
of complex interactions. Topics of interest include, but are not
limited to:

-Human behavior capture technology and benchmark datasets
-Social activity detection, tracking, reconstruction, and recognition
-Social scene representation and understanding
-Social behavior modeling and prediction
-Multimodal social signal integration
-Causality and reciprocity of social interaction
-Applications of social intelligence

The HBU is organized as a full-day, single track event with invited
talks, oral presentations and poster presentations.

Submissions:
Submissions must represent original material. Papers are accepted for
review with the understanding that the same work has been neither
submitted to, nor published in, another journal or conference. All
manuscripts will undergo a rigorous review process by the members of
the program committee.
You can submit a paper now at:
https://www.easychair.org/account/signin.cgi?conf=hbu2014

Invited Speakers:
Shai Avidan, Tel-Aviv University
Marco Cristani, University of Verona
David Forsyth, University of Illinois at Urbana-Champaign
Daniel Gatica-Perez, Indiap Research Institute
Fei-Fei Li, Stanford University
James Rehg, Georgia Institute of Technology
Nicu Sebe, University of Trento
Alessandro Vinciarelli, University of Glascow

Important Dates:
13 June: Submission of full papers (23:59pm PST)
4 July: Notification of acceptance
11 July: Camera-ready paper submissions
12 September: HBU Workshop

Contact:
You may contact H.S. Park (hyunsoop@cs.cmu.edu) or A.A. Salah
(salah@boun.edu.tr) about questions regarding HBU.

Organizing Committee:
Hyun Soo Park, Carnegie Mellon University, USA
Albert Ali Salah, Bo?azi?i University, Turkey
Yong Jae Lee, University of California, Berkeley, USA
Louis-Philippe Morency, University of Southern California, USA
Yaser Sheikh, Carnegie Mellon University, USA
Rita Cucchiara, University of Modena and Reggio Emilia, Italy

(Tentative) Program Committee:
Hamid Aghajan, Stanford University, USA
Oya Aran, Idiap Research Institute, CH
Richard Bowden, University of Surrey, UK
Wongun Choi, NEC Laboratories America, USA
Peter Carr, Disney Research, USA
Marco Cristani, University of Verona, IT
Fernando de la Torre, Carnegie Mellon University, USA
Laurence Devillers, LIMSI, FR
Hamdi Dibeklioglu, Delft University of Technology, NL
P?nar Duygulu Sahin, Bilkent University, TR
Haz?m Ekenel, Istanbul Technical University, TR
Alireza Fathi, Stanford University, USA
Raquel Fernandez Rovira, University of Amsterdam, NL
David Forsyth, University of Illinois at Urbana Champaign, USA
Jordi Gonzalez, UAB-CVC Barcelona, ES
Hatice Gunes, Queen Mary University of London, UK
Alexander Hauptmann, Carnegie Mellon University, USA
Hayley Hung, Delft University of Technology, NL
Nazli Ikizler-Cinbis, Hacettepe University, TR
Quiang Ji, Ransellaer Polytechnic Institute, USA
Mohan Kankanhalli, National University of Singapore, SG
Cem Keskin, Microsoft Research, UK
Kris Kitani, Carnegie Mellon University, USA
Ivan Laptev, INRIA, FR
Patrick Lucey, Disney Research, USA
Simon Lucey, CSIRO, AU
Jean Marc Odobez, Idiap Research Institute, CH
Greg Mori, Simon Fraser University, CA
Vittorio Murino, Istituto Italiano di Tecnologia and University of Verona, IT
Massimo Piccardi, University of Technology, Sydney, AU
Shishir Shah, University of Houston, USA
Alan Smeaton, Dublin City University, IE
Leonid Sigal, Disney Research, USA
Khiet Truong, University of Twente, NL

--
Dr. Albert Ali Salah
Bogazici University, Computer Engineering Dept.
34342 Bebek  - Istanbul, Turkey
Phone: +90 212 359 (7774)
http://www.cmpe.boun.edu.tr/~salah/
Bogazici University, Cognitive Science MA Program
http://www.cogsci.boun.edu.tr
General co-chair, 16th ACM Int. Conf. on Multimodal Interaction
http://icmi.acm.org/2014

Call for papers: Special Issue on Mental Model Ascription by Intelligent Agents

2nd Call for Papers

Interaction Studies: Special Issue on Mental Model Ascription by Intelligent Agents

Mental model ascription, otherwise known as “mindreading”, involves inferring features of another human or artificial agent that cannot be directly observed, such as that agent’s beliefs, plans, goals, intentions, personality traits, mental and emotional states, and knowledge about the world. This capability is an essential functionality of intelligent agents if they are to engage in sophisticated collaborations with people. The computational modeling of mindreading offers an excellent opportunity to explore the interactions of cognitive capabilities, such as high-level perception (including language understanding and vision), theory of mind, decision-making, inferencing, reasoning under uncertainty, plan recognition and memory management. Contributions are sought that will advance our understanding of mindreading, with priority being given to carefully described, algorithmic or implemented approaches that address the practical necessity of computing prerequisite inputs. Formal evaluations are not required.

This volume was inspired by successful workshops at CogSci 2012 (Modeling the Perception of Intentions) and CogSci 2013 (Mental Model Ascription by Language-Enabled Intelligent Agents).

Since Interaction Studies targets a broad audience, authors are encouraged to provide sufficient context for their contributions and define specialist terminology.

The deadline for submissions is January 14, 2014. Submission requirements and instructions can be found at http://benjamins.com/#catalog/journals/is/submission. Please address questions to the special edition editor, Marge McShane, at mcsham2@rpi.edu.

Reflections on mirror neurons

There hasn’t been much research in neuroscience that’s directly relevant to intention perception, except for the finding of mirror neurons. These are very small bundles of neurons that are activated whenever performing certain actions or observing someone else performing the same actions. Because all models to date of intention processing make it appear to be a very computationally-intensive task, I’ve been skeptical that any small bundle of neurons could do it. And there’s reason to be skeptical, because any task that’s distributed across a region or regions of the brain would might show low activation, while any bottleneck in communicating the results of those computations might show as quite active.

A new review article in the journal Cell provides further evidence for skepticism.

I agree with Wired’s description: “These findings are significant because they show how mirror neurons are not merely activated by incoming sensory information, but also by formulations developed elsewhere in the brain about the meaning of what is being observed.”

Getting on Singapore’s Do Not Call registry for calls, sms’s, and faxes

“Consumers who receive telemarketing calls despite having listed their numbers on the registry can complain to a watchdog called the Personal Data Protection Commission (PDPC). They may register through the website at www.dnc.gov.sg, or by text message by sending “DNC” to 78772 to block calls, text messages and fax messages; “DNC” to 78773 to block calls only; “DNC” to 78774 to block text messages only. They may also register by phone at 1800-248-0772 to block calls, text messages and fax messages; 1800-248-0773 to block calls only; or 1800-248-0774 to block text messages only.”

Telemarketers seem to be allowed 60 days to comply for a particular recipient number.

Getting started with ResearchCyc

My main hobby currently is looking for an open-source community focused on commonsense reasoning, specifically, the encoding of commonsense used in social interactions in a form that allows for inferring explanations and predictions. If no such community exists, I’ll create one. So far, I’ve found biannual symposia and mailing lists (see CommonsenseReasoning.org) but no open-source community.

Predicate calculus is the only notation that will be expressive enough to capture the richness of the domain, I think. I’m aware of graph-based efforts like OpenMind, but they either don’t allow constraints across variables (e.g., the agent of the action must also have known he would be its primary beneficiary in order to infer that the action was self-serving) or their method of matching to support inferencing (which is likely to be some flavor of “maximal bipartite-graph matching“) will end up being functionally identical to unification for predicate calculus. So I’m placing my bet on predicate calculus-based approaches. The Stanford Encyclopedia of Philosophy (SEP) has an interesting overview.

One of the most promising of these efforts is by Andrew S. Gordon and Jerry Hobbs, which they are developing toward a book. Let’s see if we can make that part of the community once the book is published. Andrew has a video lecture about it and Jerry offers an extensive peek at the formulations.

The Cyc Project has a freely-available ontology (i.e., a high-level taxonomy plus frame-like schemas or predicate definitions), OpenCyc. It’s unclear whether they encourage or permit contributions, and anyway I’m looking for general rules and facts rather than just an ontology. That kind of knowledge base is what their ResearchCyc project is said to be. Although ResearchCyc is not open, maybe they are open to contributions. And it was conceived and is led by one of my AI heroes, Doug Lenat, and I’d like to help it if I can. Michael Witbrock has some interesting video lectures about Cyc. Here’s what I’ve done to have a look inside ResearchCyc…

Getting a copy of RCyc

  1. Write to rcyc@cyc.com with a few sentences describing your non-commercial research interest in the system, and asking for a (free) license. The response might take a few days.
  2. If your license request is approved, you’ll get an email with a download site url, a userid, and a password. Licenses seem to expire after a year. If I click the download link and enter the password, I get a page where I can click an “rcyc” folder link and then get a list of downloads; however, revisiting the same url and entering the userid as well as the password always fails for me.
  3. Download both the “o” tgz and sha1 files (the most up-to-date in early 2014). Getting the sha1 file is recommended because I encountered a corrupted download a few times, and you need a way to check for that. There is a free “MD5 & SHA Checksum” tool, and you can verify the download this way:
    1. Click ‘Browse’ and select researchcyc-4.0o.tgz
    2. Open the sha1 file using a text editor, then select and copy the SHA1 sum value (i.e., the part preceding “researchcyc-4.0o.tgz”, not including whitespace)
    3. Click ‘Paste’ in the application and then ‘Verify’. If verification fails, try downloading the large tgz file again. (I’ve had 5 downloads in a row fail.)
  4. Unpack the tgz.
  5. Avoid editing any files if you’re using Windows; otherwise, you might introduce Windows-specific line ending characters that will prevent the server from starting in Ubuntu.

Installation

I recommend installing on a publicly-accessible server, say, on an Amazon Web Services (AWS) Elastic Compute Cloud (EC2) instance. Doing so will allow you access from any browser if your local machine doesn’t accept requests on port 80. For example, this is what it looks like in Chrome on a Samsung Note 2… rcyc viewed on samsung note 2

  1. Configuring the host machine…
    • If installing on EC2…
      1. Go to https://console.aws.amazon.com/ec2/ and create an account if you don’t have one
      2. Click Launch Instance
      3. Select Ubuntu Server 13.10 64-bit
      4. In the left pane, select General Purpose, then click any row with close to 4 GiB RAM (under the “Memory” column)
      5. On the “Add Storage” page, set the persistent storage capacity (the “Size (GiB)” column) to about 25.
      6. When you get a chance to edit the Security Group, permit access from your clients…
        1. Note the security group of the instance
        2. In the left pane, click Security Groups
        3. In the right pane, select the row of the instance’s security group
        4. In the bottom pane, select the Inbound tab. To allow requests from all IP addresses, there should be a row with Port=22 and Source=0.0.0.0/0. Be sure to add Port=3602 and Source=0.0.0.0/0 because rcyc’s webserver uses this port. For more guidance, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/authorizing-access-to-an-instance.html
      7. If prompted, click “Yes, I want to continue with this instance type” (It won’t be free)
      8. Click Launch
      9. Choose an existing key pair, or create a new one (and be sure to download the key pair file). Then click “Launch Instances”.
      10. Make sure you’ve installed Putty
      11. To convert the pem file to Putty’s ppk format, follow http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/putty.html#putty-private-key
      12. To upload rcyc, do something like this:

        C:\Windows\system32>C:\”Program Files (x86)”\PuTTY\pscp.exe -r -i C:\[your path]\[your private key file].ppk “C:\[another local path]\researchcyc-4.0o\*.*” ubuntu@[your ec2 public dns]:

        I would have preferred WinSCP’s graphical UI, but it kept showing connection errors while transferring large files. If pscp is interrupted, you can re-run the command above. If you decided to upload the gzip instead of unpacking first, or you didn’t set the persistent storage high enough, and then find that you don’t have enough free disk space to unpack (type df -h to check), there is a way to increase disk space without having to lose your disk content

      13. Configure Putty to make it easier to login to the server for future maintenance…
        1. In Windows, go to Start | Putty
        2. Session | HostName = ubuntu@[Public DNS shown in AWS EC2 Instances page for this instance]
        3. Session | ConnectionType = SSH
        4. Connection | SSH | Auth | PrivateKeyFile = [browse to ppk file you created from the pem file]
        5. Window | Columns = 120
        6. Window | LinesOfScrollback = 2000
        7. Session | SavedSessions = “rcyc”, then click Save. When you attempt an ssh connection with this EC2 instance in the future, you can select “rcyc” in the sessions list, click Load, and then edit the host address to match what the ec2 console shows.
        8. Before clicking Open to start the ssh session, you might need to change to a network that doesn’t block port 22.
        9. When prompted “The server’s host key is not cached in the registry…”, click Yes.
        10. If all goes well, you should be prompted with
          Using username "ubuntu".
          Authenticating with public key "imported-openssh-key"
          
          The programs included with the Ubuntu system are free software;
          the exact distribution terms for each program are described in the
          individual files in /usr/share/doc/*/copyright.
          
          Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
          applicable law.
          
          ubuntu@ip-NN-NN-NN-NN:~$

    • Otherwise, if you prefer installing locally…
      1. Let’s take the hard case of using Windows. The rcyc setup instructions indicate some features aren’t supported on Windows, so I setup Ubuntu 13 64-bit in VMWare Player 6 on Windows 7 Pro 64-bit. But before creating that virtual machine, go into your BIOS and ensure that “Intel Virtualization Technology” (i.e., “Intel VT-x”) is enabled; if there is another setting limiting use of VT-x to “trusted” applications, turn that off.
      2. Before powering-on the Ubuntu VM, make sure its Memory setting is at least 7680MB.
      3. To allow fast transfer of files with the Windows host, I created a “forUbuntu” folder on my Windows desktop, moved the unpacked download contents there, installed VMWare Tools in the vm, and set the vm option for Shared Folders to the new one on my host desktop. Now access the shared folder in the vm and copy the files to a local folder in the guest Ubuntu filesystem.
        1. sudo apt-get install build-essential
        2. sudo apt-get install linux-headers-`uname -r`
        3. Select Player | Manage | Install VMWare Tools…, which will mount a folder as though it were the cdrom drive
        4. cp /media/[your userid]/*.gz /tmp/
        5. cd /tmp
        6. tar xvzf VM*.gz
        7. cd vmware-tools-distrib/
        8. sudo ./vmware-install.pl and accept all defaults
        9. Verify that the mounting tool is working: lsmod | grep vmhgfs
        10. The shared folder should be listed after doing this: ls /mnt/hgfs. If it is, do this: cd /mnt/hgfs/yourSharedFolder
        11. Whether you’ve already unpacked the rcyc gzip or not, let’s move it to a non-shared folder to avoid any accidental edits (and attendant changes to line endinges) such as your home: mv *.tgz ~ and then let’s go there: cd ~/researchcyc-*.
        12. If you haven’t already unpacked the RCyc gzipped tar file, do so now, like this: tar -xvzf *.tgz

        The remainder of the instructions will assume your terminal’s working directory is this one.

      4. ResearchCyc’s scripting language is a Lisp variant called SubL that runs on top of Java. So, to install Java in your vm, open a terminal there and sudo apt-get install openjdk-7-jre-headless
  2. Make sure you have a terminal open to the Ubuntu instance (either a Putty connection to EC2 or a terminal running in your local vm instance)
  3. Edit researchcyc-4.0o/server/cyc/run/init/jrtl-release-init.lisp so that the license value you received in email is pasted in place of the XXXX in (csetq *master-license-key* "XXXX")
  4. In the same directory, I edited parameters.lisp by adding (csetq *cb-show-cure-link* t) just before (check-system-parameters) at the end. This is supposed to make a purple “CURE” button appear in the web interface, which I’m told is a knowledge entry tool that provides some guidance.
  5. Navigate to the main scripting directory cd researchcyc-4.0o/server/cyc/run/
  6. Do a test launch to verify everything was configured correctly; try ./bin/run-cyc.sh
    • If you get error bash no such file or directory executable then one possibility is that Windows line-endings made it into some of the text files; try:
      sudo apt-get install dos2unix
      researchcyc-4.0o$ find . *.* |xargs dos2unix
      chmod a+x ./bin/*.sh
      sudo apt-get install openjdk-7-jre-headless

      (My initial research into this problem suggested that there might be 32-bit components in rcyc that wouldn’t run in Ubuntu 13.10 without ia32-libs but that turned out to be a red herring.)

    • After about five minutes of startup, you should see something like:
      Start time: Thu Nov 21 17:08:08 SGT 2013
      Lisp implementation: Cycorp Java SubL Runtime Environment
      JVM: Oracle Corporation OpenJDK 64-Bit Server VM 1.7.0_25 (23.7-b01)
      Current KB: 7163
      Patch Level: 10.145914
      Working Directory: /home/david/Desktop/researchcyc-4.0o/server/cyc/run/.
      Running on: ubuntu
      OS: Linux 3.11.0-13-generic (amd64)

      and after another five minutes, you should see:

      HTTP server listening on port 3602.  Connect via URL http://ubuntu:3602/cgi-bin/cg?cb-start
      
      SPARQL server started on port 3615.
      Jetty server started on port 3603
      Ready for services.
      Total memory allocated to VM: 5791MB.
      Memory currently used: 1651MB.
      Memory currently available: 4140MB.
      CYC(1):
    • Once a Cyc: prompt appears, the webserver is ready. (But its public DNS might not be distributed yet, so if you’re impatient you might want to use the public IP number shown in the EC2 console.)
    • Verify that it’s accessible by using a local browser to visit http://[your ec2 instance’s public dns, or localhost if using a browser in the vm]:3602/cgi-bin/cg?cb-start
  7. If you’re going to edit any rcyc content, and you probably will eventually want to, you’ll need to create an account other than the default Guest account to do so.
    1. Check researchcyc-4.0o/server/cyc/run/init/release-specific-init.lisp to make sure the following is NOT present there.
      (noting-progress "Enabling password authentication"
         (csetq *image-requires-authentication?* T))

      If it’s present, delete it and restart the RCyc server. One way to stop the server is to enter (exit) (a SubL command) at the CYC: prompt; another way is to reboot the Ubuntu server.

    2. On the start page, there’s a textbox for entering a userid to change which user is logged in. Enter “CycAdministrator” and Submit.
    3. You should be on a new page (actually just a new frame in the lower pane) that offers a button to return to the (“now stale”) login page. Click it, enter the new userid you want, and Submit.
    4. You should be in a new frame that says “Unknown Cyclist…Do you want to create a new Cyc constant with this name?”. Click “Yes, Create Cyclist”.
          Note: On at least one occasion, the RCyc server quit after this command, returning no content to my browser and showing this in the terminal:
      CYC(1): ./bin/cyc-runner.sh: line 298:  2555 Killed                  java ${BIT_FLAG} ${SERVER_FLAG} -Xms${MIN_HEAP} -Xmx${MAX_HEAP} ${OLD_SORT_FLAG} ${CODE_CACHE_FLAG} ${PERM_SIZE_FLAG} ${EA_FLAG} ${CM_FLAG} ${PGC_FLAG} ${FAST_OPTS_FLAG} ${AGENT_LIB_FLAG} ${EXTRA_OPTIONS} ${CYC_JAVA_OPTIONS} ${LOG_FLAG} ${ASSERTS_FLAG} -cp "${CLASSPATH}" ${MAIN_CLASS} -f "${INIT_FORM}" "$@"
      Shutting down Derby which provides the SCG repository ....
      ./bin/cyc-runner.sh: line 304: /db/bin/stopNetworkServer: No such file or directory
      ... see  for log output.
  8. If the test allowed you to reach RCyc’s main webpage, then the easier way to launch the rcyc server in the future is:
    • If using EC2, the following will auto-launch rcyc whenever you Start the instance after having Stopped it (because stopping an instance when you don’t need it saves on AWS usage fees). Create /etc/init/run-cyc.conf with this content,
      start on runlevel [2345]
      stop on runlevel [!2345]
      respawn
      script
           cd /home/ubuntu/researchcyc-4.0o/server/cyc/run/
           exec ./bin/run-cyc.sh -b
      end script

      and then do initctl start run-cyc

    • Otherwise, when running in the local vm… Instead of, ./bin/run-cyc.sh do this: setsid ./bin/run-cyc.sh -b “setsid” runs rcyc in a different Linux session than your terminal, so you can exit your terminal and the rcyc server will still run. The “-b” flag is Cyc’s own flag for telling it to run in the background.
  9. You might now want to configure the rcyc server to require a password.
  10. Start exploring in the web UI. For example,
    1. Type “hear” or “perceive” into the search box.
    2. Select #$hearsThat or #$perceivesThat from the autocomplete dropdown
    3. Scroll down the left pane to select Consequent or Antecedent. The right pane will show rules in which the predicate you selected is used in either an antecedent or consequent.

Many thanks to the ResearchCyc team in helping me get this far!

Cycorp offers guidance about how to explore cyc through this web interface.

Hopefully, someday they’ll offer rcyc as an Amazon Machine Instance (AMI), which would make these instructions much shorter!

Exporting an svn repository to git format

To have good scientific results, it must be possible for other researchers to replicate what one’s found (or maybe it was a fluke). Along these lines, psychology articles have long included very detailed sections on methodology that should allow other researchers to recreate similar experimental settings. For projects that involve custom-made software tools, this should be even easier, because the researchers can share the exact code they used on a public repository. Yet this is almost never done, even for pure computer science research.

When I became Principal Investigator of the Computational Social Cognition program at A*STAR in 2008, we created and maintained all our custom software using an in-house Subversion (“svn”) repository. The policy was and still is that one can pursue open-sourcing of a project if it turns out not to have any commercial potential; it’s been my aim since 2008 to open-source as many of our projects as we can, and try to start a movement in cognitive science research to share in this way. Other researchers share this view.

My new site intentionperception.org is the next step. In addition to all the non-code resources it houses, I want it to link to source code from all researchers in this topic who are willing to share, and I want to put them all on a project-hosting site that will be around a long time. I considered SourceForge, GitHub, CodeHaus, and GoogleCode, and decided that Git’s local repos offer many advantages over svn, and that GitHub’s issue tracker and community are more mature and active than others.

So how to convert our svn repo project to git format? The svn2git tool is nearly ideal, but I had to make some adjustments:

  1. Although the readme indicates the following should be sufficient for installation,

    $ sudo apt-get install git-core git-svn ruby rubygems

    $ sudo gem install svn2git

    I found that some of these packages failed to install, and prevented those that followed them from installing, and did so silently. So I recommend doing sudo apt-get install on each item in the first line separately.

  2. The readme also indicates that one can indicate a particular project in the svn repo to convert without having to checkout and convert the entire repo. But I ran into errors such as:

    [...] is not a complete URL and a separate URL is not specified.

    and

    pathspec 'master' did not match any file(s) known to git

    So I did a complete checkout and conversion of all projects in the svn repo, but I will checkin only some of them to github. Deleting the .git folder in the current directory and then retrying helped. Our root is the trunk, and we have no branches; none of the tags is worth keeping, either. So the command that worked for me was:

    $ svn2git -v http://mydomain/svn/repoName/ --trunk / --nobranches --notags --username myusername

  3. But actually that wasn’t enough, either, because svn2git seems not to send KeepAlive’s to the server and I was timing out with this error:

    RA layer request failed: REPORT of '/svn/repoName/!svn/vcc/default': Could not read response body: connection timed out (http://mydomain) at /usr/share/perl5/Git/SVN/Ra.pm line 282

    One of the unexpected benefits of exporting from an in-house repo is that I could file a ticket requesting that the server timeout be temporarily increased. If I were converting a SourceForge repo, I might have to wait until svn2git is fixed instead.

  4. Now you should be ready to import.

btw, doing all this with Ubuntu 13 in VMWare Player 6 on Windows 7 64-bit works fine.

How to monitor packets to/from Amazon Beanstalk / EC2 instances

If you need to answer such questions as

  • Is my client correctly indicating that it can handle gzip-compressed responses?
  • Is my Beanstalk-hosted webapp providing gzip-compressed responses?

then you may have no choice but to monitor packet traffic on the Beanstalk side, especially if your client is running on a mobile OS.

How can one do this? Let’s take the most difficult case and assume the only machine you have for development work runs Windows:

  1. If you didn’t associate a keypair .pem file when launching your Beanstalk environment, then create a new environment that does have such an association.
  2. On your Windows client machine, install VMWare Player 6+ and download an ISO for Ubuntu 13+
  3. Using Player, create a virtual Ubuntu desktop. You’ll also want to install VMWare Tools, which can be tricky: http://www.howtogeek.com/howto/11287/how-to-run-ubuntu-in-windows-7-with-vmware-player/
  4. In Player, go to Player | Manage | Virtual Machine Settings | Options | Shared Folders | set to “Always enabled” and add the folder containing your keypair pem file
  5. In Ubuntu’s dock, select Ubuntu Software Manager, type Wireshark into the search box, and install it
  6. In Ubuntu, open a terminal (e.g. via Ctrl + Alt + T)
  7. You should see the folder containing your pem file if you do

    ls -l /mnt/hgfs

    Sometimes even after you’ve installed VMWareTools, it still fails to access the shared folder. In this case, I’ve tried reinstalling this way:

    sudo ./Desktop/vmware-tools-distrib/vmware-install.pl

    Let’s assume it worked and the full path is /mnt/hgfs/Projects/bs-david.pem

  8. Look in your EC2 console for the Public DNS of your instance. Let’s assume it’s ec2-99-99-99-99.ap-southeast-1.compute.amazonaws.com
    1. If you aren’t sure which ec2 instance your beanstalk instance is running on, go to console.aws.amazon.com | Service = ElasticBeanstalk | Application = your app | Configuration / Edit button | Instances / gear icon | Custom AMI ID. Note down this ID.
    2. Go to Service = EC2 | Running Instances, and find the ID under the AMI ID column. Click in the Name cell of this row.
    3. The Public DNS of this instance will be listed under the Description tab (and immediately above the tabs).
  9. First, verify that ssh can connect. In the Ubuntu terminal, enter

    ssh -t -i /mnt/hgfs/Projects/bs-david.pem ec2-user@ec2-99-99-99-99.ap-southeast-1.compute.amazonaws.com

    If that doesn’t work, it might be that your network blocks the port your SSH is trying to use. I have to switch from my office’s Ethernet to its wifi.

  10. If you succeeded, the prompt should now be [ec2-user@ip-99-99-99-99 ~]
  11. At the ec2 prompt, enter

    sudo tcpdump -i eth0 -s 65535 -w test.pcap

  12. If you succeeded, the terminal should show tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes and then suspend to capture the packet traffic
  13. Using your client, send some of the requests you want to investigate. When you’re done, do Ctrl + C in the terminal
  14. Type

    ls -l test.pcap

    to ensure the file exists and is nonempty

  15. Move the pcap file from EC2 to the Ubuntu desktop by logging out of the ec2 instance and then using scp to fetch the file

    exit
    scp -i /mnt/hgfs/Projects/bs-david.pem ec2-user@ec2-99-99-99-99.ap-southeast-1.compute.amazonaws.com:test.pcap .

    Take note of the dot at the end, indicating you want to save the remote file to the current local directory.

  16. If you succeeded, you should see running updates like this: test.pcap 100% 16MB 203.8KB/s 01:18
  17. Verify that the file was transferred fully by typing

    ls -l test.pcap

    The file size should be the same as that reported by ls when you ran it on the ec2 instance.

  18. Load the pcap file into wireshark by typing

    wireshark -r test.pcap &

  19. In wireshark near the right end of the toolbar, tap the “Edit/apply display filter…” button. In the dialog that pops up, scroll down in the Display Filter list, select “http”, and click OK.
  20. Back in the main wireshark window, the second pane from the top should now show a turnkey for Hypertext Transfer Protocol.
  21. In the top pane, select the row indicating the request or response you’re interested in. Then in the second pane, open the http turnkey.
  22. For the particular scenario of checking that gzip-compression is happening, the client GETs should include “gzip” without quotes among the values of Accept-Encoding, and the server responses must have http response code 200, the Content-Type must not be text/plain, and the body shown at the right end of the third pane should not be comprehensible.

StackOverflow describes an alternative method that works even when you don’t have access to the server, if you can direct the mobile client to use your workstation as a wifi access point.

How to make incoming calls ring longer for Singtel mobile plans

If you have a mobile plan with Singtel, the default ring time for incoming calls is 15 seconds. You can adjust that yourself up to 30 seconds, by doing this:

  1. Go to your mobile phone’s dialer pad
  2. Enter **61*1389**30# and then press the Call button. (Change ’30’ to whatever number of seconds you want the rings to last, up to 30.)
  3. No call will be made but you should see a text confirmation.

Some bloggers indicate that one can go up to 60 seconds if one disables voicemail. (If you are a business, please don’t disable your voicemail. Providing your hours of operation in your voice greeting is very helpful.)

Installing Windows7 on a 12″ Motion Computing M1400 TabletPC

My job involves reading and annotating a lot of PDFs, and doing so on a tablet seems much more comfortable than using a laptop. The tablet would have to support handwriting recognition, of course, because a laptop’s keyboard would be far superior to a virtual keyboard at an odd angle right at my pelvis. As of Fall 2011, there is no native handwriting rec ability in either Apple iOS nor Google Android, and apparently none in their app stores, either. Besides, their largest displays are 10 inches (diagonally), while an 8.5×11 inch page ideally needs a 14 inch display. Also, my favorite PDF annotation tool, Tracker Software’s PDF XChange PRO, isn’t available on those platforms. Fortunately, there are tablets dating from the early 2000s that have 12 inch displays, native handwriting rec support, and designed to run the OS that my PDF tool requires (i.e. Microsoft Windows). As an added bonus, those tablets can now be bought for around US$100 even though they sold for thousands when first released.

I’ll focus here on Motion Computing’s M1400 tablet. I found one for $99 on eBay with its original digitizer pen and Windows XP installed. One or two owners before me had experimented with it, because there was no apparent way to open the Input Panel to use handwriting rec, the device buttons didnt work (except the power button), and the virtual keyboard would open only if I booted with an external keyboard attached and entered Ctrl+U during boot to activate the Utility Manager and turn on the virtual keyboard. A further problem with the hardware itself was that it got quite hot while in use, too hot to keep in one’s lap.

Installing Windows7 (the lowly Home Premium variant is sufficient) solved all of the software problems, and the device no longer gets more than mildly warm.

One new problem is that the display cannot be rotated to portrait orientation, because Intel refuses to release an updated video driver for this CPU chip for Windows7. However, some people have had success running the video driver in XP Compatibility mode, which requires installing the Professional or Ultimate variants of Windows7. I recommend installing Comodo Time Machine after the OS install and before any changes to drivers or registry settings, so you don’t have to reinstall the OS in case of an error.

By the way, I had never heard of Motion Computing before buying this device, but I am very impressed by them. They provided free tech support for a product they hadn’t sold in 5 years, it was very quick, and very thorough. If they offer a 12+ inch Windows8 tablet, especially if it has a color eInk/LCD dual display, it’ll be the first one I consider buying.

If landscape-only orientation is okay with you (and possibly no audio), here’s what you need:

  • Motion Computing M1400 tablet
  • Windows7 Home Premium installer software
  • external usb-connected keyboard with arrow keys
  • CD/DVD external usb-connected drive (or 4GB usb drive + laptop/desktop with CD/DVD drive)
  • Ethernet cable to your router

Follow these steps:

  1. The following procedure will NOT leave your files on the tablet in place. Many of them will be moved to c:windows.old, but you should probably make a backup copy of any that you would hate to lose.
  2. If you use PDF XChange 4.0 (but not the portable version), and if you have a session that you want to remember (i.e. a set of open PDFs), then be sure to note down the filepath of each (in addition to backing up the file itself in step 1 above). Although this PDF viewer is great, they don’t yet support session backup (nor syncing).
  3. If you don’t have an external CD/DVD drive, you’ll need to make the usb drive bootable. Don’t forget to copy all the installer software to it.
  4. Connect the external keyboard and the external drive.
  5. Make sure that the tablet can boot from the external drive:
    1. Reboot the tablet and the white Motion Computing screen appears, hold the digipen tip to the screen. A context menu should appear; select Launch System Setup.
    2. The PhoenixBIOS Setup Utility should open. Tap the Boot tab along the top.
    3. If you’ll be using an external CD/DVD drive, then “CD-ROM Drive” should be listed higher than “+HDD”. If it’s not, then look at the bottom and tap the white down arrow to the left of Select Item until CD-ROM Drive is highlighted in white, then tap the + to the left of Change Values to move it up the list.
    4. If you’ll be using a bootable usb external drive, then tap the white down arrow until +HDD is highlighted in white. Tap the white Enter to the left of “Select > Sub-Menu”. A submenu should open below HDD, and the external usb drive should be listed above the hard drive; if it’s not, tap the white down arrow until the usb drive is highlighted in white, then tap the + to move the usb drive higher in the submenu.
    5. When finished, tap the white F10 at bottom right to save and exit. A yes/no confirmation dialog will appear. Click yes. A beep will sound and you should exit to the Windows bootup process on your external drive.
  6. At this point, the digipen was no longer recognized for me, but the arrow keys on the external keyboard did.
  7. When you are prompted whether to enable WindowsUpdate, I recommend not doing so. It’s a great feature, but when I did this procedure the first time and turned this feature on, the installation of Windows7 SP1 introduced a bug that made several applications give errors about not having the right permissions, or not having enough space on disk, and I could not uninstall some apps like Comodo Time Machine, either.
  8. During the install process your machine will reboot, and if you booted from a usb drive, you’re going to boot from it again (unless you’re quick and yank it). If you do boot back into the install dialog again, just move the power switch to off, yank the usb drive, and power on. Installation will pickup from where it should.
  9. Once Windows7 is installed, use the Ethernet cable to connect to your router so you can get drivers, especially a wifi driver. The external usb drive and keyboard can be disconnected.
  10. To get drivers, go to Start | ControlPanel | DeviceManager, hold down the pen button while clicking on Display Adapters (i.e., do a right-click), and select “Scan for hardware changes”. This should trigger a search on Microsoft/manufacturer sites for all the drivers you need, not just display adapters. If you turned off WindowsUpdate during install, you’ll get no drivers at this point and should click the Change Setting button, then the Yes radiobutton. You should get drivers for:
    • Intel PRO Wireless 2200BG Network
    • Motion Computing Tablet PC Buttons
    • AuthenTec AES2501 (fingerprint sensor)

    Even with this, I could not get drivers for:

    • Video controller
    • Multimedia Audio Controller
    • PCI Modem
  11. You should now have wifi connection ability, so the Ethernet cable can be removed.
  12. By default, if one holds the digipen to the screen for a short while, it will be interpreted as a right-click. This is a problem if one uses a multi-level dropdown menu and wants to select anything other than the first item in one of its submenus. To disable this, go to Start | ControlPanel | Pen And Touch | Pen Options, highlight “Press and hold” in the list, and click Settings. Turn off the “Enable press and hold for right-clicking” checkbox.

Editing ECLiPSe constraint logic code in the Eclipse IDE

One of our projects uses the ECLiPSe constraint logic programming language (a more powerful flavor of Prolog) in conjunction with Java (which is a good way to integrate graphics, unit testing, etc). One of the most popular developer tools for Java is the Eclipse IDE (yes, same name, totally different application), and we’d like to find a similarly-powerful editor for our pure-Prolog (.pro) and ECLiPSe (.ecl) files — one that provides indicators of (im)balanced parens, coloring of built-in predicates and syntax, and maybe code completion and compiler warnings.

ECLiPSe had a project called Saros on sourceforge that aimed to provide all this functionality, plus a debugger. Unfortunately, I wasn’t able to get any of these features working in its 1.0 version (after installing it as a plugin in the IDE). I’ve also heard that development on Saros has stopped.

However, I would be happy just getting paren-balancing, and I found a way to do that in the Eclipse IDE:

  1. Go to Window | Preferences
  2. Select General | Content Types in the left pane, and then on the right, select Text in the Content Types list.
  3. In the File Associations list below that on the right side, you should see “*.txt (locked)”. Click the Add button and add “.ecl” when prompted for the Content Type. Do this again for .pl files (or .pro files, as I prefer, to distinguish from Perl files).
  4. Back in the left pane, select General | Editors | File Associations.
  5. On the right side, add “.ecl” and “.pl” (or “.pro”) if they aren’t already present.
  6. Select “*.ecl” in the upper list on the right. Then in the lower listbox, if “Standalone Structured Source Editor (default)” isn’t shown, use the Add button to select “Standalone Structured Source Editor” and then click the Default button.
  7. Select “*.pl” (or “*.pro”) in the upper list, and make sure it has the same default editor.
  8. Click OK.

Now, if you open an .ecl or .pro file using either the Java or Debug perspective, and you place the text cursor after a ) or ] that is balanced, you will see a blue rectangle around the corresponding ( or [. There is no indicator if the ) or ] is not balanced.

Note that if you quit the IDE and launch it again, you will get a warning dialog about “Unsupported content type in editor.” I believe that’s because we selected “Text” in the Content Types preferences – there being no way to add a content type for ecl and pro files in particular – and that this resulted in our files being “locked” to the generic text editor in the File Associations preference. However, by marking the structured text editor as the default, we override that locking, and you’ll notice that if you dismiss the warning, paren-balancing still works. So, I selected the checkbox for “Do not show this message again” in the warning. Relaunching the IDE again gives me paren-balancing with no warning dialog.

UPDATE: Christian Wirth said on the eclipse-clp-users mailing list: “You have to install the Web Tools Project also.”

How to avoid unexpected backtracking in Prolog fail loops

A Prolog fail loop is a way of doing iteration in Prolog. For example, in this predicate (methods and functions are called predicates in Prolog because the method name is used in the predicate position of propositions used to write Prolog code), we iterate over all facts in the knowledge base that match the given ‘edge’ pattern:

buildEdges(MiddleNode, RightNode) :-
   LeftNode #< MiddleNode, %ECLiPSe feature: constrain LeftNode to be an int smaller than MiddleNode
   edge(LeftNode, MiddleNode),
   addEdge(LeftNode, RightNode),
   fail.

The reason that putting ‘fail’ at the end leads to iteration is that Prolog automatically searches for other ways of satisfying conditions, unless you tell it not to. So, if there is a fact matching the ‘edge’ pattern, and if ‘addEdge’ also succeeds, then when the interpreter reaches ‘fail’ it “backtracks” to the addEdge call to see if there were any unexplored ways of satisfying it (aka, it checks if there were any other “choicepoints”). If there were such unexplored options in addEdge, the interpreter tries the first one; if this succeeds, we return to ‘fail’; if the first one fails, it tries any others for addEdge. Once the options in addEdge are exhausted, the interpreter takes another step “upward” to see if there were any unexplored matches for the ‘edge’ pattern.

As you can see, forcing backtracking by putting ‘fail’ at the end of your conditions is one way of implementing iteration over a set of matching facts. But what you may not have expected, and what you probably don’t want, is for the interpreter to try calling ‘addEdge’ several times before looking for the next edge match. There are two standard ways of avoiding this:

  1. Put a condition at the start of all definitions of ‘addEdge’ that allow it to be entered only under the intended circumstances. (This isn’t a good match for the current example, because you can’t specify a condition that says “only call me once when iterating over edges”.)
  2. Put a cut (denoted with an exclamation point, !, in Prolog) at the end of all definitions of addEdge. A cut tells the interpreter to forget about any other choicepoints for this call of this predicate. Other tutorials about fail loops neglect to emphasize that the cut must be put at the end, because otherwise backtracking from the fail loop will explore any choicepoints remaining after the cut, even ones in your definitions of ‘addEdge’.

Side note: One can replace ‘fail’ with a condition to get do-while behavior instead of exhaustive iteration. And if one wanted to isolate a block of code for iteration (say, you want to avoid repeating the #< step), one could put “repeat,” at the start of the block, and then backtracking would never go above that point.

Why Palm’s webOS is the future of Android (and desktop computing)

Do you connect these dots in the same way I do?

  1. The current practice in OSs and browsers of asking the user at install time whether to proceed with the install, as a way of avoiding security threats, just doesn’t work. Users do not have the right kind of information at that time to decide.
  2. The threat of compromised systems and data loss is severe enough that consumer and enterprise OSs will have to be designed in a different way to manage installation risks. The widespread acceptance of smartphone apps indicates that smartphones will need such protection, too.
  3. Google’s NativeClient project is a good way of handling the risk because it provides a sandbox, and it’s better than alternatives like Java and Flash because it allows apps to run faster (because the apps are compiled natively rather than into bytecode).
  4. Palm’s webOS for its new smartphones has a very similar design to NativeClient (and since NativeClient is open source, could be built on top of it, for all I know). Specifically, webOS’ plugin development kit (PDK) will allow allow apps written in C and C++, two languages which by themselves allow altering memory contents almost anywhere in RAM and thus open to abuse by malicious app coders, but the PDK will sandbox apps, apparently in much the same way that NativeClient does. WebOS’ other interface, the Mojo SDK, allows apps written in Javascript to access data on the phone in much the same way that NativeClient’s browser plugin design would allow.
  5. Thus, webOS seems to provide a glimpse into what smartphone and desktop OSs will be like in coming years, if they deal with security threats in the inspired way detailed in the NativeClient design.

And there’s another force pushing Google’s Android smartphone OS in the same direction as webOS:

  1. Google always seems to prefer keeping its apps as platform-agnostic as it can by leveraging browsers when it can. The exceptions are Google Earth, GTalk, etc which must be installed either for performance reasons or to gain access to “hooks” in the OS that browsers can’t offer.
  2. Google’s apps for Android are Java-based (i.e., not browser-based) for apparently no strong reason. In fact, it seems that if Google had had Palm’s insights about how a web-oriented OS could be made back when Android was being designed, then Android would be very much like webOS so that Google wouldn’t have to split its app-building competence and resources across so many platforms (of course, the iPhone and Blackberry platforms would still make their own demands). Google’s efforts to build ChromeOS is another strong bit of evidence of its desire that there be fewer platforms and that they resemble browsers more.
  3. Eric Schmidt has said that Android and ChromeOS will eventually merge. I’m not sure if he came to this conclusion before or after learning about the design of Palm’s webOS, but webOS seems like a good hint of what such a merge would result in.

Am I pulling too hard on thin threads, or does this paint the same strong picture for you that Palm’s webOS really is a glimpse of the future? It sure is a fun way for me to stretch my thinking about what smartphones can do and be.

If this is an accurate prediction, then two consequences come to mind:

  1. The current unspoken practice of web engineers looking into the Javascript source of their competitors, learning new tricks, and helping the craft of web engineering to improve will suffer because companies will want to shift their presentation and business logic out of Javascript and into compiled native code for greater performance and out of a misguided attempt to protect their intellectual property.
  2. Having Google compete in the same idea space will help inspire both toward even better ideas. Of course, Google won’t buy Palm (why would it need to?), and it’s unlikely that having similar platform designs will affect the market share of either of them. As long as Palm can capture a significant share of the growing global demand for smartphones, it should be able to survive. And it’s likely to always have an advantage over Android in the beauty of its UI, given the DNA of the two companies.

UPDATE: Google released an “NDK” for Android way back in June 2009, which sounds like webOS’ planned PDK and also sounds like it was built on NativeClient. So, my prediction above that webOS is the future of Android has things a bit turned around.

Also, although the NDK seems to have a very similar design to NativeClient, and might have been built on NaCl, I’m somewhat doubtful because NaCl relies heavily on a feature known as “segmented memory” in the 386 chip architecture, and I wonder if that same feature is present in mobile CPUs such as ARM.

UPDATE: Other devs are worried that we might lose the ability to view html source and thus lose one of the primary learning and innovation paths for web app devs.

    How to enable TestNG launch configurations in Eclipse IDE (Windows)

    When using TestNG 5.11 (and at least one earlier version, 5.9) with the Eclipse IDE 3.4.2 (Ganymede, for Windows), one can’t setup a Run configuration for TestNG in the usual way. That is, one can’t use Project | Properties | Run/Debug because only Java App and Java Applet options are presented there. (Of course, one has to install the TestNG plugin first for this to make any sense.) Instead, here’s a workaround gleaned from a post by Ajay Mehra:

    1. Make sure you haven’t hidden any launch configuration types
      1. Go to the top menu bar and select Window | Preferences.
      2. In the left pane, select Run/Debug | Launching | Launch Configurations. On the right side, make sure that Java Application and TestNG are shown and not checked. (You may have to check ‘Filter checked launch types’ temporarily in order to scroll or uncheck some items.)
    2. Make sure the class(es) you want to test have @Test annotations in them
      1. If you want to see an example, look at the SimpleTest class definition near the top of the TestNG homepage.
      2. Note that in SimpleTest, it’s not necessary to include an @BeforeClass annotation anywhere, nor is it necessary to include “(groups …)” after the @Test annotations. And rather than importing all of “org.testng.annotations.*”, you may be able to get away with just importing “org.testng.annotations.Test”.
      3. To support the @Test annotation, the IDE will want to add the testng-jdkNN.jar (where NN is 15 if you’re using JDK 1.5) to your project’s classpath.
      4. Any methods you want to be used as tests should be marked as public, so TestNG can invoke them.
        • If you don’t make any test methods public, then when you run your tests, the TestNG tab near the console tab will show that zero tests were run.
    3. Create a launch configuration for your test
      1. Go to the top menu bar and click the down-facing black triangle to the right of the Run button (a green circle with a white triangle in it). This should trigger a dropdown menu that includes “Run Configurations…” Select it.
      2. Select TestNG in the left pane, then click the New button in the upper left (the white rectangle with a yellow plus in the upper right). Enter a name for the new launch config; if your test will use just one class of test methods, that class name would probably be a good choice as a memory aid.
      3. Under the Test tab, browse to the “Project” of the test, and then select a “Run…” target. (For example, if you’re trying things out with SimpleTest, you should have created a new empty project, pasted SimpleTest.java into it, and now use the Browse button for Class to select SimpleTest.java.)
      4. If you need to provide any arguments to the JVM before the test is launched, do so under the Arguments tab.
      5. To save your edits, click Apply. When you’re done editing, click Close (or you could execute the test by clicking Run).
    4. Verify that the test is setup correctly
      1. Run the test by selecting the black triangle again near the green Run button, and then selecting the launch config you just named.
      2. The Console tab should show
        [Parser] Running:
          pathToYourProjectProjFoldertemp-testng-customsuite.xml

        The name of the xml file shown here is what TestNG generates if your launch config doesn’t use the “Suite” option and you didn’t provide your own xml file.

        Following that will be any System.out printing your test methods did, plus

        PASSED: testMethodName

        for any of your test methods that passed.

        Finally, there will be a summary report like this

        ===============================================
            SimpleTest
            Tests run: 2, Failures: 0, Skips: 0
        ===============================================
      3. A similar, more graphical view of the summary report should be available under the TestNG tab.
      4. If the report says “Tests run: 0”, double-check that your @Test annotations are on the right methods, and that those methods are public.

    Glossary and notes for Len Talmy’s work on cognitive semantics

    I’m starting to read Talmy’s work on folk concepts of space and causality, and I find that I need to keep a glossary of his specialist terms. Maybe this will be helpful to other readers of Talmy, too.

    As a quick introduction, you might want to read the Wikipedia page on Force dynamics.

    [Why do we think Talmy’s work might be useful to us? We are looking for folk concepts of space, time, causality, and intention that we can formalize and use in a computer simulation of how people attribute causality and intentionality to figures in simple animations. Talmy’s work might provide articulations of the folk concepts we are after. A primary challenge for us is to identify concepts of interest to us (i.e., those that trigger expectations or that are necessary to support explanations) , because most of the concepts that Talmy identifies are powerful generalizations of distinctions made in language but which have little apparent causative power that shapes our thinking. For example, the distinction between moving-to and moving-from seems to have little effect on our expectations of what the moving object will do next, while the distinction between contact and attachment clearly affects our expectations of how two objects will move if rotated, say, around their common center of gravity.]

    All page references refer to his book, Toward a cognitive semantics, volume 1.

    Glossary (sorted in order of appearance, not alphabetically)

    • veridical – appearing to be true (100c)
    • factive – When two representations of the same thing are contradictory, the one that appears more true is called “factive” (100d)
    • fictive – When two representations of the same thing are contradictory, the one that appears less true is called “fictive” (100d)
    • fictivity – there exist multiple conflicting representations of the same thing, some of which seem more true than others
    • see vs sense – When two percepts of the same thing are contradictory, and one is less palpable and thus more fictive, Talmy calls the perception of the factive one “seeing” and the perception of the fictive one “sensing”. (102a). For example, a static Pac-Man quasi-circle shape is “seen” while the dynamic alternative of a circle having a wedge cut from it is “sensed”.
    • ception – A continuous conceptual space whose dimensions are all related to palpability (aka, the ability to recognize or act on something). (102b)  Intended as a replacement for arbitrary pigeon-holing of phenomena as one of sensation, perception, or conception. (139d)
    • constructional vs experienced fictive motion – “Languages systematically and extensively refer to stationary circumstances with forms and constructions whose basic reference is to motion;” however, there are “differences over the degree to which such expressions evoke an actual sense or conceptualization of motion [in their speakers].” (104c)  While some speakers would report a strong sense of movement for a construction that other speakers would report feeling no such sense, there are some constructions that evoke a sense of motion in almost all speakers.
    • active-determinative principle – For “some” [119b] emanation types of motion, the source role will usually be attributed to the more active or determinative candidate objects. For example, in a radiation path between the Sun and one’s hand, the Sun is perceived as the brighter of the two, and thus the more active, and thus given the role of source. “This principle accounts for the absence of any linguistic formulations that depict the sun as drawing energy from objects.” (117c) “One’s experience of the characteristics of agency may provide one with the model for the active-determinative principle” (119d)
    • extramission – “the notion that sight involves something emerging from the eyes” (124b) “The conceptual model in which the Agent emits a sensory Probe appears to hold sway in the cartoon imagery [of Superman’s X-ray vision].” (125b) Similarly, the expression “to look daggers at” or “the evil eye”.

    Notes

    1. When fictivity is present, the representations often differ in a single dimension. (100e)
      • State of occurrence – whether something is present or absent
      • State of change – whether something changed or was in stasis
        • State of motion – whether something moved or not (“stationariness”)
    2. There is a general cognitive bias towards dynamism; i.e., things appear to move when they are in fact still, rather than things appearing to remain still when they in fact have moved. (101b)
      • For example, an utterance and a belief might be contradictory, and where greater credence is given to the belief, and the utterance indicates movement while the belief indicates stationariness: “That mountain range goes from Canada to Mexico.”
    3. “Fictive motion in language encompasses a number of relatively distinct categories” (103c), including:
      1. Emanation - “The fictive motion of something intangible emerging from a source.” (105d) “In most subtypes, the entity continues along its emanation path and terminates by impinging on some distal object.” Note the reliance on distal objects in all the examples below.
        1. Orientation paths – “A continuous linear intangible entity emerging from the front of some object and moving steadily away from it.” E.g., “She crossed in front of the TV.”
          1. Prospect paths – e.g., English verbs “face” and “look out”
          2. Alignment paths – e.g., English verb “lie” with path prepositions “toward” or “away from”
          3. Demonstrative paths – e.g., English verb “point” with path prepositions “toward” or “away from”
          4. Targeting paths – An agent aims an object that has a front so that the front follows a desired path “relative to the object’s surroundings” (109d)
          5. Line of sight – E.g., English verbs “look” and “turn” with path prepositions “toward” or “away from”
        2. Radiation paths – (skipped pp. 111-116)
        3. Shadow paths
        4. Sensory paths
      2. Pattern paths – (skipped pp. 129-138)
      3. Frame-relative motion
      4. Advent paths
        1. Site manifestation
        2. Site arrival
      5. Access paths
      6. Coextension paths (e.g., see mountain range example above) –
        1. Talmy83: Virtual motion
        2. Jackendoff83: Extension
        3. Langacker87: Abstract motion
        4. Matsumoto96: Subjective motion
    4. “Palmer (1980) and Palmer and Bucher (1981) found that in certain arrays consisting of co-oriented equilateral triangles, subjects perceive all the triangles at once pointing by turns in the direction of one or another of their common vertices. Moving the array in the direction of one of the common vertices biases the perception of the pointing to be in the direction of that vertex.” (123b)
    5. Anthropologist Pascal Boyer’s study of “ghost physics” (1994) – Belief systems characteristically permit some exceptions to normal physics, such as invisibility or passing through walls, but not other (barely!) conceivable exceptions such as “reverse causality”.
    6. The semi-abstract level of palpability (146)
      1. Sensing of object structure, e.g. envelope/interior similarity across magnitudes of volcano and thimble
      2. Sensing of path structure, e.g., similarity regardless of shape of “across” when a deer runs straight across a field or zig-zags across it
      3. Sensing of reference frames: earth-based, object-based, or viewer-based
      4. Sensing of structural history and future (object is stationary), e.g. a broken flower pot
      5. Sensing of projected paths (object is moving), e.g. a thrown ball currently arcing through the air, or a path through a crowded restaurant
      6. Sensing of force dynamics, e.g. perceived forces among objects thought to naturally be in motion or at rest. Jepson and Richards (93)  found a sideways T is thought to have its two parts “attached” while in an upside-down T, the two parts are perceived merely to be in “contact”. [See Siskind’s AI work on attributing support vs attachment.]
    7. (skipped pp. 154-172, which is the rest of the chapter on Fictive Motion in Language and “Ception”)
    8. Motion-aspect formulas – e.g., Be at, Move to, …, Move from-along (215-6, 245-52)
    9. (skipped to 409)
    10. Force dynamics (to be continued)

    Porting Amzi prolog to ECLiPSe

    While Amzi Prolog has the best debugger I’ve seen for any flavor of Prolog (and I’ve evaluated many flavors), it’s become clear that my project needs the ability to constrain variables before committing to particular values. This is the key feature difference that leads me to want to port to ECLiPSe, a flavor of Prolog where constraint propagation is the main focus. And there are other features of ECLiPSe that are very appealing:

    • Easily embedded in a Java application (via a jar that implements a JNI bridge to the native ECLiPSe executable) that allows for loose coupling through queues-and-listeners or even asynchronous queues.
    • Similar library support for manipulating terms, strings, lists, etc as other Prolog flavors
    • An Eclipse IDE plugin, Saros. (It’s not quite usable for me in its current stage but I hear a new version is imminent. And it has a Tk-based UI including debugger that’s pretty good.)

    Here are the changes I had to make to my Amzi code so it would run in ECLiPSe:

    • Changed all calls using consult/1 or debug_consult to [...comma-separated filenames...]
    • Changed ‘/’ when used in predicate names to ‘_’
    • Changed import(list) to import(lists)
    • Changed all abolish calls to retractall (although this doesn’t seem absolutely necessary)
    • Changed string_term(String,Term) to term_string(Term,String)
    • Changed stringlist_concat calls to concat_string
    • Discovered that many of my variables were singletons, and changed them to start with ‘_’ (so they would be self-documenting singletons and not trigger a compiler warning)
    • Added dynamic declarations for all my dynamic predicates (Amzi probably doesn’t provide a warning when these are asserted or retracted without having been so declared)
    • Changed my (retractall(predicate(_,...)) ; true) pattern to just retractall(predicate(_,...)), since retractall never fails in ECLiPSe

    I’m grateful to Dennis Merrit and Chip Eastman of Amzi for all their help in getting my initial Amzi app working, and for their excellent documentation of Prolog and Building Expert Systems.

    Evaluating animation toolkits for ‘perception of intentionality’ simulations

    Our team needs to create 2D animations that trigger the ‘perception of causality’ or the ‘perception of intentionality’ through the movement of simple shapes. (Jointed figures with faces and props can come later.) The prototypical example of such animation is the one Heider and Simmel used in their experiments in 1944, since then transcribed into Flash.

    I’ve been evaluating animation toolkits with a few objectives in mind:

    1. Creating such 2D animations must be as easy as possible
    2. It must be possible to inspect such an animation programmatically to determine where each shape is in each frame. (We are creating a simulation that will “watch” the same animation, but instead of observing pixels it will read such data.)
    3. The animations must be easily distributed, such as being able to run in most browsers on most platforms using no plugin or only a commonly installed one

    Here are my evaluations.

    Adobe CS4 Flash Professional

    PROs

    • Flash files (SWF file format) are playable on most platforms using the Flash Player browser plugin, which most people already have installed
    • There are many online tutorials about how to create animation using this toolkit, and expertise with the tool is widespread (so it would be easy to find help or hire someone)
    • The “motion tween” feature available in the CS4 version eliminates the need to copy/paste/tweak each frame into the next frame; instead, one just drags from the starting position to the ending position, and can add arbitrary curvature at many points in between by pulling on edit points.
    • Although it’s a binary file format, the SWF format has been documented by Adobe, and there is an open-source Java library, JSwiff, that provides handy wrapper accessors. It even has a forum for questions about JSwiff, but answers seem infrequent.
      • This library is a little out of date, since it refuses to process files using SWF versions after 8, but the code seems capable of handling version 10 just fine (version 10 is what CS4 Flash Pro generates). To use the library with more recent SWF versions, it seems one has to edit the source (included in the download) by changing SWFDocument.setVersion to eliminate the max version check, and then build one’s own jar file. This method refers to private member “header”, so one can’t just subclass SWFDocument and override the method.
      • The JSwiff site also offers a separate download for its inspector.bat, which provides a GUI for inspecting all tag content of a SWF. Surprisingly, it works on SWF version 10; I’m not sure how it gets around the version check.
      • Note: There is another Java library, JavaSWF, but work on it seems to have stopped around 2005 and it doesn’t handle recent updates to the SWF file format such as the DefineShape4 tag. There is a JavaSWF Yahoo group, but it seems answers are rarely provided for any questions in recent years.

    CONs

    • Adobe’s tool costs US$700.
    • There is an option to Export Motion XML, which seems like a good alternative to the Java wrapper, but despite multiple attempts I couldn’t get it to include information about each frame of my test animation.

    Alternative SWF toolkits

    For example,

    AnimeStudio Debut6

    Swish MaxMini

    Synfig

    Toufee

    PROs

    • At US$50-150, much more affordable than Adobe’s toolkit

    CONs

    • Harder to use than Adobe’s toolkit because they require copy/paste/tweak of each frame of motion, and there is  no motion guide unless one sketches a path using the drawing tool and then erases that path.
    • Toufee’s min frame rate is 1 frame/sec, which is far too slow for my needs. It also has a number of bugs such as having random transition effects on by default, and making it impossible to configure an object to disappear faster than in 5 sec in the first frame
    • Although Synfig is an open-source SWF-creating toolkit, which would otherwise make it very attractive, it has such a convoluted install process for Windows that I’m not willing to put my time into evaluating it further. It seems likely that it could just break someday, and there would be too little interest in Windows users to expect them to fix it.

    Alternative web-based animation platforms

    For example,

    HTML5’s canvas element + Javascript

    Java

    Microsoft Silverlight

    PROs

    • Inexpensive (free) toolkits
    • Easily distributed and demo’d

    CONs

    • Would require investing significant time or hiring funds into programming animations largely from scratch
    • Would require significant effort to design a way that an outside application could inspect what objects are depicted and what is happening

    K-Sketch

    PROs

    • Free, open-source, and allows for easy creation of motion paths
    • Allows export to SWF

    CONs

    • Motion paths aren’t editable — one needs a very steady hand
    • Current release won’t run on my XP machine, and on my Vista machine the lasso selection tool doesn’t work, which means I can’t create any motion paths. Not actively supported right now, but that might change soon.

    Bottom line: Assuming one can afford Adobe’s Flash toolkit, the combination of it and the Java wrapper seem like a very workable solution.

    Tip: If you need to control the playback of an SWF using Java, it seems the best option is a hack where a native SWF player like XULRunner is controlled by Java by injecting Javascript. Have a look at the DJ Native Swing project. It’s hosted on Sourceforge and has discussion forums there.

    A font for eco-friendly printing

    So you already print double-sided or reuse single-sided prints? You can go even further in your quest for eco-friendly printing.

    A font has been developed that reduces the amount of toner used while minimizing loss of readability. The download page includes tips on how to install on a variety of platforms, and here’s a tip for installing a font in Windows 7. Note that after clicking the “Install” button for a font, there is no indication of success beyond the Install button becoming disabled — although you can open the Fonts control panel to verify success.

    Open a page in the default browser following a schedule

    The notification component of our wiki (plone) isn’t useful enough to bother with, so as a workaround I view its “Recent Changes” page on a regular basis. Well, as regularly as I can remember, which hasn’t been regular enough. So now I use Windows7’s Task Scheduler to open the page for me on a repeating schedule.

    This should also be useful for catching regular Internet radio programs.

    Steps:

    1. Start | All Programs | Admin Tools | Task Scheduler | Create Task (in far right pane)
    2. General tab
      • Enter a name you’ll recognize later
    3. Triggers tab | New…
      • Enter when the task should occur
    4. Actions tab | New…
      • Action = Start a Program
      • Program/script = rundll32
      • Arguments = url.dll,FileProtocolHandler http://yourFavoriteSiteHere.net
    5. Conditions tab
      • Start only if the computer is on AC power => off
      • Wake the computer to run this task => on
    6. Settings tab
      • Run as soon as possible after scheduled start is missed => on

    Portable Windows with/without admin privilege

    Update: SORRY! It turns out that VirtualBox ties its configuration very closely to specifics of its host machine (such as through the MAC address), so I have to back away from the “portable” claim in this post’s title. Until VirtualBox realizes the value to them of supporting portability, using VirtualBox won’t actually help you go portable.

    Previously, I described how to get some freedom from the “no admin privilege” restriction that many workplace computers have. That technique involved running specially-built versions of one’s applications from a thumbdrive. After trying that for a month or more, I grew tired of the following limitations:

    • Many useful apps aren’t built to run from a thumbdrive, often because they use the registry
    • Going back to Firefox from Chrome wasn’t a good fit for me (and Chrome isn’t really portable yet)
    • One can’t control default applications; for example, one can’t directly open links in messages in portable Thunderbird using one’s portable browser

    There is a framework called qemu that allows running an entire operating system from a thumbdrive, and it can do so in a mode that doesn’t require admin privilege. But this mode can’t access USB peripherals like keyboards, mice, and other drives connected to the host computer. The final deciding factor against qemu for me is that OSs make many writes to the storage they boot from, and thumbdrives allow only thousands of writes before their storage capability degrades.

    If you really want to try qemu before my final solution, here’s what I tried with it:

    1. Install Qemu manager on usb
    2. Set VM RAM to 512MB
    3. Give the VM whatever name you want, and whatever OS label you want…I used “Windows7Portable” for both
    4. Start VM
    5. In VM/Qemu Client, click CD Drive button in menu bar and browse to your Windows 7 RC ISO file
    6. Wait while OS installer gets going…I had to quit and go to bed before it finished

    Here are some related links that were helpful:

    Because qemu would not be able to use usb-connected keyboard and mouse, because I’d have to copy it to a new thumbdrive fairly often, and because it would be pretty slow by most accounts, I decided to give up on the “no admin privilege required” goal. Instead, I asked my IT dept to install VirtualBox. The following steps were inspired by a Lifehacker post.

    1. Get a portable hard drive. I got a Maxtor 160GB “Basics” model for about US$60.
    2. Plug in the drive to any machine you plan to use where you don’t have full control, then check that you can write to it. (For example, right-click in the drive contents shown in Windows Explorer and choose New | Text Document.) If you can’t write to it, ask your IT dept to grant you write permission for it.
    3. If you’re going to overwrite a virtual disk file you may already have on the portable drive, and you use Chrome, be sure to export your bookmarks and open tabs to the host first. (ouch)
    4. Have VirtualBox installed. Then copy its .VirtualBox directory to your portable hard drive.
    5. Start VirtualBox and go to File | Preferences | General. Change “Default Hard Disk Folder” and “Default Machine Folder” to .VirtualBoxHardDisks and .VirtualBoxMachines on your portable drive.
    6. In VirtualBox, press the New button.
    7. Click through the Next buttons, including leaving virtual memory at 512M.
    8. Click through more Next buttons, including the creation of a new boot (virtual) hard drive of size 20GB. This virtual drive will be kept in .VirtualBoxHardDisks
    9. Choose Dynamically-Sized storage
    10. Take a lunch break while the VDI is created. I also created a brief “reward if returned” text file with contact info at the root of the portable drive.
    11. Once your virtual drive is created, click the Settings button, then the CD/DVD tab and the Add button, and browse to the ISO file of the OS you want to run in the virtual machine. For me, this was Windows 7 RC and my “host” is Windows XP. Click Ok and return to the main VirtualBox window.
    12. Click the Start button. You should eventually see the OS installation prompts. Answer the prompts the way you would if you were installing to a machine sitting right next to your current one and on the same network.
    13. When the OS has been installed (it may need to reboot itself within the window a couple times, but won’t affect the rest of your computer), go to Devices and select Install Guest Additions. This will make it much easier to use your keyboard and mouse. (If you have difficulty getting to the menu, hit the Ctrl key on the right of your keyboard and then try again.)
    14. Once the OS is installed, you can move or delete the ISO file. If VirtualBox complains about not being able to find the ISO file, just use the button to remove it from the CD/DVD mounts.
    15. To get access to folders on the host machine:
      1. Turn on sharing for each of the folders you want to access.
      2. In your VM window, go to Devices | Shared Folders. The “Folder Path” browse button should allow you to select one of the folders you just shared
      3. To make the shared folder appear among the drives shown under Computer in your VM OS (“the guest machine”), go to the Start menu, type in “cmd”, and then type in “net use s: \vboxsrvMyHostFolder” where MyHostFolder is replaced by the name of a folder you shared, and where s: is replaced by a drive letter not already in use.
    16. You should now be able to copy all the data folders from your host to the guest.
    17. Go ahead and install all the apps you need on the guest. I have some recommendations:
      1. Do not install an antivirus program. They can make your guest unusable by turning its display to snow and ignoring your commands. This is the most significant sacrifice I’ve encountered so far.
      2. Use Mozy to backup your data. The first 2GB is free, and you won’t have to remember to backup again.
      3. Windows Explorer settings
        • Change default folder to Computer
          1. Right-click on Windows Explorer in taskbar
          2. When “Windows Explorer” appears in the “jump list”, right-click on it and select Properties
          3. For Target, enter
            %windir%explorer.exe ::{20D04FE0-3AEA-1069-A2D8-08002B30309D}
        • Organize | Folder and Search Options | View
          • Show hidden files, folders, and drives => checked
          • Hide extensions for known file types => NOT checked
          • click Apply to Folders and have it apply to all folders
      4. Control Panel | Programs | Turn Windows features on or off | Games => NOT checked
      5. Control Panel | Taskbar and Start Menu | Start Menu | Customize
        • Computer, Control Panel, Personal folder = > Display as menu
        • Documents, Games, Music, Pictures => Don’t display
        • Run => checked
        • System admin tools => Display in All Programs and the Start menu
        • Use large icons => NOT checked
      6. Pin to taskbar: Cmd shell, Calculator, CharMap
      7. Set built-in Windows Defender (anti malware) to run everyday over lunchtime instead of at 2am, since the portable drive may be disconnected overnight.
      8. Media players like Winamp stutter when run in a VM. I run them from the host instead.
    18. If some of the options for setting screen resolution in your guest are disabled, you might be able to enable them by running this in a cmd window in the host:
      C:Program FilesSunxVM VirtualBox>VBoxManage setextradata global GUI/MaxGuestResolution any
    19. Whenever you need to take the portable drive with you, go to Machine | Close | Save Machine State. All your open windows and state will be saved and the VM will close. Quit VirtualBox so it doesn’t hold onto the VDI file on the drive, then unmount the drive via Safely Remove Hardware. If something still won’t let you unmount, then you may have to shutdown the host.
    20. I have been closing down my VM every night, because when I’ve allowed it to run for several days before shutting down, VirtualBox seems to get stuck in a “stopping” state. It’s not clear if this is a bug in VirtualBox or a behavior in Windows 7 that can be configured not to happen.
    21. Finally, as a note to myself, here are some preferred Win7 settings and applications:
      1. Desktop context menu | View | Small icons
      2. Unpin Windows Media and IE from taskbar
      3. Taskbar properties:
        1. Use small icons
        2. Dock at bottom
        3. Never combine/stack buttons
      4. My favorite apps:
        • Java 1.6 SDK with Netbeans
        • Chrome from the dev channel
          • Create a shortcut and append –enable-sync to the Target value
          • While I would like to pin the shortcut to the taskbar, I’m not sure the –enable-sync flag would survive that, so I put the shortcut in the Startup folder instead. (My bookmarks on different machines have been out of sync for a few days, and I suspect this is because I was starting the browser from the taskbar.)
          • Install AdSweep ad-blocker extension by clicking here
          • under (wrench menu)
            • Turn on syncing of bookmarks
            • Always show bookmarks bar
            • under Options
              • under Basics
                • On startup | Restore the pages that were open last
                • Home page | Show Home button on the toolbar => off
              • under Personal Stuff
                • Passwords | Offer to save
                • Themes | Get Themes | Themes by Google | Greyscale
              • under Under the Hood
                • Download location | Desktop
                • Ask where to save each file before downloading => off
          • Right-click on these tabs and select Pin to tab:
            • Evernote
            • Gmail
            • Gmail calendar
            • Google reader
          • Right-click in address box and select Edit Search Engines, then change these keywords:
            • Wikipedia => w
            • Google Scholar => s
            • Lifehacker => l (small L)
            • Amazon => a
            • Google groups => g
            • Download.com => d
            • Google image search => i
            • Youtube => y
            • Ebay => e
          • Wishlist: Save session/open-tabs, Pinned tabs, and Search Engine Settings to Google bookmarks or some obvious file that I could backup
          • Wishlist: An official theme that matches my Windows default theme!
        • Mozy
          • I keep all my files in 4 desktop folders; I backup only parts of these:
            • Business
            • Papers: Articles by others in PDF and HTML format
            • Projects: Workspaces for IDEs, Todo lists, etc
            • Tools: Portable apps, Printer drivers, Installers
          • Schedule | Alert me if a backup hasn’t happened in this many days => 1
          • Options
            • Notify me when a backup starts => off
            • Show status when a backup successfully completes => off
        • portable Thunderbird email client with Lightning calendar extension
          • Pin to taskbar
        • Notepad++
          • Pin to taskbar
          • Settings | Preferences
            • Global
              • Toolbar | Hide => checked
              • Tab bar | Vertical => checked
              • Tab bar | Enable close button on each tab => checked
            • Misc | Remember current session for next launch => checked
        • portable PDF XChange Viewer
          • Pin to taskbar
        • IZArc unzipper
          • During install, make these settings:
            • Explorer Enhancements tab: Extract Here, Extract to <folder>, Email, Create self-extracting, Test, Display icons on context menus => NOT checked
            • Program Locations tab: Default viewer => Notepad++
          • Wishlist: a setting that allows “Extract all subfolders” as the default
        • OpenOffice
          • Pin to taskbar
        • doPDF virtual printer (use browser’s Print command to create PDF version of webpages for archiving)
      5. Pidgin
        • Don’t pin to taskbar; it will be added to hidden system tray
        • Buddies | Show | Offline buddies
        • Create shortcut and drop in Startup folder
        • Preferences
          • Interface | Show IMs and chats in tabs => off
          • Conversations | Enable buddy icon animation => off
          • Sounds : Enabled only for “Message received”
          • Logging | Log all instant messages => on
      6. TweakLogin
      7. Google Calendar gadget. Move to lower right corner.
      8. Amzi Eclipse IDE version 8 for Prolog and Java
        • Pin to taskbar
        • Set workspace to DesktopToolsEclipseWorkspace
    22. One would think that “.pdf” would appear at Control Panel | “Associate a file type or protocol with a program” but it doesn’t. Instead, double-click any PDF, choose “select a program”, choose your PDF viewer, and make sure the checkbox is checked for “always use the selected program”. (This entire step is probably only necessary because I use the portable version of the viewer, and it doesn’t create entries in the registry.)
    23. Go to Control Panel | Devices&Printers and add the office Grayscale and Color printers
    24. Get more free wallpapers from Microsoft

    Getting svn+ssh client access in Windows

    The following steps should help if you are using an svn commandline client and you get an error like:

    svn: OPTIONS of 'http://YOURSVNDOMAINANDPATH': could not connect to server

    Or if you get,

    Network connection closed unexpectedly

    in NetBeans or TortoiseSVN. UPDATE: The steps below turn out to fix the problem only in the commandline client, not in TortoiseSVN nor in Netbeans, sorry.

    I had been using svn through Eclipse+SVNKit without much problem, but when I had to use NetBeans for a new collaborative project, I couldn’t get svn working even though I was using the same repository as with Eclipse. Switching from Netbeans to TortoiseSVN and a commandline svn client didn’t do any better. It seems a fair number of people have the same problem, and it seems due to all of these programs using a common config file and that file configures ssh support off by default. If you need to access an svn repository whose url starts with “svn+ssh://” and you’re using Windows, this advice is likely to be of use to you.

    1. Install your SVN client (e.g. NetBeans including SVN, TortoiseSVN, or CollabNet’s commandline SVN client)
    2. Download ssh commandline client plink
    3. Add the path to plink to your Path environment variable
    4. Make sure Windows isn’t hiding your AppData folder: In Windows7 in WindowsExplorer, select Organize | “Folder and Search Options” | View, and select “Show hidden files, folders, and drives”.
    5. Edit
      C:UsersYOURWINDOWSUSERIDAppDataRoamingSubversionconfig

      so that the following is uncommented:

      [tunnels]
      ssh = plink -l YOURUSERID -pw YOURPASSWORD

       

    6. Change the commandline to the directory where you want the SVN contents to be savedFor example,

      H:> C:
      C:> cd UsersdavidDesktopProjectsNetbeansWorkspace
       
    7. You should now be able to make a local copy of the SVN contentsFor example, if your SVN is running on a server at DOMAIN:PORT, and the repository on that server is at /home/svn/repos, and the folder you want a local copy of is PROJECTFOLDER, then use

      C:LOCALPROJECTS> svn co svn+ssh://DOMAIN:PORT/home/svn/repos/PROJECTFOLDER

    Making games more fun with artificial stupidity

    If one buys into Daniel Dennett’s proposed use of “the intentional stance” to generate explanations and predictions of human behavior (say, in an AI program that observes a person and tries to find ways of helping), then accounting for human error is a tough problem (because the stance assumes rationality and errors aren’t rational). That’s one reason I’m interested in errors.

    Game AI faces a similar problem in that some games like chess and pool/billiards allow a computer player to make very good predictions many steps ahead, often beyond the ability of human players. Such near-optimal skill makes the computer players not much fun. One has to find ways of making the computers appear to play at a similar level of skill as whatever human they play against.

    I just came across a very interesting article on the topic of how to make computer players have plausibly non-optimal skills. Here’s a good summarizing quote:

    In order to provide an exciting and dynamic game, the AI needs to manipulate the gameplay to create situations that the player can exploit.

    In pool this could mean, instead of blindly taking a shot and not caring where the cue ball ends up, the AI should deliberately fail to pot the ball and ensure that the cue ball ends up in a place where the player can make a good shot.

    An interesting anecdote from the article is that the author created a pool-playing program that understood the physics of the simulated table and balls so well that it could unfailingly knock any ball it wanted to into a pocket. The program didn’t make any attempt to have the cue ball stop at a particular position after the target ball was pocketed, however. Yet naive human players interpreted the computer’s plays as trying to optimize the final position of the cue ball, apparently because they projected human abilities onto the program, and humans cannot unfailingly pocket any ball but seemingly are pretty good at having the cue ball stop where they want.

    Read more

    How to sync Windows Mobile with both Exchange and Google

    For Windows Mobile phones, Google suggests that you use the ActiveSync program on the phone to connect to one of their servers. This is problematic in principle for me, since ActiveSync allows only one connection with a server to be configured, and I need that to stay synced with work email. It also didn’t work even when I experimented, first with ActiveSync, and later with Funambol’s WinMo client.

    What did work was the free GMobileSync client. The 1.3.7 beta version supports multiple Google calendars, but one cannot select to sync just a subset of them, nor are secondary calendar events shown any differently than events from one’s default calendar. (Suggestion to GMobileSync: Allow defining a 2-letter uppercase code per calendar in device settings; strip these when syncing back to the server so they don’t appear to have changed.) It doesn’t sync contacts, but that feature is said to be coming. And syncs can’t yet be scheduled to happen automatically (but you can get CT Scheduler Lite free and tell it to launch the sync, but then you still have to press the Sync button — there is no parameter to automate that part yet; to install, download to a machine where you have admin privileges, or email them for the CAB file).

    As part of your move away from Microsoft, at least on the desktop, you may want to use Mozilla Thunderbird+Lightning to manage your Google Calendar.

    You may also want to try FinchSync with GCalDaemon.

    If you want to try it the way Google suggests, you might need help from this and this (from these matches) and use Funambol client for WM with these settings:

    Server URL: https://m.google.com:443/syncml

    User: [yourname]@gmail.com

    Password:[your gmail password]

    If you really want to use ActiveSync with two servers, it’s said to be possible, but one must first find a way to edit the device’s registry:

    So, to sync two exchange servers, put this in the device’s registry:

    [HKEY_CURRENT_USERSoftwareMicrosoftActiveSyncEnableNonLocalCrossPollination]
    = (DWORD) 1

    This key will most probably be there already, but if it is, its value
    will be 0. If so, change it to 1.

    Soft reset, then, setup activesync ON THE DEVICE (If you try setting up
    on the PC, it will still fail).

    There are also paid services that are geared to people who want more features or who want to avoid doing all this IT work themselves:

    • http://www.scheduleworld.com/
    • http://www.goosync.com/

    Tips if you’re moving out of the USA temporarily

    Banking

    Transferring cash back to the US to pay bills can be a big pain. I found out the hard way that it can take a month for a deposit to appear in one’s US checking account if one writes a check to oneself and mails it to the bank, and in the meantime your online account summaries are likely to show no mention of the progress of the transfer. One can also try international wire transfers, but these might require a trip to the bank and seem more expensive than what I’m going to recommend.

    Citibank has a “global transfer service” which allows for transfers with a normal checking account to which one adds the free “global executive” feature. Transfers are free if the accounts at both ends are held by Citibank, and I recall there is a US$10 fee per transfer for accounts with other banks.

    It’s a good idea to open a Citibank account, either checking/saving or credit card, before leaving the US, since your overseas branch won’t need proof of local permanent residence. Afterall, you may not have a permanent residence for a month or more while you look for the right home.

    Passports

    Make sure that all the people moving with you have passports at least six months from their expiration at the time you plan to arrive. If they’ve expired, then the wait for new ones can take weeks even if you pay for expedited service. Your employer may need you to have current passports well in advance of your arrival in order to apply for work visas on your behalf.

    Absentee voting

    Print out forms from VoteFromAbroad.org and take them to your local courthouse before you leave.

    Income tax

    If you make less than US$87k in 2008 from a foreign employer, then you don’t have to pay US income tax on it; but if you make more than that, you do. (Note that there are other conditions, too, such as being physically present in the foreign country for 330 days. See the instructions for IRS form 2555.) But I’m not a tax expert and I’m not offering professional advice; better to check out the IRS’ Federal Tax Information for U.S. Taxpayers Living Abroad (PDF).

    I’ve been a very happy user of TaxActOnline.com for years because they ask a series of questions and fill out tax forms for you, which you can print as PDFs at the end. They charge just US$15 or so to e-file, and they handle the case of having US employers and foreign employers in the same year.

    Health insurance

    Your foreign employer may offer health insurance at a good price, but it may not cover incidents that occur outside your employment country, and may not cover high-cost items such as neonatal intensive care. IHI.com came highly recommended to me; they provide coverage for any country you might visit, they cover items like neonatal ICU, and their high-deductible (US$5k) option is pretty affordable.

    Paper mail

    EarthClassMail.com provides a low-cost service where you can get a PO box in a major US city, and all mail delivered to that address will have the front of its envelope scanned. They send you an email for each piece delivered with a link to their website where you login and tell them whether to scan it (free for the first 100 pages or so a month), shred it, recycle it, or ship it to a real address.

    You won’t want to use this address for magazine subscriptions, parcel deliveries, or anyone who might send you a check, since shipping options are currently a little expensive (around US$8 for 1-10 thin envelopes).

    Package mail

    If you happen to be moving to Singapore, their postal service offers VPost, which provides a PO box for you in the US, in the UK, and in Japan. This is intended for package deliveries, and you need to forward invoices from Amazon and other sellers to VPost when you make your online purchases. Shipments from those PO boxes to Singapore are at reduced rates, but still pricey in my opinion.

    Movers

    If you’re moving from the San Francisco Bay Area, I recommend using Meridian.

    If you’re moving to Singapore, I recommend HeluTrans, (65) 6225-5448.

    Netflix

    Of course you won’t be having DVDs shipped overseas to you, but wouldn’t it be great if you could use Netflix’s streaming service? It offers many fewer titles than the DVD service, but is still great. Unfortunately, Hollywood requires Netflix to check the IP address and system clock time of your computer to make sure you’re in the US because they haven’t found a way to protect streams from piracy. There is no exemption for US military bases, either. Hollywood forces Hulu.com and Video.aol.com to do the same checks. I don’t have a recommendation for this one yet.

    You might want to check out these services, too:

    Mobile phone plans

    You’re likely to use a GSM phone network outside the US, which means you won’t be able to use a phone you bought from Sprint or Verizon (which use a different kind of network, CDMA), but you may be able to take your AT&T or T-Mobile phone if you “unlock” it from your carrier. Unlocking is something done to the software on the phone, and you can pay a local mobile shop to do it for you. There are also sellers on Ebay who will send you instructions and take a certain number of questions via email for a fee.

    I have not been able to find any way of getting voice+data service for just occasional visits back to the US. The US carriers will want to charge you monthly whether or not you use their service.

    Electric appliances

    Before setting aside any electric appliance for the movers to pack, check if it will work with the electrical system of the country you’re going to. This applies to computers, TVs, razors, blenders, fans, etc. Even appliances that seem like they would have just a motor and no electronics might still have some tucked away. And even though adapter plugs and transformers can help, there is no solution if the Hertz rating doesn’t match. For example, US electric is 60Hz and Singapore is 50Hz (following the UK example, I believe). This difference will lead to blown fuses and smokey burned-out appliances. (I’ve never heard of a fire starting, though.) Even if your computer or printer says it will work on the new Hertz, you still should check your user manual to see if you need to move a switch before plugging in.

    If your appliance doesn’t fit the bill or you’re unsure, you’re probably better off saving the shipment weight cost and just donating to GoodWill or recycling with GreenCitizen.

    Pets

    This is too wide-ranging a topic to cover here, but I do ask that you consider finding a good friend to adopt your pet instead of bringing it with you. I know from personal experience that long flights and new climates can be very hard physically and psychologically on a pet, and that doesn’t even count the effects of a quarantine period.

    If you do bring a pet, be aware that any domestic stops on your journey will subject you to temperature limits by most airlines. That is, they won’t let you check a pet if a stopover city is too hot or too cold, out of concern for your pet’s health. You probably want to move during spring or fall for this reason. Taking a pet as a carry-on seems unfair to your fellow passengers, and the pet would have to be a kitten or puppy to be allowed in one of the small carriers (which are the only kind they allow as carry-ons).

    Social support

    You won’t want to believe everything you read, but message boards can be a great source of info about the country you’re moving to, written by people from your same part of the world who have already moved there. Go to google and search for country you are moving to + “expat”. There are probably several message boards of the kind you’re looking for.

    Going portable as an alternative to using a remote desktop

    This tip is intended for people like me who:

    • Often need to work outside the office but don’t want to carry a laptop
    • Happen not to use any Linux or Mac machines, just Windows
    • Can’t use Window’s Remote Desktop Connection app. (Perhaps your IT dept won’t open the port in their firewall; but if the problem is just that you’re using a Home version of Windows, you could switch to a Business version or upgrade to an Ultimate version.)
    • Can’t use the similar VNC app because your client computer has User Account Control turned on and you want to keep it that way for security reasons.
    • Or, your IT dept won’t give you admin privileges on your machine, so you can’t install apps at will
    • If you use SVN, then either your SVN repo has a publicly-accessible IP address, or that you can access it via VPN. (That is, if you need to use SVN but it’s kept behind a firewall, then these instructions won’t help you access it…you’re stuck working non-portably.)

    The next best alternative I’ve found is to use a thumbdrive to keep your documents and applications (plus application state such as licenses, passwords, bookmarks, files currently being edited, email and contacts, etc).

    If all the desktops you’ll use the usb drive with are XP, then you could put MojoPac on the usb; it emulates an OS and provides a desktop view of your usb that runs as a window in XP. It’s not clear if there will ever be a version that works with Vista or Windows 7.

    Encryption

    Before copying any files to your thumbdrive, or installing any portable apps, consider whether you’d be hurt if the thumbdrive were lost or stolen and someone got access to its contents. If that’s at all important to to you, there are four options:

    • If you expect to have admin rights on any computer you might use, then you could install TrueCrypt on your thumbdrive and also create a TrueCrypt file container there.
    • If you don’t expect to have admin rights, but can convince your IT dept to install TrueCrypt for you, then check out the instruction in the next paragraph.
    • Or, if you won’t need more than 1G of space encrypted on your thumbdrive, you can try Rohos Mini Drive. (I haven’t)
    • Otherwise, you need to use Remora, where you will have to manually unencrypt each file you would want to use, then manually re-encrypt it after saving. (I haven’t tried this either)

    If you can go with TrueCrypt, then install it on your harddrive. Use it to encrypt the thumbdrive. You can configure the encryption so you are prompted for the password as soon as you plug in the thumbdrive anywhere; then, you will be able to access files and run apps from the drive as though it weren’t encypted but all your edits and adds will be encrypted. (Things will run slower due to the on-the-fly encryption, although perhaps not noticeably so.) You might want to encrypt only a folder of documents, but I opted to encrypt everything because apps like Thunderbird store one’s data (e.g. all one’s email, if you have opted to keep local copies) in generally unpredictable places. After installing TrueCrypt, do this (basically following TrueCrypt’s beginner tutorial):

    1. Do the following from a machine where you have admin privileges
    2. Start the TrueCrypt application from your desktop
    3. Click the Create Volume button
    4. Select “Create an encrypted file container”. (I tried “Encrypt a non-system partition/drive”, but when I plugged the drive into another machine where TrueCrypt wasn’t installed, I was prompted to format the usb drive.)
    5. Select “Standard TrueCrypt volume”
    6. Click “Select file”
    7. In the file selector popup, select your thumbdrive in the left pane, and for “File name:” provide a name for your container. Mine is “TrueCryptContainer”, but the paranoid might want to use “junk”. Then hit Save.
    8. In the Encryption Options view, just hit Next.
    9. When prompted for file size, use the full capacity available (e.g., 3680 MB for a “4GB” usb drive). Then Next.
    10. Choose a good password. Next.
    11. For Volume Format, set Filesystem to NTFS (or, if you can’t get admin privileges anywhere, choose FAT). Move your mouse around over the window several times to generate a good random seed for the encryption algorithm. Then click Format.
    12. When formatting’s done, close that dialog window.
    13. Back in the main TrueCrypt dialog, the one showing a list of unused drive letters, choose a letter you want your new container mapped to.
    14. Click the “Select File” button and choose the container file you just created.
    15. Click the “Mount” button. You’ll be prompted for the password you assigned. Sometimes this doesn’t work for me and I have to cancel and then hit Mount again before it works.
    16. Once it works, you’ll see your container file listed alongside the drive letter you chose in TrueCrypt’s main dialog. And the drive letter should appear in WindowsExplorer under “My Computer” along with your other drives. You should be able to open from and save to this encrypted container using any application’s File Open and File Save commands.
    17. Be sure to use the Dismount button before trying “safely remove hardware” and before removing the thumb drive.

    Sync with desktop or server/cloud

    After encryption, this is probably your highest priority. You can try a portable application like Toucan, but I think it’s not full-featured enough. For example, one needs to type or paste in paths when defining items to skip, instead of selecting them through a browse button. And my rules for skipping items were ignored anyway. Instead, rather than using a portable app to do the syncing, you probably just need to sync to one primary desktop, and Microsoft’s SyncToy running from that desktop works well. I configured it to sync docs/projects and apps separately, and I set my desktop-to-cloud sync service Mozy to sync just the docs/projects from the desktop (because I want double protection for things I can’t just reinstall and reconfigure). As a further step, I use Windows Task Scheduler (See “Help | Learn how to schedule SyncToy” within SyncToy) to kick off these SyncToy tasks near the end of every workday. Setting SyncToy to run at the end of a workday assumes your backup desktop is your work desktop; to sync to a desktop at home, you probably want the trigger event to be the insertion of your usb drive. TaskScheduler doesn’t natively support “mounting of usb drive” as a trigger, but you can buy MyTrigger for US$24 which enables TaskScheduler to launch SyncToy for such an event.

    Default programs

    Once you start reading email from Thunderbird on your thumbdrive, when you click links in msgs you’d want them to open with the browser also on your thumbdrive (especially if you might bookmark the link or enter a password). This doesn’t happen automatically; instead, you’ll get whatever app has been set as the default handler of the kind of file you want to open (where “kind” is determined by the file’s extension — the part after the dot). There is no good solution in XP nor Vista to this general problem of wanting to set usb-hosted apps as default handlers (but Windows 7 appears to support it).. However, just for the case of handling urls when they appear in apps other than the browser, one can make a desktop-hosted Firefox the default handler and then use the Foxmarks extension in that installation and all one’s other Firefox installs (including the portable one), since the extension syncs bookmarks and passwords across machines. (However, there does not appear to be any Firefox extension that syncs one’s open tabs, aka session.)

    Auto-start when inserting drive

    Many people like certain apps on their drive to launch as soon as it’s plugged in. PStart is a portable application with a small window (aka “panel”) where one can list other apps hosted on the same drive, and set some of them to launch when the drive is mounted. To make this work, however, one needs to configure each desktop OS to “autoplay” usb drives whenever they are inserted. XP and Windows 7 will prompt you if you want this done when you plug in your first usb drive, but Vista requires extra work:

    1. Open your run box (Start | Run) and type regedit and click OK.
    2. Go to HKLMSOFTWAREMicrosoftWindowsCurrentVersionPoliciesExplorer.
    3. You should see a key called NoDriveTypeAutoRun (see picture below). Double click it and set the Value Data to 91 (hexadecimal).
    4. Restart your computer and it should be fixed.

    Now install PStart to your usb drive.

    Once it’s installed, open its panel and go to Setup | Create autorun file. Select the drive letter that’s mapped to your TrueCrypt container. (You might want to tweak the autorun.inf file even further.)

    You can also tell PStart to launch any programs listed in its panel when PStart launches. To do so:

    1. Right-click on the item in the panel (or add it by right-clicking in an empty part of the panel and selecting “Add file”)
    2. In the dialog that appears, click the Advanced tab
    3. In this tab, set Autorun to “on startup”

    Reminder to take your drive when you log-off

    This is useful, but the Quiet version requires that you log-off instead of using Safely Remove Hardware.

    Force apps to release thumbdrive

    If you often have the problem of Safely Remove Hardware failing to dismount the drive, you might consider this workaround. However, there seems a pretty fair chance of data loss. (If you use PStart, the culprit might be that in Settings you don’t have “when closed” set to “exit application”.)

    I’ve heard that Windows 7 will actually tell you what application is holding onto the drive (but not what file it’s using).

    Reward if returned

    You probably want to create a text file at the root level of your drive called REWARD IF RETURNED.txt providing your email address, and make sure the file remains unencrypted. Or, you may want to make the name of the drive your email address or phone number.

    Mozilla Firefox browser

    I prefer Google’s Chrome browser, and there is a portable version (steps available below) but I haven’t found any way to export bookmarks once one starts using the portable version; that’s a critical flaw, because when I’ve installed updated portable versions, I’ve had to lose any bookmarks accumulated since I first started using the portable version. So, I’m using portable Firefox instead. (Update: I’m using Foxmarks because portable Firefox is slow, particularly when scrolling.) For portable Firefox, I recommend the following addons:

    • Xmarks – Keep your bookmarks in sync across machines and drives
    • Download Statusbar – I find it annoying that FF uses a popup to acknowledge every download attempt, and the designers of this addon felt the same
    • Undo Closed Tabs Button – If you were too quick to close a tab and want it back, this feature will help you
    • Tabs Open Relative – When you right-click to open a link in a new tab, it should appear right next to your current tab, not way down at the right. This feature fixes that.
    • Firebug – Useful for designing/debugging web pages
    • Zotero – Useful for managing a library of e-documents, such as you may have on your usb drive

    And if you really want to use portable Google Chrome, here’s how:

    1. The portable version is available from a German developer, and you’ll have to get a translation of his blog page (plus the download link) from Lifehacker.
    2. To copy over your default tabs and settings, (for Vista) copy your C:UsersyournameAppDataLocalGoogleChromeUserDataDefault contents to Portable_Google_Chrome_0.2.149.30Profil
    3. Migrating bookmarks from desktop Chrome to portable Chrome takes several steps
      1. Download and install Mark Clouden’s chrome bookmark exporter
      2. Run it and hit the Export Bookmarks button. Make sure it refers to the chrome installation on your harddrive rather than the one on your thumbdrive.
      3. We’re going to use Firefox to import the bookmarks, so we can tell Chrome on your thumbdrive to import bookmarks from it. So run an instance of Firefox where you don’t mind emptying all existing bookmarks first.
      4. In Firefox’s menu bar, go to Bookmarks | Organize Bookmarks, and delete all bookmarks in the Bookmarks Toolbar and Bookmarks Menu.
      5. Click the Import and Backup button at the top. Choose Import HTML.
      6. Select “From an HTML file, then select the bookmarks.html file you created with Clouden’s exporter.
      7. It may take awhile to import; wait for the bookmarks you expect to appear. Then quit Firefox.
      8. Start Chrome from your thumbdrive, then go to (Wrench icon in upper right) | Import bookmarks & settings.
      9. Set From to Firefox and click Import.
      10. You’ll find your bookmarks if you click “Other bookmarks” in the upper right, and then “Imported from Firefox”. You can drag items out of “Bookmarks bar” in this view right onto Chrome’s bookmarks bar. And you can drag the other bookmarks and folders onto “Other bookmarks” itself, and then right-click on empty folders to delete them.
    4. You may want to change the default download location under (Wrench icon in upper right) | Options | Minor Tweaks
    5. You may also want to tell Chrome to reopen the same pages when you restart it. Go to (Wrench icon) | Options | Basics | Startup | Restore the pages that were open last

    Notepad++ text editor

    PortableApps.com has a portable version of this very powerful and popular text editor. I recommend renaming the .exe file and all other folders and files that contain “++” to “NotepadPPPortable” because some sync/backup tools like Toucan have a problem with +’s in filenames.

    Mozilla Thunderbird email client

    PortableApps.com has a portable version of this email client and address book application. But there’s no way to connect to your office Exchange server unless they’ve enabled IMAP or POP support. (But if you have a mobile phone running Windows Mobile, its email client does support Exchange.) Also, I recommend adding the following extensions:

    AntiVirus

    PortableApps.com offers the ClamWin antivirus checker.  Note that this is only useful when you suspect a there is a problem, probably in a specific file. It is not a schedulable scanner. You would probably use it only after disconnecting your thumbdrive from a computer you have borrowed.

    WinAmp media player

    Just copy C:Program FilesWinamp to your thumbdrive. But disable Winamp Agent, or it will prevent unmounting the thumbdrive.

    Unzip / file compression

    I used to use the version of 7-zip available from PortableApps.com, but I find its UI very nonintuitive and am now very happy with IZArc, which has a portable version.

    OpenOffice for docs, drawings, spreadsheets

    PortableApps.com offers OpenOffice. OO’s doc writer is a great replacement for MSWord, and its drawing app is far better than Visio in my opinion.

    PDF-XChange viewer

    This PDF viewer is free, portable, and supports highlighting, comments, and typewriter features. The typewriter is great for filling out forms.

    I’m providing a link for the PDFXChange download, because the creator’s web site is so poorly designed and confusing. But the software itself is really good, and I bought a license for the extra features.

    Application launcher

    Launchy is a very popular choice, and its online PDF help file explains how to run it in portable mode. An alternative that bundles encryption and autorun scheduling is GeekMenu.

    Skype messenger and internet phone

    I haven’t tried it portably, but here’s a tip.

    Eclipse IDE

    (A programming environment for Java, C, and other languages like Prolog)

    These instructions are adapted from a forum post on PortableApps.com:

    1. If you’re going to use SVN as source control, make sure your SVN repository has a publicly-accessible IP address, or that you can access it via VPN.
    2. Run all updates in the Installed Software tab.
    3. Select Core SVNKit Library under http://eclipse.svnkit.com/1.2.x/Revision Graph, Subclipse
    4. Select SVNKit Adapter under http://subclipse.tigris.org/update_1.4.x
    5. Disable SVNKit library version 1.2.1.5297
    6. Once you launch Eclipse and select a workspace, the full filepath to that workspace — including drive letter — will be saved to eclipseconfiguration.settingsorg.eclipse.ui.ide.prefs. Because the drive letter assigned to the thumbdrive can differ across machines (and even across uses of the same machine), you probably want to edit this file so that the final line has no drive letter. For example, mine is RECENT_WORKSPACES=ProjectsEclipseWorkspace
    7. Eclipse seems to change eclipse.ini each time it runs by updating the ‘-vm’ value, and for me it puts an absolute path there including drive letter. So you may want to make eclipse.ini read-only to avoid this bug.

    Unfortunately if you are a Prolog coder wanting to use Eclipse, there is no way to work truly portably. There is just one version of Prolog that runs in Eclipse that I know of, Amzi (which works very well), but it depends on environment variables in your OS and there is no way to provide these vars via an Amzi config file (yet). However, if Amzi is installed on your desktop, then an Eclipse running from your usb drive will be able to support Prolog. I have only tried this when I’ve installed Amzi on both the desktop and in the Eclipse on my usb. Here’s how:

    1. Go to the Amzi Logic Server download page and scroll down to section “3. Existing Amzi! and/or Eclipse Users”. Follow that.
    2. When prompted for a Destination Folder, be sure you select a folder on your thumbdrive (I selected “F:AppsAmziProlog”).
    3. When that’s done, you can enable Prolog support in Eclipse by following Amzi’s install instructions in the section on Existing Eclipse Users.
    4. After restarting Eclipse, go to Windows | Open Perspective | Other | Prolog.
    5. Then do the same to open the Debug perspective.

    Task manager / Todo list

    I’m not quite obsessive enough yet to need a multi-level task manager, but ToDoList offers that capability and can run portably.

    Wishlist

    • An app that monitored the health of the usb drive and warned me when it’s time to copy its contents to a new drive. (I already try to auto-sync the drive to my laptop at home whenever I remember/have time to plug it in, and that laptop auto-syncs in turn with Mozy.)
    • A desktop-hosted antivirus that scanned any usb inserted before allowing it to autoplay.  This may just require me to hunt more; I currently use Kaspersky, which seems to work very well, but it’s not clear if it does this and their tech support has ignored my questions about it.

    Causal chain from gaze to mental attitudes in one agent

    An update on my gaze project… This diagram shows pre-formalization work (in AI) on a mapping of gaze actions, mental attitudes, and causal chains among them. The set of actions, and their conditions and effects, are being fleshed out by referring to standard textbooks on nonverbal communication. The mental attitudes, and their conditions and effects, are being fleshed out by referring to BDI models in AI. Eventually, this knowledge will be formalized as rules (probably in Prolog). This effort is made somewhat harder by having to write the rules in a way that they can be used either to “run” the mind of an agent or be used by the agent to explain/predict the observed actions of other agents. This AI microtheory should be useful in a variety of domains, but I’m especially interested in the problem of detecting when an elderly person in one’s care needs help. Eventually, I’ll want to add recognition of facial expressions, gestures, etc but one can tell a lot just from gaze: level of alertness, goals, level of satisfaction with one’s work toward a goal, emotions of fear and surprise, attempts to get one’s attention, and more. causalseqofgaze To give a sense of how this theory of mind could be applied to multi-agent interactions, here is a sample analysis of an imaginary interaction between Randy, Betty, and Greta. Randy admires Betty as she passes by, but she notices and has a negative reaction. Randy realizes he’s been caught out and tries to repair the situation, but doesn’t act quickly enough before Betty looks away. Greta is looking on (without being noticed by the others) and can predict what they each think and feel by the end.One could adjust the scenario in several ways:

    • Betty is shy and avoids meeting Randy’s eyes before even realizing he’s leering.
    • Randy anticipates a bad outcome from leering before doing so and is able to suppress it.
    • Randy and Betty had been introduced a few days before, and Randy recognizes her only after starting to leer. This incident sours their cordial relationship.
    • Before this encounter, Greta had been under the impression that Betty had a romantic interest in Randy, but this encounter changes her mind.
    • At the end, Randy realizes Greta has been watching the whole time and gets mad (explain why!), or they share a laugh.

    alanbettycarmen_scenarioflow1

    Micro-theory of gaze

    Gaze (aka “looking at something”) can be automatic (scanning the environment), strategic (getting a closer look at something interesting), and even communicative (indicating interest, anger, or eagerness to cooperate). I’m aiming for rules that would allow not just simulating an agent, but allow an agent to predict or explain another agent.

    Following my evaluation of potential inference tools last week, I’ve tentatively settled on using either JIProlog or AmziProlog because Prolog is the only rule language that allows me to be expressive enough, and these two tools allow Java and Prolog to invoke each other (while Prolog is embedded in a Java Runtime (JVM)).

    It’s important that the tool run in a JVM because I want outside users not to be restricted to using the same platform as I do, and because I want to use industry-standard libraries for graphics, etc.

    The high-level architecture of my system (named ATOM = automated theory of mind) is:

    • A set of Prolog facts for each agent representing that agent’s initial mental attitudes.
    • A set of Prolog rules for each agent representing what mental attitudes cause others under what conditions.
    • A Java object that simulates a physical environment. When the Prolog module infers an “attempt” attitude, this is converted into a call into the Java module to see what follows from the attempt. When the Java module determines a causal change that should be observable to an agent, then agent-specific “percept” facts are injected into the Prolog module and may trigger new inference there.
    • A set of JUnit tests that swap out files for different initial states in Prolog (and different simulated environments in Java) to verify that the outputs match expected values. There may also be unit tests that swap out rules to test predictions about agents that veer from the “norm”.

    Because a fair amount of research goes into the formulation of each rule, each rule will be in its own Prolog file accompanied by comments describing the scenarios that inspired the rule (also present as unit tests) and the particular pages in articles that inspired the rules or any changes to them. (This may seem fussy, but a career in reading related research tells me we all need to be much better at providing such “provenance”.)

    As mentioned in the summary above, gaze is a good human behavior to start with. But read here for even better motivation: Vertegaal et al, 2001: “Eye Gaze Patterns in Conversations: There is More to Conversational Agents Than Meets the Eyes”.

    Some initial generalizations I wanted to capture:

    • The blurriness of peripheral percepts is high.
    • Moving the eyes in the dir of a peripheral percept will change it to a focal percept, and whatever was focal will become peripheral.
    • The blurriness of a focal percept is usually less than if it were peripheral.
    • For intriguing resemblances with high blurriness, one would want less blurriness.
    • For intriguing resemblances in focus, one usually wants to ignore peripheral resemblances that are less intriguing.
    • For surprising peripheral items, one automatically looks even if intending not to.

    I want to write rules for these in such a way that, not only could the rules drive the behavior of one agent, but they could also be used by that agent to explain the gaze behavior of another agent that one is focused on. That is, it’s the start of an automated “theory of mind”.

    Comparison of tools for rule-based inference

    Although I started writing pseudo-code rules a few weeks ago, I’ve suspended that while I look for a tool for writing and testing rules so I don’t assume too much about the quality of the rules so far. This post describes what I’m looking for and the tools I’ve looked at. I’m likely to go with JIProlog or Amzi Prolog (and perhaps later, ECLiPSe).

    I’m looking for an expert system-type tool that offers these features:

    1. Simple, first-order-predicate-calculus (FOPC) syntax rather than C-style code
    2. Can be embedded easily in Java programs
      • Prefer Java because it’s platform-independent
      • Java has widespread use in simulation research that I might use or partner with
    3. Has a large, active community that might be interested in my work, and from whom I might get tech help
    4. Has good IDE (editor tool) support for auto-completion, syntax-highlighting, breakpointing, etc
    5. Allows for easy changing of any rules governing beliefs, desires, and intentions (aka BDI)
    6. Has a way of tracking what papers/scenarios influenced the conditions and actions of each rule (aka “provenance tracking” in my terms)

    This is what I’ve found:

    Jason

    PROs

    • Uses a Prolog-like syntax (i.e. fairly similar to FOPC)
    • Has a jEdit-based IDE that allows inspecting the mind of each agent, and history of actions among the agents
    • There is a book with in-depth how-to info (which I’ve ordered)

    CONs

    • Structured as a “BDI” system, which sounds at first like a big positive, but on deeper inspection seems to mean that it handles beliefs, desires, and intentions in a pre-defined way. I’m not sure yet how limiting this might be.
    • I asked the creators if there was a compelling reason to use it over a simple inference engine (apart from the ability of MAS frameworks to allow agents to run over a network), and they said there isn’t much advantage in that respect.
    • Might not support back-chaining or facts with universally-quantified vars (however, this might be possible in Jason and all Rete-based systems by encoding such facts as rules with “true” as their antecedent)

    JADEX

    PROs

    • Java based
    • Has its own BDI debugger tool

    CONs

    • BDI support seems “cooked in” and likely to be hard to change
    • Rules must be expressed in an especially verbose form of xml
    • Might not support back-chaining or facts with universally-quantified vars

    JAM

    PROs

    • Java-based BDI
    • Distinguishes between achievement goals and maintenance goals
    • Seems more comprehensive in BDI concepts than similar systems

    CONs

    • Procedural rather than FOPC-style syntax
    • Seems to have same limits on expressiveness (i.e. fairly restricted compared to FOPC or Prolog) as other forward-chaining expert system tools using the Rete clause-indexing algorithm

    Discrete Event Calculus Reasoner (DECL)

    PROs

    • The “event calculus” (EC) is a set of rules for dealing with the frame problem (aka “commonsense law of inertia” — things tend to stay as they are unless explicitly changed). The frame problem occurs when a rule system knows a fact was true in the past but isn’t sure whether to assume it’s still true. CCSS is likely to run into the frame problem often, since our systems are likely to be triggered by combinations of mental attitudes, but some of those attitudes may seem “stale” due to being triggered in the past. The “discrete” part in the name is due to assuming a timeline with discrete points rather than a continuous one.
    • Has a book that shows how to encode BDI and OCC in EC

    CONs

    • Requires integrating Unix env, Python, “PLY”, and one of a small set of SAt logic solvers. This seems too brittle and platform-specific to me.
    • While EC has a good-sized community, this inference engine seems to have a small one
    • Seemingly no IDE support
    • Might not support  facts with universally-quantified vars

    Clips

    PROs

    • FOPC-like syntax

    CONs

    • C-based, so integrations would have to use C, JINI, or system-level calls

    Jess

    [also see Jess vs Prolog]

    PROs

    • Clips syntax implemented in Java
    • Has an Eclipse (IDE) plugin
    • Widely used

    CONs

    • Roundabout support for facts with universally-quantified vars: One must use the special ‘nil’ slot value for such vars. Not sure if this will actually work.
    • Roundabout support for backchaining: Instead of indicating that a particular rule is backchaining, one must commit to making a particular predicate backchainable. This seems like it could have unwanted side-effects.

    ACT-R

    Uses special jargon — “chunk” is what most other systems call a fact; “declarative memory” (for chunks/facts) is what other systems call “working memory”; “procedural memory” is where production rules are kept (is this an oxymoron only to me?)

    Structured very similarly to BDI toolkits — there is a goal buffer and a retrieval/belief buffer, but unlike BDI these buffers are allowed to hold only one fact/chunk at a time.

    PROs

    • Very popular among “cognitive modeling” community

    CONs

    • Limiting the size and number of buffers seems unnecessarily limiting — it limits the expressiveness of rule conditions
    • Seems to be no support for universally-quantified vars in facts, especially because vars are not allowed to bind with ‘nil’ in facts
    • Coded in Lisp, and special purpose IDE is very limited but does have a stepper (but perhaps not breakpointing)

    Soar

    Similar jargon as ACT-R.

    PROs

    • Fair-sized community in AI and Psychology (e.g. John Laird of UMichigan)
    • Formalization of Cohen & Levesque’s teamwork theory exists, STEAM, created at USC by Milind Tambe’s group. (I strongly suspect any team- or cooperation-related work we do will have to build on C&L’s theory.)
    • Java-based
    • Has its own IDE

    CONs

    • Would prefer Eclipse plugin to specialized IDE
    • Syntax allows for separate stages of “proposing” and “adopting” operators, and a lot of flexibility in conflict resolution. I don’t currently need this, though, and the syntax is somewhat different from traditional FOPC. I think I could migrate to Soar later if needed.
    • Not sure if Soar has negative connotations for roboticists and AI folk, due to its somewhat dogmatic views on cognitive architecture

    LEADSTO

    Prolog-based simulation tool. It’s been used by its creators for simulating many domains, including a prey animal that anticipates a predator’s actions (by simulating the predator’s mind) and counteracts.

    It appears to have some sort of unit-testing support built-in. Perhaps embeddable in a Prolog/Java bridge.

    JBoss Rules/Drools

    [IBM tutorial]

    PROs

    • Very active community (enterprise)
    • Not only an Eclipse plugin but also a browser-based “rule management system” (not sure of its features yet)
    • Java-based

    CONs

    • Two primary rule languages – one in “natural” but highly constrained English, and one that mixes FOPC with procedural constructs. There is a nascent effort to provide CLIPS/Jess-style language support also.
    • Doesn’t support backchaining
    • Probably doesn’t support facts with universally-quantified vars (unless through a workaround like Jess’)

    Prolog in Java [external overview]

    PRO: Prolog supports backchaining and facts with universally-quantified vars

    • JIProlog
      • PRO: Java can call Prolog, and vice versa
      • PRO: Available as shareware (free)
      • PRO/CON: Special-purpose IDE
      • PRO: Actively supported
    • AmziProlog
      • PRO: Java can call Prolog, and vice versa
      • PRO: Has a plugin for Eclipse IDE
      • PRO: There is a breakpointing debugger available for the IDE plugin
      • PRO: Good support through Amzi forums
      • CON: The breakpointer is not free
      • CON: The breakpointer, and the interpreter, require some getting used to. For example, the listener must be shutdown manually (via a “quit” typed command) before one exits Eclipse. Common keyboard shortcuts cause errors when the Listener has focus.
    • ECLiPSe
      • An extension of Prolog that supports “constraint programming”. In my terms, this allows one to make temporary bindings such as “T > 2” rather than only permanent absolute bindings like “T = 3” as pure Prolog demands, thereby delaying commitment to a specific value. This is a way to be efficient by doing search in a “constrain and generate” style rather than a “generate and test” style. For example, if one wanted to know if “the book was returned before Monday”, one could constrain a time variable to be before Monday and then search for book return events using that time variable, instead of searching for all book return events using an unbound time variable and then checking each variable binding to see if it was after Monday.
      • However, one is limited to relations for which a “constraint solver” can be written; ECLiPSe comes with several solvers for different kinds of problems.
      • A plugin for the Eclipse IDE is available, Saros, but I couldn’t get version 1.0 working. It was contributed by a Cisco employee, and rumor is that he will get official support to update it. I’ll re-evaluate when that happens.
    • JPL
      • PRO: Java can call Prolog, and vice versa. Commonly used with free SWI-Prolog
      • CON: Not actively supported since 2003 (at least not the docs)
    • tuProlog
      • PRO: open source, and appears actively supported
    • JLogic
      • PRO: Free
      • PRO/CON: Special-purpose IDE
    • Jinni
      • CON: Not free
    • SICStus Jasper
      • CON: Not free
    • GNU Prolog for Java 0.1.0
      • PRO: Free
      • CON: Very early release
      • CON: Appears not to be actively supported
      • CON: No IDE

    Cyc engine and tools (but not KB) – TBD

    Currently can’t run on Vista; will try on XP or Linux soon

    RePast – TBD

    Swarm – TBD

    Cogent

    Targeted at situation-assessment and decision-making applications. Uses Bayesian network as part of its tech.

    PROs

    CONs

    • Appears to be not available, except perhaps through a belief net app from Charles River Analytics
    • “On the situation assessment side, one major problem we are confronted with is that the type of belief network currently used in COGENT does not model temporal information explicitly.”

    MatLab

    PROs

    • Popular in research community

    CONs

    • Seems limited to numeric-oriented simulation rather than rule-based ones

    Prevent HTC phone from switching to Ulaanbaatar time from Singapore/Malaysia time

    My HTC Hermes (AT&T 8525) keeps switching from the time and zone I set in Settings — in particular, from Singapore/Malaysia’s timezone to Ulaanbaatar’s. I found just one Google match about this problem: http://www.ppcsg.com/lofiversion/index.php/t93410.html, and it indicates the problem also affects the HTC Touch. The solution is to go into the Phone app, Menu, Options, find the Time Zones tab on the far lower right, and uncheck “Automatic change time zone and clock”.

    Implementing Calendar Sync via Funambol

    Sync software for managing contacts, events, etc across devices can be very helpful when it works, and very frustrating when it doesn’t. In my experience, when an error occurs it’s often unclear how to rectify it; and even when there is no error, it’s hard to judge whether it worked because many updates may have been made to my data and it’s unclear whether the updates were correct until I run into an instance of not being able to find a contact, or finding that I have multiple instances of the same reminder.

    I was really motivated to find a solution, and recently I had the opportunity because my employer chose to offer a sync service by using the open-source software shepherded by Funambol.com. This article describes how to use Funambol’s software to allow users to sync with a pre-existing store of their event and task data.

    Funambol provides a very capable sync platform that gets one past the fundamental technical challenges and will allow you to focus on the usability issues that all current sync experiences have.

    What Funambol Provides

    The core piece of Funambol’s software suite is the Data Synchronization Server (DSS), comprised of Tomcat 5.x running a webapp that handles the syncML xml-over-http protocol plus a dbms (mysql, postgres, or hypersonic) for state persistence. There is also a webapp (“webdemo”) providing an html interface for managing events and contacts, to be stored in the same dbms, but this article is about using pre-existing storage, so you won’t want to use this webapp or its db tables.

    Another major part of the suite is the large stable of plugins for mobile devices. Most smartphones and PDAs (even the iPod) have calendar and contact functionality, where the data is stored in a local file or dbms. Many of these devices also have factory-installed (“native”) sync software, which works by sending a “begin sync” request encoded in syncML wirelessly(*) to a server that knows how to coordinate sync attempts. (*Strictly speaking, a device doesn’t have to have a wireless connection; instead, like the iPod, it might require a cable to a desktop computer, which itself has to be connected to a network unless you’re syncing only with the desktop itself.) For devices which do have contacts or events storage but no native sync client, one can usually find a “plugin” from Funambol. Unfortunately, finding out whether your device already has a sync client can be difficult, since the app might be hiding several folders deep from your main menu.

    If you don’t have a sync client, but do have the ability to install software directly on your device, you can look for the plugin at https://www.forge.funambol.org/download/. If you don’t have the ability to install directly, but your device can receive binary SMS messages, then you can signup for a free user-oriented account at http://my.funambol.com, where you can indicate your device type and trigger an SMS containing the plugin software — all you have to do is click on an install link in the message.

    There are some surprises in the devices that Funambol does support and those it doesn’t. It does support Microsoft Outlook (via a plugin), allowing data there to be synced with your pre-existing storage even without an Exchange server being involved. But it doesn’t support OS X’s Sync application. It does support the iPhone for syncing contacts only, but not events (because, I’m told, the iPhone SDK doesn’t support access to the device’s calendar storage). It does support the iPod, but only if you can connect it to a Windows desktop (via the iPod’s USB cable, which talks to a plugin from Funambol that you must install); there is no support for using an iPod via a Mac (and since the iPod has no wireless capability, it’s dependent on being tethered to some desktop). Another wrinkle is that some carriers disable the native sync clients on phones they sell, or they block Funambol’s plugins from using the wireless connection unless Funambol has been certified. With new phones arriving all the time, this makes for a moving target.

    Two more major pieces of Funambol’s suite are the “sync portal” and the “PIM listener”. These are not available in the free, open Funambol software but can be gotten in the “carrier edition” through agreement with Funambol. The portal is another webapp using the same dbms, providing an http-based API, which can be used to build a website allowing users to indicate their device type and trigger an SMS containing an installable plugin (similar to what my.funambol.com provides).

    The PIM listener is useful for users who have several devices that must stay in synch; it notices when one of a user’s devices has been synced, and triggers an SMS to all the user’s other devices (which are known to support such SMSs) instructing them to automatically initiate their own sync to pick up the changes. Obviously, this won’t help with Outlook or an iPod (because they can’t receive SMS notifications), but is a good way to satisfy one’s most-engaged users, who tend to have lots of other toys. I believe that if one updates device A, makes a separate update to device B, and syncs A, then not only will B be auto-synced, but the result of that sync will trigger an auto-sync of A to pickup the new data from B. However, as the number of a user’s devices goes up, the number of auto-syncs may thus increase factorially/exponentially; I know of no tests to see if this leads to significant battery drain or application latency. For a demo, check out the video interview with Funambol’s CEO at TalkTv.

    As a last note, although the sync portal allows plugins to be pre-configured to connect with one’s own DSS, I don’t know if it’s easy to configure such plugins for locales other than US English. Also, I believe there is no publicly-available list of all error messages that a user might encounter, for use in one’s own Help page.

    How Sync with DSS Works

    Once a user has located the native sync client on their device, or installed a Funambol plugin (actually, any syncML client from any software provider should work, since Funambol’s DSS is syncML-compliant), and has gotten a login from the data provider (e.g. mail.aol.com or my.funambol.com) to enter into the client, they are ready to start. Actually, there’s one more step: Make sure the data provider url is correct in the sync client.

    When the “synchronize” button is pressed, a “begin sync” syncML message is sent to the data provider’s url along with the user’s login, a deviceId (unique to the device compared to all other devices in the world), and an indicator of what kind of data to sync (i.e., contacts, events, or tasks). The data type is actually mapped to a “sourceUri” in the device — for example, some devices allow syncing events using either sourceUri “scal” (indicating the SIF Event format) or sourceUri “event” (indicating the VCalendar format). The sourceUri values are arbitrary and depend on how the DSS has been configured, but the values given here are Funambol’s plugin defaults. I recommend using SIF formats if possible, since they appear to allow for a wider variety of event data, and are common on Microsoft devices and thus have a large user base and the implicit testing that comes with that.

    If the login is approved, the DSS combines the userId and deviceId into a “principal id” and checks its fnbl_last_sync table to see if this principalId+sourceUri has synced with it before; if not, it’s treated as a “slow” (aka full, complete) sync; if it has synced before, it’s treated as a “fast” (aka incremental, partial) sync. For slow syncs, DSS responds in syncML by asking the device to send an id for each data item of the requested type; for fast syncs, DSS asks the device to send only id’s for the data added, updated, or deleted since the last sync (as indicated by the date in fnbl_last_sync). DSS makes a similar slow-or-fast request to the data provider end.

    Who or what is this data provider? If you use the DSS’s “webdemo” webapp for managing contacts and events, it’s a set of tables in DSS’s dbms. But in our case, it’s a “calendar server” that serves many other client applications and which we access via the network. More on this later.

    For a slow sync, DSS asks the device for the full data behind each data item id, and passes that to the data provider telling it to add it. The data provider must reply with an id for the new item. Similarly, DSS asks the data provider for the full data behind every id, and passes that to the device telling it to add it. The device must also reply with an id for the new item. DSS stores the pairs of {device event id, data provider event id} in db table fnbl_client_mapping, keyed by principalId and sourceUri. (Note: if you reinstall a plugin, the deviceId is likely to change, so the principalId would change, which means that syncing, then reinstalling and syncing again, is likely to lead to duplicates on both the device and data provider ends, because there is no longer a matching principalId in fnbl_last_sync and thus a slow sync is done the second time.)

    For a fast sync, DSS asks the device and data provider ends what has been added since the time of the last sync, sends the new items to the other end, and updates the mapping table with the new id’s. It then makes a similar request to both ends for items updated since the last sync, and then for items deleted since the last time.

    On the data provider end, DSS makes all its requests through “modules” which must implement a predefined API. But before a sync can occur, the userId and password that a user entered in the device must be authenticated, and each module is responsible for indicating what Java class should do the auth check. Funambol calls such an auth class an “Officer”, and provides a default one that checks the co-installed dbms for a matching login. Since we are focused on using pre-existing data providers, you will want to write your own Officer to connect with your existing auth service for user accounts.

    The module API requires implementation of these signatures:

    • beginSync – Called by DSS if the officer indicated successful auth; the principalId and sourceUri are passed in via a context parameter
    • getAllSyncItemKeys – DSS uses this to get data item id’s when starting a slow sync
    • getNewSyncItemKeys – DSS uses this to get data item id’s for fast syncs, to get items added since last time
    • getUpdatedSyncItemKeys – DSS uses this to get data item id’s for fast syncs, to get items updated since last time
    • getDeletedSyncItemKeys – DSS uses this to get data item id’s for fast syncs, to get items deleted since last time
    • getSyncItemById – If DSS wants the device to add or update something, it gets it from the data provider this way
    • getSyncItemKeysFromTwin – The user might have added similar items on both the device and data provider ends; this call gives the data provider a chance to report items it thinks are similar to the given device item, to avoid adding duplicates at both ends.
    • addSyncItem – If the device has an item the data provider should have, DSS uses this to put it there
    • updateSyncItem – If the device has an item that should replace one the data provider has, DSS uses this to put it there
    • removeSyncItem – If an item has been deleted on the device, and a similar one should be deleted from the data provider, DSS uses this to delete it
    • mergeSyncItems – If the module has been designed to resolve conflicting similar items from the device and data provider ends by merging them, DSS uses this to get that merged version as a step before sending the update to the device and data provider ends. (This method is available only if your PIMCalendarSyncSource extends MergeableSyncSource.)
    • commitSync – This is the next-to-last call DSS makes into a module for a sync. It calls it even if a SyncSourceException has been thrown by a previous call, which IMO is a design flaw since it doesn’t follow the traditional semantics of a ‘commit’; for example, if one plans to do all updates and deletions to the data provider via a batch, one would want to do that here — but that allows for loss of data integrity because a related update to the device might have failed.
    • endSync – Always the last call DSS makes into a module

    If one wants to provide support for events and contacts using the SIF, VCalendar, and VCard formats, there are several ways to split up the work across modules. One could make a separate module for each combination, responsible for only one format of one kind of data, or have a single module that handles all types and formats. The major constraint on modules is that each can have only one auth Officer. While this constraint didn’t limit our design choices, we did happen to implement contacts and events in separate modules in separate source trees but where both modules would use the same officer code. Having separate trees had the unfortunate consequence of needing to copy/paste the officer code from one tree to the other, putting it in a different package, and configuring the two modules to use these different officer packages. Clearly, relying on copy/paste opens the door to maintenance problems; I recommend using a single source tree and single officer codebase if at all possible.

    Putting a Module Together

    In this section, we walk through the creation of a module to handle multiple data types and formats, where the data provider is a remote host rather than Funambol’s “webdemo” default.

    First, find the source for DSS at Funambol’s downloads site, or via objectweb (e.g. http://cvs.forge.objectweb.org/cgi-bin/viewcvs.cgi/sync4j/funambol/modules/foundation/connector/src/main/java/com/funambol/foundation/engine/source/Attic/PIMCalendarSyncSource.java?hidecvsroot=0&search=None&hideattic=1&sortby=file&logsort=date&rev=1.18&content-type=text%2Fvnd.viewcvs-markup&diff_format=h). You’re looking for these files:

    • PIMCalendarSyncSource.java (and parent PIMSyncSource.java)
    • PIMCalendarManager.java (and parent PIMEntityManager.java)
    • init_schema.sql
    • SIFTaskSource.xml
    • VCalendarSource.xml
    • Funambol.xml
    • PersistentStoreManager.xml

    Notice that PIMCalendarSyncSource implements the module API mentioned earlier by making calls into PIMCalendarManager, and PIMCalendarManager manipulates the co-installed dbms on behalf of the “webdemo” default webapp. You should hollow-out the method implementations of PIMCalendarManager and code them so they work for your existing data provider. I don’t have more to say about that part of the job.

    The remainder of the work is configuring the module. Look in init_schema.sql at how the fnbl_sync_source table maps an incoming sourceUri (e.g. stask) to a bean file (e.g. SIFTaskSource.xml) that describes how to configure PIMCalendarSyncSource for that sourceUri. The key insight here is that every sync attempt triggers the creation of a new PIMCalendarSyncSource object, and that object manages state between the beginSync and endSync calls. By creating an object for every attempt, we can use the same Java class to handle different data types (events or tasks, even contacts) for different formats. Differences across data types and formats are largely hidden from you by the convenience methods in PIMCalendarSyncSource for marshalling and unmarshalling formats into Funambol data objects (e.g. Event, Task, and their common parent com.funambol.common.pim.calendar.CalendarContent). If you need to know the data type within your module code, you can use PIMCalendarSyncSource’s entityType member (or add your own member field and an entry in the bean files to populate it).

    A module can support multiple data types and formats by adding an entry for each to the fnbl_sync_source table.

    Be sure to make these changes:

    • Change the package and class name of the ‘object’ element in each bean file to match yours
    • Do the same for the ‘class’ attribute of the fnbl_sync_source_type table in init_schema.sql
    • Note that the values of the ‘config’ attribute of the fnbl_sync_source table aren’t filepaths in your source tree; instead, they reflect where the files will be unpacked from a zip after running bin/install-modules.sh. You need to check build.xml to make sure these files are pulled from the right place during the build.
    • Since you will be using your own solution for user acct management, you need to disable Funambol’s by providing a stub. In init_schema.sql, all entries in fnbl_sync_source_type should reference the stub class for the ‘admin_class’ attribute.

    Next, look in Funambol.xml and notice the officer, store, and serverURI properties. The ‘serverURI’ value is probably ‘{serverURI}’, which indicates a placeholder that will be filled using install.properties when install-modules.sh is run; using placeholders is a pretty convenient way to get config into your module; you can define your own bean java and xml files, add your values to install.properties, and update install-module.xml to map those properties to placeholders in your bean xml file. You can also use a placeholder for the entire ‘officer’ property and put the xml in install.properties (which is helpful if you’re using different officer classes in different environments).

    The ‘store’ property is important because it points to PersistentStoreManager.xml, which you will edit for your module. If your module will support fast/incremental sync, it needs its own db storage to keep track of the item id’s involved in the previous sync (aka “anchors”). For example, if the data provider provides a way of asking for all items changed since a certain time, but it doesn’t distinguish between which are new and which are updated, and doesn’t indicate any of the deleted ones, then you need to compare those id’s with the anchors: anchors that aren’t in the list from the data provider should be considered “deleted”; items from the provider that aren’t among the anchors should be considered “new”; all the rest can be considered “updated”.

    To store and read anchors, you should create files create_schema.sql and drop_schema.sql (referenced in install-modules.xml) for creating and dropping the table(s) you need; if your module is listed in the modules-to-install property of install.properties when you run install-modules.sh, you will be prompted whether to recreate a table; answering ‘y’ will run drop_table.sql and then create_table.sql (so, you generally want to answer ‘n’ when installing unless you want to force all users to do slow syncs next time). To allow the module to read from and update the anchors, create java and xml bean code where the xml defines the SQL for inserting, updating, querying, and deleting. Add the bean xml filename to the list of such beans in PersistentStoreManager.xml.

    Some of the files I’ve mentioned may not be in Funambol’s public source, and I’m sorry for that oversight. But they are helpful folks and I’m sure they would send you some samples.

    Lessons Learned

    Other than issues I’ve already mentioned, here are some issues you should be aware of up-front:

    1. There is no automated update process for getting the latest info about the world of devices. The sync portal needs this so it can offer support for the latest phones, and so it can keep up with changes in support for existing ones. You need to arrange with Funambol to be sent regular “phone pack” updates.
    2. There is no support for sending a message from the server to be displayed in the client. For errors, the best one can do is set a message in a thrown SyncSourceException, since the message will appear in the client log if it uses log level “error”.
    3. There is no way for a module to trigger a change from fast to slow sync in case, say, an error is found with anchor storage.
    4. The module API requires addItem to return the id that your storage will use for the item. This makes it hard if not impossible to include add’s with update’s and delete’s in a single batch during commitSync(); instead, you may need a synchronous network roundtrip for every call to addItem().
    5. It appears that the SIF Event content type used by Microsoft uses UTC-based times (e.g. 20080704T010000Z for 9amPT on July 4) for all values unless it’s an all-day event, in which case it uses YYY-MM-DD with no timezone indicator.  Funambol’s convert() in PIMCalendarSyncSource leaves these all-day values as-is instead of changing them to UTC format. I didn’t find any guiding principle about when UTC versus local time was used.
    6. The default dbms in the open source version of DSS is hypersonic. If you plan to use mysql, be sure to indicate your engine is InnoDB in your DDL scripts since Funambol’s dev wiki mentions there’s a risk of memory leaks without that.
    7. Since you’re using your own storage, make sure to remove the “webdemo” webapp (and associated db tables) so users don’t stumble across it by accident.
    8. When dealing with com.funambol.common.pim.calendar.RecurrencePattern, note that recurrence types with “NTH” in their name are for use when instance!=0, not when interval>0.
    9. Funambol’s Outlook plugin has a bug where if it is given an event that repeats every N years, it will change it to repeating yearly.
    10. Outlook has a bug where if it is given an event that repeats monthly, that’s timed/not-allday, and whose start time crosses a day boundary when converted to UTC (e.g. 4pm Pacific in summer becomes 1am UTC the next day), then Outlook will move it a day late (e.g. from Friday to Saturday). There is a similar problem when Outlook provides such an event, instead of consuming it.
    11. If an event is timed/not-allday but it has a duration of 24 hours, then the Outlook plugin will change it to an allday event, thereby losing start and end times.
    12. When a repeating event is created, synced, and then some individual dates on each end are modified or deleted (they must be different dates on each end), then on the next sync they will show as dupes. This is because Funambol’s default merge method has a bug…such changed dates are represented as an “exclusions list” in the RecurrencePattern obj, and merge should take the union of the device’s list and the data provider’s.
    13. Most devices don’t support multiple calendars, so Funambol has no special support for them, but your data provider may have many users who have these. As long as your module uses a concatenation of calendar id and event id as the full id given to DSS (to guarantee each item id that DSS has is unique within a user’s set of id’s), there should be no problem.
    14. It’s not feasible to host, say, a contacts module in one DSS and a calendar module in another DSS, and rely on sniffing to route http requests to the correct server. The problem is that the sourceUri is the one piece of info that would allow such decisions, but it’s embedded in the request payload, not the http headers.
    15. If a user did a manual sync at the same time that one of his devices happened to do an auto-sync, I believe there is no mechanism in DSS to detect this and block one of the requests, to avoid loss of data integrity for the user. For example, imagine there are similar events on device A, data provider B, and auto-syncing device C; while A and B are merging, C might update the event on B; when the merge of A and B is done and the update happens on B, it wipes out the info from C.
    16. If you use the carrier edition, be aware that your init_schema.sql isn’t the only file controlling what sourceUri’s the DSS thinks it supports. There are also portal/database/cared-coredb-mysql.sql and portal/database/cared-mysql.sql. Be sure to remove INSERT calls from these for sourceUri’s you won’t support, to ensure the user sees an error saying something like “Remote name not recognized” rather than a generic error.
    17. If you see error “Fatal error creating the PersistentStore object” in your log, a possible cause is having the wrong line endings in the config files of your s4j file. For example, if you build the s4j on a Windows machine and try to run it on a *nix machine, you are likely to see this error.
    18. Trivia item: The second param to the ICalendarParser constructor, a String, represents the device’s charset, not its timezone.

    Community Mailing Lists

    Good luck!

    How to search for a string pattern in a directory tree in *nix

    grep --include='FILEPATTERN' -rin SEARCHPATTERN *
    
    Where:
    FILEPATTERN is the file name pattern you want to match, for example 'h*' for all files that start with h
    -r is recursive
    i is ignore case for SEARCHPATTERN matching
    n is print out the line number
    SEARCHPATTERN is what you want to search
    * is to include all files (which is usually what you use if you use --include)

    How to avoid the error beep in MySQL client

    Continuing today’s theme of avoiding ill-considered audio feedback in some desktop tools, the MySQL client plays a beep on errors, and one can’t avoid it even if volume is muted.

    To avoid it, invoke the client with a –no-beep argument like this:

    “C:Program FilesMySQLMySQL Server 5.0binmysql.exe” -u userid -p –no-beep

    How to avoid the “breaking glass” sound in TortoiseCVS

    When an error occurs in the TortoiseCVS client, it plays a “breaking glass” sound that can be pretty annoying. To avoid this, rename or delete C:Program FilesTortoiseCVSTortoiseCVSError.wav

    Note that the error sound plays even if uses Preferences to set Progress Messages to “Really Quiet” (although I suspect this preference controls the verbosity of text feedback, not audio — a poor labelling choice).

    EOFException from getBinaryStream

    If you get EOFException from a call to getBinaryStream on a result set, here are two suggestions:

    • Check that when you write the stream, you aren’t trying to read from the same stream twice. For example, you might try to update a row first, and if that returns 0 rows changed, then you may be trying to do an insert using the same stream that you read from for the update. This won’t work because once the stream is read once, the stream pointer is at the end of the content when you try to read the second time.
    • Try deserializing the stream right after serializing and without putting it into a file or dbms.  If this doesn’t work, it’s likely that your object type doesn’t implement Serializable nor Externalizable.

    Steve Souders’ 14 rules for faster-loading websites

    Within a few years, if you notice in Firebug that a site doesn’t adhere to these, you should ask yourself why…

    http://stevesouders.com/hpws/rules.php

    These rules are an excerpt from Steve Souders’ best-seller, High Performance Websites.

    Since modular techniques for building dynamic pages (such as .ascx files in ASP.NET) don’t immediately work well with some of the rules (such as placing all scripts at the bottom of the page), new patterns are sure to emerge.

    Top 7 most common mistakes with Javascript syntax

    Ok, my basis for saying “most common” is not at all scientific, just a gut sense of how many times I’ve seen 15 mins or more spent trying to find a bug that ended up being just a syntax error in one’s script. Tools like JSLint and Firebug are really helpful in general, but not with these issues. (I’ve heard Aptana should be, but I couldn’t get it to work and it kept crashing on me.)

    1. Spurious comma after the last entry in a JSON object string

      {name:’Fred’,visits:3,}

    2. Use of = instead of : in a JSON object string

      {name=‘Fred’}

    3. Misspelled function and object names

      encodeURICompoent

      onReadyStateChange instead of onreadystatechange

    4. Spurious parenthesis after a function call

      confirm(‘Proceed?’));

    5. Using parenthesis instead of braces for access within arrays and objects

      htsVisits(‘Fred’)

    6. Spurious : between keyword case and a switch value

      case: 1:

    7. Expecting “False” or new Boolean(“False”) to evaluate as true

    The Enter key: Making a mess of submit buttons and textboxes

    If one has a submit button in an HTML form, then pressing Enter will trigger the first of these in doc order, as though one pressed the button. At first glance, this seems like a nice feature, but in practice it leads to lots of problems. The root of the problem is that users forget or don’t know about this. (Browsers could help by giving special highlighting to such a button, as desktop apps often do.) Users might be typing in a textbox, not knowing or caring whether it’s a textbox (which will pass on any Enter to the form) or a one-line textarea (which would add a line ending to its content and not pass on the Enter). The Browse button of a file input (in some browsers) will also pass on an Enter rather than trigger a FileOpen dialog, even when it’s in focus.

    To prevent such errors, one can change all submit buttons to normal buttons and use script and a hidden input to transmit to the server the name-value pair that the submit button would have provided. Or, if there are no file inputs (checkboxes and radios might also be a problem, though), then one might try changing all textboxes to textareas. However, a gotcha with the second approach occurs when a user enters a string with no spaces that’s longer than the one-line textarea; in that case, a horizontal scrollbar will appear that might hide the text. One can try making the textarea taller, but in Firefox it has to be more than 20 px or else no vertical scrollbar will appear for multi-line entries (because the scroll thumbs are quite large in FF).

    Making text entry faster and easier for mobile, games, and the disabled

    A Google video about MobileQWERTY™  shows how a 3×3 button layout using letter assignments different than the usual abc, def pattern can provide lots of benefits. For example, the speaker says that an average of 2.14 key presses is needed to type each letter of a typical message using an abc layout on a standard phone, but MobileQWERTY’s layout reduces the average to 1.35 key presses.  That’s 35% more than a full QWERTY keyboard but the abc layout is 114% more!

    MobileQWERTY™  is shown to provide similar improvements for several Western European languages (and one can see a demo of Japanese at minute 40 in the video). It’s targeted not just at standard mobile keypads (a problem space dominated to this point by Tegicâ„¢ , which owns the IP behind predictive spelling for abc layouts) but also game controllers and input devices for the disabled, children, and the elderly — anyone having trouble managing fine finger movements.

    The most impressive thing in the video to me is seeing how fluidly someone trained in MobileQWERTY™  can type typical messages.  I really liked the small form factor of my freebie Sprint Samsung phone, but had to give it up for a Treo650’s fuller keypad.  I think MobileQWERTYâ„¢ could turn out to be a better solution for mobile than Apple’s touch typing and predictive spelling.  Let’s hope it’s an option in Google’s Android phone OS.

    Simplifying markup and CSS selectors through “semantic” tags

    In a brown bag yesterday, Kevin Lawver suggested a best practice for DHTML: Prefer “semantic” markup instead of overuse of div. The principles are:

    • If the content identifies a major section, use an h tag;
    • If the content is list-like, such as left nav or header/footer links, use ul and li;
    • If the content is text-only, use p;
    • Otherwise, one might use div or span, but one might as well use shorter tags such as b or i…where one might adopt a practice of always wrapping text inputs with i and button or dropdown inputs with b and more general kinds of content with, say, u (assuming the styles of these tags are set to “vanilla” styles globally).

    Shorter tags make responses lighter. They also allow for CSS selectors that reflect the structure of the doc; for example, a selector containing “ul li i” suggests a container for a textbox (i.e., the i) within a radio group (i.e., the “ul li”)…which is a common pattern for a radio labelled “other:”. Note that since the selector uses type/tag names, the markup does not need to include a class attribute-value pair; that makes the markup and the CSS both shorter, too.

    But the key benefit proposed was that such markup is more self-documenting than lots of nested divs because the tags indicate the kinds of things they should contain. That is, the main benefit is improved readability and hence better maintainability.

    Read more on the microformats page about “POSH” 

    Understanding Remote Presence

    A comment on: Understanding Remote Presence

    The authors are Scandinavian, and start off with an interesting observation: “When you enter a Scandinavian home in the wintertime you will soon realize the importance of light, and how different lamps are crucial for carrying out work and daily house chores. But the use of light is also essential to show that you are at home and to manifest the presence of life. […] using light […] in some communities […] is a mutual social activity with the neighbors to show that you are doing well and even that you might welcome visits.”

    The authors are also interested in the variety of sentimental artifacts in the home, “A key concern for us is to focus not only on the pattern of communication within a co-residential social unit, but also to investigate what people keep in their homes to act as a reminder of people to whom they are close.”

    The authors did an ethnographic study where they visited homes and asked about the significance of objects while being given a tour. One of the more interesting findings is that some people have mixed feelings about the phone as an object, because it reminds them of painful conversations with remote loved ones. The authors also asked subjects to enact specific scenarios, such as leaving the house, in order to observe habits such as leaving lights on if the owner were staying in the neighborhood.

    From the observations, the authors created several concept devices that combined the qualities of light source, keepsake item, and awareness of presence with a remote loved one. One of these concepts, called the “6th Sense” lamp, was developed into a prototype used in a followup study.

    These lamps are made in pairs that are connected via GSM wireless network. When a human is near one lamp, the other lamp brightens, giving each lamp owner a sense of the other owner’s activity around the lamp. There was a 2-week study with 6 families of different kinds. Subjects were prepped with:

    – a story of how in the old days in small villages, people could look out their window and see their parents’ homes, and could get a sense of how they were by how the house was lit inside

    – a simple ritual once the lamps were installed where each user called the other and turned on the lamps while on the phone

    – journals with prepared questions that were kept by the subjects

    The authors wereinterested in the subject’s perceived quality of (a) sense of presence, and (b) sense of being under surveillance. One subject (a father whose sons complained that he only called about practical matters) said the sense of presence wasn’t useful because he spoke to his sons so often by phone. And the only subjects who worried about surveillance were parents that didn’t want to intrude on their children.

    These studies were very interesting in how they identified the meaningfulness of routine, practical activities like turning on lights, and how we might be able to introduce artifacts that are easily embedded in such activities and which enable even greater, yet subtle, human interactions.

    Massively conferenced phone calls that mimic physical space

    A comment on: The Mad Hatter’s Cocktail Party: A Social Mobile Audio Space supporting Multiple Simultaneous Conversations

    Question: Imagine you have several group conversations going on via conference call, but instead of having separate calls, all the groups are lumped into a single call so that individuals can migrate in and out of conversations at will. Could an algorithm identify the different “floors” of conversation in real time, so that the volume of all conversations that a caller is not in are largely muted? That is, can an algorithm simulate the acoustics of a large meeting room, where the conversation that one hears best is the one of the group one is participating in?

    Method: Create a full-duplex (i.e. you can hear while speaking) conference call using mic’d iPaq PDAs that are connected via 802.11b wireless network to a GStreamer central audio exchange server. Create a “naive” Bayesian algorithm that is trained off-line with audio files from human conversations. The conversations are recorded during a party game that forces conversational groups to split up and reform. The human trainer then segments the audio according to which group is in the audio. People give subtle audio cues in their speech about whether they are participating in the conversation, such as not interrupting the current speaker but jumping in when that speaker indicates he is finishing. The Bayesian training should enable the algorithm to pick up on these cues and attentuate the volume of incoming audio streams that aren’t in the same conversation as a particular user, and make these decisions for all users simultaneously.

    Findings:

    – When the system assigned floors correctly, users preferred it to having no floor assignment (i.e. no volume adjustment). This makes sense, since so many people were in the call at once that it was practically impossible to have group conversations without managing the call as if they were one, all-inclusive group.

    – Users suggested a “maintain the current volume level” widget, since the system sometimes reassigned other people in the conversation to another group and this wasn’t noticed until the other person’s audio was so muted that they had trouble speaking normally to each other and thereby getting the system to notice that they were actually in the same group.

    – Unlike in Fact-To-Face (FTF) interactions, it often took participants quite awhile to notice that another person had moved to another group (whether by that person’s choice or by the system’s mistake).

    IMO, this could be a very useful feature in stationary and mobile conferences, but only if there is really a need for people to move silently in and out of conversations. The only situation like that that I can think of are phone-based chat rooms (which don’t exist yet AFAIK) and party lines. The need for a maintain-this-volume button is also problematic with mobile use, since pressing keys on cellphones while talking is very awkward.

    Using only audio to support telepresence in office environments

    A comment on: Hanging on the ‘wire: a field study of an audio-only media space

    Question: How effective is an audio-only connection for office-based collaborative work?

    Method: The authors setup a conference-call-type (but full-duplex) connectivity in the offices of a group of video editors who were already a social group, within 100 ft of each other. Each editor received a mutable open mike and speaker system that all users to speak and hear each other simultaneously. In their regular routine, the editors rarely worked on the same video segments.

    Findings:

    – The lack of a formal way to take and release “the floor” led to overlapping speech and the need to repeat oneself and overtly manage the floor…this appears to be done most of the time through facial expressions and gestures. It was also more of a problem than in phone conversations, since interactions were open-ended (hours long) and less formal.

    – There was lots of joking and voice play

    – What little time was devoted to work tasks involved scheduling meetings, primarily

    – Similar to the problem of managing the floor, another major problem was determining who was “on” the system and thus how careful one had to be about what was said…in at least one case, catty gossip was overheard by the subject of the gossip

    – A phone ring that interrupted the speaker evolved into an informal sign-off, since it typically indicated that the speaker muted the mic and took the call

    – Some sudden background noises, like ringing phones, caused pain for some participants who were listening in with headphones instead of speakers. This was the only problem that the authors thought merited automated help (i.e. to monitor and squelch loud noises)

    IMO, this study showed that open-ended, non-task-related exchanges aren’t served well by an audio-only Computer-Supported Collaborative Work (CSCW) system. But this result suggests a followup study of how audio-only conference connections might help limited, task-related exchanges. The mobile blue collar subjects in one of the other studies here would be an ideal group.

    Using suggestive sounds to indicate state in complex systems

    A comment on: Effective sounds in complex systems: the ARKOLA simulation

    A prototype was created to answer the question, if you have a complex system that needs monitoring, like a soda bottling operation, and there is too much visual information to display on screen at a time, would auditory clues about the system’s behavior be helpful?

    The prototype simulates the imaginary “ARKola” bottling plant, where empty glass bottles, cola nuts, and carbonated water are delivered at intervals at different ends of the plant. The nuts and water are heated, the bottles filled and then capped, and then marshalled for shipping. Each of these steps has its own box in the graphical simulation and its own sounds (such as the clinking of newly arrived bottles). The screen can show only 1/4 of the simulation at a time, so the user has to listen for the crash of broken bottles, the spill of boiling syrup, and so on, in order to adjust several controls and keep the plant running smoothly.

    In the test, users were put into partner pairs where each partner was in a different room. In the control group, the partners could speak with each other and monitor the GUI. In the other group, the partners heard the auditory icons (“earcons”) also.

    Conclusions:

    • Earcons in general seemed very useful
    • However, more recognizable sounds, like breaking glass, could distract from less recognizable sounds that indicated more important problems.
    • The stopping of a sound tended to be ignored, even if the stopping was important.
    • The partners who had earcons tended to segment the work and talk with each other about the other’s problems. The partners who didn’t have earcons tended to ignore the other person’s problems and focus on their own.
    • The overall performance of the simulated plant was inferrable from the earcons almost as a gestalt, similar to the way people suspect problems with their car based on the sound of the engine.

    WebDAV: IETF Standard for Collaborative Authoring on the Web

    A comment on: Introduction to IETF WebDAV standard (IEEE Internet Computing, September/October, 1998)

    WebDAV, or just DAV, is “Distributed Authoring and Versioning”, a protocol that extends HTTP to allow creating, editing, and managing versions of resources using URLs. DAV completes Tim Berners-Lee’s original vision of the web as not just a collection of read-only resources but of read/write ones.

    The main features are:

    • The ability to define an access control list (ACL) for a remote web resource, to limit who can write to it.
    • The ability to place an exclusive lock (for serial sharing) or a shared lock (for simultaneous sharing) on a remote web resource, just before editing it.
    • The ability to get and set properties (metadata) on a remote resource, similar to the filename/creationDate/etc attributes of a file in a file system.
    • The ability to define resource collections similar to directories, and references similar to soft links.

    Planned extensions to the protocol will define search and versioning should behave.

    These features comprise a “network file system”, which on its own is not a new thing. But since this is a protocol, it’s platform-independent. Microsoft, Netscape, and Xerox have all contributed (and Office 2000 supports it), but no corporation dominates its direction.

    On a practical note, you might wonder why not just use FTP. Well, that’s not secure. But you could use it with SSH. Yet, not all platforms and apps support SSH; DAV, on the other hand, leverages the security of HTTPS.

    IMO, DAV is one of the core Internet technologies of the near future. There is a growing sense that email, documents, webpages/blogs/RSSfeeds, etc have to be tamed through a uniform interface for creating and sharing resources with people. DAV holds the promise of being a uniform underlying technology that could spark that uniform interface. And when such simplifications occur, creativity is freed up to discover new uses.

    Designing to support communication on the move

    A comment on: Designing to support communication on the move

    A new ethnographic study by Jacqueline Brody about blue-collar mobile communication needs (this one from 2003; prev was from 2001).

    She points out that there is no definitive definition of ‘mobile work’ and offers a list of example instances: “working at multiple (but stationary) locations, walking around a central location, travelling between locations, working in hotel rooms, on moving vehicles, and in remote meeting rooms.”

    This was a 2-page report and its findings are fairly general:

    • Information arriving in one medium tends to force the interaction to remain in that medium, because it’s so hard to switch. Mobile tools should make this easier.
    • Mobile workers need to prevent interruptions while reassuring employers that they are “on the job”
    • Current tech limits the extent to which mobile workers can tap into shared resources like knowledge bases.
    • Current tech depends too much on having a flat work surface available (when using a laptop) or a free hand (when using a cellphone, PDA, or laptop)

    Designing for Mobility, Collaboration, and Information Use by Blue-Collar Workers

    A comment on: Designing for Mobility, Collaboration, and Information Use by Blue-Collar Workers

    The focus of this study was mobile blue-collar workers like copy repairmen, and how they solve problems using wireless equipment like cellphones and laptops.

    The general findings were:

    • The tasks most in need of better support are
      • scheduling later Fact-To-Face mtgs, which is complicated by the need to find mutually available times and locations, record these decisions while having only one or no hands free, and where the other person may not even be able to talk with you at the moment
      • delivering docs to complete an interaction. For example, during a cellphone conversation, the other participant might request that a contract be sent in advance of a mtg; since cellphones generally can’t be used to access or transmit files, the promiser has to remember to send the doc later, perhaps hours later, when he has access to a computer or fax.
    • Some companies need to provide phone services to their mobile employees but are so concerned about cell minutes that they opt for phone cards instead of actual phones. Finding a phone is often a hassle, and these phones are often not close enough to the worksite to allow guidance over the phone.
    • Mobile workers need a good way of indicating a ‘busy’ status, to prevent interruptions. (The article seems to suggest a software solution, but I’m not sure why taking one’s phone off-hook wouldn’t be sufficient, and more reliable than a new type of software.)
    • Workers need a hands-free, out-of-car communication solution.

    “Taking Email to Task” – PARC HCI research

    A comment on: Taking Email to Task

    This paper notes that email has morphed from simple asynchronous msg exchange to a means of managing tasks. For example, users have been observed sending themselves email as reminders, since they know that they scan their inbox several times a day. Similarly, users forward msgs to themselves to keep the msg within the visible area of the inbox. The inbox is also used as a file store — users will send themselves an email with an attachment as a way of quickly accessing the file during the run of the task.

    Based on these observations, a prototype was developed that:

    – Allows senders to indicate a deadline, and all recipients of this msg see a progress bar in the inbox showing how much time remains for this item

    – The inbox and outbox are shown in one view, since the two together represent current tasks

    – Attachments appear as their own item in the inbox, which sidesteps the problem of att’s being hidden within msgs

    These features (and others) were implemented as an Outlook plugin, and a study was done across a variety of users in diff locations. The added features were found to be generally very helpful, although most users stopped using it due to the plugin not supporting the full range of Outlook features they had become dependent on.

    The problem examples suggest some features that the authors didn’t discuss:

    – Allow recipients to create a summary meaningful to them that would be shown in place of the subject text (in a diff color)…useful for archiving how-to info and such

    – Allow senders to mark some recipients as “responsible”. The inbox would also be split into 4 bins: MyTasks, OthersTasks, Unread, and Read. Any msg where I’m marked responsible would go to MyTasks from Unread once I read it; similarly for msgs where only others are responsible, for OthersTasks. The bins help avoid important msgs falling out of the visible area (and MyTasks/OthersTasks/Unread should expand to fit new msgs until full, and then pulse).

    – Items with responsible parties should also have a status: {assigned, accepted, declined, finished}…this is a lightweight form of bugtracking/task resolution.

    – Changes to deadlines, those responsible, or status should cause the display of that item in everyone’s display to change

    This paper fired my already keen interest in Personal Information Management tools (“PIMs”) even further.

    Microsoft Research project “MyLifeBits” at PARC 2/2004

    A comment on: MyLifeBits.com

    Gordon Bell was once one of the main engineers at DEC working on the PDP-11 and VAX mainframes. Now, he’s at Microsoft Research in a project aiming to capture much of his life digitally — scans of all books and papers he’s written, photos from his life, and recently, copies of all emails, IMs, phone calls sent/received since the project started. And even some TV shows he watches. (The scope here isn’t as wide as for some “ubiquitous computing” pioneers, who wear AV recording equipment virtually 24/7.)

    This is an extreme version of a trend that’s been happening for all of us, and the aim is to identify ways of managing the storage and retrieval.

    A program on Gordon’s laptop captures all of this flow and stores it in a db on the laptop. It’s accessible via iPhoto-like thumbnails and SQL queries. They’re working on creating auto-classification mechanisms…there seems to be little meta-data for these resources, either. (Just “Dublin Core” for papers and books.)

    Surprisingly, Gordon talked of the difficulty of copying lots of photos to his daughter’s hd, but they haven’t thought much about automating sharing with the people experiencing the events recorded this way. (This is typical Microsoft/Apple mindset: solve it on a PC rather than a server (or peer network).) The sharing challenge seems just as difficult as the indexing/retrieval challenge.

    A good point from the audience:You’d want to enable diff authorization for diff pieces. For example, you wouldn’t want everyone to see footage of you in the bathroom, but you might want to allow your doctor.

    Creating the interface for such massive storage, to allow casual users to easily find things, regardless of whether it’s an email, photo, music track, etc seems like the next great “killer app” challenge.

    Computer History Museum overview by director John Toole, ~2004/02/02

    Director Toole presented a very general overview of the Computer History Museum’s mission and holdings, which includes mainframes, PCs, software, documentation, and ephemera such as computer-related coffee mugs. (This museum is located in Mountain View, CA and this overview was given at Xerox PARC, Palo Alto, CA.)

    Part of the mission is to give visitors as much hands-on experience with the holdings as possible. The fragility and irreplaceability of much of the tangible holdings presents a big challenge. But what interested me most was how they plan to exhibit the software, especially server-based software such as portals and networked games.

    It appears that the museum has very few holdings in this area, and Toole’s best guess of an exhibit would be screenshots. While I don’t have any suggestions for this problem, it seems a real shame that these experiences will soon be inaccessible. Wouldn’t it be great to try Google as it was in its first month, or AOL/Yahoo/MSN in their early stages, or networked computer games, just for a sense of how far we’ve come and what lessons have been learned.

    In with the old, onto the new

    ‘Just finished copying over lots of little tech notes I had posted on various old blogs and home pages from as long ago as 1998.  Whew!

    I wanted to make all those tips findable by search engines, but I’m eager to start on real commentary about issues that excite me now, like commonsense reasoning in AI and parsing of natural language.

    Memory leaks due to iframes in IE (also how to file-upload via dojo)

    1) Changing the src of an iframe in IE may cause later events to fire multiple times – once for each change of src.  See http://www.dynamicdrive.com/forums/archive/index.php/t-10589.html

    jscheuer1
    06-24-2006, 09:21 AM
    Yeah, I see what you mean, big time slow down in IE. No problem in FF. I didn’t test any others. I Thought it might be a memory problem so I tracked memory usage in Task Manager. No real problem with memory usage but I noticed actual CPU usage was spiking and then getting pegged a 100%. The more I loaded pages into the iframe in IE after that the longer CPU usage would remain pegged at 100% and this corresponded exactly with the amount of time that the frame was blank. I then had a look at your source code and saw that you had commented out this line:

    //currentfr.detachEvent(“onload”, readjustIframe) // Bug fix line

    Those two little red slashes at the beginning make it a comment. Why did you do that? I’m like 99% sure that this is the problem as that is an IE specific line designed to prevent multiple instances of the resizing event. Without that line, each time you load something into the iframe an event gets attached to it. After 20 loads, you have 20 events all firing at the same time. Almost has to be it. Just remove the red slashes and you should be fine.

    2) Alex Russell of DojoToolkit explains that memory problems can occur with IE in both DOM event handlers and XHR due to the browser’s reference counting mechanism not realizing that it can recover some closures after they’re no longer needed.  See http://alex.dojotoolkit.org/?p=528 (which also explains how dojo can be used for file uploads).

    How to set locale in OpenAIM

    For one-time change of locale:

    Use commandline flag ‘-l’ (“el”); for example, c> .accbuddy.exe -l zh-cn:cn

    For a persistent change:

    1. Signon to accbuddy
    2. Type :h
    3. In the returned list, locate the command that sets preferences
    4. Set the locale preference
    5. Sign off
    6. Signon with the same userid

    “virtual directory not being configured as an application in IIS”

    Parser Error Message: It is an error to use a section registered as allowDefinition=’MachineToApplication’ beyond application level.  This error can be caused by a virtual directory not being configured as an application in IIS.

    If you get this msg in your browser, then do:

    1. Open Settings/ControlPanel/AdminTools/IIS/(this computer)/WebSites/DefaultWebSite/(your branch dir)
    2. Right-click your project folder
    3. Select Properties
    4. Click the ‘Create’ button for ‘Application Name’
    5. Restart IIS
    6. Re-request your page

    How to use the OpenAIM test client accbuddy.exe

    Accbuddy.exe is a no-frills client distributed by the AIMcc team as part of their SDK, to help with debugging.

    1) Locate accbuddy.exe in your AIMcc SDK dirs (e.g., mine is in aimcc1.2.1.1281distdebug)

    2) Download the following MS debug-support DLLs into that same dir:

    https://www.cm.aol.com/viewcvs/clientdev/viewcvs.cgi/sdks/microsoft/vc6/bin/debug/MSVCRTD.DLL

    https://www.cm.aol.com/viewcvs/clientdev/viewcvs.cgi/sdks/microsoft/vc71/bin/debug/msvcr71d.dll

    https://www.cm.aol.com/viewcvs/clientdev/viewcvs.cgi/sdks/microsoft/vc71/bin/debug/msvcp71d.dll

    3) Double-click accbuddy.exe and enter your login when prompted.

    4) Type :h for a list of commands, or :q to quit

    5) View the trace file by doing Start|Run and entering %temp%.  Locate the xprt* file, representing the xprt library’s portion of the session.  (I believe there should also be a trace.txt file representing the AIMcc portion of the session, but I don’t see one.)

    Here are all the commands:

    :h     {cmd}         get (this) list of commands
    :q                   log out and quit
    :sid   key           enter SecurID key

    === Session ===
    s:a    {message}     sets away message
    s:c                  confirm account
    s:d    {prop} {v}    display info about the current session
    s:l                  list current sessions
    s:sp   prop value    set a session property
    s:ca   cap           add a custom capability
    s:cr   cap           remove a custom capability
    s:w    {name}        switch to a different session

    === per-session bart items ===
    s:gb   {type}        get one of your bart items
    s:gbo  {type}        get one of your bart items as an object
    s:sbf  file {type}   set/upload a bart item from a file
    s:sbt  text {type}   set a bart item with text
    s:sbi  bartid {type} set a bart item from its bartid string
    s:sbu  user {type}   set a bart item copied from another user
    s:sbx  {type}        remove a bart item

    === Buddy List ===
    :w                   show currently logged in buddies
    b:a    buddy group   add a buddy to the specified group
    b:r    buddy group   remove a buddy from your list
    b:l    {group}       list buddies on your list
    b:lf   {num}         list MFU buddies
    b:f    buddy {group} find a buddy
    b:m    from to fromGp {toGp} move a buddy
    g:a    group         add a new group
    g:r    group         remove group and all buddies inside
    g:d    group {prop} {v} dumps group props
    g:l                  list all the groups
    g:m    from to       move a group
    g:rn   old new       rename a group
    b:ta   buddy         add a temp buddy (cannot be in buddy list)
    b:tr   buddy         remove a temp buddy
    b:d    {prop} {v}    dumps buddy list props
    b:ynr                list buddies with unknown state

    === User info ===
    i:d    user {prop} {v} dumps user props
    i:g    user {prop} {v} gets user info, async
    i:gb   user {type}   get/download a user’s bart item
    i:gbo  user {type}   get/download a user’s bart item as an object
    i:e    email {…}   lookup users by email address
    d:s    query {type}  search on e-mail address for screen names
    d:l                  asks LOST to guess where client is
    u:r    user {reason} report a user (aka warn them)
    u:rn   user {reason} report a user (and notify AOL)
    u:sp   user prop {val} set a user property
    u:ig   user          ignore a user
    u:unig user          unignore a user
    u:bl   user          block a user
    u:unbl user          unblock a user

    === IM/Chat ===
    im:ni  user          new std im session
    im:nd  user          new direct im session
    im:nc  room          new chat session
    im:i   user {msg}    invite a user
    im:e   user          eject a user
    im:p   type          propose im session type change
    im:a                 accept incoming proposal
    im:r   {reason}      reject incoming proposal
    im:c   {invitee}     cancel outgoing proposal
    im:q                 end im session
    im:d   {prop} {v}    dump im session properties
    im:s   msg           send specified im
    im:sm  type msg {f}  send specified im and MIME type w/optional flags
    im:st  state         set input state
    im:at  text          append text to outgoing IM
    im:ab  bartid type   append a bart item to the outgoing IM
    im:af  file          append embed file/dir to outgoing DIM
    im:s!                send accumulated IM/DIM
    im:sr                stop receiving DIM embeds
    im:ss                stop sending DIM embeds
    im:l                 list all im sessions
    im:w   num           switch to the specified im session
    im:cp  personality   change personality to send in IM
    im:ex  {value}       display incoming IMs in viewer
    im:rs                requested stored IM summaries
    im:ds  {id}          deliver stored IM(s)
    im:xs  {id}          delete stored IM(s)
    .      msg           shorthand for im:s

    === File Xfer ===
    fx:n   user file     send a file or folder
    fx:a                 accept incoming proposal
    fx:r   {reason}      reject incoming proposal
    fx:q                 end file xfer session
    fx:d   {prop} {v}    dump a file xfer session
    fx:rc  {action} {path} resolve a file xfer collision

    === File Sharing ===
    fs:n   user          request a shared file listing
    fs:a                 accept incoming proposal
    fs:r   {reason}      reject incoming proposal
    fs:q                 end file sharing session
    fs:c   index         change into a different folder
    fs:x   index         request a file or folder
    fs:d   {prop} {v}    dump a sharing session
    fs:di  {prop} {v}    dump current sharing item

    === A/V ===
    av:na  user          start an audio session
    av:nv  user          start a video session
    av:nca {tbd}         start a multiparty audio session
    av:ncv {tbd}         start a multiparty video session
    av:i   user {msg}    invite a user
    av:e   user          eject a user
    av:a                 accept incoming proposal
    av:r   {reason}      reject incoming proposal
    av:c   {invitee}     cancel outgoing proposal
    av:q                 end an a/v session
    av:d   {prop} {v}    dump an a/v session
    av:h   {hold}        hold an a/v session
    av:t   key           play DTMF tone in session
    av:l                 list all av sessions
    av:w   num           switch to the specified av session
    av:dm  {prop} {v}    dump a/v manager
    av:m   {mute}        mutes outgoing audio

    === Share Buddies ===
    sb:ng  user group    share a buddy group
    sb:nu  user buddy    share a single user
    sb:a                 accept incoming proposal
    sb:r   {reason}      reject incoming proposal
    sb:q                 end share buddies session

    === Custom Sessions ===
    cs:np  guid {user}   start a custom offer-answer session
    cs:ns  guid {user}   start a custom stream session
    cs:nm  guid {user}   start a custom message session
    cs:s   data          send data
    cs:i   user {msg}    invite a user
    cs:e   user          eject a user
    cs:a                 accept incoming proposal
    cs:r   {reason}      reject incoming proposal
    cs:q                 end custom session
    cs:d   {prop} {v}    dump an custom session
    cs:l                 list all custom sessions
    cs:w   num           switch to the specified custom session

    === Security ===
    sec:i  file passwd   imports cert(s)
    sec:r                clear cert db & invalidate certs
    sec:l                lists known certs
    sec:e  file filePwd tknPwd export cert(s)

    === Preferences ===
    p:g    spec          get a preference value
    p:G    spec          request a preference value
    p:s    spec val      set a preference value
    p:sbf  file {type}   set/upload a bart item from a file
    p:sbt  text {type}   set a bart item with text
    p:sbi  bartid {type} set a bart item from its bartid string
    p:sbu  user {type}   set a bart item copied from another user
    p:sbx  {type}        remove a bart item
    p:c    {spec}        count prefs under spec
    p:d    {spec}        display all prefs under spec
    p:dd   {spec}        display all prefs under spec without giving up
    p:r    spec          reset a preference
    p:i    file          import preferences from a file & merge
    p:e    file          export preferences to a file

    === Plugins ===
    pl:l                 list installed plugins
    pl:x   uuid cmd {names} execute plugin command

    === Smileys ===
    sm:s   bartid        print info about a smiley set
    sm:n   bartid n      print info about one smiley from a set
    sm:f   bartid text   print info about the smiley that matches ‘text’

    === Other ===
    o:gbi  bartid type   get a bart item by id&type
    o:gbio bartid type   get a bart item by id&type as an object
    o:sim  addr msg      send an email inviting someone to sign up with AIM
    o:srv  service       send a request for an external service
    o:url  url           download a URL using URLMON
    o:urlv url           display a URL in a separate dialog
    o:x    path          convert input file to xhtml
    o:link               link active AV and IM (chat) sessions
    o:v                  show the current version

    NOTE: Whitespace must be double quoted.
    NOTE: Double quotes must be escaped with a forward slash: “.

    System.Reflection.TargetInvocationException

    When you get a popup about this kind of exception in C#, select Break in the popup and then look in your Locals for the innerException property of this exception.  It will probably indicate something like “Exception thrown by type initializer of class“, which means you might be trying to do something like get the Length of an uninitialized string.

    System.UnauthorizedAccessException

    This can happen if you received the ASPX file via an IM file transfer, for example.  Look at the file’s Properties|Security|GroupOrUserNames, and if “Users” isn’t listed there, that means ASP.NET can’t access the file.

    I fixed this by copying the file contents into the clipboard, reverting the file, and pasting the contents over the file.  (When you get the file from Perforce, the users list should be correct.)

    “Failed to access IIS metabase”

    Try reinstalling ASP.NET 2.0:

    c:windowsmicrosoft.netframeworkv2.0.50727aspnet_regiis -i

    Also make sure that the IIS console shows that, for Properties|ASP.NET of each of your web-shared folders, that v2.0 is shown in the first dropdown.

    ASP.NET supports Page_Error()

    If one’s C# code throws an exception that isn’t explicitly caught by any try/catch, then  one can use ASP.NET’s Page_Error() method as a fallback…it’s invoked for all such thrown exceptions.  The default implementation just pops an error dialog.

    Trick IE into reserving a connection for Comet traffic

    “Comet” is a dhtml technique for sending updates from the server to a browser client. One way to do it is to place a hidden iframe in one’s page and have the server write a <script> to the response whenever there’s an update; the script executes as soon as the client receives it, and it might popout a window containing a new message or change styling in the parent doc. In order for the client to keep getting updates, the server purposely never indicates that it has finished writing the response.

    A webapp that uses Comet typically needs one connection for Comet and one to send requests to the server whenever the user makes some kind of edit (aka RPC = remote procedure call).  While the second connection isn’t needed all the time, IE6 permits only two connections to be used at any time, so any other windows open in IE have to fight with your service for them, which sometimes leads IE to close one of your connections prematurely.

    Although it’s not friendly to such other services, one can trick IE into reserving one of its connections for your Comet page by requesting that page from a different domain than one makes RPCs from. If the server must maintain state, the domains can even point to the same machine (although you probably need to map both domains to a switch that then uses a cookie to find the particular machine)

    To simulate such a setup on a Window dev machine, add the following line to C:WINDOWSsystem32driversetchosts

    127.0.0.1       fake-domain1                       fake-domain2

    IIS fails to serve pages, although ping succeeds

    After many failures to install VS2005 on my laptop, I got a second machine and reinstalled XP (and drivers) in order to bypass any app conflicts. This wiped out McAfee (which was part of the install image), and the XP install turned on its firewall by default in response.

    I hadn’t noticed that the firewall was on, and when I tried to get IIS to serve a simple xml file from its doc root (c:inetpubwwwroot), only the hosting machine could access it. Yet ping from other machines to the IP name worked.

    I looked in the IIS access log at c:Windowssystem32LogFilesW3SVC1ex*.log, and there weren’t even any access attempts from other machines shown. Googling “The page cannot be displayed” revealed the firewall involvement.

    To disable the firewall, Start|Run, and enter “Firewall.cpl” and follow the prompts.

    “A project with an output type of Class Library cannot be started directly”

    VisualStudio has forgotten your debug settings. Right-click on your project in the SolutionExplorer pane, and select “Set as startup project”.  In the context menu, you might also want to select Properties|Configuration|Debugging|StartPage and set to path-under-project-dir/your-test-page.aspx

    For example, if the project is “WebApp” and the test page is at “WebApp/test/Login.aspx”, then the start url could be “test/Login.aspx”.  The url could also be an absolute one using localhost or your machine’s true IP address.