XForm Primer

KoBoCollect requires that the survey forms are built using valid XML.  KoBoForm  is an easy way for non-technical users to create valid forms without knowing how to edit XML. However, for those interested in the inner working of these XML forms, there are rules and practices that are helpful to know. Learning these rules will allow a form designer to edit completed forms to make changes without loading into the Form Designer, or even to build forms from scratch. 

  To edit XML, you need a simple text editor. Notepad on windows will work just fine. If you want syntax-coloring and other features, you can use Notepad ++ (Win or Mac) or Scite (Linux).

An Xforms Primer

XML is a tag language, so everything in it has tag sets.

tags work like this:

<tagname>content</tagname>

 

Or, in practice:

<sex>Male</sex>

Or, more realistically, you will be using quantitative values instead of text for your values. 

<sex>1</sex>

 

Tags always open and close. 

<tag></tag>

Sometimes, there are some self-closing tags that look like this:

<tag/>

 

Building forms is done using XML and there are a set of rules that apply to the XML used to build these forms. When you use these rules, the forms you build are called Xforms, and the rules basically just say that certain tags go in a certain order. 

 

Here is an example of an Xform:


 

<?xml version="1.0"?>

<h:html xmlns="http://www.w3.org/2002/xforms" xmlns:h="http://www.w3.org/1999/xhtml"xmlns:ev="http://www.w3.org/2001/xml-events" xmlns:xsd="http://www.w3.org/2001/XMLSchema"xmlns:jr="http://openrosa.org/javarosa">

        <h:head>

                <h:title>Basic Form</h:title>

                <model>

                        <instance>

                                <!-- submitted data and defaults go here -->

                                <data id="basic">

                                        <StringData/>

                                </data>

                        </instance>

                        <!-- add properties to the questions, like data types, skip logic, and constraints -->

                        <bind nodeset="/data/StringData" type="string" />

                </model>

        </h:head>

        <h:body>  

                <!-- define the questions that the user will fill out -->

                <input ref="StringData">

                        <label>please enter a string</label>

                </input>

        </h:body>

</h:html>

 

 

That's an entire form. I will break it down a little. 

 

NOTE:  the following two lines <?xml... and <h:html... , you can copy and paste every time. It's the declaration and it just lets the program know that some XML is coming. 


 

<?xml version="1.0"?>

<h:html xmlns="http://www.w3.org/2002/xforms" xmlns:h="http://www.w3.org/1999/xhtml"xmlns:ev="http://www.w3.org/2001/xml-events" xmlns:xsd="http://www.w3.org/2001/XMLSchema"xmlns:jr="http://openrosa.org/javarosa">   

 

NOTE:  Here is the HEAD. it contains the title for the form. and all the data that doesn't get shown to the user. The MODEL contains the INSTANCE and the BIND. 

        <h:head>

                <h:title>Basic Form</h:title>

                <model>


NOTE: The INSTANCE is a description of the data structure, it is basically a flat database that gives us the name for all the data fields and provides a place for them to the written into. Later on, when the survey is completed, the software will use the instance to store all the survey answers, and it will save just this instance with data in it as the survey results. That way, the survey results are very lightweight, they don't need to have all the extra crap.
Each piece of data, like the Age of the respondent, is a NODE in the INSTANCE. Each node gets a name. The name is arbitrary. If your first question is about Age, you can name the node "age" or "q1", it doesn't matter. 

<instance>

 <node/>

</instance>

in practice:

<data id=survey>

 <age/>

</data>

Later on, after a survey has been completed, the application will save out a new file, containing only the instance and the answers to the questions saved in the appropriate nodes, like this:

<MySurvey>

<age>23</age>

</MySurvey>

                        <instance>

                                <!-- submitted data and defaults go here -->

                                <data id="basic">

                                        <StringData/>

                                </data>

                        </instance>

                        <!-- add properties to the questions, like data types, skip logic, and constraints -->

The BIND is also in the MODEL, and it is a place where you can write rules for each node in the Instance. Those rules can be things like constraining integers... for example, when asking age, we don't want anyone under 18 and we don't want people writing a crazy number in there, so we constrain that data between 18 and 100. The survey looks in the bind when it is running, and when that question comes up, it will not accept an answer that doesn't fit the constraint. 
So, BIND statements are optional, the form will work without them, but you CAN have a BIND statement for every node in the instance. 
In each BIND statement, you can set the question to be required, constrain the answer, insert skip logic, and more. In the following example, the bind statement refers to the node in this instance called "age", it sets the type as "integer" and constrains the answer to between 18 and 99. If the given answer is outside of this range, it provides an error message. Also, the question is required, so if it is not answered the respondent will not be able to continue.

                        <bind nodeset="/data/age" type="int" constraint=". &gt; 17 and . &lt; 100" jr:constraintMsg="Age must be between 18 and 99" required="true()"/>

NOTE: Finally, there is a close tag for the Model, then the Head. The instance and the BIND always go between Model tags, the Model always goes between Head tags.                  </model>

        </h:head>

NOTE: Now, the Body of the form. The Body contains all the elements which will be presented to the User. That is, here follows all the questions, in the order that they will be presented. 
        <h:body>   NOTE: Each question must refer to a Node in the Instance with ref="node". Each question has a Label that displays the text of the question.

                <!-- define the questions that the user will fill out -->

                <input ref="StringData">

                        <label>please enter a string</label>

                </input> NOTE: Questions may be of several types. 

  • INPUT questions display an open box to the user. There is a label, which will usually contain the text of a questions such as "What is the Age of the respondent?" The INPUT box, when selected, will pop up a keyboard so that the user can enter text. If the BIND statement indicates that the TYPE is an integer (type="int"), then the numeric keyboard will pop up. 

  • SELECT1 questions display a series of choices with radio buttons. User can select only one of the choices. You can code a label and a value for each choice.

  • SELECT questions display a series of choices with check boxes. User can select any number of choices. You can code a label and a value for each choice.

In addition to the label, Questions can also have a Hint, some text that appears below the label. It is optional, and is really just a secondary label, it appears in italic and is usually used to clarify the question. 
Once all the questions are coded, there just remains the closing tags for the Body section, and the <h:html> tag that closes the entire form. 

        </h:body>

</h:html>

 

 

Those are the basic elements of an Xform. I did not touch on some of the more complex uses of the BIND and I left out the multiple languages that can be coded into a section in the Head called Itext. Also, there are ways to use the camera and barcode scanner to collect data, but those questions types are worth getting into another time.  

Xform Logic Explained  

The forms can contain both Skip Logic and data constraints. The way it's done in the XML is a section called BIND.

For every piece of data you are collection (say, question 1) there is a node in the Instance.

<instance>
   <data>
      <question1/>
   </data>
</instance>

for ever Node, there can be a BINDing (though it's not necessary).

<bind nodeset="/data/question1" />

All by itself, that doesn't do anything. But, if you want the question to be required, this is where you add that control.

<bind nodeset="/data/question1" required="true()" />

Now, let's say that you want to add a data constraint. Question 1 is this, "What is your Age?" You can't interview children, so the answer must be older than 18. You don't want someone accidentally putting in the year of their birth, or some other crazy number, so you constrain the number to be less than 100. Oh, and it has to be a number, not a word.

<bind nodeset="/data/question1" required="true()" constraint=". &gt; 17 and . &lt; 100" type="int" />

If someone puts in a crazy number for age, it won't accept it. You might want to add an error message, so you can do that too:

<bind nodeset="/data/question1" required="true()" constraint=". &gt; 17 and . &lt; 100" jr:constraintMsg="Subject must be an adult, between 18 and 99 years old." />

There are lots of kinds of data constraints, you can use regular expressions in there, like if the answer must be a 3 digit number, you can constrain like this:

<bind nodeset="/data/question1" required="true()" type="int" constraint="regex(., '^\d\d\d$')" jr:constraintMsg="The number must be 3 digits" />

A lot of people think these kinds of expressions are awesome, I just think they are regular. Anyway, useful stuff.

Notice that there is a datatype constraint in there as well: type="int" This will make sure only numbers (integers) can be entered and will pop up the numerical keyboard for the user.

Other types of types are:

date
time
geopoint
string

there are some others, they control how the interface presents to the user. If the type="geopoint" then the user will see a control to get and record the GPS position.

Datatypes are useful in combination with Preloaders, for example, you can record the start time of your survey (or the end time, or some other time) like this:

<bind nodeset="/data/start" type="time" jr:preload="timestamp" jr:preloadParams="start"/>

That records the time that the survey starts and records it into a node called "start".

You asked about Skip Logic as well, and you do that in the BIND also by making use of Relevant property. You basically add this to a question saying, THIS question should be presented to the user IF the answer to a previous question is THAT.

Q1 asks "Do you have a dog or a cat?" Q2 asks "What is your dog's name?" Q3 asks "What is your cat's name?"

So, Q2 is only RELEVANT if Q1's answer is "dog". Get it? Q3 is only RELEVANT if Q1's answer is "cat".

So, in the bind, it gets written like this:

<bind nodeset="/data/q1"  />
<bind nodeset="/data/q2"  relevant="/data/q1= 'dog'"  />
<bind nodeset="/data/q3"  relevant="/data/q1= 'cat'"  />

ODK will only present the cat question to cat owners, the dog question to dog owners.

There are multiple ways to use the RELEVANT property.

Like the Dog/Cat question above, Relevant IF some other question equals some value.

relevant="/data/q16 = '88'"

If you want to ask ten questions about dogs and ten questions about cats, you just give the appropriate relevant="/data/q1= 'dog'" to all the dog questions and the same for the cat. So, you can skip whole sections that way.

You can combine more than one Relevant value by putting "and" between them.

relevant="/data/q5= '1' and /data/q6 = '88'"

You can also use negatives, that is, Question is Relevant IF another question's value does NOT equal something.

relevant="/data/q5 != '0'"

When working with multiple-select questions <select>, where users can select more than one answer, you can't use node='value' because there may be more than one value in there. So, you can use selected(node, 'value')

relevant="selected(/data/q11, '88')"

Yes, it can be combined with other Relevants for other nodes

relevant="/data/q9 = '1' and selected(/data/q11, '88')"

You can even say THIS question is Relevant only IF the multiple choice question does NOT have a certain value selected.

relevant="not(selected(/data/q11, '0'))"

OK, that's what I have to say about that. I write a lot of skip logic and data constraints, so I have figured out some tricks. Also some caveats, like the ODK Form Validator (though it is awesome, all praise to the mighty Validator) does not get all mistakes in the BIND.

If you are writing some BIND statements, and you refer to a non-existent node (like you mix up the letter 'O' and the number '0' or something like that, you have a node called /data/O01, but you spell it /data/OO1) the validator will not catch it, but your form will crash like a lead balloon.

You can watch it crash with the adb logcat, and you will see an error message "Node does not exist".

For more exciting details, please refer to our friends at the Open Data Kit's Guide to Binding.